-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization of virtual casing method in SPEC #32
Comments
In the STELLOPT virtual_casing_mod.f90 module used in DIAGNO, FIELDLINES, and BEAMS3D, we construct splines over the surface of the required quantities. This alleviates some of the overhead of evaluating the trigonometric functions but isn't necessarily optimal for this problem. To calculate the field from the virtual casing we need three vector quantities. The field on the plasma vacuum interface (or rather the related surface current), the position where we want to evaluate the field, and the position of the interface. Now in SPEC the second quantity does not change during the simulation so this can be computed once on a finite grid. The position of the interface and the surface current change quite a bit. A hack is to apply stellarator symmetry to reduce the number of points at which the field needs to be evaluated. As it turns out for stellarators we only need to evaluate the field on a grid from phi=[0,pi/NFP] and theta=[0,pi]. This is because of the mirror symmetry. Thus compared to a non-stellarator symmetric simulation the number of points at which the field needs to be evaluated is reduced by a factor of 4. BTW, VMEC does this when sampling the vacuum field. However, this is just a 'trick.' Another method one could employ is to pre-calculate the Fourier terms into a matrix. If we always want to evaluate quantities on a known (u,v) mesh then evaluating the Fourier double integral is just a matrix multiply and only requires an initial evaluate of the trig coefficients, see this following code (from virtual_casing_mod.f90) SUBROUTINE uvtomn_local(k1,k,mnmax,nu,nv,xu,xv,fmn,xm,xn,f,signs,calc_trig)
IMPLICIT NONE
! INPUT VARIABLES
INTEGER, INTENT(in) :: k1
INTEGER, INTENT(in) :: k
INTEGER, INTENT(in) :: mnmax
INTEGER, INTENT(in) :: nu
INTEGER, INTENT(in) :: nv
DOUBLE PRECISION, INTENT(in) :: xu(1:nu)
DOUBLE PRECISION, INTENT(in) :: xv(1:nv)
DOUBLE PRECISION, INTENT(inout) :: fmn(1:mnmax,k1:k)
INTEGER, INTENT(in) :: xm(1:mnmax)
INTEGER, INTENT(in) :: xn(1:mnmax)
DOUBLE PRECISION, INTENT(in) :: f(1:nu,1:nv,k1:k)
INTEGER, INTENT(in) :: signs
INTEGER, INTENT(in) :: calc_trig
! LOCAL VARIABLES
INTEGER :: mn, i, j, ier, ik
DOUBLE PRECISION :: xn_temp(1:mnmax,1)
DOUBLE PRECISION :: xm_temp(1:mnmax,1)
DOUBLE PRECISION :: pi2_l
DOUBLE PRECISION :: mt(1:mnmax,1:nu)
DOUBLE PRECISION :: nz(1:mnmax,1:nv)
DOUBLE PRECISION :: xu_temp(1,1:nu)
DOUBLE PRECISION :: xv_temp(1,1:nv)
DOUBLE PRECISION, ALLOCATABLE, SAVE :: fnuv(:)
DOUBLE PRECISION, ALLOCATABLE, SAVE :: cosmt(:,:)
DOUBLE PRECISION, ALLOCATABLE, SAVE :: sinmt(:,:)
DOUBLE PRECISION, ALLOCATABLE, SAVE :: cosnz(:,:)
DOUBLE PRECISION, ALLOCATABLE, SAVE :: sinnz(:,:)
! BEGIN SUBROUTINE
pi2_l = 8.0D+0 * ATAN(1.)
IF (calc_trig == 1) THEN
IF (ALLOCATED(cosmt)) DEALLOCATE(cosmt)
IF (ALLOCATED(sinmt)) DEALLOCATE(sinmt)
IF (ALLOCATED(cosnz)) DEALLOCATE(cosnz)
IF (ALLOCATED(sinnz)) DEALLOCATE(sinnz)
IF (ALLOCATED(fnuv)) DEALLOCATE(fnuv)
ALLOCATE(fnuv(1:mnmax),STAT=ier)
ALLOCATE(cosmt(1:mnmax,1:nu),sinmt(1:mnmax,1:nu),&
cosnz(1:mnmax,1:nv),sinnz(1:mnmax,1:nv),STAT=ier)
FORALL(i=1:mnmax) xm_temp(i,1)=DBLE(xm(i))
FORALL(i=1:mnmax) xn_temp(i,1)=DBLE(xn(i))
FORALL(i=1:mnmax) fnuv(i) = 2.0D+0/DBLE(nu*nv)
WHERE(xm == 0) fnuv = 5.0D-1*fnuv
FORALL(i=1:nu) xu_temp(1,i)=xu(i)
FORALL(i=1:nv) xv_temp(1,i)=xv(i)
mt = MATMUL(xm_temp,xu_temp)
nz = MATMUL(xn_temp,xv_temp)
FORALL(mn=1:mnmax,i=1:nu) cosmt(mn,i) = dcos(pi2_l*mt(mn,i))
FORALL(mn=1:mnmax,i=1:nu) sinmt(mn,i) = dsin(pi2_l*mt(mn,i))
FORALL(mn=1:mnmax,i=1:nv) cosnz(mn,i) = dcos(pi2_l*nz(mn,i))
FORALL(mn=1:mnmax,i=1:nv) sinnz(mn,i) = dsin(pi2_l*nz(mn,i))
END IF
fmn = zero
IF (signs == 0) THEN
DO mn = 1, mnmax
DO i = 1, nu
DO j = 1, nv
fmn(mn,k1:k) = fmn(mn,k1:k) + f(i,j,k1:k)*(cosmt(mn,i)*cosnz(mn,j) &
- sinmt(mn,i)*sinnz(mn,j))*fnuv(mn)
END DO
END DO
END DO
ELSE IF (signs == 1) THEN
DO mn = 1, mnmax
DO i = 1, nu
DO j = 1, nv
fmn(mn,k1:k) = fmn(mn,k1:k) + f(i,j,k1:k)*(sinmt(mn,i)*cosnz(mn,j) &
+ cosmt(mn,i)*sinnz(mn,j))*fnuv(mn)
END DO
END DO
END DO
END IF
! END SUBROUTINE
END SUBROUTINE uvtomn_local If we get clever about the grid spacing then it may be possible to just to integrate over fixed grids removing the overhead of DCUHRE. |
@SRHudson Regarding your debugging of DCUHRE here's the code used by vitual_casing_mod.f90 SUBROUTINE bfield_virtual_casing_adapt_dbl(x,y,z,bx,by,bz,istat)
IMPLICIT NONE
! INPUT VARIABLES
DOUBLE PRECISION, INTENT(in) :: x, y, z
DOUBLE PRECISION, INTENT(out) :: bx, by, bz
INTEGER, INTENT(inout) :: istat
! LOCAL VARIABLES
LOGICAL :: adapt_rerun
INTEGER(KIND=8), PARAMETER :: ndim_nag = 2 ! theta,zeta
INTEGER(KIND=8), PARAMETER :: nfun_nag = 3 ! Bx, By, Bz
INTEGER(KIND=8), PARAMETER :: lenwrk_nag = IWRK
INTEGER(KIND=8) :: maxcls_nag,mincls_nag, subs, restar, wrklen, rulcls, wrksbs, n, m, funcls
DOUBLE PRECISION :: absreq_nag, relreq_nag
DOUBLE PRECISION :: wrkstr_nag(lenwrk_nag)
DOUBLE PRECISION :: a_nag(ndim_nag), b_nag(ndim_nag), &
finest_nag(nfun_nag), absest_nag(nfun_nag)
DOUBLE PRECISION, ALLOCATABLE :: vrtwrk(:)
#ifdef NAG
EXTERNAL :: D01EAF
#else
EXTERNAL :: dcuhre
#endif
! BEGIN SUBROUTINE
IF (adapt_tol < 0) THEN
CALL bfield_virtual_casing_dbl(x,y,z,bx,by,bz)
RETURN
END IF
a_nag(1:2) = zero
b_nag(1:2) = one
mincls_nag = MIN_CLS
maxcls_nag = IWRK
!absreq_nag = zero
absreq_nag = adapt_tol ! Talk to Stuart about proper values
relreq_nag = adapt_rel ! Talk to Stuart about proper values
finest_nag = zero
absest_nag = zero
x_nag = x
y_nag = y
z_nag = z
adapt_rerun = .true.
subs = 1
restar = 0
DO WHILE (adapt_rerun)
#ifdef NAG
CALL D01EAF(ndim_nag,a_nag,b_nag,mincls_nag,maxcls_nag,nfun_nag,funsub_nag_b,absreq_nag,&
relreq_nag,lenwrk_nag,wrkstr_nag,finest_nag,absest_nag,istat)
IF (istat == 1 .or. istat == 3) THEN
maxcls_nag = maxcls_nag*10
mincls_nag = -1
restar = 1
WRITE(6,*) '!!!!! WARNING Could not reach desired tollerance !!!!!'
WRITE(6,*) ' BX = ',finest_nag(1),' +/- ',absest_nag(1)
WRITE(6,*) ' BY = ',finest_nag(2),' +/- ',absest_nag(2)
WRITE(6,*) ' BZ = ',finest_nag(3),' +/- ',absest_nag(3)
WRITE(6,*) '!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!'
ELSE IF (mincls_nag <= 32) THEN
mincls_nag = 64
restar = 1
ELSE IF (istat < 0) THEN
bx = zero
by = zero
bz = zero
adapt_rerun=.false.
ELSE
bx = finest_nag(1)
by = finest_nag(2)
bz = finest_nag(3)
adapt_rerun=.false.
END IF
#else
IF (.not.ALLOCATED(vrtwrk)) THEN
wrklen = ((maxcls_nag-ndim_nag)/(2*ndim_nag) + 1)*(2*ndim_nag+2*nfun_nag+2) + 17*nfun_nag + 256
ALLOCATE(vrtwrk(wrklen),STAT=istat)
IF (istat .ne. 0) THEN
WRITE(6,*) ' ALLOCATION ERROR IN: bfield_virtual_casing_adapt_dbl'
WRITE(6,*) ' VARIABLE: VRTWRK, SIZE: ',wrklen
WRITE(6,*) ' ISTAT: ',istat
RETURN
END IF
END IF
CALL dcuhre(ndim_nag,nfun_nag,a_nag,b_nag,mincls_nag,maxcls_nag,funsub_nag_b,absreq_nag,&
relreq_nag,0,wrklen,restar,finest_nag,absest_nag,funcls,istat,vrtwrk)
IF (istat == 1) THEN
maxcls_nag = maxcls_nag*10
mincls_nag = funcls
restar = 1
istat = 0
ELSE IF (istat > 1) THEN
bx = zero
by = zero
bz = zero
adapt_rerun=.false.
DEALLOCATE(vrtwrk)
ELSE
bx = finest_nag(1)
by = finest_nag(2)
bz = finest_nag(3)
adapt_rerun=.false.
DEALLOCATE(vrtwrk)
END IF
#endif
END DO
nlastcall=mincls_nag
RETURN
! END SUBROUTINE
END SUBROUTINE bfield_virtual_casing_adapt_dbl
SUBROUTINE funsub_nag_b(ndim, vec, nfun, f)
IMPLICIT NONE
! INPUT VARIABLES
INTEGER, INTENT(in) :: ndim, nfun
DOUBLE PRECISION, INTENT(in) :: vec(ndim)
DOUBLE PRECISION, INTENT(out) :: f(nfun)
! LOCAL VARIABLES
INTEGER :: ier
DOUBLE PRECISION :: bn, bx, by, bz , xs, ys, zs, gf, gf3, nx, ny, nz, ax, ay, az
! BEGIN SUBROUTINE
xs = zero; ys = zero; zs = zero
CALL EZspline_interp(x_spl,vec(1),vec(2),xs,ier)
CALL EZspline_interp(y_spl,vec(1),vec(2),ys,ier)
CALL EZspline_interp(z_spl,vec(1),vec(2),zs,ier)
CALL EZspline_interp(kx_spl,vec(1),vec(2),ax,ier)
CALL EZspline_interp(ky_spl,vec(1),vec(2),ay,ier)
CALL EZspline_interp(kz_spl,vec(1),vec(2),az,ier)
bn = zero
gf = one/DSQRT((x_nag-xs)*(x_nag-xs)+(y_nag-ys)*(y_nag-ys)+(z_nag-zs)*(z_nag-zs))
gf3 = gf*gf*gf
f(1) = norm_nag*(ay*(z_nag-zs)-az*(y_nag-ys)+bn*(x_nag-xs))*gf3
f(2) = norm_nag*(az*(x_nag-xs)-ax*(z_nag-zs)+bn*(y_nag-ys))*gf3
f(3) = norm_nag*(ax*(y_nag-ys)-ay*(x_nag-xs)+bn*(z_nag-zs))*gf3
RETURN
END SUBROUTINE funsub_nag_b |
The virtual casing method in SPEC is far from optimal for the problem at hand. We should address this.
Background links
The integration of the equations of propagation of electric waves
Use of the virtual-casing principle in calculating the containing magnetic field in toroidal plasma systems
The virtual-casing principle for 3D toroidal systems
A magnetic diagnostic code for 3D fusion equilibria
The virtual-casing principle and Helmholtz's theorem
Background info from @SRHudson
It seems that there was some bug in replacing the previous NAG routine with DCUHRE, but this is not the main problem. If anyone wants to fix this bug (I started, and have committed/pushed my changes, but I need to go home now.)
In SPEC we need to calculate the magnetic field on a regular grid, Nt in theta and Nz in zeta, on the computational boundary. For each one of these Ntz = Nt * Nz virtual casing integrals, we need to perform a surface integral over the plasma boundary. I think that the best algorithm in SPEC would be to compute the geometry and tangential field on the plasma boundary just once, and then to use this information for all of the virtual casing integrals required on the Nt * Nz grid on the computational boundary. The point is that there is a lot of information that is common, so if we compute the common information only once then SPEC will be faster.
Presently I am using a slow Fourier transform to compute the magnetic field at whatever point that DCUHRE requires, and because DCUHRE is adaptive, the points on the surface where DCUHRE requires the geometry/tangential field of the plasma boundary will vary as the position on the computational boundary varies. One approach, which I first suggested, is to use/develop a quadrature algorithm that only requires the integrand on a fixed regular grid. This will probably be fasted, but we will potentially lose the accuracy that adaptive routines usually guarantee. Another approach is to first compute the geometry/tangential field on the plasma boundary on a regular grid and thereafter compute the geometry/tangential field using splines (or something fast). I am not sure that this will really solve the problem, because if we need such adaptivity it probably means we need to run SPEC at higher Fourier resolution in any case.
The text was updated successfully, but these errors were encountered: