Fornax Code Tests
| |

Fornax Code Description
| |

Fornax Code Paper
| |

We have developed an entirely new multi-dimensional, multi-group
radiation/hydrodyanmic code, Fornax, for the study of core-collapse
supernovae (Skinner et al. 2019). Most of the code is written in C, with only a few Fortran 95
routines for reading in microphysical data tables, and we use an MPI
parallelism model for the Blue Waters Cray/XE6 architecture and an
MPI/OpenMP hybrid paralellism model for KNL architectures. Fornax employs
spherical coordinates in one, two, and three spatial dimensions, solves
the comoving-frame, multi-group, two-moment, velocity-dependent transport
equations to O(v/c), and uses the M1 tensor closure for the second and
third moments of the radiation fields (Dubroca & Feugeas 1999; Vaytet et
al. 2011). We employ a spherical grid. Three species of neutrino are
followed using an explicit Godunov characteristic method applied to the
radiation transport operators, but an implicit solver for the radiation
source terms. In this way, the radiative transport and transfer are
handled locally, without the need for a global solution on the entire
mesh. This is also the recent approach taken by Just, Obergaulinger, &
Janka (2015) and O'Connor & Couch (2015), though with some important
differences. By addressing the transport operator with an explicit method,
we significantly reduce the computational complexity and communication
overhead of traditional multi-dimensional radiative transfer solutions by
bypassing the need for global iterative solvers that have proven to be
slow and/or problematic beyond ~10,000 cores. Strong scaling of the
transport solution in three dimensions using Fornax is excellent beyond
100,000 tasks on KNL and Cray architectures; Skinner et al. 2019). The
light-crossing time of a zone generally sets the timestep, but since the
speed of light and the speed of sound in the inner core are not far apart
in the core-collapse problem after bounce, this numerical stability
constraint on the timestep is similar to the CFL constraint of the
explicit hydrodynamics. Radiation quantities are reconstructed with linear
profiles and the calculated edge states are used to determine fluxes via
an HLLE solver. In the non-hyprebolic regime, the HLLE fluxes are
corrected to reduce numerical diffusion (O'Connor & Ott 2013). The
momentum and energy transfer between the radiation and the gas are
operator-split and addressed implicitly.
The hydrodynamics in Fornax is based on a directionally unsplit
Godunov-type finite-volume method. Fluxes at cell faces are computed with
the fast and accurate HLLC approximate Riemann solver based on left and
right states reconstructed from the underlying volume-averaged states.
The reconstruction is accomplished via a novel algorithm we developed
specifically for Fornax that uses moments of the coordinates within each
cell and the volume-averaged states to reconstruct TVD-limited parabolic
profiles, while requiring one less "ghost cell" than the standard PPM
approach. The profiles always respect the cells' volume averages and, in
smooth parts of the solution away from extrema, yield third-order accurate
states on the faces. We have taken special care in treating the
reconstruction of quantities on the mesh and have formulated our
reconstruction methods to be agnostic to choices of coordinates and mesh
equidistance. As in Mignone (2014), we do this by forming our
reconstruction expressions in terms of {volume-averaged} quantities and
moments of the coordinates in each cell. The moments are calculated at
start up for the given mesh and coordinates. In the radial direction, we
have a uniform mesh in coordinate x, where the normal spherical radius is
an arbitrary mapping from r. We reconstruct quantities in x, i.e. we
locally compute f(x) to specify f on the faces of each cell. The mesh in
x is entirely uniform. We have no need of, and do not construct, f(r),
where then the samples of f would be non-uniformly distributed in r. The
same is true in the angular direction. Thus, we have no
"non-equidistant" reconstruction and, therefore, our reconstructions
yield the same results as Mignone (2014) (Skinner et al. 2019). To
eliminate the carbuncle and related phenomenon, Fornax specifically
detects strong, grid-aligned shocks and employs in neighboring cells HLLE,
rather than HLLC, fluxes that introduces a small amount of smoothing in
the transverse direction.
Without gravity, the coupled set of radiation/hydrodynamic equations
conserves energy and momentum to machine accuracy. With gravity, energy
conservation is excellent before and after core bounce. However, as with
all other supernova codes, at bounce the total energy as defined in
integral form glitches by ~10^{49} ergs (Skinner et al. 2019). (Most
supernova codes jump in this quantity at this time by more than 10^{50}
ergs.) This is due to the fact that the gravitational terms are handled in
the momentum and energy equations as source terms and are not in
conservative divergence form.
The code is written in a covariant/coordinate-independent fashion, with
generalized connection coefficients, and so can employ any coordinate
mapping. This facilitates the use of any logically-Cartesian coordinate
system and, if necessary, the artful distribution of zones. In the
interior, to circumvent Courant limits due to converging angular zones,
the code can deresolve in both angles (theta and phi) independently with
decreasing radius, conserving hydrodynamic and radiative fluxes in a
manner similar to the method employed in AMR codes at refinement
boundaries. The use of such a "dendritic grid" allows us to avoid angular
Courant limits at the center, while maintaining accuracy and enabling us
to employ the useful spherical coordinate system natural for the supernova
problem.
Importantly, the overheads for Christoffel symbol calculations are
minimal, since the code uses static mesh refinement, and hence the terms
need to be calculated only once (in the beginning). Therefore, the
overhead associated with the covariant formulation is almost nonexistent.
The metric, Christoffel symbols, and other associated information related
to geometry are computed once upfront and stored in memory. In the
context of a multi-species, multi-group, neutrino radiation hydrodynamics
calculation, the additional memory footprint is small (note that the
radiation requires hundreds of variables to be stored per zone). In terms
of FLOPs, the additional costs are associated with occasionally
transforming between contravariant and covariant quantities and in the
evaluation of the geometric source terms. Again, in the context of a
radiation hydrodynamics calculation, the additional expense is extremely
small.
Gravity is handled in 2D and 3D with a multipole solver (Muller &
Steinmetz 1995), where we generally set the maximum spherical harmonic
order necessary equal to twelve. The monopole gravitational term is
altered to approximately accommodate general-relativistic gravity (Marek
et al. 2006), and we employ the metric terms, g(rr) and g(tt), derived
from this potential in the neutrino transport equations to incorporate
general relativistic redshift effects (in the manner of Rampp & Janka
2002; see also Skinner et al. 2016). We used for our recent 2D simulations
the SFHo equation of state (EOS) (Steiner et al. 2013), but also the K =
220 MeV equation of state of Lattimer & Swesty (1991) and the DD2 EOS
(Banik et al. 2014). For these proposed simulations, we will use the SFHo
equation of state, shown to be consistent with all current nuclear physics
constraints (Steiner et al. 2010).
A comprehensive set of neutrino-matter interactions are followed in Fornax
and these are described in Burrows, Reddy, & Thompson (2006). They include
weak magnetism and recoil corrections to neutrino-nucleon scattering and
absorption (Horowitz 2002); ion-ion-correlations, weak screening, and
form-factor corrections for neutrino-nucleus scattering; and inelastic
neutrino-electron using the scheme of Thompson, Burrows, & Pinto (2003)
and the relativistic formalism summarized in Reddy et al. (1999).
Inelastic neutrino-nucleon scattering is also included using a modified
version of the Thompson, Burrows, & Pinto (2003) approach. Neutrino
sources and sinks due to nucleon-nucleon bremsstrahlung and
electron-positron annihilation are included, as described in Thompson,
Burrows, & Horvath (2000). We currently include a many-body structure
factor correction to the axial-vector term in the neutrino-nucleon
scattering rate due to the neutrino response to nuclear matter at (low)
densities below ~10^{13} gm cm^{-3} derived by Horowitz et al. (2017)
using a virial approach. Horowitz et al. (2017) attach their fit to the
structure factor to that of Burrows & Sawyer (1998) at higher densities.
We reiterate - Fornax, being explicit, does not require global iterations
and is ~10 times faster than most full-transport supernova codes.
Moreover, it scales extremely well with core count on Intel Xeon Phi (KNL)
and Cray architectures. Of course, Fornax checkpoints often during a
simulation to enable restarts at numerous timesteps/times and the I/O is
fully parallel. In fact, the code uses parallel HDF5 writes to a single
file with underlying MPIIO coordination. Each processor writes at a
different offset into a single HDF5 file, which we have found useful for
restarting checkpoints with different processor numbers and topologies,
and which has performed well on 2D runs and low-resolution 3D runs.
However, depending on the scaling at very large core counts, we may find
it advantageous to introduce an additional layer in our I/O pipeline where
only a subset of the MPI ranks interact with the file system, writing
their own data and data from nearby non-I/O MPI processes. Simulation
analysis will be performed locally with the software package VisIt and
with custom written python analysis software.
As stated, the main advantages of the Fornax code are its efficiency due
to its explicit nature, its excellent strong scaling to hundreds of
thousands of cores, its truly multi-dimensional transport solution, and
the interior static mesh derefinement in the core. In our strong-scaling
study with a total of 3 times 20 radiation groups and a cell size of ~0.5
km on the finest level, it took about ~0.5-1.0 seconds to advance one
timestep on ~100,000 cores of Cori II (KNL). On Blue Waters, this is
approximately equivalent ~10-20 MCPU-hours (~one million node-hours) per
~1000 milliseconds of physical time, a canonical interval during which the
important early dynamics of the explosion mechanism is traditionally
addressed. For the first time it is now possible, not only to contemplate,
but to simulate and investigate multiple 3D radiation/hydro simulations
per year, incorporating all the radiation/hydro feedbacks
To benchmark the parallel performance of our code under production-run conditions, we ran a full 3D simulation with 20 energy groups, 12th-order multipole gravity with GR corrections to the monopole component, and a resolution of 608 times 256 times 512 (radial, poloidal, toroidal) for 10 cycles with an increasing number of pure MPI tasks out to 110k tasks. The results demonstrate that our code runs extremely fast on Cori II, with excellent strong scaling efficiency over 90 percent. In fact, because of the way we use a greedy algorithm to load-balance the decomposition of our dendritic mesh, the efficiency peaks at larger task count. These results indicate that our code is currently ready for full production runs at scale. |