Fornax Code Tests
Fornax Code Description
Fornax Code Paper
We have developed an entirely new multi-dimensional, multi-group radiation/hydrodyanmic code, Fornax, for the study of core-collapse supernovae (Skinner et al. 2019). Most of the code is written in C, with only a few Fortran 95 routines for reading in microphysical data tables, and we use an MPI parallelism model for the Blue Waters Cray/XE6 architecture and an MPI/OpenMP hybrid paralellism model for KNL architectures. Fornax employs spherical coordinates in one, two, and three spatial dimensions, solves the comoving-frame, multi-group, two-moment, velocity-dependent transport equations to O(v/c), and uses the M1 tensor closure for the second and third moments of the radiation fields (Dubroca & Feugeas 1999; Vaytet et al. 2011). We employ a spherical grid. Three species of neutrino are followed using an explicit Godunov characteristic method applied to the radiation transport operators, but an implicit solver for the radiation source terms. In this way, the radiative transport and transfer are handled locally, without the need for a global solution on the entire mesh. This is also the recent approach taken by Just, Obergaulinger, & Janka (2015) and O'Connor & Couch (2015), though with some important differences. By addressing the transport operator with an explicit method, we significantly reduce the computational complexity and communication overhead of traditional multi-dimensional radiative transfer solutions by bypassing the need for global iterative solvers that have proven to be slow and/or problematic beyond ~10,000 cores. Strong scaling of the transport solution in three dimensions using Fornax is excellent beyond 100,000 tasks on KNL and Cray architectures; Skinner et al. 2019). The light-crossing time of a zone generally sets the timestep, but since the speed of light and the speed of sound in the inner core are not far apart in the core-collapse problem after bounce, this numerical stability constraint on the timestep is similar to the CFL constraint of the explicit hydrodynamics. Radiation quantities are reconstructed with linear profiles and the calculated edge states are used to determine fluxes via an HLLE solver. In the non-hyprebolic regime, the HLLE fluxes are corrected to reduce numerical diffusion (O'Connor & Ott 2013). The momentum and energy transfer between the radiation and the gas are operator-split and addressed implicitly. The hydrodynamics in Fornax is based on a directionally unsplit Godunov-type finite-volume method. Fluxes at cell faces are computed with the fast and accurate HLLC approximate Riemann solver based on left and right states reconstructed from the underlying volume-averaged states. The reconstruction is accomplished via a novel algorithm we developed specifically for Fornax that uses moments of the coordinates within each cell and the volume-averaged states to reconstruct TVD-limited parabolic profiles, while requiring one less "ghost cell" than the standard PPM approach. The profiles always respect the cells' volume averages and, in smooth parts of the solution away from extrema, yield third-order accurate states on the faces. We have taken special care in treating the reconstruction of quantities on the mesh and have formulated our reconstruction methods to be agnostic to choices of coordinates and mesh equidistance. As in Mignone (2014), we do this by forming our reconstruction expressions in terms of {volume-averaged} quantities and moments of the coordinates in each cell. The moments are calculated at start up for the given mesh and coordinates. In the radial direction, we have a uniform mesh in coordinate x, where the normal spherical radius is an arbitrary mapping from r. We reconstruct quantities in x, i.e. we locally compute f(x) to specify f on the faces of each cell. The mesh in x is entirely uniform. We have no need of, and do not construct, f(r), where then the samples of f would be non-uniformly distributed in r. The same is true in the angular direction. Thus, we have no "non-equidistant" reconstruction and, therefore, our reconstructions yield the same results as Mignone (2014) (Skinner et al. 2019). To eliminate the carbuncle and related phenomenon, Fornax specifically detects strong, grid-aligned shocks and employs in neighboring cells HLLE, rather than HLLC, fluxes that introduces a small amount of smoothing in the transverse direction. Without gravity, the coupled set of radiation/hydrodynamic equations conserves energy and momentum to machine accuracy. With gravity, energy conservation is excellent before and after core bounce. However, as with all other supernova codes, at bounce the total energy as defined in integral form glitches by ~10^{49} ergs (Skinner et al. 2019). (Most supernova codes jump in this quantity at this time by more than 10^{50} ergs.) This is due to the fact that the gravitational terms are handled in the momentum and energy equations as source terms and are not in conservative divergence form. The code is written in a covariant/coordinate-independent fashion, with generalized connection coefficients, and so can employ any coordinate mapping. This facilitates the use of any logically-Cartesian coordinate system and, if necessary, the artful distribution of zones. In the interior, to circumvent Courant limits due to converging angular zones, the code can deresolve in both angles (theta and phi) independently with decreasing radius, conserving hydrodynamic and radiative fluxes in a manner similar to the method employed in AMR codes at refinement boundaries. The use of such a "dendritic grid" allows us to avoid angular Courant limits at the center, while maintaining accuracy and enabling us to employ the useful spherical coordinate system natural for the supernova problem. Importantly, the overheads for Christoffel symbol calculations are minimal, since the code uses static mesh refinement, and hence the terms need to be calculated only once (in the beginning). Therefore, the overhead associated with the covariant formulation is almost nonexistent. The metric, Christoffel symbols, and other associated information related to geometry are computed once upfront and stored in memory. In the context of a multi-species, multi-group, neutrino radiation hydrodynamics calculation, the additional memory footprint is small (note that the radiation requires hundreds of variables to be stored per zone). In terms of FLOPs, the additional costs are associated with occasionally transforming between contravariant and covariant quantities and in the evaluation of the geometric source terms. Again, in the context of a radiation hydrodynamics calculation, the additional expense is extremely small. Gravity is handled in 2D and 3D with a multipole solver (Muller & Steinmetz 1995), where we generally set the maximum spherical harmonic order necessary equal to twelve. The monopole gravitational term is altered to approximately accommodate general-relativistic gravity (Marek et al. 2006), and we employ the metric terms, g(rr) and g(tt), derived from this potential in the neutrino transport equations to incorporate general relativistic redshift effects (in the manner of Rampp & Janka 2002; see also Skinner et al. 2016). We used for our recent 2D simulations the SFHo equation of state (EOS) (Steiner et al. 2013), but also the K = 220 MeV equation of state of Lattimer & Swesty (1991) and the DD2 EOS (Banik et al. 2014). For these proposed simulations, we will use the SFHo equation of state, shown to be consistent with all current nuclear physics constraints (Steiner et al. 2010). A comprehensive set of neutrino-matter interactions are followed in Fornax and these are described in Burrows, Reddy, & Thompson (2006). They include weak magnetism and recoil corrections to neutrino-nucleon scattering and absorption (Horowitz 2002); ion-ion-correlations, weak screening, and form-factor corrections for neutrino-nucleus scattering; and inelastic neutrino-electron using the scheme of Thompson, Burrows, & Pinto (2003) and the relativistic formalism summarized in Reddy et al. (1999). Inelastic neutrino-nucleon scattering is also included using a modified version of the Thompson, Burrows, & Pinto (2003) approach. Neutrino sources and sinks due to nucleon-nucleon bremsstrahlung and electron-positron annihilation are included, as described in Thompson, Burrows, & Horvath (2000). We currently include a many-body structure factor correction to the axial-vector term in the neutrino-nucleon scattering rate due to the neutrino response to nuclear matter at (low) densities below ~10^{13} gm cm^{-3} derived by Horowitz et al. (2017) using a virial approach. Horowitz et al. (2017) attach their fit to the structure factor to that of Burrows & Sawyer (1998) at higher densities. We reiterate - Fornax, being explicit, does not require global iterations and is ~10 times faster than most full-transport supernova codes. Moreover, it scales extremely well with core count on Intel Xeon Phi (KNL) and Cray architectures. Of course, Fornax checkpoints often during a simulation to enable restarts at numerous timesteps/times and the I/O is fully parallel. In fact, the code uses parallel HDF5 writes to a single file with underlying MPIIO coordination. Each processor writes at a different offset into a single HDF5 file, which we have found useful for restarting checkpoints with different processor numbers and topologies, and which has performed well on 2D runs and low-resolution 3D runs. However, depending on the scaling at very large core counts, we may find it advantageous to introduce an additional layer in our I/O pipeline where only a subset of the MPI ranks interact with the file system, writing their own data and data from nearby non-I/O MPI processes. Simulation analysis will be performed locally with the software package VisIt and with custom written python analysis software. As stated, the main advantages of the Fornax code are its efficiency due to its explicit nature, its excellent strong scaling to hundreds of thousands of cores, its truly multi-dimensional transport solution, and the interior static mesh derefinement in the core. In our strong-scaling study with a total of 3 times 20 radiation groups and a cell size of ~0.5 km on the finest level, it took about ~0.5-1.0 seconds to advance one timestep on ~100,000 cores of Cori II (KNL). On Blue Waters, this is approximately equivalent ~10-20 MCPU-hours (~one million node-hours) per ~1000 milliseconds of physical time, a canonical interval during which the important early dynamics of the explosion mechanism is traditionally addressed. For the first time it is now possible, not only to contemplate, but to simulate and investigate multiple 3D radiation/hydro simulations per year, incorporating all the radiation/hydro feedbacks To benchmark the parallel performance of our code under production-run conditions, we ran a full 3D simulation with 20 energy groups, 12th-order multipole gravity with GR corrections to the monopole component, and a resolution of 608 times 256 times 512 (radial, poloidal, toroidal) for 10 cycles with an increasing number of pure MPI tasks out to 110k tasks. The results demonstrate that our code runs extremely fast on Cori II, with excellent strong scaling efficiency over 90 percent. In fact, because of the way we use a greedy algorithm to load-balance the decomposition of our dendritic mesh, the efficiency peaks at larger task count. These results indicate that our code is currently ready for full production runs at scale.