Brad Whitlock and Earl Duque
Simulations running at high concurrency on HPC systems generate large volumes of data that are impractical to write to disk due to time and storage constraints. Applications often adapt by saving data infrequently, resulting in datasets with poor temporal resolution. This can make datasets difficult to interpret during post hoc visualization and analysis, or worse, it can lead to lost science. In situ visualization moves the visualization process into the simulation code itself, enabling algorithms to operate directly on application data structures to create smaller, more concentrated data products such as rendered images. In situ visualization is inexpensive enough to be applied frequently and it generates results that are small enough such that many data products can be generated while saving time and storage relative to writing volume data files.
In situ visualization can be applied to generate data products such as extract databases, which consist of extracted polygonal geometry plus fields. Extract databases can drastically reduce the amount of data being saved while still providing enough information to perform useful visualization and data analysis post hoc. The use of extract databases enables flexibility since most of the costly data reductions occur in situ and the reduced datasets can be visualized on more modest compute resources. Data reduction and the ability to later visualize features of interest, as opposed to static, rendered images, are what make extract databases compelling. Libsim is an in situ infrastructure that brings the features of VisIt, a flexible parallel visualization tool designed for massive data, into the simulation. Libsim and VisIt provide a large set of plots and operators that can be used to isolate features of interest and export them to formats such as FieldView extract database (XDB) format. We have instrumented the AVF-LESLIE reactive flow multiphysics solver using Libsim in order to provide in situ visualization and extract database generation. AVF-LESLIE is mainly used for Direct Numerical Simulation or Large Eddy Simulation (DNS/LES) investigation of canonical reactive flows. It solves the reactive multi-species compressible Navier-Stokes equations using a finite volume discretization upon a Cartesian grid. We used AVF-LESLIE to simulate an unsteady, turbulent mixing layer (TML) between two fluids. The simulation demonstrates the evolution of turbulent, braided flow structures (our features of interest) that form as the system breaks down into homogeneous turbulence. The quick evolution of these structures requires access to many simulation time steps to produce a faithful visualization. However for the case we ran, each time step produced hundreds of gigabytes of data, making it unfeasible to save enough time steps as post-processing the data would be as intensive as computing it. To combat these difficulties, we implemented an in situ workflow to isolate features of interest based on isosurfaces and planar slices of the vorticity vector field. The geometry representing the features of interest is exported to XDB format and Libsim creates some rendered images. Extract databases are later processed in parallel using Intelligent Light’s FieldView software.
The paper presents details on the extract database workflow, its setup in terms of modifying AVF-LESLIE for in situ and for running the turbulent mixing layer use case, and the realized benefits of using in situ. The instrumented AVF-LESLIE code has run on up to 131K cores on the Titan computer at Oak Ridge National Laboratory and we include benchmarks for volume I/O versus extract-based I/O and scaling results for Libsim while performing data extraction and rendering.