Skip to content

restart instantly crashes with !Error!!!--> Null dt #24

@astroboylrx

Description

@astroboylrx

I was playing with p3diso and ran into a weird bug that seems to be related to VTK outputs: restarting a simulation from VTK data files sometimes instantly throw !Error!!!--> Null dt, sometimes it ran forever with tiny dt.

Below is a minimum example that I can construct to reproduce this bug.
From the current repo, I added FARGO_OPT += -DFLOOR into p3diso.opt (to avoid other potential issues).
Then my p3diso.par looks like:

Setup                  p3diso

### Disk parameters
AspectRatio            0.0375                  Thickness over Radius in the disc
Sigma0                 0.001                   Surface Density at r=1
Nu                     3.0e-5
SigmaSlope             1.0                     Slope of surface
FlaringIndex           0.5

### Planet parameters
PlanetConfig           planets/jupiter.cfg
ThicknessSmoothing     0.1                     Smoothing parameters in disk thickness

### Numerical method parameters
Disk                   YES
OmegaFrame             1.0
Frame                  F
IndirectTerm           No

### Mesh parameters
Nx                     200                     Azimuthal number of zones 
Ny                     96                      Radial number of zones
Nz                     16                      Number of zones in colatitude
Ymin                   0.5                     Inner boundary radius
Ymax                   2.0                     Outer boundary radius
Zmin                   1.3832963267948966
Zmax                   1.57079632679489661922
Xmin                   -3.141592653589793
Xmax                   3.141592653589793

### Output control parameters
Ntot                   12                      Total number of time steps
Ninterm                2                       Time steps between outputs
DT                     0.3141592653589793      Time step length. 2PI = 1 orbit
OutputDir              @outputs/p3diso
VTK                    YES

Field                  gasdens
PlotLine               field[-1,:,:]

I think it is not too far away from the built-in setup file, except using jupiter and the VTK YES line.

With these, I can get 6 outputs fine. But restarting, say, from the 5th output, shows the bug. Below summarizes the output:

➜➜$ ./fargo3d ./setups/p3diso/p3diso.par -t 2>&1 | tee outputs/p3diso/output.txt
......
===========================
FARGO3D git version 2.0-41-gf3593281-dirty
SETUP: 'p3diso'
===========================
......
The default output directory root is ./
The output directory is ./outputs/p3diso/
I do not output the ghost values
......
Found 0 communicators
Standard version with no ghost zones in X
Time counters initialized
OUTPUTS 0 at date t = 0.000000 OK
!.......................
Process 0 created the directory ./outputs/p3diso/monitor/gas/FG000000/
....................................................
⋮
⋮
OUTPUTS 4 at date t = 2.513274 OK
!..........................................................
Process 0 created the directory ./outputs/p3diso/monitor/gas/FG000004/
...............................................................
OUTPUTS 5 at date t = 3.141593 OK
!...............................................................
Process 0 created the directory ./outputs/p3diso/monitor/gas/FG000005/
...............................................................
OUTPUTS 6 at date t = 3.769911 OK
End of the simulation!

➜➜$ ./fargo3d ./setups/p3diso/p3diso.par -S 5 -o "Ntot=12" -t 2>&1 | tee outputs/p3diso/output_run1.txt
⋮
⋮
Standard version with no ghost zones in X
Time counters initialized
OUTPUTS 5 at date t = 3.141593 OK
!Error!!!--> Null dt

If I add masterprint("[RL-debug]: dt=%.8f\n", StepTime); into src/cfl_fluids_min.c, right before if(StepTime <= SMALLTIME), and re-do the test above, I got:

➜➜$ ./fargo3d ./setups/p3diso/p3diso.par -t 2>&1 | tee outputs/p3diso/output.txt
⋮
⋮
.[RL-debug]: dt=0.00504203
.
OUTPUTS 6 at date t = 3.769911 OK
End of the simulation!

➜➜$ ./fargo3d ./setups/p3diso/p3diso.par -S 5 -o "Ntot=12" -t 2>&1 | tee outputs/p3diso/output_run1.txt
⋮
⋮
OUTPUTS 5 at date t = 3.141593 OK
[RL-debug]: dt=0.00486245
![RL-debug]: dt=0.00000000
Error!!!--> Null dt

The timestep behaves fine in the first run but somehow becomes nearly zero for a restart. I can reproduce this bug with either PARALLEL=1 or GPU=1.

After lots of experiments, I found VTK could be the issue. As commenting out VTK YES seems to fix this issue.
So I went to take a look at the OUTPUTS 5 and wanted to understand what's wrong with the vtk files when compared to the dat files. To my best understanding, for Spherical coordinates, the data follows (Nz, Nx, Ny) in vtk and (Nz, Ny, Nx) in dat. The first number and the last number did match between vtk and dat. However, I cannot make them identical by np.transpose or np.swapaxes after np.reshape. Moreover, if I use imshow with origin='lower' to plot the last Ny*Nx numbers in gasdens (should be the density at midplane), it seems the vtk data has some issue:

Image

But I cannot find any issue in output_vtk.c or RestartVTK. At this point, I'm not sure how to proceed. I hope the info above is enough for debugging.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions