segmentation fault while running cclm job
Hi,
while running a job chain, I get this error for the cclm job:
OPEN: ncdf-file: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out06/lffd2041030500p.nc [m10091:30787:0] Caught signal 11 (Segmentation fault) .... backtrace 2 0x00000000000572cc mxm_handle_error() /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.6.392-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.4/src/mxm/util/debug/debug.c:641 3 0x000000000005743c mxm_error_signal_handler() /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.6.392-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.4/src/mxm/util/debug/debug.c:616 .... srun: error: m10090: tasks 0-23: Segmentation fault srun: Terminating job step 6664508.0 srun: Job step aborted: Waiting up to 32 seconds for job step to finish. srun: error: m10091: tasks 24-47: Segmentation fault srun: error: m10095: tasks 48-71: Segmentation fault
I use the
BULL
MPI
environment as described here: http://redc.clm-community.eu/projects/cclmdkrz/wiki/Fopts
and the batch setting recommended for mistral (http://redc.clm-community.eu/projects/cclmdkrz/wiki/Run-scripts)
All necessary modules are loaded.
Thanks.
Often this error appears when the basic data for the vertical interpolation have unrealistic or even NaN values. Have you checked the data in the files listed below?
00: /scratch/b/b324052/rcp85_41_60_125/input/cclm/2041_03/lbfd2041030503.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out01/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out02/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out03/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out04/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out05/lffd2041030500.nc
00: CLOSING ncdf FILE
Often this error appears when the basic data for the vertical interpolation have unrealistic or even NaN values. Have you checked the data in the files listed below?
00: /scratch/b/b324052/rcp85_41_60_125/input/cclm/2041_03/lbfd2041030503.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out01/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out02/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out03/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out04/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out05/lffd2041030500.nc
00: CLOSING ncdf FILE
Often this error appears when the basic data for the vertical interpolation have unrealistic or even NaN values. Have you checked the data in the files listed below?
00: /scratch/b/b324052/rcp85_41_60_125/input/cclm/2041_03/lbfd2041030503.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out01/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out02/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out03/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out04/lffd2041030500.nc
00: CLOSING ncdf FILE
00: OPEN: ncdf-file:
00: /scratch/b/b324052/rcp85_41_60_125/output/cclm/2041_03/out05/lffd2041030500.nc
00: CLOSING ncdf FILE
Hi,
while running a job chain, I get this error for the cclm job:
I use the BULL MPI environment as described here: http://redc.clm-community.eu/projects/cclmdkrz/wiki/Fopts
and the batch setting recommended for mistral (http://redc.clm-community.eu/projects/cclmdkrz/wiki/Run-scripts)
All necessary modules are loaded.
Thanks.