CCLM simulations fail on Mistral - floating point exception C – in #9: CCLM

in #9: CCLM

Dear all,
by changing the INT2LM I could solve the problem I posted before, but now a new error appears, that is also kind of similar.

  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100z.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100p.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:   smoothing pmsl over mountainous terrain
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
549: [m11510:32171:0] Caught signal 11 (Segmentation fault)
 77: [m11397:46159:0] Caught signal 11 (Segmentation fault)
...
  2: [m11394:43474:0] Caught signal 11 (Segmentation fault)
  0:  CLOSING ncdf FILE
  0: [m11394:43472:0] Caught signal 11 (Segmentation fault)
...
380: [m11503:12975:0] Caught signal 11 (Segmentation fault)
 28:  backtrace 
 28:  2 0x000000000005767c mxm_handle_error()  /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:641
 28:  3 0x00000000000577ec mxm_error_signal_handler()  /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:616
 28:  4 0x0000000000032510 killpg()  ??:0
 28:  5 0x000000000053343b organize_data_()  ??:0
 28:  6 0x000000000058de82 MAIN__()  ??:0
 28:  7 0x00000000004052fe main()  ??:0
 28:  8 0x000000000001ed1d __libc_start_main()  ??:0
 28:  9 0x00000000004051f9 _start()  ??:0
 28: ===============

Could anyone please help me with this problem? I would be very grateful for help.
Thank you very much and best regards,
Eva Nowatzki

  @evanowatzki in #c06843e

Dear all,
by changing the INT2LM I could solve the problem I posted before, but now a new error appears, that is also kind of similar.

  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100z.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100p.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:   smoothing pmsl over mountainous terrain
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
549: [m11510:32171:0] Caught signal 11 (Segmentation fault)
 77: [m11397:46159:0] Caught signal 11 (Segmentation fault)
...
  2: [m11394:43474:0] Caught signal 11 (Segmentation fault)
  0:  CLOSING ncdf FILE
  0: [m11394:43472:0] Caught signal 11 (Segmentation fault)
...
380: [m11503:12975:0] Caught signal 11 (Segmentation fault)
 28:  backtrace 
 28:  2 0x000000000005767c mxm_handle_error()  /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:641
 28:  3 0x00000000000577ec mxm_error_signal_handler()  /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:616
 28:  4 0x0000000000032510 killpg()  ??:0
 28:  5 0x000000000053343b organize_data_()  ??:0
 28:  6 0x000000000058de82 MAIN__()  ??:0
 28:  7 0x00000000004052fe main()  ??:0
 28:  8 0x000000000001ed1d __libc_start_main()  ??:0
 28:  9 0x00000000004051f9 _start()  ??:0
 28: ===============

Could anyone please help me with this problem? I would be very grateful for help.
Thank you very much and best regards,
Eva Nowatzki

Dear all,
by changing the INT2LM I could solve the problem I posted before, but now a new error appears, that is also kind of similar.

  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100z.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100p.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:   smoothing pmsl over mountainous terrain
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
549: [m11510:32171:0] Caught signal 11 (Segmentation fault)
 77: [m11397:46159:0] Caught signal 11 (Segmentation fault)
...
  2: [m11394:43474:0] Caught signal 11 (Segmentation fault)
  0:  CLOSING ncdf FILE
  0: [m11394:43472:0] Caught signal 11 (Segmentation fault)
...
380: [m11503:12975:0] Caught signal 11 (Segmentation fault)
 28:  backtrace 
 28:  2 0x000000000005767c mxm_handle_error()  /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:641
 28:  3 0x00000000000577ec mxm_error_signal_handler()  /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:616
 28:  4 0x0000000000032510 killpg()  ??:0
 28:  5 0x000000000053343b organize_data_()  ??:0
 28:  6 0x000000000058de82 MAIN__()  ??:0
 28:  7 0x00000000004052fe main()  ??:0
 28:  8 0x000000000001ed1d __libc_start_main()  ??:0
 28:  9 0x00000000004051f9 _start()  ??:0
 28: ===============

Could anyone please help me with this problem? I would be very grateful for help.
Thank you very much and best regards,
Eva Nowatzki