CCLM simulations fail on Mistral - floating point exception C – in #9: CCLM

in #9: CCLM

<p> Dear all, <br/> by changing the INT2LM I could solve the problem I posted before, but now a new error appears, that is also kind of similar. <br/> <pre> 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100z.nc 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100p.nc 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc 0: smoothing pmsl over mountainous terrain 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc 549: [m11510:32171:0] Caught signal 11 (Segmentation fault) 77: [m11397:46159:0] Caught signal 11 (Segmentation fault) ... 2: [m11394:43474:0] Caught signal 11 (Segmentation fault) 0: CLOSING ncdf FILE 0: [m11394:43472:0] Caught signal 11 (Segmentation fault) ... 380: [m11503:12975:0] Caught signal 11 (Segmentation fault) 28: backtrace 28: 2 0x000000000005767c mxm_handle_error() /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:641 28: 3 0x00000000000577ec mxm_error_signal_handler() /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:616 28: 4 0x0000000000032510 killpg() ??:0 28: 5 0x000000000053343b organize_data_() ??:0 28: 6 0x000000000058de82 MAIN__() ??:0 28: 7 0x00000000004052fe main() ??:0 28: 8 0x000000000001ed1d __libc_start_main() ??:0 28: 9 0x00000000004051f9 _start() ??:0 28: =============== </pre> <br/> Could anyone please help me with this problem? I would be very grateful for help. <br/> Thank you very much and best regards, <br/> Eva Nowatzki </p>

  @evanowatzki in #c06843e

<p> Dear all, <br/> by changing the INT2LM I could solve the problem I posted before, but now a new error appears, that is also kind of similar. <br/> <pre> 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100z.nc 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100p.nc 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc 0: smoothing pmsl over mountainous terrain 0: CLOSING ncdf FILE 0: OPEN: ncdf-file: 0: /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc 549: [m11510:32171:0] Caught signal 11 (Segmentation fault) 77: [m11397:46159:0] Caught signal 11 (Segmentation fault) ... 2: [m11394:43474:0] Caught signal 11 (Segmentation fault) 0: CLOSING ncdf FILE 0: [m11394:43472:0] Caught signal 11 (Segmentation fault) ... 380: [m11503:12975:0] Caught signal 11 (Segmentation fault) 28: backtrace 28: 2 0x000000000005767c mxm_handle_error() /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:641 28: 3 0x00000000000577ec mxm_error_signal_handler() /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:616 28: 4 0x0000000000032510 killpg() ??:0 28: 5 0x000000000053343b organize_data_() ??:0 28: 6 0x000000000058de82 MAIN__() ??:0 28: 7 0x00000000004052fe main() ??:0 28: 8 0x000000000001ed1d __libc_start_main() ??:0 28: 9 0x00000000004051f9 _start() ??:0 28: =============== </pre> <br/> Could anyone please help me with this problem? I would be very grateful for help. <br/> Thank you very much and best regards, <br/> Eva Nowatzki </p>

Dear all,
by changing the INT2LM I could solve the problem I posted before, but now a new error appears, that is also kind of similar.

  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100z.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100p.nc
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
  0:   smoothing pmsl over mountainous terrain
  0:  CLOSING ncdf FILE
  0:  OPEN: ncdf-file:
  0:  /scratch/b/b380794/Ref_run/output/cclm/1999_01/out01/lffd1999010100.nc
549: [m11510:32171:0] Caught signal 11 (Segmentation fault)
 77: [m11397:46159:0] Caught signal 11 (Segmentation fault)
...
  2: [m11394:43474:0] Caught signal 11 (Segmentation fault)
  0:  CLOSING ncdf FILE
  0: [m11394:43472:0] Caught signal 11 (Segmentation fault)
...
380: [m11503:12975:0] Caught signal 11 (Segmentation fault)
 28:  backtrace 
 28:  2 0x000000000005767c mxm_handle_error()  /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:641
 28:  3 0x00000000000577ec mxm_error_signal_handler()  /scrap/jenkins/workspace/hpc-power-pack/label/r-vmb-rhel6-u7-x86-64-MOFED-CHECKER/hpcx_root/src/hpcx-v1.9.7-gcc-OFED-3.18-redhat6.7-x86_64/mxm-v3.6/src/mxm/util/debug/debug.c:616
 28:  4 0x0000000000032510 killpg()  ??:0
 28:  5 0x000000000053343b organize_data_()  ??:0
 28:  6 0x000000000058de82 MAIN__()  ??:0
 28:  7 0x00000000004052fe main()  ??:0
 28:  8 0x000000000001ed1d __libc_start_main()  ??:0
 28:  9 0x00000000004051f9 _start()  ??:0
 28: ===============

Could anyone please help me with this problem? I would be very grateful for help.
Thank you very much and best regards,
Eva Nowatzki