Problem with CFL-criterion and NaN values – in #9: CCLM

in #9: CCLM

<p> Dear colleges, </p> <p> I’ve faced serious problem in work with <span class="caps"> CCLM </span> model: very often my simulations lead to appearance of <span class="caps"> NAN </span> -values in the output with message “!!!!*** <span class="caps"> WARNING </span> ***!!! <span class="caps"> CFL </span> -criterion for horizontal advection is violated”. I’m trying to solve this problem by reducing time step dt, but now, when I work with version 5.0_clm2, and try different options from <span class="caps"> PHYCTL </span> and <span class="caps"> DYNCTL </span> , I noticed that no I need much lower value of dt that I used before with version 5.0 and standard parameters from <span class="caps"> CCLM </span> -training. </p> <p> For example, I launched model with 0.045 deg. horizontal resolution and dt=30, and model has failed ( <span class="caps"> NAN </span> in output since some moment), but in my previous experience I’ve used model with 0.025 deg. resolution and same value of dt=30 and it worked OK. </p> <p> So, my question is: what other model perameters could affect appearance of this problem? What of them are most important? </p> <p> Thanks in advance! </p>

  @mikhailvarentsov in #6636c2c

<p> Dear colleges, </p> <p> I’ve faced serious problem in work with <span class="caps"> CCLM </span> model: very often my simulations lead to appearance of <span class="caps"> NAN </span> -values in the output with message “!!!!*** <span class="caps"> WARNING </span> ***!!! <span class="caps"> CFL </span> -criterion for horizontal advection is violated”. I’m trying to solve this problem by reducing time step dt, but now, when I work with version 5.0_clm2, and try different options from <span class="caps"> PHYCTL </span> and <span class="caps"> DYNCTL </span> , I noticed that no I need much lower value of dt that I used before with version 5.0 and standard parameters from <span class="caps"> CCLM </span> -training. </p> <p> For example, I launched model with 0.045 deg. horizontal resolution and dt=30, and model has failed ( <span class="caps"> NAN </span> in output since some moment), but in my previous experience I’ve used model with 0.025 deg. resolution and same value of dt=30 and it worked OK. </p> <p> So, my question is: what other model perameters could affect appearance of this problem? What of them are most important? </p> <p> Thanks in advance! </p>

Problem with CFL-criterion and NaN values

Dear colleges,

I’ve faced serious problem in work with CCLM model: very often my simulations lead to appearance of NAN -values in the output with message “!!!!*** WARNING ***!!! CFL -criterion for horizontal advection is violated”. I’m trying to solve this problem by reducing time step dt, but now, when I work with version 5.0_clm2, and try different options from PHYCTL and DYNCTL , I noticed that no I need much lower value of dt that I used before with version 5.0 and standard parameters from CCLM -training.

For example, I launched model with 0.045 deg. horizontal resolution and dt=30, and model has failed ( NAN in output since some moment), but in my previous experience I’ve used model with 0.025 deg. resolution and same value of dt=30 and it worked OK.

So, my question is: what other model perameters could affect appearance of this problem? What of them are most important?

Thanks in advance!

View in channel
<p> <a href="http://www.clm-community.eu/namelist-tool/namelist-tool_portal/index.htm"> The namelist tool </a> includes an information on the dependency of namelist parameters on dt. It may still not be complete, though: <code> nrdtau, crltau, nincrad, hincrad </code> <br/> However I guess it is very likely that the problem lies in the namelist settings itself. <br/> Be aware that CCLM5.0 is still a pre-released version and there is a warning on the download page: <br/> <warning> <br/> This pre-released version contains <span class="caps"> COSMO </span> -model source code that has been evaluated for the forecast mode but not yet for climate simulations! <br/> </warning> <br/> This also concerns the namelist settings. <br/> Presently, the evaluation tests for a new released version are underway. The results are planned to be presented on the <span class="caps"> CLM </span> -Community General Assembly end of September. </p>

  @burkhardtrockel in #fd8a26b

<p> <a href="http://www.clm-community.eu/namelist-tool/namelist-tool_portal/index.htm"> The namelist tool </a> includes an information on the dependency of namelist parameters on dt. It may still not be complete, though: <code> nrdtau, crltau, nincrad, hincrad </code> <br/> However I guess it is very likely that the problem lies in the namelist settings itself. <br/> Be aware that CCLM5.0 is still a pre-released version and there is a warning on the download page: <br/> <warning> <br/> This pre-released version contains <span class="caps"> COSMO </span> -model source code that has been evaluated for the forecast mode but not yet for climate simulations! <br/> </warning> <br/> This also concerns the namelist settings. <br/> Presently, the evaluation tests for a new released version are underway. The results are planned to be presented on the <span class="caps"> CLM </span> -Community General Assembly end of September. </p>

The namelist tool includes an information on the dependency of namelist parameters on dt. It may still not be complete, though: nrdtau, crltau, nincrad, hincrad
However I guess it is very likely that the problem lies in the namelist settings itself.
Be aware that CCLM5.0 is still a pre-released version and there is a warning on the download page:

This pre-released version contains COSMO -model source code that has been evaluated for the forecast mode but not yet for climate simulations!

This also concerns the namelist settings.
Presently, the evaluation tests for a new released version are underway. The results are planned to be presented on the CLM -Community General Assembly end of September.

<p> Dear Burkhardt, thank you very much for your hints. I’m working together with Mikhail and have faced to the same problems. I woud like to remark, that we have used the same model version and model parameters in winter and spring, and we didn’t have these problems during experiments runs. Technical support of our supercomputer ‘Lomonosov’ gave us a prompt about solving this problem. They says it may be associated with the accessible <span class="caps"> RAM </span> capacity. Maybe, you have any expierence of <span class="caps"> RAM </span> consumption optimization? I mean somethings about binaries’ compilation options, or running command options, or maybe any namelists parameters? <br/> Additionally, we have a question about controlling the NaN values in output? Does any namelists parameters exist that controls output values and, for example, terminates the model run (not logs the warnings only)? <br/> Thanks a lot for any recommendations! </p>

  @vladimirplatonov in #d688bb9

<p> Dear Burkhardt, thank you very much for your hints. I’m working together with Mikhail and have faced to the same problems. I woud like to remark, that we have used the same model version and model parameters in winter and spring, and we didn’t have these problems during experiments runs. Technical support of our supercomputer ‘Lomonosov’ gave us a prompt about solving this problem. They says it may be associated with the accessible <span class="caps"> RAM </span> capacity. Maybe, you have any expierence of <span class="caps"> RAM </span> consumption optimization? I mean somethings about binaries’ compilation options, or running command options, or maybe any namelists parameters? <br/> Additionally, we have a question about controlling the NaN values in output? Does any namelists parameters exist that controls output values and, for example, terminates the model run (not logs the warnings only)? <br/> Thanks a lot for any recommendations! </p>

Dear Burkhardt, thank you very much for your hints. I’m working together with Mikhail and have faced to the same problems. I woud like to remark, that we have used the same model version and model parameters in winter and spring, and we didn’t have these problems during experiments runs. Technical support of our supercomputer ‘Lomonosov’ gave us a prompt about solving this problem. They says it may be associated with the accessible RAM capacity. Maybe, you have any expierence of RAM consumption optimization? I mean somethings about binaries’ compilation options, or running command options, or maybe any namelists parameters?
Additionally, we have a question about controlling the NaN values in output? Does any namelists parameters exist that controls output values and, for example, terminates the model run (not logs the warnings only)?
Thanks a lot for any recommendations!

<p> I am not an expert in memory issues in <span class="caps"> CCLM </span> . In one of the next version there will be the possibility to run the model with 32bit real variables instead of 64bit if my information is right. This may be an option then, but has to be carefully tested beforehand. <br/> If you provide information on the compiler you use and attach your <code> Fopts </code> file maybe someone who uses the same compiler can comment on it. <br/> Regarding NaN values in the output: Most compilers have an option that causes the program to stop in case of NaN. </p>

  @burkhardtrockel in #7008347

<p> I am not an expert in memory issues in <span class="caps"> CCLM </span> . In one of the next version there will be the possibility to run the model with 32bit real variables instead of 64bit if my information is right. This may be an option then, but has to be carefully tested beforehand. <br/> If you provide information on the compiler you use and attach your <code> Fopts </code> file maybe someone who uses the same compiler can comment on it. <br/> Regarding NaN values in the output: Most compilers have an option that causes the program to stop in case of NaN. </p>

I am not an expert in memory issues in CCLM . In one of the next version there will be the possibility to run the model with 32bit real variables instead of 64bit if my information is right. This may be an option then, but has to be carefully tested beforehand.
If you provide information on the compiler you use and attach your Fopts file maybe someone who uses the same compiler can comment on it.
Regarding NaN values in the output: Most compilers have an option that causes the program to stop in case of NaN.

<p> Dear Burkhardt, thank you for recommendations. I’m attaching Fopts file. There are a lot of commented lines because of we have chosen the final configuration between many variants. We use the following parameters and libraries: fortran compiler – mpif90, intel-13.1.0/intel-15.0.90; mpi compiler – openmpi-1.8.4-icc. <br/> Thanks to anyone for suggestions about compiler options or parameters. </p>

  @vladimirplatonov in #90370e2

<p> Dear Burkhardt, thank you for recommendations. I’m attaching Fopts file. There are a lot of commented lines because of we have chosen the final configuration between many variants. We use the following parameters and libraries: fortran compiler – mpif90, intel-13.1.0/intel-15.0.90; mpi compiler – openmpi-1.8.4-icc. <br/> Thanks to anyone for suggestions about compiler options or parameters. </p>

Dear Burkhardt, thank you for recommendations. I’m attaching Fopts file. There are a lot of commented lines because of we have chosen the final configuration between many variants. We use the following parameters and libraries: fortran compiler – mpif90, intel-13.1.0/intel-15.0.90; mpi compiler – openmpi-1.8.4-icc.
Thanks to anyone for suggestions about compiler options or parameters.

<p> Regarding the NaN issue: Did you check the option <code> -fpe0 </code> ? </p>

  @burkhardtrockel in #be6c2a1

<p> Regarding the NaN issue: Did you check the option <code> -fpe0 </code> ? </p>

Regarding the NaN issue: Did you check the option -fpe0 ?

<p> No, I didn’t. Could you explain, what is this option? Where should I use it – in Fopts compiler, or somewhere else? </p>

  @vladimirplatonov in #242829f

<p> No, I didn’t. Could you explain, what is this option? Where should I use it – in Fopts compiler, or somewhere else? </p>

No, I didn’t. Could you explain, what is this option? Where should I use it – in Fopts compiler, or somewhere else?

<p> <code> -fpe0 </code> is an Intel compiler option. It can be used in Fopts in addition to the other compiler flags. </p>

  @burkhardtrockel in #d10386e

<p> <code> -fpe0 </code> is an Intel compiler option. It can be used in Fopts in addition to the other compiler flags. </p>

-fpe0 is an Intel compiler option. It can be used in Fopts in addition to the other compiler flags.

<p> Dear colleagues, I have faced this problem again. The long model run (1 year) starts well, but after a few days (or even a few hours) it gets NaN values in YU* files (attached) and in model point output (M_Yuzhno-Sakhalinsk). It doesn’t lead to model crash, but all model variables in output have got NaN (see slurm-1151743). I have done different variants of compilation ( <span class="caps"> CKEYS </span> in Fopts): </p> <p> <span class="caps"> CKEYS </span> = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 – it gives NaN <br/> CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 -g -O0 -traceback – it gives NaN <br/> CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 -fpe0 -g -O0 -traceback – it crashes according to (slurm-1151468) <br/> CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 -fpe0 – it crashes according to (slurm-1151420) </p> <p> Could you give any solvings to this problem? I’m attaching the script file too (cclm_Sakhalin_0.12.sh). <br/> Thank you. </p>

  @vladimirplatonov in #f25f501

<p> Dear colleagues, I have faced this problem again. The long model run (1 year) starts well, but after a few days (or even a few hours) it gets NaN values in YU* files (attached) and in model point output (M_Yuzhno-Sakhalinsk). It doesn’t lead to model crash, but all model variables in output have got NaN (see slurm-1151743). I have done different variants of compilation ( <span class="caps"> CKEYS </span> in Fopts): </p> <p> <span class="caps"> CKEYS </span> = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 – it gives NaN <br/> CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 -g -O0 -traceback – it gives NaN <br/> CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 -fpe0 -g -O0 -traceback – it crashes according to (slurm-1151468) <br/> CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 -fpe0 – it crashes according to (slurm-1151420) </p> <p> Could you give any solvings to this problem? I’m attaching the script file too (cclm_Sakhalin_0.12.sh). <br/> Thank you. </p>

Dear colleagues, I have faced this problem again. The long model run (1 year) starts well, but after a few days (or even a few hours) it gets NaN values in YU* files (attached) and in model point output (M_Yuzhno-Sakhalinsk). It doesn’t lead to model crash, but all model variables in output have got NaN (see slurm-1151743). I have done different variants of compilation ( CKEYS in Fopts):

CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 – it gives NaN
CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 -g -O0 -traceback – it gives NaN
CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 -fpe0 -g -O0 -traceback – it crashes according to (slurm-1151468)
CKEYS = -c -fpp -fno-alias -unroll0 -heap-arrays 1000 -fpe0 – it crashes according to (slurm-1151420)

Could you give any solvings to this problem? I’m attaching the script file too (cclm_Sakhalin_0.12.sh).
Thank you.