problem with gcm_to_cclm test – in #12: CCLM Starter Package Support

in #12: CCLM Starter Package Support

Dear colleagues,

We try to perform test in directory ../step_by_step/gcm_to_cclm/ but for some reasons run_int2lm script fails at a certain moment. All files from directory ../cclm-sp_2.4/step_by_step/gcm_to_cclm/log_int2lm are attached. No file appears in directory ../step_by_step/gcm_to_cclm/data/int2lm_output. Could you, please, suggest any advice on tracking/solving this problem?

Kind regards,
Iya Belova

P.S. maybe this problem is related to the fact that in file int2lm.exe.out-14621 we see the following line “Binary name ….: tstint2lm” but the real binary name is int2lm.exe but we didn’t find does this name (tstint2lm) come from.

  @iyabelova in #d5345fc

Dear colleagues,

We try to perform test in directory ../step_by_step/gcm_to_cclm/ but for some reasons run_int2lm script fails at a certain moment. All files from directory ../cclm-sp_2.4/step_by_step/gcm_to_cclm/log_int2lm are attached. No file appears in directory ../step_by_step/gcm_to_cclm/data/int2lm_output. Could you, please, suggest any advice on tracking/solving this problem?

Kind regards,
Iya Belova

P.S. maybe this problem is related to the fact that in file int2lm.exe.out-14621 we see the following line “Binary name ….: tstint2lm” but the real binary name is int2lm.exe but we didn’t find does this name (tstint2lm) come from.

problem with gcm_to_cclm test

Dear colleagues,

We try to perform test in directory ../step_by_step/gcm_to_cclm/ but for some reasons run_int2lm script fails at a certain moment. All files from directory ../cclm-sp_2.4/step_by_step/gcm_to_cclm/log_int2lm are attached. No file appears in directory ../step_by_step/gcm_to_cclm/data/int2lm_output. Could you, please, suggest any advice on tracking/solving this problem?

Kind regards,
Iya Belova

P.S. maybe this problem is related to the fact that in file int2lm.exe.out-14621 we see the following line “Binary name ….: tstint2lm” but the real binary name is int2lm.exe but we didn’t find does this name (tstint2lm) come from.

View in channel

I assume that you run the original run_int2lm script without any modifications? In that case it might be a problem with your computer system, i.e. compiler and options, mpi version etc. If you provide information on

  • the name of the computing system you use
  • the name of the compiler and its version
  • the name of the mpi and its version
  • attach the Fopts file from /home/dokukin/work/cosmo/cclm-sp_2.4/src/int2lm

maybe someone from the CLM -Community using a similar configuration can help.

The different binary name is not the reason, because this is an information which has to be set by the user (see the following snippet from the subroutine info_int2lm.f90 ) and does not has any effect on the model run.


! Currently it is not possible with FORTRAN95 to get the information
! of the full path of binary name like the $0 in C. Additionally
! we cannot determine on which host(s) the binary is running and the
! domain of the data spread through the nodes.
! Therefore this information has to be defined manually. On using info_readnl()
! this information may be defined within the segment /info_defaults/ which
! has to reside within the named namelist of your choice. Missing information
! will be ignored silently.
! Currently following information may be defined within /info_defaults/:
! INFO_Options ..: List of print options
! INFO_BinaryName: Name (best: full path) of the binary
! INFO_RunMachine: The machine (OS) where the program is running
! INFO_Nodes ….: Description of the nodes the binary is running
! INFO_Domain …: The domain the binary is calculating

  @burkhardtrockel in #d8cb29b

I assume that you run the original run_int2lm script without any modifications? In that case it might be a problem with your computer system, i.e. compiler and options, mpi version etc. If you provide information on

  • the name of the computing system you use
  • the name of the compiler and its version
  • the name of the mpi and its version
  • attach the Fopts file from /home/dokukin/work/cosmo/cclm-sp_2.4/src/int2lm

maybe someone from the CLM -Community using a similar configuration can help.

The different binary name is not the reason, because this is an information which has to be set by the user (see the following snippet from the subroutine info_int2lm.f90 ) and does not has any effect on the model run.


! Currently it is not possible with FORTRAN95 to get the information
! of the full path of binary name like the $0 in C. Additionally
! we cannot determine on which host(s) the binary is running and the
! domain of the data spread through the nodes.
! Therefore this information has to be defined manually. On using info_readnl()
! this information may be defined within the segment /info_defaults/ which
! has to reside within the named namelist of your choice. Missing information
! will be ignored silently.
! Currently following information may be defined within /info_defaults/:
! INFO_Options ..: List of print options
! INFO_BinaryName: Name (best: full path) of the binary
! INFO_RunMachine: The machine (OS) where the program is running
! INFO_Nodes ….: Description of the nodes the binary is running
! INFO_Domain …: The domain the binary is calculating

I assume that you run the original run_int2lm script without any modifications? In that case it might be a problem with your computer system, i.e. compiler and options, mpi version etc. If you provide information on

  • the name of the computing system you use
  • the name of the compiler and its version
  • the name of the mpi and its version
  • attach the Fopts file from /home/dokukin/work/cosmo/cclm-sp_2.4/src/int2lm

maybe someone from the CLM -Community using a similar configuration can help.

The different binary name is not the reason, because this is an information which has to be set by the user (see the following snippet from the subroutine info_int2lm.f90 ) and does not has any effect on the model run.


! Currently it is not possible with FORTRAN95 to get the information
! of the full path of binary name like the $0 in C. Additionally
! we cannot determine on which host(s) the binary is running and the
! domain of the data spread through the nodes.
! Therefore this information has to be defined manually. On using info_readnl()
! this information may be defined within the segment /info_defaults/ which
! has to reside within the named namelist of your choice. Missing information
! will be ignored silently.
! Currently following information may be defined within /info_defaults/:
! INFO_Options ..: List of print options
! INFO_BinaryName: Name (best: full path) of the binary
! INFO_RunMachine: The machine (OS) where the program is running
! INFO_Nodes ….: Description of the nodes the binary is running
! INFO_Domain …: The domain the binary is calculating

Thank you for the answer.

We changed only lines which are used to call int2lm.exe file and number of CPU s in run_in2lm script. Anyway I’ll attach it together with the Fopts file used to compile int2lm.

Here is the information about our system

  • system: CentOS 5.2
  • compiler: ifort 10.1
  • mpi: mvapich2 1.0.3

  @iyabelova in #57000e8

Thank you for the answer.

We changed only lines which are used to call int2lm.exe file and number of CPU s in run_in2lm script. Anyway I’ll attach it together with the Fopts file used to compile int2lm.

Here is the information about our system

  • system: CentOS 5.2
  • compiler: ifort 10.1
  • mpi: mvapich2 1.0.3

Thank you for the answer.

We changed only lines which are used to call int2lm.exe file and number of CPU s in run_in2lm script. Anyway I’ll attach it together with the Fopts file used to compile int2lm.

Here is the information about our system

  • system: CentOS 5.2
  • compiler: ifort 10.1
  • mpi: mvapich2 1.0.3

I just run the script with nprocx =1, nprocy = 1, as you did. No problems. Therefore I assume this is a problem of your computing system. I have no experience with CentOs and mvapich2. Hopefully another member of the CLM -Community has and can help you.

  @burkhardtrockel in #e185555

I just run the script with nprocx =1, nprocy = 1, as you did. No problems. Therefore I assume this is a problem of your computing system. I have no experience with CentOs and mvapich2. Hopefully another member of the CLM -Community has and can help you.

I just run the script with nprocx =1, nprocy = 1, as you did. No problems. Therefore I assume this is a problem of your computing system. I have no experience with CentOs and mvapich2. Hopefully another member of the CLM -Community has and can help you.

Thank you for your help.
Could you, please, attach resulting log files? This could help us to track our problem.

  @iyabelova in #db0f42e

Thank you for your help.
Could you, please, attach resulting log files? This could help us to track our problem.

Thank you for your help.
Could you, please, attach resulting log files? This could help us to track our problem.

There is a maintenance of the computing system at DKRZ today and tomorrow. I will send you the log files after that.

  @burkhardtrockel in #0b600d1

There is a maintenance of the computing system at DKRZ today and tomorrow. I will send you the log files after that.

There is a maintenance of the computing system at DKRZ today and tomorrow. I will send you the log files after that.

Here are the log and OUTPUT files of the successful job.

  @burkhardtrockel in #33eac9b

Here are the log and OUTPUT files of the successful job.

Here are the log and OUTPUT files of the successful job.

Thank you again.
When I compared your files with ours I found that the following part is quite different: (this is part of our file)
“Info about KIND -parameters: iintegers / MPI _INT = 4 1275069467 int_ga / MPI _INT = 4 1275069467“
In your file instead of 1275069467 stays 7. I’ve received another int2lm outputs and there is also 7 on that place. It seems that this could cause some problems.
I’m not sure but it seems that variable (?) MPI _INT is broken for some reasons.
I’ve tried to add something like “export MPI _INT=7” to run_int2lm script but unfortunately it didn’t work.

  @iyabelova in #9e9f7c6

Thank you again.
When I compared your files with ours I found that the following part is quite different: (this is part of our file)
“Info about KIND -parameters: iintegers / MPI _INT = 4 1275069467 int_ga / MPI _INT = 4 1275069467“
In your file instead of 1275069467 stays 7. I’ve received another int2lm outputs and there is also 7 on that place. It seems that this could cause some problems.
I’m not sure but it seems that variable (?) MPI _INT is broken for some reasons.
I’ve tried to add something like “export MPI _INT=7” to run_int2lm script but unfortunately it didn’t work.

Thank you again.
When I compared your files with ours I found that the following part is quite different: (this is part of our file)
“Info about KIND -parameters: iintegers / MPI _INT = 4 1275069467 int_ga / MPI _INT = 4 1275069467“
In your file instead of 1275069467 stays 7. I’ve received another int2lm outputs and there is also 7 on that place. It seems that this could cause some problems.
I’m not sure but it seems that variable (?) MPI _INT is broken for some reasons.
I’ve tried to add something like “export MPI _INT=7” to run_int2lm script but unfortunately it didn’t work.

Hi,
you cannot modify the MPI _INT value. This is a value given by the MPI Library used and it can be different for different computers. You should find more information on this value in the documentation of your MPI library used (it is the MPI _INTEGER value). In the documentation you can see whether the value 1275069467 is correct or not.
Furthermore you could check what “Exit code -5” means on your system. Are you running interactively or per batch? Maybe you have to increase the stack size for your run (with “ulimit -s unlimited”)

  @ulrichschättler in #a956d42

Hi,
you cannot modify the MPI _INT value. This is a value given by the MPI Library used and it can be different for different computers. You should find more information on this value in the documentation of your MPI library used (it is the MPI _INTEGER value). In the documentation you can see whether the value 1275069467 is correct or not.
Furthermore you could check what “Exit code -5” means on your system. Are you running interactively or per batch? Maybe you have to increase the stack size for your run (with “ulimit -s unlimited”)

Hi,
you cannot modify the MPI _INT value. This is a value given by the MPI Library used and it can be different for different computers. You should find more information on this value in the documentation of your MPI library used (it is the MPI _INTEGER value). In the documentation you can see whether the value 1275069467 is correct or not.
Furthermore you could check what “Exit code -5” means on your system. Are you running interactively or per batch? Maybe you have to increase the stack size for your run (with “ulimit -s unlimited”)

Thank you for help.
We found that problem was in trying to run program in 1 core mode.
It seems that either programs with mpi can’t be runned on 1 core or this is the feature of our computing system.
Anyway, when we write in script
“npx=4 npy=2 mpirun -np 8“
everything works correctly.

  @iyabelova in #f8a5589

Thank you for help.
We found that problem was in trying to run program in 1 core mode.
It seems that either programs with mpi can’t be runned on 1 core or this is the feature of our computing system.
Anyway, when we write in script
“npx=4 npy=2 mpirun -np 8“
everything works correctly.

Thank you for help.
We found that problem was in trying to run program in 1 core mode.
It seems that either programs with mpi can’t be runned on 1 core or this is the feature of our computing system.
Anyway, when we write in script
“npx=4 npy=2 mpirun -np 8“
everything works correctly.

Dear all,

We have similar problem with Iya Belova. We are trying to run COSMO - CLM (cclm-sp_2.4) at 0.11 resolution using ERA interim dataset for period between August 1st, 2007 and December 31st, 2009. Although the run time is 29 months, we obtain ‘int2lm finished’ message after 2-3 months run period. We had experience using cclm-sp_1.5 on same workstation before and we did not encounter with this kind of problem. Do you have any suggestion about this problem?

Fopts, int2lm_test.log and run_int2lm_eraint_test files are in the attachment.

Best regards,

Cemre Yürük

  @cemreyürük in #d4450e1

Dear all,

We have similar problem with Iya Belova. We are trying to run COSMO - CLM (cclm-sp_2.4) at 0.11 resolution using ERA interim dataset for period between August 1st, 2007 and December 31st, 2009. Although the run time is 29 months, we obtain ‘int2lm finished’ message after 2-3 months run period. We had experience using cclm-sp_1.5 on same workstation before and we did not encounter with this kind of problem. Do you have any suggestion about this problem?

Fopts, int2lm_test.log and run_int2lm_eraint_test files are in the attachment.

Best regards,

Cemre Yürük

Dear all,

We have similar problem with Iya Belova. We are trying to run COSMO - CLM (cclm-sp_2.4) at 0.11 resolution using ERA interim dataset for period between August 1st, 2007 and December 31st, 2009. Although the run time is 29 months, we obtain ‘int2lm finished’ message after 2-3 months run period. We had experience using cclm-sp_1.5 on same workstation before and we did not encounter with this kind of problem. Do you have any suggestion about this problem?

Fopts, int2lm_test.log and run_int2lm_eraint_test files are in the attachment.

Best regards,

Cemre Yürük