Translations of this page:

ANSYS FLUENT

Running ANSYS FLUENT calculations in clusters with the SLURM resource manager (hydra, celaeno)

NOTE: Check this page every now and then for the latest information.

Running ANSYS FLUENT calculations in the clusters with SLURM is a little bit different than running the calculations in a desktop environment. The main difference is that the user submits the calculation to a work queue instead of directly starting the calculation. SLURM will then run the calculation when the required resources (processors, memory, licenses, etc.) are available. This means that the calculation case has to be fully prepared and ready for calculation when submitted to the queue. There is also no graphical user interface (GUI) for ANSYS FLUENT in the cluster and it is not allowed (without a permission from the cluster admin) to use ANSYS FLUENT in interactive mode even with the text user interface. Even though these features may feel like limitations, they are not. It should be possible to do all the same things with ANSYS FLUENT in the clusters as in a desktop environment.

This guide will give you the basics on how to run ANSYS FLUENT calculations in the clusters. In addition to this guide, you should use the really comprehensive documentation provided by ANSYS Inc., especially FLUENT User's Guide and FLUENT Text Command List. These are available in the common LUT fileshare at files-tkt/ENTE/common/CFD_development/softwares/Fluent/. There is also lots of information and a community to help with troubleshooting at CFD-Online. The full documentation of the SLURM resource manager might also be helpful and is available here.

Essential files

You will need the following files to run ANSYS FLUENT in the clusters:

  1. SLURM parallel job script
  2. ANSYS FLUENT batch run journal file
  3. ANSYS FLUENT mesh or case file
  4. ANSYS FLUENT data file (only if you continue a previous calculation)

The purpose and structure of the files is explained below.

1. SLURM parallel job script is needed to submit your ANSYS FLUENT calculation to the cluster. There is an example job script with comment lines available below which can be used as a template. The first part of the script contains the information specific to ANSYS FLUENT and your case, such as solver type and the complete path to your case folder in the cluster. The name of the ANSYS FLUENT batch run journal file is also specified here. The second part has commands preceded by #BATCH, e.g. #SBATCH –nodes 1. These are SLURM specific commands you need to specify. Note that two ## defines a comment line and one # a batch command. With batch commands you specify the number of processors, the maximum amount of memory required by your calculation and other important parameters. The last part of the file contains a link to an external file and needs to be the last thing in the file. Do not modify this in any way.

There is a list of available batch commands at https://computing.llnl.gov/linux/slurm/sbatch.html. When running ANSYS FLUENT you should, however, have no need for commands other than in the example script file.

slurm_fluent_template.sh
#!/bin/bash -l
##
## Parallel job script for ANSYS FLUENT calculations with the SLURM resource manager.
##
 
## SLURM SPECIFIC COMMANDS (START WITH #SBATCH)
## NOTE! SBATCH COMMANDS NEED TO BE DEFINED BEFORE ANYTHING ELSE.
 
## Define the number of parallel tasks, i.e. how many processors you wish to run your case with. Use some discretion as more processors does not automatically mean a faster solution and each processor reserves one license, which we have a limited number. As a rough rule, you should use max. 8 tasks for cases with less than 500 000 cells and max. 16 tasks for cases between 500 000 and 1 000 000 cells. Cases with more than 1 000 000 cells you can consider spreading on more processors, but remember that the currently available licenses limit the total number of processes to 24 and that will reserve all of the parallel licenses we have in our common use! If running your case takes several days do not reserve all of the parallel licenses at least without discussing with other FLUENT users. Always use the minimum of 4 tasks for your cases when running them in the cluster, otherwise run your simulation with your desktop computer.
#SBATCH --ntasks 6
## Number of licenses needed. Allways specify 1 aa_r_cfd license which lets you run 4 parallel tasks. Specify one aa_r_hpc license for each additional parallel task. We have a total number of 20 aa_r_hpc licenses. So always define 1 aa_r_cfd license and then ntasks - 4 aa_r_hpc licenses. NOTE: If you run your case with 4 tasks, remove the ,aa_r_hpc:n part and just specify --licenses=aa_r_cfd:1
#SBATCH --licenses=aa_r_cfd:1,aa_r_hpc:2
## Maximum estimated memory needed by one process in MiB. The total memory is then ntasks * mem-per-cpu. Try to estimate this based on e.g. your previous similar simulations. Nodes in the partition phase2 of Celaeno have 128 GiB of memory in total. Again, you should not reserve more memory than you need.
#SBATCH --mem-per-cpu=2000
## Maximum estimated time needed by your simulation. Try to give a realistic estimation. After this time your simulation will be automatically shut down. Maximum time for a simulation is 14 days and after that your simulation will be shut down. The format is dd-hh:mm:ss.
#SBATCH --time 00:30:00
## Name of your simulation run. You can monitor your cases with squeue and this name will show up in the list.
#SBATCH --job-name my-simulation
## Names of the files where the error and log messages are written. Prints the output data e.g. residuals you have chosen to output.
#SBATCH --error slurmerr.txt
#SBATCH --output slurmlog.txt
## Name of the partition (queue) to submit your simulation. See available queues with sinfo command.
#SBATCH --partition phase2
## Send notification after finishing the simulation to the specified email address.
#SBATCH --mail-type=end
#SBATCH --mail-user=my-email@lut.fi
 
## ANSYS FLUENT SPECIFIC COMMANDS
 
## Name of your Fluent journal file. 
RUNFILE=my-fluent-journal-file.jou
## File containing mpi hosts. This will be automatically generated, but you can change the file name if you like.
HOSTFILE=mpihosts
## Path to your Fluent case. You can check it with Linux command pwd. NOTE! You need to have / in the end.
CASEDIR=/home/my-username/my/fluent/case/directory/
## Define Fluent version. Check available versions with "module avail" command at the cluster. Using the latest version is recommended. At Celaeno the following versions are currently available (19.2.2016): 13.0 14.0 14.5 15.0 16.1 16.2 17.0
FLUENTVERSION=17.0
## Solver you wish to use. Available options: 2d 2ddp 3d 3ddp
SOLVER=3d
## MPI version you wish to use. Available options: openmpi intel
MPIVERSION=openmpi
 
## DO NOT CHANGE OR PUT ANY OF THE ABOVE PARAMETERS BELOW THIS! IT WILL RUN YOUR FLUENT SIMULATION.
. /shared/apps/ansys_inc/include_fluent.sh

2. ANSYS FLUENT batch run journal file is the file that ANSYS FLUENT executes in order when it is started by SLURM. You should read details about the file from the FLUENT User's Guide. The file contains ANSYS FLUENT text commands (see the FLUENT Text command list) and in a simple case it can be something like this:

/file/read-case "./my-fluent-case.cas"   
/solve/initialize/compute-defaults/mass-flow-inlet inlet  
/solve/initialize/initialize-flow
/file/auto-save/data-frequency 50 
/file/auto-save/case-frequency if-case-is-modified 
/file/auto-save/root-name "./data/my-fluent-case"
/solve/iterate 100
exit

The first line reads in your case file which you have already prepared at your desktop. It also contains your mesh. The next two lines initialize your case based on the values at your inlet boundary. You then instruct fluent to save data files every 50 iterations and to save case files if the case is modified during the simulation. The files are saved under a separate “data” folder in your case directory. Finally, Fluent performs 100 iterations and exits when finished.

This is a simple example of running your case using a journal file. However, you can use journal files in a much more advanced way and even automize your workflow significantly. As an example, you can instruct ANSYS FLUENT to automatically save interesting data to external ASCII files after or during the calculation. You can even save your residuals and other convergence monitors to external files which you can then plot e.g. with MATLAB during the calculation to see the progress. You can also modify your case between iterations beforehand.

In addition to FLUENT User's Guide and Text Command List, a convenient way to find out the exact text command counterparts for those available through GUI is to use the text user interface at your desktop FLUENT and make it record the commands you issue into a text file.

/file/start-journal 
Give a name for the file when prompted: "filename.txt" 
Perform some text commands
.
.
.
/file/stop-journal

3. You can read in either a ready case file that you have prepared in your desktop FLUENT or you can read in a mesh file and make the whole setup of case parameters as text commands in the journal file. It is also possible to have a ready made case file that you read in first, and then replace the mesh with a text command in the journal file. This way it is possible to set up your case in your desktop with a coarse mesh and then replace the mesh with a more dense one in the cluster as high resolution grids may require too much memory for you to even open in your desktop.

4. You can also read in the data of previous run to continue your calculation.

Starting calculation

When you have moved all the above files (and others your case might need) to the cluster with e.g. filezilla you are basically ready to submit your calculation to the queue. The basics commands for using SLURM can be found from the CSC web site. What you will probably need most are the commands sinfo, squeue, sbatch and scancel.

sinfo

With this command you can see the available partitions or queues in the cluster. The partitions are typically formed based on different hardware on the machines or intended usage. In Celaeno, e.g., there are partitions grid1, grid2, phase1 and phase2. Partitions grid1 and grid2 are intended for external use coming from the FGI grid. phase1 and phase2 are for local use. The partitions phase1 and phase2 differ from each other by a little different hardware configuration (see hpc for details).

squeue

With this command you will see the current work queue in the cluster. Your case will show in the list with the name you have given in the SLURM job script.

sbatch

With this command you will submit your work to the queue.

sbatch slurm_fluent_template.sh
scancel

With this command you can cancel your case. Check the JOBID of your case with squeue and cancel your job by

scancel JOBID

Example calculation

There is a simple example calculation case available at Hydra and Celaeno which can be used to learn how to run ANSYS FLUENT calculations in the clusters. You can copy the case to your home folder at either cluster and test it. When in your home folder, issue

cp -r /shared/tutorials/fluent/example_case_01/ ./

Before running the case open the elbow_slurm.sh and change at least the CASEDIR and #SBATCH –mail-user parameters to correct ones.

Interactive use

Learn the batch method first! Occasionally interactive use is handy to do something. Do NOT use it when you really should use batch mode.

After SLURM allocates resources for calculation you are dropped to command line. First change to calculation directory and then get list of hosts where allocated rosources are.

cd /home/<user name>/<FLUENT calculations directory>/<some case directory>
srun hostname >hostfile.txt

Decide which version of FLUENT you want to use by loading environment module.

module load fluent/16.0

Now start FLUENT launcher or feed parameters directly to command line.

Launcher

fluent

Make following selections:

  • Processing options > Parallel: Select
  • Processing options > Parallel > Solver > Processes: Amount need to match your reservation (-n <something>)
  • General Options > Working Directory: (check)
  • Parallel Settings > Interconnects: ethernet on hydra, infiniband on celaeno
  • Parallel Settings > Remote Spawn Command: ssh
  • Parallel Settings > MPI Types: openmpi, others might work also
  • Parallel Settings > Distributed Memory on a Cluster > File containing Machine Names: Select file which was created earlier (hostfile.txt)

Directly to solver

hydra:

fluent 3ddp -mpi=openmpi -t $SLURM_NPROCS -pethernet -cnf=hostfile.txt -ssh
 
/opt/webdata/webroot/wiki/data/pages/en/hpc/software/fluent.txt · Last modified: 2017/11/22 15:02 by mnikku
[unknown button type]
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki