cgroup.conf

Section: \ Slurm cgroup configuration file (5)
Updated: December 2010
Index Return to Main Contents

 

NAME

cgroup.conf - Slurm configuration file for the cgroup support

 

DESCRIPTION

/etc/slurm/cgroup.conf is an ASCII file which defines parameters used by Slurm's Linux cgroup related plugins. The file will always be located in the same directory as the slurm.conf file.

Parameter names are case insensitive. Any text following a "#" in the configuration file is treated as a comment through the end of that line. The size of each line in the file is limited to 1024 characters. Changes to the configuration file take effect upon restart of SLURM daemons, daemon receipt of the SIGHUP signal, or execution of the command "scontrol reconfigure" unless otherwise noted.

Two cgroup plugins are currently available in SLURM. The first one is a proctrack plugin, the second one a task plugin.

The following cgroup.conf parameters are defined to control the general behavior of Slurm cgroup plugins.

CgroupAutomount=<yes|no>
Slurm cgroup plugins require valid and functional cgroup subsystem to be mounted under /cgroup/<subsystem_name>. When launched, plugins check their subsystem availability. If not available, the plugin launch fails unless CgroupAutomount is set to yes. In that case, the plugin will first try to mount the required subsystems.

CgroupReleaseAgentDir=<path_to_release_agent_directory>
Used to tune the cgroup system behavior. This parameter identifies the location of the directory containing Slurm cgroup release_agent files. A release_agent file is required for each mounted subsystem. The release_agent file name must have the following format: release_<subsystem_name>. For instance, the release_agent file for the cpuset subsystem must be named release_cpuset. See also CLEANUP OF CGROUPS below.

 

PROCTRACK/CGROUP PLUGIN

Slurm proctrack/cgroup plugin is used to track processes using the freezer control group subsystem. It creates a hierarchical set of directories for each step, putting the step tasks into the leaf.

This directory structure is like the following:
/cgroup/freezer/uid_%uid/job_%jobid/step_%stepid

Slurm cgroup proctrack plugin is enabled with the following parameter in slurm.conf:
ProctrackType=proctrack/cgroup

No particular cgroup.conf parameter is defined to control the behavior of this particular plugin.

 

TASK/CGROUP PLUGIN

Slurm task/cgroup plugin is used to enforce allocated resources constraints, thus avoiding tasks to use unallocated ressources. It currently only uses cpuset subsystem but could use memory and devices subsystems in a near future too.

It creates a hierarchical set of directories for each task and subsystem. The directory structure is like the following:
/cgroup/%subsys/uid_%uid/job_%jobid/step_%stepid/task_%taskid

Slurm cgroup task plugin is enabled with the following parameter in slurm.conf:
TaskPlugin=task/cgroup

The following cgroup.conf parameters are defined to control the behavior of this particular plugin:

ConstrainCores=<yes|no>
If configured to "yes" then constrain allowed cores to the subset of allocated resources. It uses the cpuset subsystem. The default value is "no".
TaskAffinity=<yes|no>
If configured to "yes" then set a default task affinity to bind each step task to a subset of the allocated cores using sched_setaffinity. The default value is "no".

The following cgroup.conf parameters could be defined to control the behavior of this particular plugin in a next version where memory and devices support would be added :

AllowedRAMSpace=<number>
Constrain the job cgroup RAM to this percentage of the allocated memory. The default value is 100. If the limit is exceeded, the job steps will be killed and a warning message will be written to standard error. Also see ConstrainRAMSpace.

AllowedSwapSpace=<number>
Constrain the job cgroup swap space to this percentage of the allocated memory. The default value is 0. If the limit is exceeded, the job steps will be killed and a warning message will be written to standard error. Also see ConstrainSwapSpace.

ConstrainRAMSpace=<yes|no>
If configured to "yes" then constrain the job's RAM usage. The default value is "no". Also see AllowedRAMSpace.

ConstrainSwapSpace=<yes|no>
If configured to "yes" then constrain the job's swap space usage. The default value is "no". Also see AllowedSwapSpace.

ConstrainDevices=<yes|no>
If configured to "yes" then constrain the job's allowed devices based on GRES allocated resources. It uses the devices subsystem for that. The default value is "no".

 

EXAMPLE


###
# Slurm cgroup support configuration file
###
CgroupAutomount=yes
CgroupReleaseAgentDir="/etc/slurm/cgroup"
ConstrainCores=yes
#

 

NOTES

Only one instance of a cgroup subsystem is valid at a time in the kernel. If you try to mount another cgroup hierarchy that uses the same cpuset subsystem it will fail. However you can mount another cgroup hierarchy for a different cpuset subsystem.

 

CLEANUP OF CGROUPS

To allow cgroups to be removed automatically when they are no longer in use the notify_on_release flag is set in each cgroup when the cgroup is instantiated. The release_agent file for each subsystem is set up when the subsystem is mounted. The name of each release_agent file is release_<subsystem name>. The directory is specified via the CgroupReleaseAgentDir parameter in cgroup.conf. A simple release agent mechanism to remove slurm cgroups when they become empty may be set up by creating the release agent files for each required subsystem as symbolic links to a common release agent script, as shown in the example below:

[sulu] (slurm) etc> cat cgroup.conf | grep CgroupReleaseAgentDir
CgroupReleaseAgentDir="/etc/slurm/cgroup"

[sulu] (slurm) etc> ls -al /etc/slurm/cgroup
total 12
drwxr-xr-x 2 root root 4096 2010-04-23 14:55 .
drwxr-xr-x 4 root root 4096 2010-07-22 14:48 ..
-rwxrwxrwx 1 root root 234 2010-04-23 14:52 release_common
lrwxrwxrwx 1 root root 32 2010-04-23 11:04 release_cpuset -> /etc/slurm/cgroup/release_common
lrwxrwxrwx 1 root root 32 2010-04-23 11:03 release_freezer -> /etc/slurm/cgroup/release_common

[sulu] (slurm) etc> cat /etc/slurm/cgroup/release_common
#!/bin/bash
base_path=/cgroup
progname=$(basename $0)
subsystem=${progname##*_}

rmcg=${base_path}/${subsystem}$@
uidcg=${rmcg%/job*}
if [[ -d ${base_path}/${subsystem} ]]
then

     flock -x ${uidcg} -c "rmdir ${rmcg}"
fi
[sulu] (slurm) etc>

 

COPYING

Copyright (C) 2010 Lawrence Livermore National Security. Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER). CODE-OCEC-09-009. All rights reserved.

This file is part of SLURM, a resource management program. For details, see <https://computing.llnl.gov/linux/slurm/>.

SLURM is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

SLURM is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

 

SEE ALSO

slurm.conf(5)


 

Index

NAME
DESCRIPTION
PROCTRACK/CGROUP PLUGIN
TASK/CGROUP PLUGIN
EXAMPLE
NOTES
CLEANUP OF CGROUPS
COPYING
SEE ALSO

This document was created by man2html, using the manual pages.
Time: 20:00:45 GMT, May 17, 2011
Lawrence Livermore National Laboratory
7000 East Avenue • Livermore, CA 94550
Operated by Lawrence Livermore National Security, LLC, for the Department of Energy's
National Nuclear Security Administration
NNSA logo links to the NNSA Web site Department of Energy logo links to the DOE Web site