Kinds of job requirements that can be specified
Much of the time, jobs will run fine without specifying every detail about a job. Not specifying more detail than necessary helps the scheduler to find nodes on which to run your job, which in turn reduces the amount of time it takes to start a job. Sometimes you will need to specify additional detail for a job, and we illustrate some of those here. Please feel free to send us e-mail at firstname.lastname@example.org if you have questions about how or when to specify additional detail, or if there is something you need to specify that is not illustrate here.
The examples below show how to request jobs with the following kinds of special requirements.
- Requesting a job with a specific processor type
- Requesting a minimum number of processors per node
- Requesting exactly N processors per node
- Requesting exclusive use of a whole node
Requesting a job with a specific processor type
Flux is designed to optimize the number processors in use at any given time. If your job only requires one processor, that maximizes the number of places the scheduler can look to try to run it. If your job requires more than one processor, then it will be easiest for the scheduler if those can be located anywhere. If your program allows, then only specifying the number of processors will most likely result in the shortest wait till job start. For example,
#PBS -l procs=4,pmem=2gb
Flux uses policies to try to keep the processors within a job near each other, so if Flux is not busy, you will likely end up with your processors divided up among a small number of nodes. It is usually only when Flux gets very busy that processors get widely distributed, and then it’s a trade-off between having the job start sooner but possibly run longer, or wait longer for it to start but have it run faster.
Flux contains nodes with processors of different types. At the time of writing, standard Flux contains nodes with Nehalem (x5650), Sandybridge (x2670), Ivybridge (e5-2680v2), and Haswell (e5-2680v3) processors. The processor types have 12, 16, 20, and 24 processors per CPU, respectively. When instrumenting code, you may need to specify in the published results on which type of processor the measurements were made.
If you so not specify the type of processor, your job could be scheduled to run on three different types of processors, so you may want to specify the processor type. To do this when requesting processors, you request a corresponding feature, such as the processor type.
#PBS -l procs=4,pmem=2gb #PBS -l feature=sandybridge
Many algorithms benefit greatly from a more organized assignment of processors to physical machines. Sometimes because of network communication patterns, sometimes because of the nature of the computation. The following sections will discuss increasingly specific ways to specify how processors are allocated among nodes.
Requesting a multi-node job with heterogeneous processor types
Certain applications run better if the processors are all of one type. The Flux operators have chosen to create nodesets in which to put machines with the same processor. By default, your job will be assigned to nodes that all have the same processor type. There may be some circumstance when you want to mix processor types, and you can do that by using the
nodeset option. For example, to request nodes that can be either sandybridge or ivybridge, you would use
#PBS -l -nodeset=ONEOF:Feature:ivybridge:sandybridge
You can add additional types to the list, separated by colons.
The complete list of types is
listed from oldest and fewest cores to newest and most cores. It is recommended that you not mix the haswell and broadwell types with any other. We have not seen problems when mixing sandybridge and ivybridge. In general, it is probably best not to mix types.
Note: This only applies to Flux. Armis is a homogeneous cluster of sandybridge nodes, so this is not relevant (there is one large memory nehalem node, but it requires a special flag to request).
All of the more specific examples use the node as the basic request unit, where a node is almost always synonymous with physical machine. Processor features, when attached to a node, are called a node property, and they are used to make special requests.
Node properties are a label that the cluster administrators attach to each node in the cluster’s configuration and that can be used to specify that a node should be selected from that set of nodes. Node properties are specified by modifying the node request.
Requesting a minimum number of processors per node
The next more specific request after simply asking for a pool of processors is to request that each of the machines (nodes) assigned to your job have some minimum number of processors. This will insure that processors are kept in groups of a certain size, which can help processing speed for many algorithms.
For example, if we have a job that performs best if there at least four processors per machine, we would request four such nodes by replacing procs=16 with nodes=4:ppn=4, as in
#PBS -l nodes=4:ppn=4,pmem=2gb
Note that here ppn is a property or characteristic of the node, and those are attached to, but separated from, the node request by colons (:).
There are two ways to combine this with a request for a particular processor type, and that is exactly as was done above when requesting processors with a particular feature.
#PBS -l nodes=4:ppn=4,pmem=2gb #PBS -l feature=sandybridge
An alternate, and possibly more commone, way to specify processor type when using ppn is to simply add the processor type to the node property list, as in
#PBS -l nodes=4:ppn=4:sandybridge,pmem=2gb
The feature of the processor is an property of the node and is therefore attached to the node request with a colon. You could also order it as nodes=4:sandybridge:ppn=4,pmem=2g if you wish.
Flux’s scheduler will try to “pack&rdgquo; nodes; i.e., fit as many blocks of processors from a job onto a single node as possible. Thus, if you request
#PBS -l nodes=4:ppn=4
you might well end up with all of them on a single 16- or 20-core node. That may not be desirable under all circumstances, so the next section explains requesting exactly N processors per node.
Requesting exactly N processors per node
Sometimes you may need to specify a specific number of processors per node. One common scenario is with a program that distributes itself to multiple computers, but then on each computer wants to be able to run using more than one processor. That is more easily accomplished if you can specify the job to simulate, say, four quad-core machines.
To specify that you want an exact number of processors per node, the node property is removed, and instead a job requirement is specified. For example, to specify exactly 4 processors on 4 nodes, you specify the total number of processors needed and how many should be on each of the nodes with the following syntax.
#PBS -l procs=16,tpn=4,pmem=2gb
where the number of procs divided by the tasks per node should have no remainder. That is procs = nodes * tpn.
One task is, in this context, synonomous with one processor. Note that the tpn is separated from the procs by a comma, not a colon. Using the wrong separator is the most common error with this syntax.
This specification brings us back around to the beginning, and if we want to specify which processor type, we do that by specifying the feature.
#PBS -l procs=16,tpn=4,pmem=2gb #PBS -l feature=sandybridge
Requesting exclusive use of a whole node
If you are writing code, developing software, or testing peformance, you may want to exercise much higher control over your operating environment for reporting purposes. One common scenario is reporting how an algorithm or program scales; that is, how does performance increase as the number of processors increases. To do this, you will want to control as many variables as possible. You might want to report the following information, for example.
- Performance on a Nehalem node with 1, 2, 4, 6, 8, and 12 processors
- Performance on a Sandybridge node with 1, 2, 4, 6, 8, 12, and 16 processors
- Performance on a Ivybridge node with 1, 2, 4, 6, 8, 12, 16, and 20 processors
- Performance on a Haswell node with 1, 2, 4, 6, 8, 12, 16, 20, and 24 processors
It is easiest
To take the first as an example, and assuming that the program binary is run_sim, you might run a job with
#PBS -l nodes=4:ppn=12:nehalem
then, assuming your job uses MPI, in your PBS run script, you could run, in sequence,
mpirun -npernode=2 run_sim mpirun -npernode=4 run_sim mpirun -npernode=6 run_sim mpirun -npernode=8 run_sim mpirun -npernode=12 run_sim
Those would run the job with the specified number of processors per node, but because the job has requested all the processors on each node, your timing will not be affected by other jobs running on the same node.
There is also a PBS directive, -n, that specifies that you want exclusive use of a node. If you use that, your account will be charged for the total number of processors in the node, regardless how many you actually specify or use. In most cases, it will be more straightforward to simply ask for all the processors for that node type and use what you need from that number.
One case where this might not hold would be if you are testing performance for a family of processors where the CPUs can have several different numbers of cores, your code is hybrid MPI and threaded, so you would like to have the only running processes on the nodes but only have a subset of the processors available to prevent threading to a larger number of processors than the simulated CPU would have. To accomplish this, you could combine node properties and the node-exclusive directive, as in
#PBS -l nodes=4:ppn=12:ivybridge #PBS -n
By using this, Flux’s use of cpusets will assign only 12 processors on each node to the job, but you will have insured that no other jobs can run on them. Do note that your account would be charged for use of 80 processors rather than 48 because your job will have made 32 processors unavailable for other jobs to use.