Partition QoS vs User QoS

Partition QoS

For every partition, there is a Quality of Service which has different parameters like MaxJobs, MaxSubmitJobs, etc defined for the partition. This has an effect on the jobs submitted by the user on the partition. Run the below scontrol command for normal partition.

scontrol show partition normal

Output:

PartitionName=normal
   AllowGroups=discovery-users_normal,pkgmgr AllowAccounts=ALL AllowQos=ALL
   AllocNodes=ALL Default=YES QoS=p-normal
   DefaultTime=00:01:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
   MaxNodes=UNLIMITED MaxTime=7-01:00:00 MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
   Nodes=discovery-c[1-15]
   PriorityJobFactor=1 PriorityTier=25 RootOnly=NO ReqResv=NO OverSubscribe=FORCE:1
   OverTimeLimit=NONE PreemptMode=SUSPEND
   State=UP TotalCPUs=624 TotalNodes=15 SelectTypeParameters=NONE
   JobDefaults=DefCpuPerGPU=4
   DefMemPerCPU=512 MaxMemPerNode=UNLIMITED

The above output shows that the QoS for the normal partition is ` QoS = p-normal `. Run the below command to find information about the p-normal QoS.

Syntax: sacctmgr show qos where name =<qos-name> format=<headername1,headername2,…​.n>

sacctmgr show qos where name=p-normal format=name,maxJobs,maxSubmit

Output:

~~~~~~Name MaxJobs MaxSubmit
---------- ------- ---------
p-normal      10        20

The output shows some parameters which are defined for the QoS p-normal for the normal partition. It shows that MaxJobs limit is 10 which means you can have two jobs actively running. The MaxSubmit limit is 20 which means that you can submit a maximum of 20 jobs to the normal partition. However, 10 jobs will be in the running state and 10 jobs will be in the queue.

In similar manner, there is a different QoS defined for every partition in HPC. The below table point_down shows the QoS information for the other partitions.

QoS MaxJobs MaxSubmit Flags

p-normal

10

20

DenyOnLimit

p-gpu

2

4

DenyOnLimit

p-interactive

3

3

DenyOnLimit

p-backfill

10

20

DenyOnLimit

User QoS

Similar to partition QoS, there is a User QoS available in Supercomputing which can be attached to the user. But in Discovery, there is no User QoS available or defined. For more information about the QoS, refer to the following page point_right https://slurm.schedmd.com/qos.html