diff --git a/doc.zih.tu-dresden.de/docs/archive/HardwareAtlas.md b/doc.zih.tu-dresden.de/docs/archive/HardwareAtlas.md index ec8061e72e58e2910510521edcfbc0dd4ef6086b..184f395bcd3d8ed952ca9dcf26c64a66ed13210b 100644 --- a/doc.zih.tu-dresden.de/docs/archive/HardwareAtlas.md +++ b/doc.zih.tu-dresden.de/docs/archive/HardwareAtlas.md @@ -1,5 +1,3 @@ - - # MEGWARE PC-Farm Atlas The PC farm `Atlas` is a heterogenous cluster based on multicore chips @@ -7,41 +5,42 @@ AMD Opteron 6274 ("Bulldozer"). The nodes are operated by the Linux operating system SuSE SLES 11 with a 2.6 kernel. Currently, the following hardware is installed: -\|CPUs \|AMD Opteron 6274 \| \|number of cores \|5120 \| \|th. peak -performance\| 45 TFlops\| \|compute nodes \| 4-way nodes *Saxonid* with -64 cores\| \|nodes with 64 GB RAM \| 48 \| \|nodes with 128 GB RAM \| 12 -\| \|nodes with 512 GB RAM \| 8 \| - -\<P> +| CPUs |AMD Opteron 6274 | +| number of cores | 5120 | +|th. peak performance | 45 TFlops | +|compute nodes | 4-way nodes *Saxonid* with 64 cores | +|nodes with 64 GB RAM | 48 | +|nodes with 128 GB RAM | 12 | +|nodes with 512 GB RAM | 8 | -Mars and Deimos users: Please read the [migration -hints](MigrateToAtlas). +Mars and Deimos users: Please read the [migration hints](MigrateToAtlas.md). -All nodes share the HOME and `/fastfs/` [file system](FileSystems) with -our other HPC systems. Each node has 180 GB local disk space for scratch -mounted on `/tmp` . The jobs for the compute nodes are scheduled by the -[Platform LSF](Platform LSF) batch system from the login nodes +All nodes share the `/home` and `/fastfs` file system with our other HPC systems. Each +node has 180 GB local disk space for scratch mounted on `/tmp` . The jobs for the compute nodes are +scheduled by the [Platform LSF](PlatformLSF.md) batch system from the login nodes `atlas.hrsk.tu-dresden.de` . -A QDR Infiniband interconnect provides the communication and I/O -infrastructure for low latency / high throughput data traffic. +A QDR Infiniband interconnect provides the communication and I/O infrastructure for low latency / +high throughput data traffic. -Users with a login on the [SGI Altix](HardwareAltix) can access their -home directory via NFS below the mount point `/hpc_work`. +Users with a login on the [SGI Altix](HardwareAltix.md) can access their home directory via NFS +below the mount point `/hpc_work`. ## CPU AMD Opteron 6274 -\| Clock rate \| 2.2 GHz\| \| cores \| 16 \| \| L1 data cache \| 16 KB -per core \| \| L1 instruction cache \| 64 KB shared in a *module* (i.e. -2 cores) \| \| L2 cache \| 2 MB per module\| \| L3 cache \| 12 MB total, -6 MB shared between 4 modules = 8 cores\| \| FP units \| 1 per module -(supports fused multiply-add)\| \| th. peak performance\| 8.8 GFlops per -core (w/o turbo) \| +| Clock rate | 2.2 GHz | +| cores | 16 | +| L1 data cache | 16 KB per core | +| L1 instruction cache | 64 KB shared in a *module* (i.e. 2 cores) | +| L2 cache | 2 MB per module | +| L3 cache | 12 MB total, 6 MB shared between 4 modules = 8 cores | +| FP units | 1 per module (supports fused multiply-add) | +| th. peak performance | 8.8 GFlops per core (w/o turbo) | The CPU belongs to the x86_64 family. Since it is fully capable of running x86-code, one should compare the performances of the 32 and 64 bit versions of the same code. -For more architectural details, see the [AMD Bulldozer block -diagram](http://upload.wikimedia.org/wikipedia/commons/e/ec/AMD_Bulldozer_block_diagram_%288_core_CPU%29.PNG) -and [topology of Atlas compute nodes](%ATTACHURL%/Atlas_Knoten.pdf). +For more architectural details, see the +[AMD Bulldozer block diagram](http://upload.wikimedia.org/wikipedia/commons/e/ec/AMD_Bulldozer_block_diagram_%288_core_CPU%29.PNG) +and [topology of Atlas compute nodes] **todo** %ATTACHURL%/Atlas_Knoten.pdf. diff --git a/doc.zih.tu-dresden.de/docs/archive/SystemAtlas.md b/doc.zih.tu-dresden.de/docs/archive/SystemAtlas.md index 13ce0ace2b6d91e07df39d6a83c0ad86fdb7ef9a..59fe0111fbe1052a6b45923128369f703462ea15 100644 --- a/doc.zih.tu-dresden.de/docs/archive/SystemAtlas.md +++ b/doc.zih.tu-dresden.de/docs/archive/SystemAtlas.md @@ -1,27 +1,22 @@ # Atlas -**`%RED%This page is deprecated! Atlas is a former system!%ENDCOLOR%`** -( [Current hardware](Compendium.Hardware)) +**This page is deprecated! Atlas is a former system!** -Atlas is a general purpose HPC cluster for jobs using 1 to 128 cores in -parallel ( [Information on the hardware](HardwareAtlas)). +Atlas is a general purpose HPC cluster for jobs using 1 to 128 cores in parallel +([Information on the hardware](HardwareAtlas.md)). ## Compiling Parallel Applications -When loading a compiler module on Atlas, the module for the MPI -implementation OpenMPI is also loaded in most cases. If not, you should -explicitly load the OpenMPI module with `module load openmpi`. This also -applies when you use the system's (old) GNU compiler. ( [read more about -Modules](Compendium.RuntimeEnvironment), [read more about -Compilers](Compendium.Compilers)) +When loading a compiler module on Atlas, the module for the MPI implementation OpenMPI is also +loaded in most cases. If not, you should explicitly load the OpenMPI module with `module load +openmpi`. This also applies when you use the system's (old) GNU compiler. -Use the wrapper commands `mpicc` , `mpiCC` , `mpif77` , or `mpif90` to -compile MPI source code. They use the currently loaded compiler. To -reveal the command lines behind the wrappers, use the option `-show`. +Use the wrapper commands `mpicc` , `mpiCC` , `mpif77` , or `mpif90` to compile MPI source code. They +use the currently loaded compiler. To reveal the command lines behind the wrappers, use the option +`-show`. -For running your code, you have to load the same compiler and MPI module -as for compiling the program. Please follow te following guiedlines to -run your parallel program using the batch system. +For running your code, you have to load the same compiler and MPI module as for compiling the +program. Please follow te following guiedlines to run your parallel program using the batch system. ## Batch System @@ -29,45 +24,43 @@ Applications on an HPC system can not be run on the login node. They have to be submitted to compute nodes with dedicated resources for the user's job. Normally a job can be submitted with these data: -- number of CPU cores, -- requested CPU cores have to belong on one node (OpenMP programs) or - can distributed (MPI), -- memory per process, -- maximum wall clock time (after reaching this limit the process is - killed automatically), -- files for redirection of output and error messages, -- executable and command line parameters. +- number of CPU cores, +- requested CPU cores have to belong on one node (OpenMP programs) or + can distributed (MPI), +- memory per process, +- maximum wall clock time (after reaching this limit the process is + killed automatically), +- files for redirection of output and error messages, +- executable and command line parameters. ### LSF -The batch sytem on Atlas is LSF. For general information on LSF, please -follow [this link](PlatformLSF). +The batch sytem on Atlas is LSF. For general information on LSF, please follow +[this link](PlatformLSF.md). ### Submission of Parallel Jobs -To run MPI jobs ensure that the same MPI module is loaded as during -compile-time. In doubt, check you loaded modules with `module list`. If -you code has been compiled with the standard OpenMPI installation, you -can load the OpenMPI module via `module load openmpi`. +To run MPI jobs ensure that the same MPI module is loaded as during compile-time. In doubt, check +you loaded modules with `module list`. If you code has been compiled with the standard OpenMPI +installation, you can load the OpenMPI module via `module load openmpi`. -Please pay attention to the messages you get loading the module. They -are more up-to-date than this manual. To submit a job the user has to -use a script or a command-line like this: +Please pay attention to the messages you get loading the module. They are more up-to-date than this +manual. To submit a job the user has to use a script or a command-line like this: - <span class='WYSIWYG_HIDDENWHITESPACE'> </span>bsub -n <N> mpirun <program name><span class='WYSIWYG_HIDDENWHITESPACE'> </span> +```Bash +bsub -n <N> mpirun <program name> +``` ### Memory Limits -**Memory limits are enforced.** This means that jobs which exceed their -per-node memory limit **may be killed** automatically by the batch -system. +**Memory limits are enforced.** This means that jobs which exceed their per-node memory limit **may +be killed** automatically by the batch system. -The **default limit** is **300 MB** *per job slot* (bsub -n). +The **default limit** is **300 MB** *per job slot* (`bsub -n`). -Atlas has sets of nodes with different amount of installed memory which -affect where your job may be run. To achieve the shortest possible -waiting time for your jobs, you should be aware of the limits shown in -the following table and read through the explanation below. +Atlas has sets of nodes with different amount of installed memory which affect where your job may be +run. To achieve the shortest possible waiting time for your jobs, you should be aware of the limits +shown in the following table and read through the explanation below. | Nodes | No. of Cores | Avail. Memory per Job Slot | Max. Memory per Job Slot for Oversubscription | |:-------------|:-------------|:---------------------------|:----------------------------------------------| @@ -77,14 +70,12 @@ the following table and read through the explanation below. #### Explanation -The amount of memory that you request for your job (-M ) restricts to -which nodes it will be scheduled. Usually, the column **"Avail. Memory -per Job Slot"** shows the maximum that will be allowed on the respective -nodes. +The amount of memory that you request for your job (-M ) restricts to which nodes it will be +scheduled. Usually, the column **Avail. Memory per Job Slot** shows the maximum that will be +allowed on the respective nodes. -However, we allow for **oversubscribing of job slot memory**. This means -that jobs which use **-n32 or less** may be scheduled to smaller memory -nodes. +However, we allow for **oversubscribing of job slot memory**. This means that jobs which use **-n32 +or less** may be scheduled to smaller memory nodes. Have a look at the **examples below**. @@ -98,17 +89,17 @@ available for longer running jobs (>10 min). | Job Spec. | Nodes Allowed | Remark | |:--------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------| -| `bsub %GREEN%-n 1 -M 500%ENDCOLOR%` | `All nodes` | \<= 940 Fits everywhere | -| `bsub %GREEN%-n 64 -M 700%ENDCOLOR%` | `All nodes` | \<= 940 Fits everywhere | -| `bsub %GREEN%-n 4 -M 1800%ENDCOLOR%` | `All nodes` | Is allowed to oversubscribe on small nodes n\[001-047\] | -| `bsub %GREEN%-n 64 -M 1800%ENDCOLOR%` | `n[049-092]` | 64\*1800 will not fit onto a single small node and is therefore restricted to running on medium and large nodes | -| \<span>bsub \<span style`"color: #eecc22;">-n 4 -M 2000</span></span> | =n[049-092]` | Over limit for oversubscribing on small nodes n\[001-047\], but may still go to medium nodes | | -| \<span>bsub \<span style`"color: #eecc22;">-n 32 -M 2000</span></span> | =n[049-092]` | Same as above | | -| `bsub %GREEN%-n 32 -M 1880%ENDCOLOR%` | `All nodes` | Using max. 1880 MB, the job is eligible for running on any node | -| \<span>bsub \<span style`"color: #eecc22;">-n 64 -M 2000</span></span> | =n[085-092]` | Maximum for medium nodes is 1950 per slot - does the job **really** need **2000 MB** per process? | | -| `bsub %GREEN%-n 64 -M 1950%ENDCOLOR%` | `n[049-092]` | When using 1950 as maximum, it will fit to the medium nodes | -| `bsub -n 32 -M 16000` | `n[085-092]` | Wait time might be **very long** | -| `bsub %RED%-n 64 -M 16000%ENDCOLOR%` | `n[085-092]` | Memory request cannot be satisfied (64\*16 MB = 1024 GB), **%RED%cannot schedule job%ENDCOLOR%** | +| `bsub -n 1 -M 500` | All nodes | <= 940 Fits everywhere | +| `bsub -n 64 -M 700` | All nodes | <= 940 Fits everywhere | +| `bsub -n 4 -M 1800` | All nodes | Is allowed to oversubscribe on small nodes n\[001-047\] | +| `bsub -n 64 -M 1800` | `n[049-092]` | 64\*1800 will not fit onto a single small node and is therefore restricted to running on medium and large nodes | +| `bsub -n 4 -M 2000` | `-n[049-092]` | Over limit for oversubscribing on small nodes `n[001-047]`, but may still go to medium nodes | +| `bsub -n 32 -M 2000` | `-n[049-092]` | Same as above | +| `bsub -n 32 -M 1880` | All nodes | Using max. 1880 MB, the job is eligible for running on any node | +| `bsub -n 64 -M 2000` | `-n[085-092]` | Maximum for medium nodes is 1950 per slot - does the job **really** need **2000 MB** per process? | +| `bsub -n 64 -M 1950` | `n[049-092]` | When using 1950 as maximum, it will fit to the medium nodes | +| `bsub -n 32 -M 16000` | `n[085-092]` | Wait time might be **very long** | +| `bsub -n 64 -M 16000` | `n[085-092]` | Memory request cannot be satisfied (64\*16 MB = 1024 GB), **cannot schedule job** | ### Batch Queues @@ -117,10 +108,10 @@ scheduling policy prefers short running jobs over long running ones. This means that **short jobs get higher priorities** and are usually started earlier than long running jobs. -| Batch Queue | Admitted Users | Max. Cores | Default Runtime | \<div style="text-align: right;">Max. Runtime\</div> | -|:--------------|:---------------|:---------------------------------------------|:--------------------------------------------------|:-----------------------------------------------------| -| `interactive` | `all` | \<div style="text-align: right;">n/a\</div> | \<div style="text-align: right;">12h 00min\</div> | \<div style="text-align: right;">12h 00min\</div> | -| `short` | `all` | \<div style="text-align: right;">1024\</div> | \<div style="text-align: right;">1h 00min\</div> | \<div style="text-align: right;">24h 00min\</div> | -| `medium` | `all` | \<div style="text-align: right;">1024\</div> | \<div style="text-align: right;">24h 01min\</div> | \<div style="text-align: right;">72h 00min\</div> | -| `long` | `all` | \<div style="text-align: right;">1024\</div> | \<div style="text-align: right;">72h 01min\</div> | \<div style="text-align: right;">120h 00min\</div> | -| `rtc` | `on request` | \<div style="text-align: right;">4\</div> | \<div style="text-align: right;">12h 00min\</div> | \<div style="text-align: right;">300h 00min\</div> | +| Batch Queue | Admitted Users | Max. Cores | Default Runtime | Max. Runtime | +|:--------------|:---------------|:----------------------------------|:-------------------|:-------------| +| `interactive` | `all` | n/a | 12h 00min | 12h 00min | +| `short` | `all` | 1024 | 1h 00min | 24h 00min | +| `medium` | `all` | 1024 | 24h 01min | 72h 00min | +| `long` | `all` | 1024 | 72h 01min | 120h 00min | +| `rtc` | `on request` | 4 | 12h 00min | 300h 00min |