diff --git a/doc.zih.tu-dresden.de/docs/archive/load_leveler.md b/doc.zih.tu-dresden.de/docs/archive/load_leveler.md index 07daea3dbcef9d375a57f47dbec1d0d8a27d0491..7a96f4945d08650ca8c20592e140ab0f43bcfe16 100644 --- a/doc.zih.tu-dresden.de/docs/archive/load_leveler.md +++ b/doc.zih.tu-dresden.de/docs/archive/load_leveler.md @@ -287,7 +287,7 @@ This command will give you detailed job information. ### Job Status States -| | | | +| State | Short | Description | |------------------|-----|----------------| | Canceled | CA | The job has been canceled as by the `llcancel` command. | | Completed | C | The job has completed. | diff --git a/doc.zih.tu-dresden.de/docs/archive/migrate_to_atlas.md b/doc.zih.tu-dresden.de/docs/archive/migrate_to_atlas.md index 051b65694ad03ebc248a813b3e46b400d4af286e..905f59a880cfafefa6633cdfdca3320feccd5b8f 100644 --- a/doc.zih.tu-dresden.de/docs/archive/migrate_to_atlas.md +++ b/doc.zih.tu-dresden.de/docs/archive/migrate_to_atlas.md @@ -7,13 +7,12 @@ [Atlas](system_atlas.md) is a different machine than [Deimos](system_deimos.md), please have a look at the table: -| | | | -|---------------------------------------------------|------------|-----------| -| | **Deimos** | **Atlas** | -| **number of hosts** | 584 | 92 | -| **cores per host** | 2...8 | 64 | -| **memory \[GB\] per host** | 8...64 | 64..512 | -| **example benchmark: SciMark (higher is better)** | 655 | 584 | +| | Deimos | Atlas | +|---------------------------------------------------|-----------:|----------:| +| number of hosts | 584 | 92 | +| cores per host | 2...8 | 64 | +| memory \[GB\] per host | 8...64 | 64..512 | +| example benchmark: SciMark (higher is better) | 655 | 584 | A single thread on Atlas runs with a very poor performance in comparison with the 6 year old Deimos. The reason for this is that the AMD CPU @@ -37,13 +36,12 @@ The most important changes are: `-M <memory per process in MByte>`, the default is 300 MB, e.g. `-M 2000`. -| | | | -|-----------------------|--------|------------------------------------------------------| -| Hosts on Atlas | number | per process/core user memory limit in MB (-M option) | -| nodes with 64 GB RAM | 48 | 940 | -| nodes with 128 GB RAM | 24 | 1950 | -| nodes with 256 GB RAM | 12 | 4000 | -| nodes with 512 GB RAM | 8 | 8050 | +| Hosts on Atlas | Count | Per Process/Core User Memory Limit in MB (`-M` option) | +|-----------------------|-------:|-------------------------------------------------------:| +| nodes with 64 GB RAM | 48 | 940 | +| nodes with 128 GB RAM | 24 | 1950 | +| nodes with 256 GB RAM | 12 | 4000 | +| nodes with 512 GB RAM | 8 | 8050 | - Jobs with a job runtime greater than 72 hours (jobs that will run in the queue `long`) will be collected over the day and scheduled in a @@ -98,7 +96,7 @@ compiler comes from the Open64 suite. For convenience, other compilers are insta shows good results as well. Please check the best compiler flags at [this overview] developer.amd.com/Assets/CompilerOptQuickRef-62004200.pdf. -### MPI parallel applications +### MPI Parallel Applications Please note the more convenient syntax on Atlas. Therefore, please use a command like diff --git a/doc.zih.tu-dresden.de/docs/archive/system_altix.md b/doc.zih.tu-dresden.de/docs/archive/system_altix.md index d3208237453cbbaf685e6fd4d9d4e1b28575b0c1..08cc3a3e8f739780205ef7de75bcd618ee111dd9 100644 --- a/doc.zih.tu-dresden.de/docs/archive/system_altix.md +++ b/doc.zih.tu-dresden.de/docs/archive/system_altix.md @@ -74,14 +74,14 @@ properties: | Component | Count | |-------------------------------------|----------------------------| -| clock rate | 1.6 GHz | -| integer units | 6 | -| floating point units (multiply-add) | 2 | -| peak performance | 6.4 GFLOPS | +| Clock rate | 1.6 GHz | +| Integer units | 6 | +| Floating point units (multiply-add) | 2 | +| Peak performance | 6.4 GFLOPS | | L1 cache | 2 x 16 kB, 1 clock latency | | L2 cache | 256 kB, 5 clock latency | | L3 cache | 9 MB, 12 clock latency | -| front side bus | 128 bit x 200 MHz | +| Front side bus | 128 bit x 200 MHz | The theoretical peak performance of all Altix partitions is hence about 13.1 TFLOPS. diff --git a/doc.zih.tu-dresden.de/docs/archive/vampirtrace.md b/doc.zih.tu-dresden.de/docs/archive/vampirtrace.md index 15746b60035e4ec7999159693dcaa56ca5f54f9f..edfa2a80f1b70902a5622f84d33efded97ee4dc3 100644 --- a/doc.zih.tu-dresden.de/docs/archive/vampirtrace.md +++ b/doc.zih.tu-dresden.de/docs/archive/vampirtrace.md @@ -47,7 +47,7 @@ The following sections show some examples depending on the parallelization type Compiling serial code is the default behavior of the wrappers. Simply replace the compiler by VampirTrace's wrapper: -| | | +| | Compile Command | |----------------------|-------------------------------| | original | `ifort a.f90 b.f90 -o myprog` | | with instrumentation | `vtf90 a.f90 b.f90 -o myprog` | @@ -59,7 +59,7 @@ This will instrument user functions (if supported by compiler) and link the Vamp If your MPI implementation uses MPI compilers (this is the case on [Deimos](system_deimos.md)), you need to tell VampirTrace's wrapper to use this compiler instead of the serial one: -| | | +| | Compile Command | |----------------------|--------------------------------------| | original | `mpicc hello.c -o hello` | | with instrumentation | `vtcc -vt:cc mpicc hello.c -o hello` | @@ -68,7 +68,7 @@ MPI implementations without own compilers (as on the [Altix](system_altix.md) re link the MPI library manually. In this case, you simply replace the compiler by VampirTrace's compiler wrapper: -| | | +| | Compile Command | |----------------------|-------------------------------| | original | `icc hello.c -o hello -lmpi` | | with instrumentation | `vtcc hello.c -o hello -lmpi` | @@ -81,7 +81,7 @@ option `-vt:inst manual` to disable automatic instrumentation of user functions. When VampirTrace detects OpenMP flags on the command line, OPARI is invoked for automatic source code instrumentation of OpenMP events: -| | | +| | Compile Command | |----------------------|----------------------------| | original | `ifort -openmp pi.f -o pi` | | with instrumentation | `vtf77 -openmp pi.f -o pi` | @@ -90,7 +90,7 @@ code instrumentation of OpenMP events: With a combination of the above mentioned approaches, hybrid applications can be instrumented: -| | | +| | Compile Command | |----------------------|-----------------------------------------------------| | original | `mpif90 -openmp hybrid.F90 -o hybrid` | | with instrumentation | `vtf90 -vt:f90 mpif90 -openmp hybrid.F90 -o hybrid` | diff --git a/doc.zih.tu-dresden.de/docs/software/perf_tools.md b/doc.zih.tu-dresden.de/docs/software/perf_tools.md index 8afac08a87eb3925b6b6ece7c2fa7732b6ec827a..897c2dbc05d30275015552e86ae5876d2c20844d 100644 --- a/doc.zih.tu-dresden.de/docs/software/perf_tools.md +++ b/doc.zih.tu-dresden.de/docs/software/perf_tools.md @@ -11,12 +11,11 @@ support for sampling applications and reading performance counters. Admins can change the behaviour of the perf tools kernel part via the following interfaces -| | | -|---------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------| | File Name | Description | -| `/proc/sys/kernel/perf_event_max_sample_rate` | describes the maximal sample rate for perf record and native access. This is used to limit the performance influence of sampling. | -| `/proc/sys/kernel/perf_event_mlock_kb` | defines the number of pages that can be used for sampling via perf record or the native interface | -| `/proc/sys/kernel/perf_event_paranoid` | defines access rights: | +|---------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------| +| `/proc/sys/kernel/perf_event_max_sample_rate` | Describes the maximal sample rate for perf record and native access. This is used to limit the performance influence of sampling. | +| `/proc/sys/kernel/perf_event_mlock_kb` | Defines the number of pages that can be used for sampling via perf record or the native interface | +| `/proc/sys/kernel/perf_event_paranoid` | Defines access rights: | | | -1 - Not paranoid at all | | | 0 - Disallow raw tracepoint access for unpriv | | | 1 - Disallow cpu events for unpriv | diff --git a/doc.zih.tu-dresden.de/docs/software/scs5_software.md b/doc.zih.tu-dresden.de/docs/software/scs5_software.md index b5a1bef60d20cdc9989c8db82f766d31a96d3cdc..b017af11ab9d3beda0c7c88436d29d716db9ac39 100644 --- a/doc.zih.tu-dresden.de/docs/software/scs5_software.md +++ b/doc.zih.tu-dresden.de/docs/software/scs5_software.md @@ -3,7 +3,7 @@ Bull's new cluster software is called SCS 5 (*Super Computing Suite*). Here are the major changes from the user's perspective: -| software | old | new | +| Software | Old | New | |:--------------------------------|:-------|:---------| | Red Hat Enterprise Linux (RHEL) | 6.x | 7.x | | Linux kernel | 2.26 | 3.10 | @@ -35,11 +35,11 @@ ml av There is a special module that is always loaded (sticky) called **modenv**. It determines the module environment you can see. -| | | | -|----------------|-------------------------------------------------|---------| -| modenv/scs5 | SCS5 software | default | -| modenv/ml | software for data analytics (partition ml) | | -| modenv/classic | Manually built pre-SCS5 (AE4.0) software | hidden | +| Module Environemnt | Description | Status | +|--------------------|---------------------------------------------|---------| +| `modenv/scs5` | SCS5 software | default | +| `modenv/ml` | Software for data analytics (partition ml) | | +| `modenv/classic` | Manually built pre-SCS5 (AE4.0) software | hidden | The old modules (pre-SCS5) are still available after loading the corresponding **modenv** version (**classic**), however, due to changes @@ -90,31 +90,28 @@ than you will be used to, coming from modenv/classic. A full toolchain, like "in For instance, the "intel" toolchain has the following structure: -| | | +| Toolchain | `intel` | |--------------|------------| -| toolchain | intel | -| compilers | icc, ifort | -| mpi library | impi | -| math library | imkl | +| Compilers | icc, ifort | +| Mpi library | impi | +| Math. library | imkl | On the other hand, the "foss" toolchain looks like this: -| | | +| Toolchain | `foss` | |----------------|---------------------| -| toolchain | foss | -| compilers | GCC (gcc, gfortran) | -| mpi library | OpenMPI | -| math libraries | OpenBLAS, FFTW | +| Compilers | GCC (gcc, gfortran) | +| Mpi library | OpenMPI | +| Math. libraries | OpenBLAS, FFTW | If you want to combine the Intel compilers and MKL with OpenMPI, you'd have to use the "iomkl" toolchain: -| | | +| Toolchain | `iomkl` | |--------------|------------| -| toolchain | iomkl | -| compilers | icc, ifort | -| mpi library | OpenMPI | -| math library | imkl | +| Compilers | icc, ifort | +| Mpi library | OpenMPI | +| Math library | imkl | There are also subtoolchains that skip a layer or two, e.g. "iccifort" only consists of the respective compilers, same as "GCC". Then there is "iompi" that includes Intel compilers+OpenMPI but @@ -145,7 +142,7 @@ Since "intel" is only a toolchain module now, it does not include the entire Par anymore. Tools like the Intel Advisor, Inspector, Trace Analyzer or VTune Amplifier are available as separate modules now: -| product | module | +| Product | Module | |:----------------------|:----------| | Intel Advisor | Advisor | | Intel Inspector | Inspector |