README.md 5.81 KB
Newer Older
Robert Dietrich's avatar
Robert Dietrich committed
1 2 3 4
# PIKA - Center-Wide and Job-Aware Cluster Monitoring

PIKA is an infrastructure for continuous monitoring and analysis of HPC systems. 
It uses the collection daemon collectd, InfluxDB to store time-series data and MariaDB to store job metadata. 
Frank Winkler's avatar
Frank Winkler committed
5
Furthermore, it provides a powerful [web-frontend](https://gitlab.hrz.tu-chemnitz.de/pika/visualization) for the visualization of job data. 
Robert Dietrich's avatar
Robert Dietrich committed
6

Frank Winkler's avatar
Frank Winkler committed
7
- [Installation](#installation)
8 9 10 11
- [Configuration](#header-configuration)
- [Data Collection](#header-data-collection)
- [Job Control](#header-job-control)
- [Post-Processing](#header-post-processing)
Frank Winkler's avatar
Frank Winkler committed
12
- [How Components Are Connected](#markdown-header-how-components-are-connected)
13
- [Evaluation Test](#header-evaluation-test)
14

Frank Winkler's avatar
Frank Winkler committed
15
***
Robert Dietrich's avatar
Robert Dietrich committed
16 17 18
## Installation
The software stack consists of several components and tools. 
To simplify the installation, appropriate install scripts are available. 
19 20 21
For detailed install instructions see the [README.md](install/README.md) in the install directory.

## Configuration
22
Five files are used to configure the software stack: 
23

24
* *pika.conf* 
Robert Dietrich's avatar
Robert Dietrich committed
25
contains the global version independent configuration variables. It also sets some environment variables that are used in the job prolog and epilog. It uses `source` to read the environment variables from *.pika_access*.
26 27 28 29 30 31
* *.pika_access* 
exports the environment variables with the access parameters for the databases. Thus, this file should have restricted read access. You can use [pika_access_template](pika_access_template) to create this file. 
* *pika-VERSION.conf* 
is used for versioning of the PIKA package. It sets the PIKA package version along with the used version of collectd, LIKWID and Python. Finally, it uses `source` to read the environment variables from *pika.conf*. 
* *pika_utils.conf* 
provides utility functions for prolog, epilog and other bash scripts. 
32

33 34 35 36
Edit *pika.conf* an change the variables *LOCAL_STORE*, *PIKA_LOGPATH* and *PIKA_INSTALL_PATH* according to your needs or system setup. 
*LOCAL_STORE* specifies the path where temporary files are placed during prolog and read by the epilog script. It is also used for locking of the install and collectd start procedure. 
*PIKA_LOGPATH* specifies the path where the collectd log file *pika_collectd.log* will be written to.
*PIKA_INSTALL_PATH* specifies the path where the PIKA software (binaries, libraries, etc.) is installed to.
37

38 39 40
Edit *pika-VERSION.conf* and set the variable *PIKA_ROOT* to the path where the PIKA sources (and also the *.conf files) are located. 
This file also specifies the collectd batch size (number of metric values that are collected until being sent to the database) with the variable *PIKA_COLLECTD_BATCH_SIZE*. 
Furthermore, it does some exception handling for different types of nodes.
Frank Winkler's avatar
Frank Winkler committed
41

42 43 44 45
Finally, a symbolic link that points on a *pika-VERSION.conf* file has to be created an named pika-current.conf. For example:

    ln -s pika-1.2.conf pika-current.conf

46 47
To create a new PIKA software package, copy a *pika-VERSION.conf* file with a new version number and change the variables *PIKA_VERSION*, *COLLECTD_VERSION*, *LIKWID_VERSION* and, if necessary, *LIKWID_VERSION_SHA*.

Frank Winkler's avatar
Frank Winkler committed
48 49 50 51 52 53 54 55 56 57
## Data Collection
Files that are required for the execution of the monitoring daemon (collectd), are located in the daemon folder. This includes the collectd configuration file and the LIKWID event group files as well as scripts that are periodically triggered to perform log rotation and error detection. For detailed instructions see the [README.md](daemon/README.md) in the daemon directory.

## Job Control
Prolog and epilog scripts ensure that the PIKA package is installed and the daemon is running. Corresponding files are located in the folder job_control. For detailed instructions on Taurus see the [README.md](job_control/slurm/taurus/README.md) in the job_control directory.

## Post-Processing
Post-processing includes the backup and analysis of the recorded job data. For detailed instructions see the [README.md](post-processing/README.md) in the post-processing directory.

## How Components Are Connected
Robert Dietrich's avatar
Robert Dietrich committed
58 59 60
![Flow Graph](./flow_graph.svg)

### What is written/read or send/received?
Frank Winkler's avatar
Frank Winkler committed
61
1) JOB_RECORD__COMMENT, JOB_RECORD__WHOLE_NODE, JOB_RECORD__CPU_CNT, JOB_RECORD__CPU_IDS, JOB_RECORD__NODE_NAMES, JOB_RECORD__TIME_LIMIT, !!JOB_NAME!!, !!JOB_ACCOUNT!!
Robert Dietrich's avatar
Robert Dietrich committed
62 63 64 65 66 67 68 69 70 71 72 73 74
(The environment variables SLURM_JOB_ID, SLURM_NODELIST, SLURM_JOB_USER and SLURM_JOB_PARTITION are available within prolog.)

2) SLURM_JOB_ID, SLURM_JOB_USER, JOB_ACCOUNT,STATUS='running', NUM_NODES, SLURM_NODELIST,CPULIST,JOB_RECORD__CPU_CNT, START=`date +%s`, JOB_NAME, JOB_RECORD__TIME_LIMIT, SLURM_JOB_PARTITION, JOB_RECORD__WHOLE_NODE, ARRAY_ID<br>
NUM_NODES ... via $(nodeset -c ${SLURM_NODELIST}<br>
CPULIST ... generated from JOB_RECORD__CPU_IDS and JOB_RECORD__NODE_NAMES<br>
ARRAY_ID ... for non-array jobs 0, otherwise ... (currently not available)

3) Update STATUS='completed|timeout', JOB_END=`date +%s`, PROPERTY_ID for a SLURM_JOB_ID and START<br>
PROPERTY_ID ... bit field which defines several properties, e.g. monitoring was disabled, incomplete Slurm data<br>
(Delete jobs shorter than one minute.)

4) Chunks/batches of time-series data. For a complete list of metrics see [daemon/collectd](daemon/collectd).

Frank Winkler's avatar
Frank Winkler committed
75
<!--Update job metadata according to the SLURM backup database, see [revise_mariadb.py](post_processing/revise_mariadb.py)
Frank Winkler's avatar
Frank Winkler committed
76 77 78 79 80 81 82 83 84 85

|Job_Data (PIKA)|taurus_job_table (SLURM backup)|
|---|---|
|PROJECT|account |
|STATUS|state (convert id to string)|
|NUM_CORES|cpus_req |
|NAME|job_name |
|SUBMIT| time_submit|
|P_PARTITION|partition |
|EXCLUSIVE| nodes_alloc*(core number per partition) |
Frank Winkler's avatar
Frank Winkler committed
86 87 88 89 90 91 92 93
|ARRAY_ID|id_array_job |-->

5) From "taurus_job_table": account, state (convert id to string), cpus_req, job_name, time_submit, partition, nodes_alloc, id_array_job

## Evaluation Test

Scripts to determine the scalability and the overhead of the monitoring as well as regression tests are located in the test folder.