1.93 KB
Newer Older
Robert Dietrich's avatar
Robert Dietrich committed
1 2 3 4 5 6
# PIKA - Center-Wide and Job-Aware Cluster Monitoring

PIKA is an infrastructure for continuous monitoring and analysis of HPC systems. 
It uses the collection daemon collectd, InfluxDB to store time-series data and MariaDB to store job metadata. 
Furthermore, it provides a powerful web-frontend for the visualization of job data. 

7 8 9 10 11 12 13
Files that are required for the execution of the monitoring daemon (collectd), are located in the daemon folder. This includes the collectd configuration file and the LIKWID event group files as well as scripts that are periodically triggered to perform log rotation and error detection. 

Prolog and epilog scripts ensure that the PIKA package is installed and the daemon is running. Corresponding files are located in the folder job_control. 

Scripts for post processing, such as the generation of footprints, are located in the post-processing folder. 
Scripts to determine the scalability and the overhead of the monitoring as well as regression tests are located in the test folder. 

Robert Dietrich's avatar
Robert Dietrich committed
14 15 16
## Installation
The software stack consists of several components and tools. 
To simplify the installation, appropriate install scripts are available. 
17 18 19 20 21 22 23 24 25 26 27 28 29
For detailed install instructions see the [](install/ in the install directory.

## Configuration
Five files are used to configure the software stack. 

contains the global version independent configuration variables. It also sets some environment variables that are used in the job prolog and epilog. It sources *.pika_access*.

exports the environment variables with the access parameters for the databases. Thus, this file should have restricted read access. 

is used for versioning of the PIKA package. It sets the PIKA package version along with the used version of collectd, LIKWID and Python. Finally, it sources *pika.conf*. 
Frank Winkler's avatar
Frank Winkler committed

31 32
provides utility functions for prolog, epilog and other bash scripts.