Commit d750b72f authored by Robert Dietrich's avatar Robert Dietrich

Merge branch 'master' of gitlab.hrz.tu-chemnitz.de:pika/monitoring

parents a0d7503a c8f1a206
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
PIKA uses the collection daemon collectd to read metrics and send them to an instance of InfluxDB. To capture all metrics relevant to PIKA, we have developed some additional plugins (see [collectd/collectd-plugins/](collectd/collectd-plugins/)). PIKA uses the collection daemon collectd to read metrics and send them to an instance of InfluxDB. To capture all metrics relevant to PIKA, we have developed some additional plugins (see [collectd/collectd-plugins/](collectd/collectd-plugins/)).
Since collectd does not provide log rotation, we added this feature based on *logrotate*. Since collectd does not provide log rotation, we added this feature based on *logrotate*.
Furthermore, we us this mechanism to regularily check the log files for errors, summarize them into a shared directory and send a report once per day to registered email addresses (only if errors occurred). Furthermore, we us this mechanism to regularily check the log files for errors, summarize them into a shared directory and send a report once per day to registered email addresses (only if errors occurred). For more information see the [README.md](logrotate/README.md) in the logrotate directory.
## PIKA Perfgroups for LIKWID ## PIKA Perfgroups for LIKWID
......
# Error Detection Using Logrotate
To detect general errors in any part of the monitoring process, we write and analyze log files.
There are different log files for the monitoring daemon, the job prolog and epilog as well as the post-processing of the stored data.
Currently, we simply grep for keywords in these log files, such as `error` and `failure`, `outlier`.
If a keyword was found, the respective log file is saved with the name of the compute node to a shared file system and an email is sent to the administrators.
##TODO:
- specify location for error logfiles
- Register email addresses
- create cronjob...
\ No newline at end of file
...@@ -32,6 +32,10 @@ The [collectd daemon patch](compute_node/patches/collectd-5.11.0_daemon.patch) a ...@@ -32,6 +32,10 @@ The [collectd daemon patch](compute_node/patches/collectd-5.11.0_daemon.patch) a
The [LIKWID Set-Counters-Patch](compute_node/patches/pika_likwid-5.0.0_src.patch) adds an API function to (re)set the active counters. The [LIKWID Set-Counters-Patch](compute_node/patches/pika_likwid-5.0.0_src.patch) adds an API function to (re)set the active counters.
All other patches are deprecated with the most recent versions of collectd and LIKWID. All other patches are deprecated with the most recent versions of collectd and LIKWID.
**Note**: Each compute node needs the MySQL client for the prolog/epilog scripts. If not available, you can install them via:
yum install mysql
## Databases for Time-Series Data and Metadata ## Databases for Time-Series Data and Metadata
It is recommended to install the databases on different systems or virtual machines, each having access to fast storage. It is recommended to install the databases on different systems or virtual machines, each having access to fast storage.
...@@ -97,6 +101,28 @@ Add MariaDB access port to your firewall policy, e.g. for RHEL with ...@@ -97,6 +101,28 @@ Add MariaDB access port to your firewall policy, e.g. for RHEL with
firewall-cmd --zone=public --permanent --add-port=3306/tcp firewall-cmd --zone=public --permanent --add-port=3306/tcp
systemctl restart firewalld systemctl restart firewalld
Modify MariaDB’s configuration file at the following location:
vi /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
ignore-db-dir=lost+found
max-connections=500
[mysqld_safe]
log-error=/var/log/mariadb/mariadb.log
pid-file=/var/run/mariadb/mariadb.pid
**Note**: It is recommend to allow at least 500 connections.
Start the MariaDB server:
systemctl restart mariadb
Create a database and an admin user using the mysql shell: Create a database and an admin user using the mysql shell:
......
frank.winkler@tu-dresden.de # list email addresses to get notifications from logrotate
robert.dietrich@tu-dresden.de #example@mail.de
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment