Skip to content
Snippets Groups Projects
README.md 11.5 KiB
Newer Older
Jan Frenzel's avatar
Jan Frenzel committed
# FROOM

Jan Frenzel's avatar
Jan Frenzel committed
The framework of operators for OTF2 modification (FROOM) can help, when one has one or more trace archives and does not only want to view them in Vampir, but make modifications. This could have several reasons:
Jan Frenzel's avatar
Jan Frenzel committed

Jan Frenzel's avatar
Jan Frenzel committed
- the measurement infrastructure did not support a unified measurement, so creating one trace archive from many is required
- the trace archive should be made smaller because it should be shared with someone
- the trace archive contains irrelevant or private information which should be removed
- some external data should be incorporated into a trace
- some data should be extracted
Jan Frenzel's avatar
Jan Frenzel committed

Jan Frenzel's avatar
Jan Frenzel committed
Other tools provide only some of the necessary functionality to do the tasks mentioned above, whereas FROOM let's you express a complex pipeline of trace archive modifications that is then used to create a new trace archive. Because the output is again a trace archive, the same tools can be used to analyse and view it as before the use of FROOM. The DSL is easy to learn and apply to your set of trace archives.
Jan Frenzel's avatar
Jan Frenzel committed

Jan Frenzel's avatar
Jan Frenzel committed
## Getting started
Jan Frenzel's avatar
Jan Frenzel committed

```
Jan Frenzel's avatar
Jan Frenzel committed
git clone git@gitlab.hrz.tu-chemnitz.de:s2817051--tu-dresden.de/froom.git
Jan Frenzel's avatar
Jan Frenzel committed
```

Jan Frenzel's avatar
Jan Frenzel committed
### Installation
Jan Frenzel's avatar
Jan Frenzel committed

Install the dependencies first:
Jan Frenzel's avatar
Jan Frenzel committed

sudo apt-get install flex bison libcsv-dev libjansson-dev
```

and download the latest OTF2 library from https://perftools.pages.jsc.fz-juelich.de/cicd/otf2

For the installation of OTF2, you need the Python modules `six` and `jinja2`, which can be installed using:

```bash
sudo apt-get install python3-six python3-jinja2
```

After that, configure the OTF2 installation: Please use option `--verbose` to see whether `Python for generator` support is enabled. If it is, you should see a line similar to this:

```
    Python for generator:       yes, using /usr/bin/python3 with 'jinja2' module in version 3.1.2
```

Proceed using `make` as usual.

Once OTF2 is installed, use the following commands to build the `froom-interpreter`:
make OTF2_BASE= #put the path to the root folder of the OTF2 installation here
```

From now on, it should be sufficient to update to the most recent version by using:

```bash
git pull
Jan Frenzel's avatar
Jan Frenzel committed

### Operator overview

The list of operators in alphabetical order:

| Name | Description | Link | Command line program |
| ---- | ----------- | ---- | -------------------- |
| ChromeTraceSource | Transform a trace collection that was recorded for viewing in Chrome web browser into OTF2 | [Chrome trace example](#transforming-a-trace-recorded-for-viewing-in-chrome-web-browser-into-otf2) | froom-from-chrome-trace |
| CSVMetricSource | Transform metric data contained in a CSV file into OTF2 trace data | [CSV example](#add-data-from-a-csv-file-to-a-trace) | froom-from-csv-metric |
| LocationRemover | Remove one or more location based on the number of events it has | [Location removing example](#remove-locations-from-a-trace-archive-based-on-the-number-of-events) | froom-remove-location |
| MPICommAdaptor | Add message events based on enter events of a specific format | [Messaging example](#merging-artificially-generated-mpi-events) | froom-merge-messages |
| OTF2Source | Read trace data from a trace archive | (used by most examples) | (part of most command line programs) |
| OTF2Sink | Write trace data to a trace archive | (used by all examples) | (part of all command line programs) |
| RegionRemover | Remove regions based on their names | [Region removing example](#remove-regions-from-a-trace-archive-based-on-their-name) | froom-remove-region |
| Renamer | Changes strings, i. e. region names | [Renaming example](#mergingunification-and-renaming-of-trace-data) | froom-rename |
| TimeSlicer | Creates a new trace archive containing only events from a particular time interval | [Time Slicing example](#cut-out-a-time-slice-from-an-existing-trace-archive) | froom-slice |
| Unifier | Merges/unifies traces of multiple archives into one | [Merging example](#mergingunification-and-renaming-of-trace-data) | froom-unify |
Jan Frenzel's avatar
Jan Frenzel committed
### Usage examples
Jan Frenzel's avatar
Jan Frenzel committed

#### Merging/Unification (and Renaming) of Trace Data
Jan Frenzel's avatar
Jan Frenzel committed

In this example, it is assumed, that one application with two parallel processes was running, but the measurement environment lacked the support to collect the performance data in one trace archive. Thus, two trace archives are existing with a similar timestamp range. Without loss of generality, it is assumed that the application used a master/worker approach with one of the processes being the master, the other being the worker. The goal is to create a single trace archive containing the performance data from both trace archives. To make the two measured processes clearly distinguishable, the "Master thread" in each process should be renamed to "Master" or "Worker", respectively.
Jan Frenzel's avatar
Jan Frenzel committed

The task can be expressed in a file `unify-master-worker.froom`:

```
OTF2Source(master) -> Renamer("Master thread" -> "Master") -> newMaster;
OTF2Source(worker) -> Renamer("Master thread" -> "Worker") -> Unifier(newMaster) -> OTF2Sink(unified);
```

Now, the task can be applied to some files, e. g. `./traces/master/traces.otf2` and `./worker/traces.otf2`, from the command line:

```bash
$ froom-interpreter unify-master-worker.froom master=traces/master/traces.otf2 \
 worker=traces/worker/traces.otf2 unified=traces/unified
```

The first argument to the interpreter specifies the task to be solved, further arguments can be given to resolve variables used in the task description.

#### Cut out a Time Slice from an existing trace archive

The task to retain only the trace data in the interval from 2 seconds to 10.1 seconds is described in `timeslice.froom`:
OTF2Source(in) -> TimeSlicer(from=2, to=10.1) -> OTF2Sink(out);
The new trace archive can be created like this:
$ froom-interpreter timeslice.froom in=traces/to-be-cut/traces.otf2 out=traces/final
```

#### Add data from a CSV file to a trace

The task is described in `csv-adder.froom`:

```
OTF2Source("traces/input/traces.otf2") -> trace;
// The time column contains relative timestamps (offset from some starting point) and we only need one metric:
CSVMetricSource("metrics.csv", itemseparator="\n", propertyseparator=",", time=Column(1),
 tickspersecond=10000, referencetime="2023-02-10T15:48:54.123", metrics=(value=Column(2),
 unit="B")) -> metric;

// The time column contains ISO-formatted timestamps, and we need 2 metrics:
CSVMetricSource("metrics.csv", itemseparator="\n", propertyseparator=",", isotime=Column(1),
 metrics=(value=Column(2),unit="B"),(value=Column(3),unit="ms")) -> metric;
metric -> Unifier(trace) -> OTF2Sink("traces/final");
This time, the paths are contained in the script, so the new trace archive can be created using:

```bash
$ froom-interpreter csv-adder.froom
```
Jan Frenzel's avatar
Jan Frenzel committed

#### Remove locations from a trace archive based on the number of events

If locations need to removed that have only very few events that are meaningless to the analysis, put a script similar to this into `remove-locations.froom`:

```
OTF2Source("input/traces.otf2") -> LocationRemover(
  eventCount <= /* any number that you find suitable: */ 42
) -> OTF2Sink("final");
```

The operation can be applied like this:

```bash
$ froom-interpreter remove-locations.froom
```

#### Remove regions from a trace archive based on their name

Jan Frenzel's avatar
Jan Frenzel committed
If regions need to removed that share a common name pattern, put a script similar to this into `remove-regions.froom` (You can use extended regular expressions to specify the pattern!):

```
OTF2Source("input/traces.otf2")
-> RegionRemover("^Log:.+broadcast") -> OTF2Sink("final");
```

The operation can be applied like this:

```bash
$ froom-interpreter remove-regions.froom
```

#### Merging artificially generated MPI events

In some situations, communication between processes should be recorded, but the communication is not directly supported by OTF2. In that case, the communication can often be recorded as normal region enter and leave events and in a second step mapped to MPI events with FROOM. FROOM takes the enter events and transforms them into `MPI_SEND` or `MPI_RECV` events, while removing the corresponding leave events. The input trace archives for FROOM can use arbitrary numbers as identifiers for communication partners (e.g. hashes built using IP addresses and port numbers). The only conditions that FROOM puts on such input traces is, that between two communication partners, these identifiers stay constant. The following format is required for the region names:
FAKE_<operation>;<sender>;<receiver>;<messageSize>;<tag>
- `<operation>` is either `SEND` or `RECV`
- `<sender>`/`<receiver>` is a string not containing `;`
- `<messageSize>` is an integer representing the size of the message sent in the range `[0-18446744073709551615]` (`uint64_t`)
- `<tag>` is an integer representing the tag of the message in the range `[0-4294967295]` (`uint32_t`)

`otf2-print -A traces.otf2` should give lines such as the following:

```
ENTER   42     1683289351077941  Region: "FAKE_RECV;localhost:4711;myserver:7077;1407;9" <131>
```

The FROOM script to merge trace archives and modify MPI ranks is given in `mpi-adaptor.froom`:

```
OTF2Source("spark-org.apache.spark.deploy.master.Master/traces.otf2")
-> Renamer("Master thread" -> "Master")   -> newmaster;
OTF2Source("spark-org.apache.spark.deploy.worker.Worker/traces.otf2")
-> Renamer("Master thread" -> "Worker")   -> newworker;
OTF2Source("spark-org.apache.spark.executor.CoarseGrainedExecutorBackend/traces.otf2")
-> Renamer("Master thread" -> "Executor") -> newexecutor;
OTF2Source("spark-org.apache.spark.deploy.SparkSubmit/traces.otf2")
-> Renamer("Master thread" -> "Client")   -> newclient;

newclient -> Unifier(newexecutor, newworker, newmaster)
-> MPICommAdaptor -> OTF2Sink("unified-mpi");
```

The operation can be applied like this:

```bash
$ froom-interpreter mpi-adaptor.froom
```

#### Transforming a trace recorded for viewing in Chrome web browser into OTF2

When you have recorded a trace in Chrome trace format (e. g. using Tensorflow), you can transform it into OTF2 using the following `chrome.froom` script:

```
ChromeTraceSource(chrometrace) -> OTF2Sink ("transformed");
```

The operation can be applied like this:

```bash
$ froom-interpreter chrome.froom chrometrace=mychrometrace.json
```

Jan Frenzel's avatar
Jan Frenzel committed
## Acknowledgment
Jan Frenzel's avatar
Jan Frenzel committed

Jan Frenzel's avatar
Jan Frenzel committed
This work was supported by the German Federal Ministry of Education and Research (BMBF, SCADS22B) and the Saxon State Ministry for Science, Culture and Tourism (SMWK) by funding the competence center for Big Data and AI ”ScaDS.AI Dresden/Leipzig”. The authors gratefully acknowledge the GWK support for funding this project by providing computing time through the Center for Information Services and HPC (ZIH) at TU Dresden.
Jan Frenzel's avatar
Jan Frenzel committed

Jan Frenzel's avatar
Jan Frenzel committed
## References
Jan Frenzel's avatar
Jan Frenzel committed

If you use FROOM for your work, we would be happy if you cite the following paper:

Jan Frenzel, Apurv Deepak Kulkarni, Sebastian Döbel, Bert Wesarg, Maximilian Knespel, and Holger Brunst. 2023. FROOM: A Framework of Operators for OTF2 Modification. In Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC-W 2023), November 12--17, 2023, Denver, CO, USA. ACM, New York, NY, USA 9 Pages. https://doi.org/10.1145/3624062.3624209
Jan Frenzel's avatar
Jan Frenzel committed

## Support

Jan Frenzel's avatar
Jan Frenzel committed
The hope is that this README.md is self-explanatory. Please open issues if you feel that an improvement is necessary.
Jan Frenzel's avatar
Jan Frenzel committed

## Contributing

Jan Frenzel's avatar
Jan Frenzel committed
Contributions are very welcome! Feel free to share ideas by opening issues or contributing code.
Jan Frenzel's avatar
Jan Frenzel committed

## License
Jan Frenzel's avatar
Jan Frenzel committed

Please see [the License](LICENSE).
Jan Frenzel's avatar
Jan Frenzel committed

## Project status
Jan Frenzel's avatar
Jan Frenzel committed

This project is active, but the maintainers have a lot of other things to do. Thus, improvements (bug fixes, new features) only appear from time to time. So, please contribute!