From b4f0a39274659af6d0844861c4eec1d42d3248fb Mon Sep 17 00:00:00 2001 From: Kostis Sagonas Date: Fri, 29 Dec 2023 09:37:10 +0100 Subject: [PATCH] Changes in the README files --- AEC-README.md | 69 +++++++++++++++++++++++++++------------------------ README.md | 55 ++++++++++++++++++++++++---------------- 2 files changed, 69 insertions(+), 55 deletions(-) diff --git a/AEC-README.md b/AEC-README.md index 82b02a7..c03b8a7 100644 --- a/AEC-README.md +++ b/AEC-README.md @@ -15,14 +15,14 @@ For both early light and full review, setup must be performed. All dependencies are included with this artifact, i.e., no network connection is required. First, the system dependencies need to be installed. -When prompted, enter the sudo password (same as username). This -will take some minutes. +When prompted, enter the sudo password (same as username). +This step will complete in a few minutes. ``` ./install-deps.sh ``` Next, build the RaLib package. No administrative privileges are -requires. This will also take some minutes, especially during unit testing. +required. This step also takes a few minutes. ``` ./build.sh ``` @@ -30,43 +30,46 @@ requires. This will also take some minutes, especially during unit testing. ## Reproducing Experimental Data Three scripts are provided that regenerate the experimental -output data in `results/`. Initially, this directory will -contain data produced by us which will be overwritten. If some -of the scripts are not run, our data will be used in the respective -parts of the table and figures. +output data in `results/`. -The first command uses counterexample search and should -complete in less than ten minutes: +The first command uses counterexample search and, depending +on the speed of your machine, should complete in 5 to 20 minutes: ``` ./run-experiments-with-ce-search-tacas2024.sh ``` -It generates the data presented -in lines 7 and 8 of Table 1. The data for the other lines and figures -is not recreated. If you are performing an **early light evaluation**, -you may want to skip the next two commands and proceed to reproducing -table and figures. +It generates the data for all but the last row of Table 1. +As you will notice, it runs each benchmark for 20 iterations +and reports averages and standard deviation from the data collected. +It also reports how many of these 20 iterations were successful + +If you are performing an **early light evaluation**, you probaly +want to generate a table similar to Table 1 at this point. +You can do this with the following command: +``` +./generate-table-tacas2024.sh +``` +which will leave the table in `plots/tacas2024-table-results.pdf`. + +You can now view this table or proceed directly to generating +the data for the two figures and data for the fifo7 benchmark +(the missing last row of Table 1). The second command uses model checking to find counterexamples -and should complete in ca. three hours: +and should complete in a few (3-7) hours: ``` ./run-experiments-model-checker-tacas2024.sh ``` -It generates the data presented -in the top six lines of Table 1 as well as both Figures. Lines 7 to 9 of -the table are not recreated. +It generates the data for the two figures (Figs 4 and 5). -The third command uses counterexample search on larger models -and may require more that one day on mid-end machines: +The third command generates the data for the last row of Table 1. +Be warned that it may require more than a day on mid-end machines. ``` ./run-big-experiments-with-ce-search-tacas2024.sh ``` -It generates -the data presented in line 9 of Table 1. The data for the other lines -and the figures is not recreated. ## Reproducing Table and Figures -The plots can be re-rendered using two scripts +The plots can be created using two scripts ``` ./generate-table-tacas2024.sh ./generate-plots-tacas2024.sh @@ -106,7 +109,7 @@ as follows: 1. **benchmark problems** are located in ```benchmarks/```. Problems are collected from papers on RaLib, from the [automata wiki](https://automata.cs.ru.nl), and some - are generated or manually created for this project. + are generated or manually created. 2. **configurations** of RaLib are located in ```configs/```. The different configurations are used in different series of experiments and specify aspects of the @@ -125,7 +128,7 @@ as follows: in ```plots/```. We ship experiment results and plots generated by us in these locations. -## Running experiments +## Running Experiments Generally, experiments are organized in series and configurations do not specify which learner should @@ -136,25 +139,25 @@ series, and complete evaluations. ### Experiments for the evaluation of CT-based learners (TACAS'24) This section describes how one can run the experiments -that are reported in the TACAS 2024 submission. +that are reported in the TACAS 2024 paper. Concrete results may vary a little bit for every execution due to randomization but trends should be stable. -1. Running experiments from TACAS 2024 submission +1. Running experiments from TACAS 2024 paper ``` - ./run-experiments-model-checker-tacas2024.sh ./run-experiments-with-ce-search-tacas2024.sh + ./run-experiments-model-checker-tacas2024.sh ./run-big-experiments-with-ce-search-tacas2024.sh ``` results (logs and models) can be found in ```results/```. - The expected running times are ca. 3 hours, less than ten - minutes, and (on mid-end machines) over a day, respectively. - + The expected running times are between 5 and 15 minutes, + a few hours, and (on mid-end machines) over a day, + respectively. -2. Generating table in TACAS 2024 submission +2. Generating table in TACAS 2024 paper this step requires latexmk and a latex distribution to be installed diff --git a/README.md b/README.md index c298bf8..f0d912d 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,13 @@ -# RaLib Benchmarking Setup +# RaLib Benchmarking Repository -This project contains benchmark examples and scripts to -run RaLib on these benchmarks. The project is organized -as follows: +This repository contains benchmark examples and scripts to run +[RaLib](https://github.com/LearnLib/ralib) on these benchmarks. +The repository is organized as follows: 1. **benchmark problems** are located in ```benchmarks/```. Problems are collected from papers on RaLib, from the [automata wiki](https://automata.cs.ru.nl), and some - are generated or manually created for this project. + are generated or manually created. 2. **configurations** of RaLib are located in ```configs/```. The different configurations are used in different series of experiments and specify aspects of the @@ -16,7 +16,7 @@ as follows: teachers/theories for types, etc.). 3. **experiments** are organized in series in ```experiments/```. Series combine benchmark problems with a configuration. - The different series in this project originate from + The different series in this repository originate from multiple papers on RaLib / register automata learning. 4. **results** from running experiments will be stored in ```results/```. Results include logs and models from @@ -30,44 +30,56 @@ as follows: ## RaLib Installation -To install RaLib, you have to initialize the submodule -and build RaLib, which requires a JDK and Maven to be -installed. +To install RaLib on a machine with internet access, you have +to initialize the submodule with the following commands. +Note that the TACAS 2024 artifact already contains RaLib's +code, so you can skip this initial step and simply execute +the build script. + +``` +git submodule init +git submodule update +``` + +You can then build RaLib with the following command, +which requires a JDK and Maven to be installed. ``` ./build.sh ``` -## Running experiments +## Running Experiments -Generally, experiments are organized in series and +Generally, experiments are organized in series, and configurations do not specify which learner should be used or how many times experiments should be run. Shell scripts exist for running individual experiments, series, and complete evaluations. +Note that most of RaLib's algirithms use randomization, +so concrete results typically vary for every execution. +Thus, some of the scripts below run a number of +iterations and report averages and standard deviations. + ### Experiments for the evaluation of CT-based learners (TACAS'24) This section describes how one can run the experiments -that are reported in the TACAS 2024 submission. -Concrete results may vary a little bit for every -execution due to randomization but trends should be -stable. +that are reported in the TACAS 2024 paper. -1. Running experiments from TACAS 2024 submission +1. Running experiments from TACAS 2024 paper ``` - ./run-experiments-model-checker-tacas2024.sh ./run-experiments-with-ce-search-tacas2024.sh + ./run-experiments-model-checker-tacas2024.sh ./run-big-experiments-with-ce-search-tacas2024.sh ``` results (logs and models) can be found in ```results/```. - The expected running times are ca. 3 hours, less than ten - minutes, and (on mid-end machines) over a day, respectively. + The expected running times are between 5 and 15 minutes, + few hours, and (on mid-end machines) over a day, + respectively. - -2. Generating table in TACAS 2024 submission +2. Generating table in TACAS 2024 paper this step requires latexmk and a latex distribution to be installed @@ -147,4 +159,3 @@ all experiments that are to be run as part of the evaluation. ``` Searches for `search_term` in the logs from all iterations. -