This directory contains the Neo4j/Cypher implementation of the Interactive workload of the LDBC SNB benchmark.
The recommended environment is that the benchmark scripts (Bash) and the LDBC driver (Java 8) run on the host machine, while the Neo4j database runs in a Docker container. Therefore, the requirements are as follows:
- Bash
- Java 8
- Docker 19+
- enough free space in the directory
${NEO4J_CONTAINER_ROOT}
(its default value is specified inscripts/vars.sh
)
The default environment variables (e.g. Neo4j version, container name, etc.) are stored in scripts/vars.sh
. Adjust these as you see fit.
From the pre-generated data sets in the SURF/CWI data repository, use the ones named social_network-csv_composite-longdateformatter-sf*
.
The data sets need to be generated and preprocessed before loading it to the database. To generate such data sets, use the Hadoop-based Datagen's CsvComposite
serializer classes with the LongDateFormatter
date formatter:
ldbc.snb.datagen.serializer.dateFormatter:ldbc.snb.datagen.util.formatter.LongDateFormatter
ldbc.snb.datagen.serializer.dynamicActivitySerializer:ldbc.snb.datagen.serializer.snb.csv.dynamicserializer.activity.CsvCompositeDynamicActivitySerializer
ldbc.snb.datagen.serializer.dynamicPersonSerializer:ldbc.snb.datagen.serializer.snb.csv.dynamicserializer.person.CsvCompositeDynamicPersonSerializer
ldbc.snb.datagen.serializer.staticSerializer:ldbc.snb.datagen.serializer.snb.csv.staticserializer.CsvCompositeStaticSerializer
An example configuration for scale factor 1 is given in the params-csv-composite-longdateformatter.ini
file of the Datagen repository.
Set the following environment variables based on your data source and where you would like to store the converted CSVs:
export NEO4J_VANILLA_CSV_DIR=`pwd`/test-data/vanilla
export NEO4J_CONVERTED_CSV_DIR=`pwd`/test-data/converted
To load the data set, run the following script:
scripts/load-in-one-step.sh
This preprocesses the CSVs in ${NEO4J_VANILLA_CSV_DIR}
and places the resulting CSVs in ${NEO4J_CONVERTED_CSV_DIR}
, stops any running Neo4j database instances, loads the database and starts it.
The instructions below explain how to run the benchmark driver in one of the three modes (create validation parameters, validate, benchmark). For more details on the driver modes, check the "Driver modes" section of the main README.
-
Edit the
driver/benchmark.properties
file. Make sure that theldbc.snb.interactive.scale_factor
,ldbc.snb.interactive.updates_dir
,ldbc.snb.interactive.parameters_dir
properties are set correctly and are in sync. -
Run the script:
driver/create-validation-parameters.sh
-
Edit the
driver/validate.properties
file. Make sure that thevalidate_database
property points to the file you would like to validate against. -
Run the script:
driver/validate.sh
-
Edit the
driver/benchmark.properties
file. Make sure that theldbc.snb.interactive.scale_factor
,ldbc.snb.interactive.updates_dir
, andldbc.snb.interactive.parameters_dir
properties are set correctly and are in sync. -
Run the script:
driver/benchmark.sh
scripts/backup-database.sh
and scripts/restore-database.sh
scripts to achieve this. Alternatively, e.g. if you lack sudo rights, use Neo4j's built-in dump and load features through the scripts/backup-neo4j.sh
and scripts/restore-neo4j.sh
scripts.