Skip to content

Merkle Tree Filter Module

Miguel Guimarães edited this page Mar 25, 2020 · 4 revisions

Merkle Tree Filter Module

In cryptography and computer science, a hash tree or Merkle tree is a tree in which every leaf node is labelled with the cryptographic hash of a data block, and every non-leaf node is labelled with the hash of the labels of its child nodes. Hash trees allow efficient and secure verification of the contents of large data structures. Hash trees are a generalization of hash lists and hash chains. Wikipedia.

Use Case

This filter module is used to generate a hash which represents the content of large data structures. This process aims to fulfill the requirement of completeness and correctness of the archival process, i.e. to ensure that no message is lost (not archived or not correctly archived).

Produces a JSON file with the following information:

  • The cryptographic algorithm used to digest the data;
  • Information about the schemas, table and columns used for the calculation;
  • The top-hash value.

Optionally it can contain every leaf node used for the calculation for debug purpose.

Bellow you can find an example of the file generated by this filter module.

 {
    "merkle": {
    "algorithm": "SHA-256",
    "schemas": [{
            "sakila": {
                "tables": [{
                        "actor": {
                            "columns": [
                                "actor_id",
                                "first_name",
                                "last_name",
                                "last_update"
                             ]
                        }
                    }
                ]
            }
        }
    ],
    "topHash": "EC2DF58B4EEF15E3CF23E5415F75F69AFCB2D566230AB8AF1F2412E56C97549D"
    }
 }

How to use

dbptk migrate -i import-config -if <path> -e siard-2 -ef <path> -f merkle-tree -f1f <path>

Advanced configurations

Change the cryptographic algorithm

dbptk migrate -i import-config -if <path> -e siard-2 -ef <path> -f merkle-tree -f1f <path> -f1d SHA-256

Change the font case

dbptk migrate -i import-config -if <path> -e siard-2 -ef <path> -f merkle-tree -f1f <path> -f1fc lowercase

Activate the debug mode

dbptk migrate -i import-config -if <path> -e siard-2 -ef <path> -f merkle-tree -f1f <path> -f1e

Integration with the import-config module

By default every column is used for the Merkle tree calculation. It is possible to explicit choose which columns to include. This can be done with the import-config module that can be consulted here.