diff --git a/docs/api/dataset.md b/docs/api/dataset.md index 71ed9f76..714b3a79 100644 --- a/docs/api/dataset.md +++ b/docs/api/dataset.md @@ -8,7 +8,7 @@ This documentation doesn't provide full API reference for all members of `dataset` package. Instead, it concentrates on the Dataset that are immediately exposed to the users. Namely, we focus on `CCDataset`, `FIPSDataset` and their abstract base class `Dataset`. ```{tip} -The examples related to this package can be found at [common criteria notebook](./../notebooks/examples/cc.ipynb) and [fips notebook](./../notebooks/examples/fips.ipynb). +The examples related to this package can be found in the [common criteria notebook](./../notebooks/examples/cc.ipynb) and the [fips notebook](./../notebooks/examples/fips.ipynb). ``` ## CCDataset diff --git a/docs/api/model.md b/docs/api/model.md index ba8ca387..dc040ab9 100644 --- a/docs/api/model.md +++ b/docs/api/model.md @@ -6,7 +6,7 @@ ``` ```{tip} -The examples related to this package can be found at [model notebook](./../notebooks/examples/model.ipynb). +The examples related to this package can be found in the [model notebook](./../notebooks/examples/model.ipynb). ``` ## CPEClassifier diff --git a/docs/api/sample.md b/docs/api/sample.md index fe46a4a1..e404fd47 100644 --- a/docs/api/sample.md +++ b/docs/api/sample.md @@ -6,7 +6,7 @@ ``` ```{tip} -The examples related to this package can be found at [common criteria notebook](./../notebooks/examples/cc.ipynb) and [fips notebook](./../notebooks/examples/fips.ipynb). +The examples related to this package can be found in the [common criteria notebook](./../notebooks/examples/cc.ipynb) and the [fips notebook](./../notebooks/examples/fips.ipynb). ``` ## CCCertificate diff --git a/docs/conf.py b/docs/conf.py index f21ee733..29484b85 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -17,9 +17,9 @@ # -- Project information ----------------------------------------------------- +author = "CRoCS MUNI" project = "sec-certs" -copyright = "Anonymized | 2020-2023" -# copyright = "CRoCS MUNI | 2020-2023" +copyright = "CRoCS MUNI | 2020-2024" # Note thas this inference won't work from Docker: https://github.com/pypa/setuptools_scm/#usage-from-docker release = ".".join(get_version("sec-certs").split(".")[:3]) diff --git a/docs/index.md b/docs/index.md index cb204514..a04eb2e2 100644 --- a/docs/index.md +++ b/docs/index.md @@ -2,7 +2,7 @@ Welcome to the technical documentation of *sec-certs* tool for the data analysis of products certified with Common Criteria or FIPS 140 frameworks. If you're looking for general description of the tool, its use cases and capabilites, we refer you to [sec-certs homepage](https://sec-certs.org/). If you are looking for more advanced knowledge, e.g. how to mine your own data, how to extend the tool, and so forth, this is the right place. -There are three main parts of this documentation. *User's guide* describes high-level use of our tool. Driven by this knowledge, you can progress to *Notebook examples* that showcase some of the API that we use in the form of Jupyter notebooks. The documentation also contains some of the modules documented with `autodoc`, see *API reference*. Still, some dark corners of our codebase are not documented. To inspect the code directly, see the [sec_certs](https://github.com/crocs-muni/sec-certs/tree/main/src/sec_certs) module. If you want, you can run the notebooks as they are stored in the [project repository](https://github.com/crocs-muni/sec-certs/tree/main/notebooks). If you are interested in contributing to our project or in other aspects of our development, you can consult the relevant *GitHub artifacts* +There are three main parts of this documentation. *Quickstart* describes high-level use of our tool. Driven by this knowledge, you can progress to *Notebook examples* that showcase some of the API that we use in the form of Jupyter notebooks. The documentation also contains some of the modules documented with `autodoc`, see *API reference*. Still, some dark corners of our codebase are not documented. To inspect the code directly, see the [sec_certs](https://github.com/crocs-muni/sec-certs/tree/main/src/sec_certs) module. If you want, you can run the notebooks as they are stored in the [project repository](https://github.com/crocs-muni/sec-certs/tree/main/notebooks). If you are interested in contributing to our project or in other aspects of our development, you can consult the relevant *GitHub artifacts*. ```{button-ref} quickstart :align: center @@ -21,7 +21,7 @@ Each of the notebooks can be launched interactively in MyBinder by clicking on :maxdepth: 1 Sec-certs homepage Sec-certs docs -GitHub repo +GitHub repo ``` ```{toctree} @@ -38,12 +38,12 @@ user_guide.md :caption: Notebook examples :hidden: True :maxdepth: 1 -notebooks/examples/est_solution.ipynb -notebooks/examples/cc.ipynb -notebooks/examples/fips.ipynb -notebooks/examples/model.ipynb -notebooks/examples/fips_iut.ipynb -notebooks/examples/fips_mip.ipynb +Demo +Common Criteria +FIPS-140 +FIPS-140 IUT +FIPS-140 MIP +Model ``` ```{toctree} @@ -59,7 +59,7 @@ api/model.md :maxdepth: 1 :hidden: True :caption: GitHub artifacts -readme.md +README contributing.md code_of_conduct.md license.md diff --git a/docs/user_guide.md b/docs/user_guide.md index 8b137891..9d330c6c 100644 --- a/docs/user_guide.md +++ b/docs/user_guide.md @@ -1 +1,34 @@ +# Advanced user's guide +```{important} +This guide is in the making. +``` + +## NVD datasets + +Our tool matches certificates to their possible CVEs using datasets downloaded from [National Vulnerability Database (NVD)](https://nvd.nist.gov). If you're fully processing the `CCDataset` or `FIPSDataset` by yourself, you must somehow obtain the NVD datasets. + +Our tool can seamlessly download the required NVD datasets when needed. We support two download mechanisms: + +1. Fetching datasets with the [NVD API](https://nvd.nist.gov/developers/start-here) (preferred way). +1. Fetching snapshots from seccerts.org. + +The following two keys control the behaviour: + +```yaml +preferred_source_nvd_datasets: "api" # set to "sec-certs" to fetch them from sec-certs.org +nvd_api_key: null # or the actual key value +``` + +If you aim to fetch the sources from NVD, we advise you to get an [NVD API key](https://nvd.nist.gov/developers/request-an-api-key) and set the `nvd_api_key` setting accordingly. The download from NVD will work even without API key, it will just be slow. No API key is needed when `preferred_source_nvd_datasets: "sec-certs"` + + +## Inferring inter-certificate reference context + +```{important} +This is an experimental feature. +``` + +We provide a model that can predict the context of inter-certificate references based on the text embedded in the artifacts. The model output is not incorporated into the `CCCertificate` instances, but can be dumped into a `.csv` file from where it can be correlated with a DataFrame of certificate features. + +To train and deploy the model, it should be sufficient to change some paths and run the [prediction notebook](https://github.com/crocs-muni/sec-certs/blob/main/notebooks/cc/reference_annotations/prediction.ipynb). The output of this notebook is a `prediction.csv` file that can be loaded into the [references notebook](https://github.com/crocs-muni/sec-certs/blob/main/notebooks/cc/references.ipynb). This notebook documents the full analysis of references conducted on the Common Criteria certificates. Among others, the notebook generates some further `.csv` files that can subsequently be plotted via [plotting notebook](https://github.com/crocs-muni/sec-certs/blob/main/notebooks/cc/paper2_plots.ipynb). diff --git a/notebooks/cc/chain_of_trust_plots.ipynb b/notebooks/cc/chain_of_trust_plots.ipynb index 1a3662af..708502a8 100644 --- a/notebooks/cc/chain_of_trust_plots.ipynb +++ b/notebooks/cc/chain_of_trust_plots.ipynb @@ -1,5 +1,10 @@ { "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": "# Plots from the \"Chain of Trust\" paper" + }, { "cell_type": "code", "execution_count": 6, diff --git a/notebooks/cc/temporal_trends.ipynb b/notebooks/cc/temporal_trends.ipynb index 5e377572..16cbc162 100644 --- a/notebooks/cc/temporal_trends.ipynb +++ b/notebooks/cc/temporal_trends.ipynb @@ -1,5 +1,10 @@ { "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": "# Temporal trends in the CC ecosystem" + }, { "cell_type": "code", "execution_count": 1, diff --git a/notebooks/examples/cc.ipynb b/notebooks/examples/cc.ipynb index fe53be27..84308596 100644 --- a/notebooks/examples/cc.ipynb +++ b/notebooks/examples/cc.ipynb @@ -9,7 +9,9 @@ "\n", "This notebook illustrates basic functionality with the `CCDataset` class that holds Common Criteria dataset and of its sample `CCCertificate`.\n", "\n", - "Note that there exists a front end to this functionality at [seccerts.org/cc](https://seccerts.org/cc/). Before reinventing the wheel, it's good idea to check our web. Maybe you don't even need to run the code, but just use our web instead. " + "Note that there exists a front end to this functionality at [sec-certs.org/cc](https://sec-certs.org/cc/). Before reinventing the wheel, it's good idea to check our web. Maybe you don't even need to run the code, but just use our web instead. \n", + "\n", + "For full API documentation of the `CCDataset` class go to the [dataset](../../api/dataset) docs." ] }, { @@ -129,7 +131,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## Dissect single certificate\n", + "## Dissect a single certificate\n", "\n", "The `CCCertificate` is basically a data structure that holds all the data we keep about a certificate. Other classes (`CCDataset` or `model` package members) are used to transform and process the certificates. You can see all its attributes at [API docs](https://seccerts.org/docs/api/sample.html)." ] @@ -207,9 +209,11 @@ "source": [ "## Create new dataset and fully process it\n", "\n", - "The following piece of code roughly corresponds to `$ cc-certs all` CLI command -- it fully processes the CC pipeline. This will create a folder in current working directory where the outputs will be stored. \n", + "The following piece of code roughly corresponds to `$ sec-certs cc all` CLI command -- it fully processes the CC pipeline. This will create a folder in current working directory where the outputs will be stored. \n", "\n", - "*Warning*: It's not good idea to run this from notebook. It may take several hours to finnish. We recommend using `from_web_latest()` or turning this into a Python script." + "```{warning}\n", + "It's not good idea to run this from notebook. It may take several hours to finish. We recommend using `from_web_latest()` or turning this into a Python script.\n", + "```" ] }, { @@ -220,11 +224,34 @@ "source": [ "dset = CCDataset()\n", "dset.get_certs_from_web()\n", - "dset.process_auxillary_datasets()\n", + "dset.process_auxiliary_datasets()\n", "dset.download_all_artifacts()\n", "dset.convert_all_pdfs()\n", "dset.analyze_certificates()" ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Advanced usage\n", + "There are more notebooks available showcasing more advanced usage of the tool.\n", + "\n", + "```{toctree}\n", + ":caption: Other\n", + ":hidden: True\n", + ":maxdepth: 1\n", + "Temporal trends <../cc/temporal_trends.ipynb>\n", + "Vulnerabilities <../cc/vulnerabilities.ipynb>\n", + "References <../cc/references.ipynb>\n", + "Chain of Trust paper <../cc/chain_of_trust_plots.ipynb>\n", + "```\n", + "\n", + " - Examine [temporal trends](../cc/temporal_trends.ipynb) in the CC ecosystem.\n", + " - Analyze [vulnerabilities](../cc/vulnerabilities.ipynb) of CC certified items.\n", + " - Study [references](../cc/references.ipynb) between CC certificates.\n", + " - Reproduce the plots from our [Chain of Trust](../cc/chain_of_trust_plots.ipynb) paper." + ] } ], "metadata": { diff --git a/notebooks/examples/est_solution.ipynb b/notebooks/examples/est_solution.ipynb index d7f507f8..fd0f98fd 100644 --- a/notebooks/examples/est_solution.ipynb +++ b/notebooks/examples/est_solution.ipynb @@ -5,11 +5,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## sec-certs Python API demo\n", + "# sec-certs Python API demo\n", "\n", "In this demo, we will:\n", "1. Fetch the fully processed dataset from our web\n", - "2. Turn the dataset into a [pandas](pandas.pydata.org/) dataframe -- a data structure suitable for further data analysis.\n", + "2. Turn the dataset into a [pandas](https://pandas.pydata.org/) dataframe -- a data structure suitable for further data analysis.\n", "3. Filter the dataset to certificates of our interest\n", "4. Explore various attrributes of a dataset and its individual certificate\n", "5. Learn how to go from a single vulnerability to all certificates that *may suffer* from the vulnerability\n", @@ -80,9 +80,7 @@ "attachments": {}, "cell_type": "markdown", "metadata": {}, - "source": [ - "## 2. Turn the dataset into a [pandas](pandas.pydata.org/) dataframe -- a data structure suitable for further data analysis." - ] + "source": "## 2. Turn the dataset into a [pandas](https://pandas.pydata.org/) dataframe -- a data structure suitable for further data analysis." }, { "cell_type": "code", diff --git a/notebooks/examples/fips.ipynb b/notebooks/examples/fips.ipynb index 6d9edee4..54849f83 100644 --- a/notebooks/examples/fips.ipynb +++ b/notebooks/examples/fips.ipynb @@ -4,9 +4,15 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# FIPS Dataset class\n", + "# FIPS-140 example\n", "\n", - "This notebook illustrates basic functionality with the `FIPSDataset` class that holds FIPS 140 dataset" + "This notebook illustrates basic functionality with the `FIPSDataset` class that holds FIPS 140 dataset.\n", + "\n", + "Note that there exists a front end to this functionality at [sec-certs.org/fips](https://sec-certs.org/fips/). Before reinventing the wheel, it's good idea to check our web. Maybe you don't even need to run the code, but just use our web instead. \n", + "\n", + "For full API documentation of the `FIPSDataset` class go to the [dataset](../../api/dataset) docs.\n", + "\n", + "If you would like to examine the FIPS-140 \"Implementations Under Test\" or \"Modules In Process\" queues, check out the [FIPS IUT](fips_iut.ipynb) and [FIPS MIP](fips_mip.ipynb) example notebooks." ] }, { @@ -81,9 +87,7 @@ { "cell_type": "markdown", "metadata": {}, - "source": [ - "## Dissect single certificate" - ] + "source": "## Dissect a single certificate" }, { "cell_type": "code", @@ -123,7 +127,9 @@ "source": [ "## Create new dataset and fully process it\n", "\n", - "*Warning*: It's not good idea to run this from notebook. It may take several hours to finnish. We recommend using `from_web_latest()` or turning this into a Python script." + "```{warning}\n", + "It's not good idea to run this from notebook. It may take several hours to finish. We recommend using `from_web_latest()` or turning this into a Python script.\n", + "```" ] }, { @@ -134,11 +140,34 @@ "source": [ "dset = FIPSDataset()\n", "dset.get_certs_from_web()\n", - "dset.process_auxillary_datasets()\n", + "dset.process_auxiliary_datasets()\n", "dset.download_all_artifacts()\n", "dset.convert_all_pdfs()\n", "dset.analyze_certificates()" ] + }, + { + "metadata": {}, + "cell_type": "markdown", + "source": [ + "## Advanced usage\n", + "There are more notebooks available showcasing more advanced usage of the tool.\n", + "\n", + "```{toctree}\n", + ":caption: Other\n", + ":hidden: True\n", + ":maxdepth: 1\n", + "Temporal trends <../fips/temporal_trends.ipynb>\n", + "Vulnerabilities <../fips/vulnerabilities.ipynb>\n", + "References <../fips/references.ipynb>\n", + "IUT and MIP <../fips/in_process.ipynb>\n", + "```\n", + "\n", + " - Examine [temporal trends](../fips/temporal_trends.ipynb) in the FIPS-140 ecosystem.\n", + " - Analyze [vulnerabilities](../fips/vulnerabilities.ipynb) of FIPS-140 certified items.\n", + " - Study [references](../fips/references.ipynb) between FIPS-140 certificates.\n", + " - Analyze the FIPS-140 [IUT and MIP](../fips/in_process.ipynb) queues." + ] } ], "metadata": { diff --git a/notebooks/examples/fips_iut.ipynb b/notebooks/examples/fips_iut.ipynb index 506c114f..738020bc 100644 --- a/notebooks/examples/fips_iut.ipynb +++ b/notebooks/examples/fips_iut.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## FIPS 'Implementation Under Test'\n", + "# FIPS-140 'Implementation Under Test' example\n", "\n", "Functionality to parse FIPS 'Implementation Under Test' webpage. To use, download pages from the URL: [https://csrc.nist.gov/Projects/cryptographic-module-validation-program/modules-in-process/IUT-List](https://csrc.nist.gov/Projects/cryptographic-module-validation-program/modules-in-process/IUT-List)\n", "into a `directory` and name them `fips_iut_.html`.\n", diff --git a/notebooks/examples/fips_mip.ipynb b/notebooks/examples/fips_mip.ipynb index f46ba23a..131a5d98 100644 --- a/notebooks/examples/fips_mip.ipynb +++ b/notebooks/examples/fips_mip.ipynb @@ -4,7 +4,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## FIPS 'Modules In Process'\n", + "# FIPS-140 'Modules In Process' example\n", "\n", "Functionality to parse FIPS 'Modules In Process' webpage. To use, download pages from the URL: [https://csrc.nist.gov/Projects/cryptographic-module-validation-program/modules-in-process/Modules-In-Process-List](https://csrc.nist.gov/Projects/cryptographic-module-validation-program/modules-in-process/Modules-In-Process-List) into a `directory` and name them `fips_mip_.html`.\n", "\n", diff --git a/notebooks/examples/model.ipynb b/notebooks/examples/model.ipynb index 1714364e..e5e8966d 100644 --- a/notebooks/examples/model.ipynb +++ b/notebooks/examples/model.ipynb @@ -4,11 +4,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "# Model\n", + "# Model example\n", "\n", "This notebook illustrates basic functionality with the `model` package that apply complex transformations on certificates.\n", "\n", - "*Note*: You probably don't need to use this. Instead, you should use `CCDataset` or `FIPSDataset` classes to handle the transformations for yourself." + "```{note}\n", + "You probably don't need to use this. Instead, you should use `CCDataset` or `FIPSDataset` classes to handle the transformations for yourself.\n", + "```" ] }, { diff --git a/notebooks/fips/in_process.ipynb b/notebooks/fips/in_process.ipynb index 4e0be8cb..fb04ea79 100644 --- a/notebooks/fips/in_process.ipynb +++ b/notebooks/fips/in_process.ipynb @@ -1,5 +1,11 @@ { "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": "# FIPS IUT and MIP queues", + "id": "6212ee2f4518283e" + }, { "cell_type": "code", "execution_count": null, diff --git a/notebooks/fips/temporal_trends.ipynb b/notebooks/fips/temporal_trends.ipynb index 10b28460..2e91a976 100644 --- a/notebooks/fips/temporal_trends.ipynb +++ b/notebooks/fips/temporal_trends.ipynb @@ -1,5 +1,10 @@ { "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": "# Temporal trends in the FIPS-140 ecosystem" + }, { "cell_type": "code", "execution_count": null, diff --git a/notebooks/fips/vulnerabilities.ipynb b/notebooks/fips/vulnerabilities.ipynb index ffd19548..3b3d21e3 100644 --- a/notebooks/fips/vulnerabilities.ipynb +++ b/notebooks/fips/vulnerabilities.ipynb @@ -1,5 +1,11 @@ { "cells": [ + { + "metadata": {}, + "cell_type": "markdown", + "source": "# Vulnerability analysis", + "id": "3a0981d008383c12" + }, { "cell_type": "code", "execution_count": null, diff --git a/src/sec_certs/dataset/cc.py b/src/sec_certs/dataset/cc.py index 4c1114f3..ba2ffb58 100644 --- a/src/sec_certs/dataset/cc.py +++ b/src/sec_certs/dataset/cc.py @@ -269,9 +269,8 @@ def from_web_latest( Optionally stores it at the given path (a directory) and also downloads auxiliary datasets and artifacts (PDFs). - :::{note} - Note that including the auxiliary datasets adds several gigabytes and including artifacts adds tens of gigabytes. - ::: + .. note:: + Note that including the auxiliary datasets adds several gigabytes and including artifacts adds tens of gigabytes. :param path: Path to a directory where to store the dataset, or `None` if it should not be stored. :param auxiliary_datasets: Whether to also download auxiliary datasets (CVE, CPE, CPEMatch datasets). diff --git a/src/sec_certs/dataset/dataset.py b/src/sec_certs/dataset/dataset.py index 44b390c8..c906fd59 100644 --- a/src/sec_certs/dataset/dataset.py +++ b/src/sec_certs/dataset/dataset.py @@ -193,9 +193,8 @@ def from_web( # noqa Optionally stores it at the given path (a directory) and also downloads auxiliary datasets and artifacts (PDFs). - :::{note} - Note that including the auxiliary datasets adds several gigabytes and including artifacts adds tens of gigabytes. - ::: + .. note:: + Note that including the auxiliary datasets adds several gigabytes and including artifacts adds tens of gigabytes. :param archive_url: The URL of the full dataset archive. :param snapshot_url: The URL of the full dataset snapshot. diff --git a/src/sec_certs/dataset/fips.py b/src/sec_certs/dataset/fips.py index eeaec0a0..0feb920f 100644 --- a/src/sec_certs/dataset/fips.py +++ b/src/sec_certs/dataset/fips.py @@ -228,9 +228,8 @@ def from_web_latest( Optionally stores it at the given path (a directory) and also downloads auxiliary datasets and artifacts (PDFs). - :::{note} - Note that including the auxiliary datasets adds several gigabytes and including artifacts adds tens of gigabytes. - ::: + .. note:: + Note that including the auxiliary datasets adds several gigabytes and including artifacts adds tens of gigabytes. :param path: Path to a directory where to store the dataset, or `None` if it should not be stored. :param auxiliary_datasets: Whether to also download auxiliary datasets (CVE, CPE, CPEMatch datasets). diff --git a/src/sec_certs/sample/cc.py b/src/sec_certs/sample/cc.py index aa145706..63c2cca5 100644 --- a/src/sec_certs/sample/cc.py +++ b/src/sec_certs/sample/cc.py @@ -484,7 +484,8 @@ def eal(self) -> str | None: @property def actual_sars(self) -> set[SAR] | None: """ - Computes actual SARs. First, SARs implied by EAL are computed. Then, these are augmented with heuristically extracted SARs + Computes actual SARs. First, SARs implied by EAL are computed. Then, these are augmented with heuristically extracted SARs. + :return Optional[Set[SAR]]: Set of actual SARs of a certificate, None if empty """ sars = {} @@ -543,7 +544,7 @@ def __str__(self) -> str: def merge(self, other: CCCertificate, other_source: str | None = None) -> None: """ Merges with other CC sample. Assuming they come from different sources, e.g., csv and html. - Assuming that html source has better protection profiles, they overwrite CSV info + Assuming that html source has better protection profiles, they overwrite CSV info. On other values the sanity checks are made. """ if self != other: