High level view of the Healthcheck system

Introduction

Healthcheck is a system used at the Swedish Internet Infrastructure Foundation to gather information about the .se zone, with an eye to estimating the general health of the zone. It consist of two major logical parts: the engine part that gathers data and extracts (some) information, and a web application through which the user can control which domain names will be scanned, and see the results. All the data is stored in a CouchDB database instance.

Logical structure

The two major conceptual units the Healthcheck system works with are Domain Sets and Testruns. A domainset is a list of domain names, and a testrun is the data that results from running the gathering process on that list. So a domainset can have zero or more testruns, while a testrun has exactly one domainset. The web interface presents testruns, grouped by their domainsets for selection purposes.

Intended work cycle

Someone gathers a list of domain names, and enters it into the system (giving it a name in the process).
They hit "Start Gathering" in the web interface, whereupon a new testrun id is allocated and the domain names are added to the queue marked with that testrun id.
The dispatcher daemon picks names off the queue and spawns child processes that gather data. It runs many in parallel, for efficiency. Usually, the amount of RAM in the server or the disk I/O needed to store the results is what sets the ceiling for how many can be run concurrently.
CouchDB runs its map/reduce process over the results.
The web interface, on request, shows the results to the user.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High level view of the Healthcheck system

Introduction

Logical structure

Intended work cycle

Clone this wiki locally