engine: each experiment calls the input-fetching API it needs? #2381
Labels
funder/drl2022-2024
methodology
issues related to the testing methodology
needs investigation
This issue needs extra data and investigation
ooni/probe-engine
priority/high
refactoring
research prototype
This (currently-work-in-progress) issue describes an alternative solution for passing richer input to experiments that perhaps is conducive to less complexity inside the core of ooniprobe and more flexibility in terms of data formats.
Let's kick off our discussion by observing that we have N possible input formats already (Web Connectivity, Psiphon, DNSCheck, Tor) and M formats (Web Connectivity is actually also used by urlgetter and would be used by websteps, Psiphon is only used by Psiphon, DNSCheck only by DNSCheck, and Tor only by Tor). Additionally in a run-by-command-line or OONI Run v2 scenario, some orthogonal input is provided by either command line settings or the OONI Run v2 descriptor.
This situation has led us to (1) converge on the minimum denominator for passing inputs to experiments (i.e., strings containing URLs) and (2) using additional mechanisms for providing inputs to experiments where the input would not fit this model (think, e.g., at how Psiphon and Tor download their own input and have no strings-based input).
What this situation is telling us, though, is that we actually have one single kind of experiment, the one which fetches its own input, formatted according to the input type understood by the experiment, and processes it accordingly. Obviously, even if we did that, we would have other bottleneck places where experiments assume string input (e.g., the database format). Yet, if this would be possible, we could reduce the ~complex way in which experiments run with or without input to another model where each experiment does the right thing for itself (which, in cases such as Telegram, is to actually not have input).
Now, this discussion makes sense conceptually but changing the code to behave as described may or may not be quite difficult to do. I am not super sure. Hence this issue. We want to explore the design space and work on small prototypes to understand whether this (in my opinion desirable) design change is doable or too hard given the current codebase.
The text was updated successfully, but these errors were encountered: