A small change to how metadata is represented in the transport graph #878

cjao · 2022-07-19T21:22:28Z

cjao
Jul 19, 2022
Maintainer

This post documents some recent (minor) changes to how electron and lattice metadata are represented in the transport graph.

Users can currently specify the following constraints on each electron:

executor -- either a string description of an executor or an executor object
deps -- Instances of DepsBash and DepsPip
call_before -- a list of DepsCall objects
call_after -- a list of DepsCall objects

Technically, the Electron decorator also accepts a file_transfer argument, but that is internally converted into a DepsCall.

Previously, whenever object instances were passed as metadata, the transport graph stored the actual object instances. These were pickled together with the transport graph by the client and unpickled by the server.

As of the recent work to allow Covalent to process workflows without needing their dependencies, Covalent no longer unpickles any user-submitted data. As part of this change, the transport graph always holds a JSON representation of each metadata object, never object instances. The objects are rehydrated from their JSON representations at runtime whenever they are needed.

One consequence is that every new metadata class we support needs to know how to serialize itself to JSON; the existing metadata types have been taught to do this (the from_dict() and to_dict() methods).

Some elaboration is needed for executors. During workflow submission, the transport graph stores the essential data for reconstructing the executor in a field called executor_data. The executor field holds the short-string representation of the executor.

For example, suppose the client specifies an executor instance like @ct.electron(executor=dask_exec), where dask_exec=DaskExecutor(scheduler_address) is an instance of DaskExecutor. the corresponding graph node contains the following fields.

executor="dask"
executor_data = {"type": "DaskExecutor", "short_name": "dask", attributes = {"address": [scheduler_address]}

This is what the SDK actually submits to the dispatcher, and you will notice that this is completely JSON-serializable. When the dispatcher dispatches the task, it creates a DaskExecutor instance from this JSON description (see here for the gory details).

A similar treatment is applied to the Deps objects.

To better understand the current mechanism it may be helpful to construct a simple workflow, run

workflow.build_graph()
received_lattice = Lattice.deserialize_from_json(workflow.serialize_to_json())

and then inspect received_lattice, which is what the server would receive from the client.

mshkanth · 2022-07-26T13:35:45Z

mshkanth
Jul 26, 2022

Hi @cjao & @FyzHsn - Will this affect what gets stored in the executor.pkl file? Our current understanding is that all executor attributes (Incl name) will be stored as a pickled object in this file.

7 replies

cjao Jul 26, 2022
Maintainer Author

Hi @mshkanth, you are right that the executor_data_filename also contains a shortname attribute, but only if the user specifes an executor instance instead of a short name during workflow construction. The following example may help clarify the metadata:

import covalent as ct
from covalent.executor import DaskExecutor
from covalent._workflow.lattice import Lattice
from dask.distributed import LocalCluster

lc = LocalCluster()
dask_exec = DaskExecutor(lc.scheduler_address)

@ct.electron(executor="local")
def square(x):
    return x**2

@ct.electron(executor=dask_exec)
def cube(x):
    return x**3


@ct.lattice
def workflow(x):
    res1 = square(x)
    res2 = cube(x)

workflow.build_graph(2)
received_lattice = Lattice.deserialize_from_json(workflow.serialize_to_json())

tg = received_lattice.transport_graph._graph

Here, received_lattice is the lattice that the Covalent server receives from the client. The two tasks in the workflow have node id 0 and 2. Here are their metadata:

>>> tg.nodes[0]["metadata"]

{'executor': 'local',
 'deps': {'bash': {'type': 'DepsBash',
   'short_name': 'covalent',
   'attributes': {'commands': [],
    'apply_fn': {'type': 'TransportableObject',
     'attributes': {'_object': 'gAWVNwAAAAAAAACMG2NvdmFsZW50Ll93b3JrZmxvdy5kZXBzYmFzaJSME2FwcGx5X2Jhc2hfY29tbWFuZHOUk5Qu',
      'python_version': '3.8.13',
      'object_string': '<function apply_bash_commands at 0x7fc1a44f1b80>',
      '_json': '',
      'attrs': {'doc': '', 'name': ''}}},
    'apply_args': {'type': 'TransportableObject',
     'attributes': {'_object': 'gAWVBgAAAAAAAABdlF2UYS4=',
      'python_version': '3.8.13',
      'object_string': '[[]]',
      '_json': '[[]]',
      'attrs': {'doc': '', 'name': ''}}},
    'apply_kwargs': {'type': 'TransportableObject',
     'attributes': {'_object': 'gAV9lC4=',
      'python_version': '3.8.13',
      'object_string': '{}',
      '_json': '{}',
      'attrs': {'doc': '', 'name': ''}}},
    'retval_keyword': ''}}},
 'call_before': [],
 'call_after': [],
 'executor_data': {}}

>>> tg.nodes[2]["metadata"]

{'executor': 'dask',
 'deps': {'bash': {'type': 'DepsBash',
   'short_name': 'covalent',
   'attributes': {'commands': [],
    'apply_fn': {'type': 'TransportableObject',
     'attributes': {'_object': 'gAWVNwAAAAAAAACMG2NvdmFsZW50Ll93b3JrZmxvdy5kZXBzYmFzaJSME2FwcGx5X2Jhc2hfY29tbWFuZHOUk5Qu',
      'python_version': '3.8.13',
      'object_string': '<function apply_bash_commands at 0x7fc1a44f1b80>',
      '_json': '',
      'attrs': {'doc': '', 'name': ''}}},
    'apply_args': {'type': 'TransportableObject',
     'attributes': {'_object': 'gAWVBgAAAAAAAABdlF2UYS4=',
      'python_version': '3.8.13',
      'object_string': '[[]]',
      '_json': '[[]]',
      'attrs': {'doc': '', 'name': ''}}},
    'apply_kwargs': {'type': 'TransportableObject',
     'attributes': {'_object': 'gAV9lC4=',
      'python_version': '3.8.13',
      'object_string': '{}',
      '_json': '{}',
      'attrs': {'doc': '', 'name': ''}}},
    'retval_keyword': ''}}},
 'call_before': [],
 'call_after': [],
 'executor_data': {'type': "<class '/var/home/casey/Agnostiq/code/covalent/covalent/executor/executor_plugins/dask.DaskExecutor'>",
  'short_name': 'dask',
  'attributes': {'log_stdout': 'stdout.log',
   'log_stderr': 'stderr.log',
   'conda_env': '',
   'cache_dir': '/var/home/casey/.cache/covalent',
   'current_env_on_conda_fail': False,
   'current_env': '',
   'scheduler_address': 'tcp://127.0.0.1:46359'}}}

mshkanth Jul 26, 2022

@cjao - Your example helps a lot. It is safer for the web app to refer to executor_name for the name and the executor_data_filename for executor attributes.

Cc @Prasy12 @mpvgithub @Aravind-psiog @ArunPsiog

mshkanth Jul 26, 2022

@cjao - In v10, the executor.pkl file will sometimes be an executor object (if executor is dask) and have only a name (if the exector is local). Correct?

FyzHsn Jul 26, 2022
Maintainer

~~Yes @mshkanth that's correct! There was an issue to only pickle the executors and not the strings but in light of the upcoming changes in V11, it's counterproductive to make these changes~~

EDIT: See below

cjao Jul 26, 2022
Maintainer Author

@mshkanth The executor.pkl in V10 always contains just the "short_name". The metadata received by the server (the above examples are representative) no longer stores any executor object instances.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A small change to how metadata is represented in the transport graph #878

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 7 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

A small change to how metadata is represented in the transport graph #878

cjao Jul 19, 2022 Maintainer

Replies: 1 comment · 7 replies

mshkanth Jul 26, 2022

cjao Jul 26, 2022 Maintainer Author

mshkanth Jul 26, 2022

mshkanth Jul 26, 2022

FyzHsn Jul 26, 2022 Maintainer

cjao Jul 26, 2022 Maintainer Author

cjao
Jul 19, 2022
Maintainer

Replies: 1 comment 7 replies

mshkanth
Jul 26, 2022

cjao Jul 26, 2022
Maintainer Author

FyzHsn Jul 26, 2022
Maintainer

cjao Jul 26, 2022
Maintainer Author