Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing compulsory argument in nbadapter call to NB2WProductQuery from dispatcher-plugin-nb2workflow #199

Open
ferrigno opened this issue Sep 3, 2024 · 41 comments
Assignees
Labels
bug Something isn't working

Comments

@ferrigno
Copy link
Contributor

ferrigno commented Sep 3, 2024

File ~/Soft/nb2workflow/nb2workflow/nbadapter.py:1096, in validate_oda_dispatcher(nba, optional, machine_readable)
   1094         raise
   1095 else:
-> 1096     nbpq = NB2WProductQuery('testname', 
   1097                     'testproduct', 
   1098                     nba.extract_parameters(),
   1099                     nba.extract_output_declarations())
   1100                     
   1102     output = nba.extract_output()

TypeError: NB2WProductQuery.__init__() missing 1 required positional argument: 'ontology_path'

Introducing the oda_ontology_path dedfined in the file gives another error

File ~/Soft/nb2workflow/nb2workflow/nbadapter.py:1096, in validate_oda_dispatcher(nba, optional, machine_readable)
   1094         raise
   1095 else:
-> 1096     nbpq = NB2WProductQuery('testname', 
   1097                     'testproduct', 
   1098                     nba.extract_parameters(),
   1099                     nba.extract_output_declarations(),
   1100                     oda_ontology_path)
   1102     output = nba.extract_output()
   1104     logger.debug(json.dumps(output, indent=4))

File ~/Soft/dispatcher-plugin-nb2workflow/dispatcher_plugin_nb2workflow/queries.py:102, in NB2WProductQuery.__init__(self, name, backend_product_name, backend_param_dict, backend_output_dict, ontology_path)
    100 self.backend_output_dict = backend_output_dict
    101 self.backend_param_dict = backend_param_dict
--> 102 parameter_lists = construct_parameter_lists(backend_param_dict, ontology_path)
    103 self.par_name_substitution = parameter_lists['par_name_substitution']
    104 plist = deepcopy(parameter_lists['prod_plist'])

File ~/Soft/dispatcher-plugin-nb2workflow/dispatcher_plugin_nb2workflow/queries.py:24, in with_hashable_dict.<locals>.wrapper(backend_param_dict, ontology_path)
     22 @wraps(func)
     23 def wrapper(backend_param_dict, ontology_path):
---> 24     return func(HashableDict(backend_param_dict), ontology_path)

File ~/Soft/dispatcher-plugin-nb2workflow/dispatcher_plugin_nb2workflow/queries.py:19, in HashableDict.__hash__(self)
     18 def __hash__(self):
---> 19     return hash(dumps(self, sort_keys=True))

File ~/.conda/envs/py10/lib/python3.10/json/__init__.py:238, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    232 if cls is None:
    233     cls = JSONEncoder
    234 return cls(
    235     skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    236     check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237     separators=separators, default=default, sort_keys=sort_keys,
--> 238     **kw).encode(obj)

File ~/.conda/envs/py10/lib/python3.10/json/encoder.py:199, in JSONEncoder.encode(self, o)
    195         return encode_basestring(o)
    196 # This doesn't pass the iterator directly to ''.join() because the
    197 # exceptions aren't as detailed.  The list call should be roughly
    198 # equivalent to the PySequence_Fast that ''.join() would do.
--> 199 chunks = self.iterencode(o, _one_shot=True)
    200 if not isinstance(chunks, (list, tuple)):
    201     chunks = list(chunks)

File ~/.conda/envs/py10/lib/python3.10/json/encoder.py:257, in JSONEncoder.iterencode(self, o, _one_shot)
    252 else:
    253     _iterencode = _make_iterencode(
    254         markers, self.default, _encoder, self.indent, floatstr,
    255         self.key_separator, self.item_separator, self.sort_keys,
    256         self.skipkeys, _one_shot)
--> 257 return _iterencode(o, 0)

File ~/.conda/envs/py10/lib/python3.10/json/encoder.py:179, in JSONEncoder.default(self, o)
    160 def default(self, o):
    161     """Implement this method in a subclass such that it returns
    162     a serializable object for ``o``, or calls the base implementation
    163     (to raise a ``TypeError``).
   (...)
    177 
    178     """
--> 179     raise TypeError(f'Object of type {o.__class__.__name__} '
    180                     f'is not JSON serializable')

TypeError: Object of type type is not JSON serializable

this goes beyond my understanding.

[I used the master branch of all repositories with python 3.10

@ferrigno ferrigno added the bug Something isn't working label Sep 3, 2024
@dsavchenko
Copy link
Member

We haven't used this functionality anywhere for quite a while, so this function became incompatible with current dispatcher/plugin. And it stays untested as long as there is no more dispatcher/plugin in dependencies.

@ferrigno in which scenario do you need it?

@volodymyrss do we really want it like this, so that dispatcher/plugin is an optional dependency of nb2workflow?

@ferrigno
Copy link
Contributor Author

ferrigno commented Sep 4, 2024

@dsavchenko , I was trying to run a test notebook for my workflows following the suggested example.

See:
https://gitlab.renkulab.io/astronomy/mmoda/jemx-expert/-/blob/main/test_products.ipynb?ref_type=heads

I run it from my local virtual environment where I installed nb2worflow and dispatcher-plugin-nb2workflow from the git repository as suggested by @volodymyrss (I had to install also the dispatcher and I did it from the git repository)

@dsavchenko
Copy link
Member

You don't really need the dispatcher/plugin in this context. You won't encounter the exception if you remove them from the virtual environment.
We still need to adapt this part, thought, so let's keep this issue open

@volodymyrss
Copy link
Member

volodymyrss commented Sep 4, 2024

I think the plugin was used to construct products from the test output, also verifying that the output annotation works.

That is, it should be used. It's not great to use a dispatcher plugin for this and it would be better to use product classes from oda-api. But I do not suppose its possible now, @dsavchenko ?

@ferrigno
Copy link
Contributor Author

ferrigno commented Sep 4, 2024

I can just confirm that I had to introduce the plugin because I encountered an error and could not obtain any result from the test. Removing it gives
unable to import dispatcher_plugin_nb2workflow.queries.NB2WProductQuery: No module named 'dispatcher_plugin_nb2workflow'

@dsavchenko
Copy link
Member

I can just confirm that I had to introduce the plugin because I encountered an error and could not obtain any result from the test. Removing it gives
unable to import dispatcher_plugin_nb2workflow.queries.NB2WProductQuery: No module named 'dispatcher_plugin_nb2workflow'

Unsurprisingly. You explicitly import it in test nb, but never use it after in fact

@ferrigno
Copy link
Contributor Author

ferrigno commented Sep 4, 2024

The explicit import was just left from a test, the error appears also without import.

@ferrigno
Copy link
Contributor Author

ferrigno commented Sep 5, 2024

@dsavchenko
Copy link
Member

There are both dispatcher and plugin in requirements. If you don't use them explicitly for some specific reason, you can remove both, then you won't encounter the exception.

@dsavchenko
Copy link
Member

I think the plugin was used to construct products from the test output

One can try to do it, but I don't think we ever did this explicitly.

It was used to test, how dispatcher "understands" annotations. But this information then isn't reflected anywhere as far as I remember (if available, it stays as variable in the container).
Despite general problems with complicated requirements, this also increases the container size, which is rather big in any case.

@volodymyrss
Copy link
Member

volodymyrss commented Sep 6, 2024

I think the plugin was used to construct products from the test output

One can try to do it, but I don't think we ever did this explicitly.

It was used to test, how dispatcher "understands" annotations. But this information then isn't reflected anywhere as far as I remember (if available, it stays as variable in the container).

This was done and even sent by email, extracted from the container in here .

Would be nice to recover this functionality along with bot tests, else it is progressively harder to make sure the workflows remain valid.

Despite general problems with complicated requirements, this also increases the container size, which is rather big in any case.

It's true that it's better to do it from outside of the container, within oda-bot.

@ferrigno
Copy link
Contributor Author

ferrigno commented Sep 7, 2024

The import of nb2workflow is as suggested in the development guide ... at least when I started this project. The dispatcher plug-in is commented out. I let you discuss and solve this issue, as it goes beyond my field of competence. For me, it is just inconvenient that I cannot run a test as suggested.

@volodymyrss
Copy link
Member

The import of nb2workflow is as suggested in the development guide ... at least when I started this project. The dispatcher plug-in is commented out. I let you discuss and solve this issue, as it goes beyond my field of competence. For me, it is just inconvenient that I cannot run a test as suggested.

We are discussing with @dsavchenko what is the best next step for sustainable implementation, in relation to oda-hub/oda-bot#38 .

If you @ferrigno want to help, you could maybe propose to make a change to the example or manual? To reflect your experience. Would be useful!

@dsavchenko
Copy link
Member

The import of nb2workflow is as suggested in the development guide ...

The nb2workflow is obviously needed, as it's directly used in tests.
But one normally don't need any dispatcher-related dependencies in the session.
Some of them may be eventually automatically used in the deployment process, thought

@volodymyrss
Copy link
Member

@anawas could you please try to run a test in your repository, run it, and see if you have the same issue. If yes (probably you will), try to fix it.

If not, please help @ferrigno with his repository, it's all the same mechanism.

@anawas
Copy link

anawas commented Sep 12, 2024

I can reproduce the error on my machine. Additionally, I get the following warnings/error:

T1="2022-10-22T05:18:22.0" #http://odahub.io/ontology#StartTime --> Unknown datatype for owl_uri http://odahub.io/ontology#StartTime
T2='2022-10-22T23:10:16.0' #http://odahub.io/ontology#EndTime --> Unknown datatype for owl_uri http://odahub.io/ontology#EndTime
...

And there are complaints about unknown owl type uri such as

Unknown owl type uri http://odahub.io/ontology#Float. Creating basic Parameter object.
Unknown owl type uri http://odahub.io/ontology#String. Creating basic Parameter object.
Unknown owl type uri http://odahub.io/ontology#Boolean. Creating basic Parameter object.
Unknown owl type uri http://odahub.io/ontology#String. Creating basic Parameter object.

I am not sure if this is somehow related.

@ferrigno
Copy link
Contributor Author

I corrected some wrong annotations in the parameter cell for both isgri and jemx expert repositories, but not T1 and T2, as they are valid ontology terms, also suggested in the development guide.

If I change the annotations of T1 and T2 into StartTimeISOT, the warning disappears, but the error in running with nb2workflow remains, so there is no relation.

@ferrigno
Copy link
Contributor Author

ferrigno commented Sep 13, 2024

This should be related oda-hub/dispatcher-plugin-nb2workflow#111 . @anawas you can correct that class and it will help this issue.

@anawas
Copy link

anawas commented Sep 19, 2024

I found the line in the code where the exception is raised. The class NB2WProductQuery requires the argument ontology_path (in project dispatcher-plugin-nb2workflow, file queries.py).

  1. What is this ontology_path? I cannot find either a documentation nor a type hint.
  2. And where should this parameter be set?

@volodymyrss
Copy link
Member

I found the line in the code where the exception is raised. The class NB2WProductQuery requires the argument ontology_path (in project dispatcher-plugin-nb2workflow, file queries.py).

1. What is this ontology_path? I cannot find either a documentation nor a type hint.
2. And where should this parameter be set?

It's a path to ontology file to be loaded.
You can see in this case and in other similar cases how this is done in the tests.

You should be able to set a default published ontology path.

Just could you please cross-check that it is used as intended?

@anawas
Copy link

anawas commented Sep 26, 2024

The follwing url return 404 when called. It's in file ontology.py. https://raw.githubusercontent.com/FnOio/fnoio.github.io/master/ontology/0.4.1/function.ttl
What is it for? I wanted to use this url to get a default ontology.

In addition I wonder if this is the cause for the frequent not a turtle: None messages.

@dsavchenko
Copy link
Member

https://raw.githubusercontent.com/FnOio/fnoio.github.io/master/ontology/0.4.1/function.ttl

That's not our default ontology. It was used for somehow unrelated purposes. @volodymyrss may have more insight
Anyway, it's requested in try-except, so it can't raise issue.

Our ontology is served from https://odahub.io/ontology/ontology.ttl and this path is hardcoded as default in the code where applicable.

@volodymyrss
Copy link
Member

Yes, the fno is an old one. They have an update somewhere. But we should use our own (it has the references to upstream where suitable).

@anawas
Copy link

anawas commented Oct 22, 2024

I can now run locally some of the tests here: astronomy/mmoda/mmoda-nb2workflow-example. However, when I run Carlo's notebook (https://gitlab.renkulab.io/astronomy/mmoda/jemx-expert/-/blob/main/test_products.ipynb?ref_type=heads) I get a timeout for the following url, which seems not to be online:
dispatcher-staging.obsuks1.unige.ch

What is it used for? What is it supposed to return? I may need to write a mock for it.

@ferrigno
Copy link
Contributor Author

that URL is available only in the virtual network of the University of Geneva. You can use the public endpoint https://www.astro.unige.ch/mmoda/dispatch-data instead

@ferrigno
Copy link
Contributor Author

or use
'host_type' : 'production'
instead of
'host_type' : 'staging'

The current version of the test NB at
https://gitlab.renkulab.io/astronomy/mmoda/isgri-expert/-/blob/main/test_products.ipynb?ref_type=heads
is updated. I am running it again so that you can get the products from the backend (for independent reasons, they are not anymore cached).

@dsavchenko
Copy link
Member

dsavchenko commented Nov 13, 2024

The PR fixes this #212

@anawas
Copy link

anawas commented Dec 2, 2024

Update

Thanks to the team at EPFL the above mentioned error (TypeError: Object of type type is not JSON serializable) has disappeared. The serialization issue seems to be solved. I am now tracing another bug. The following error pops up when running test_products.ipynb:

FAILED test_products.py::test_workflow - KeyError: 'osa_mosaic'

Log output

KeyError                                  Traceback (most recent call last)
Cell In[12], line 4
      2 instruments = ['jemx%d' % jemx_unit]
      3 pr.report_progress(stage='Cleaning', progress=80)
----> 4 common_functions.clean_mosaics(instruments, revs, isdc_sources, logger, parameters)

File ~/mmoda/jemx-debug/oda_api_wrapper/oda_integral_wrapper/common_nb_tools.py:385, in clean_mosaics(instruments, observations, isdc_sources, logger, parameters)
    382 for instr in instruments:
    383     for source in observations:
--> 385         data = source['%s_raw_mosaic' % instr]
    387         if type(data) is str or data is None:
    388             continue

KeyError: 'jemx1_raw_mosaic'

@ferrigno
Copy link
Contributor Author

ferrigno commented Dec 6, 2024

@burnout87
Copy link
Contributor

burnout87 commented Dec 17, 2024

With the last re-build, and after the renku incident was resolved, we now get this error when requesting a lilghtcurve for isgri-expert

image

https://cdci.sentry.io/issues/6152756783/?project=1467382&query=is%3Aunresolved&referrer=issue-stream&statsPeriod=24h&stream_index=3&utc=true

@volodymyrss
Copy link
Member

Yeah, any idea what it is?

@burnout87
Copy link
Contributor

burnout87 commented Dec 18, 2024

Unfortunately Sentry is not helping much. I noticed the same error on the french staging too

@dsavchenko
Copy link
Member

So it's an issue in the dispatcher? Notebook is executed from top to bottom?

I haven't yet reproduced on staging (because of some unrelated issue)

@burnout87
Copy link
Contributor

burnout87 commented Dec 18, 2024

So it's an issue in the dispatcher?

Probably yes

Notebook is executed from top to bottom?

Yes, I also got the done email, and I could verify the notebook on the mmoda frontend

@dsavchenko
Copy link
Member

I confirm on staging. Logs also don't give much insight

@volodymyrss
Copy link
Member

So we need to add some code to extract better insight, both in sentry and also in the logs.

@ferrigno
Copy link
Contributor Author

It might be related, so I write here this information.

When I run the test notebook https://gitlab.renkulab.io/astronomy/mmoda/isgri-expert/-/blob/main/test_products.ipynb?ref_type=heads
from a renkulab session
https://renkulab.io/projects/astronomy/mmoda/isgri-expert/sessions/new?autostart=1
of isgri-expert

I have the error:
unable to import dispatcher_plugin_nb2workflow.queries.NB2WProductQuery: failed to find libmagic. Check your installation
after the notebook has run completely without errors.

However, libmagic is present and with the latest version
libmagic==1.0

@burnout87
Copy link
Contributor

burnout87 commented Dec 19, 2024

With the last re-build, and after the renku incident was resolved, we now get this error when requesting a lilghtcurve for isgri-expert

image

https://cdci.sentry.io/issues/6152756783/?project=1467382&query=is%3Aunresolved&referrer=issue-stream&statsPeriod=24h&stream_index=3&utc=true

I managed to reproduce the issue locally.

In particular, it happens inside the plugin, here:

https://github.com/oda-hub/dispatcher-plugin-nb2workflow/blob/bfeddaeebd18e88694da4ca7bbaf6cd1f32a74fd/dispatcher_plugin_nb2workflow/products.py#L340

Because the units var is None, given that du.units_dict is also None. Can @ferrigno confirm if this can happen ?

In any case, I will provide a fix.

@ferrigno
Copy link
Contributor Author

the units_dict is an attribute of NumpyDataUnit which is initialized at None. Normally, in the light curve, it should be initialized to something else, but in principle it can be None.

@ferrigno
Copy link
Contributor Author

do you have a PR?

@burnout87
Copy link
Contributor

do you have a PR?

oda-hub/dispatcher-plugin-nb2workflow#121

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants