-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEATURE: Incompatible queries for linking artifact tracking and runs #1096
Comments
Hi @ljstrnadiii Thanks for the feedback! I definitely see your point. This behaviour is strange. I've passed this feedback to the engineering so they can investigate this for you and I'll keep you updated. |
Hey @ljstrnadiii I just heard from the product team, and they mentioned that it's a known issue and that it's in the backlog. Workaround ✅Make sure that all runs in the project and the project-level metadata have the same namespace to artifacts. For example: import neptune.new as neptune
project = neptune.init_project()
project["dataset"].track_files("path_to_files")
run = neptune.init_run()
run["dataset"] = project["dataset"].fetch() Run to run import neptune.new as neptune
run = neptune.init_run()
run["dataset/v1.0"].track_files(“path_to_files”)
run_2 = neptune.init_run()
run_2["dataset/v1.0"] = run["dataset/v1.0"].fetch() |
@ljstrnadiii To let you know, in the coming months we would be working on improvements in the artifacts that would solve this issue, so that it behaves as you would imagine for it to behave. If you have any more comments, inputs, or questions on the artifacts, feel free to contact me at [email protected], especially since the scope of the incoming changes still clarifies, we are looking for everything that could help us to improve the experience of the users better. |
Describe the bug
This is not necessarily a bug, but an issue that does not allow me to take advantage of linked runs to datasets in project metadata and vice versa.
When you log an artifact to metadata it displays the number of runs using that artifact. You can also go to the dataset registered within a run and it shows the number of runs using that dataset. One issue is that the two views make two possibly incompatible queries to populate the query.
Reproduction
Log an artifact to project metadata:
project_meta['datasets/some/sub/dir/dataset'].track_files(...)
and then I link to the dataset to a run
run['dataset'] = project_meta['datasets/some/sub/dir/dataset'].fetch()
When you navigate to the metadata and look at runs used and click it, you query runs by
datasets/some/sub/dir/dataset
, which means a run has to have a keydatasets/some/sub/dir/dataset
, but we just call it dataset so that we don't have to click into the nested structure that is desired in our project metadata tracking. So, it is not possible to track which runs use the artifact from the metadata runs used to link to a query.I can always make a manual query, but the link to the query is misleading.
Expected behavior
I suppose it would be hard to expect the query to know which key you link the dataset to in the run, which would make generating the query in the runs used link pretty challenging. Nonetheless, I would expect the runs used link in the dataset tracked in meta data to avoid telling me it is used by 0 runs because that would be invalid in this case.
This is somewhat user error, but also there is an implicit assumption that is not obvious to the user about the structure of which keys are used in the run to link to a tracked artifact. I can always just create a query to find all runs after copy-pasting the hash and make sure I am consistent with what I call "dataset" in the run.
To solve this, can we simply specify which key we promise to use when linking tracked artifacts in runs? Something like
project_meta['datasets/some/sub/dir/dataset'].track_files("s3://...", run_key='training_dataset')
and then update the hyperlink to query withtraining_dataset = <hash>
?An example of linking the artifact to a run could then just be:
By hyperlink to query I mean the 0 runs hyperlink you see in the screen shot below
The text was updated successfully, but these errors were encountered: