Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce graphs from original data #8

Open
krlmlr opened this issue Jun 16, 2023 · 14 comments · Fixed by #9
Open

Reproduce graphs from original data #8

krlmlr opened this issue Jun 16, 2023 · 14 comments · Fixed by #9
Assignees

Comments

@krlmlr
Copy link
Contributor

krlmlr commented Jun 16, 2023

Done by inst/getdata.R, I have trouble recreating yeast and USairports .

  • The Nature publications required for yeast are not found, cannot open URL 'https://www.nature.com/nature/journal/v417/n6887/extref/nature750-s1.doc': HTTP status was '404 Not Found', new location linked at https://www.nature.com/articles/nature750#Sec8
  • USairports relies on ~/Downloads/1067890998_T_T100D_SEGMENT_ALL_CARRIER.csv

@gaborcsardi: Can you help with the .csv file?

Moving forward, I propose to separate the downloads from the creation of the graphs so that we're less reliant on external data storage here.

@gaborcsardi
Copy link
Contributor

I don't have that CSV any more, sorry.

@krlmlr
Copy link
Contributor Author

krlmlr commented Jun 16, 2023

Thanks. We could create that CSV from the graph, I suppose.

I'm thrilled that the other graphs seem to be recreatable without problems. We'll need to compare the results, though.

@szhorvat
Copy link
Member

Just to understand, is it necessary to recreate the networks from the original source or are you just testing the scripts that do so?

@krlmlr
Copy link
Contributor Author

krlmlr commented Jun 17, 2023

No, it's not necessary, just good practice.

@maelle
Copy link

maelle commented Jun 19, 2023

@krlmlr

Moving forward, I propose to separate the downloads from the creation of the graphs so that we're less reliant on external data storage here.

Do you mean the package should contain a copy of the datasets? Or we store them somewhere else?

@krlmlr
Copy link
Contributor Author

krlmlr commented Jun 19, 2023

I don't mind a copy of the raw data on GitHub.

@maelle
Copy link

maelle commented Jun 19, 2023

so the task is to try and locate the yeast and airport data?

how do we compare the obtained graphs?

@maelle
Copy link

maelle commented Jun 19, 2023

reg the yeast data, it probably is in https://www.nature.com/articles/nature750 but I don't have access (and even if we get access I suppose the data isn't really free to share 😅 )

@krlmlr
Copy link
Contributor Author

krlmlr commented Jun 19, 2023

Yes, yeast seems solved. The only tricky challenge is to reverse-engineer the CSV file, but I'm not sure it's worth it.

@maelle
Copy link

maelle commented Jun 19, 2023

so what are the TODOs?

@krlmlr
Copy link
Contributor Author

krlmlr commented Jun 19, 2023

  • Find a place for the raw files in this repo
  • Split the getdata.R script, perhaps one unified download script, and one script per dataset?
  • Make sure that the generated data is unchanged

@krlmlr krlmlr closed this as completed in #9 Jun 19, 2023
@maelle
Copy link

maelle commented Jun 19, 2023

Wait so we need 3 new issues, this issue is not really closed then?

@krlmlr krlmlr reopened this Jun 19, 2023
@krlmlr
Copy link
Contributor Author

krlmlr commented Jun 19, 2023

One issue is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants