Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoint to delete an account #194

Open
PGijsbers opened this issue Sep 23, 2024 · 2 comments
Open

Endpoint to delete an account #194

PGijsbers opened this issue Sep 23, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@PGijsbers
Copy link
Contributor

Currently a manual process.
@joaquinvanschoren can you let me know what "Deleting an account" currently means?

@PGijsbers PGijsbers added the enhancement New feature or request label Sep 23, 2024
@joaquinvanschoren
Copy link
Contributor

It means:

  • Removing the account from the user database and other stores (e.g. elasticsearch)
  • If the user has uploaded datasets, flow, runs, they should either be 'removed' or the uploader should be anonymized. Right now we need to ask authors what to do (luckily it has not yet come up that people also had created resources, so we could simply delete them).

How this can be implemented in practice:

  • A first step could be that authors get the option to remove their accounts only when they have no resources. That covers most use cases. If they do have resources, they'll get an error message that they need to remove those resources first.
  • Next, we can make this an option that users can select. The default being that resources remain and the uploader becomes anonymous. If they opt for removal we can remove the resources for them.
  • According to FAIR principles, it must be possible for people to remove data, but a record (meta-data) of the dataset should remain available. This is akin to archiving. E.g. for datasets, that means that the dataset still exists in the database, but the data itself is deleted. For flows and runs it is more vague since they are all meta-data. But e.g. models attached to a run could still be removed.
  • I believe this means that we should probably update the dataset delete option to allow removing a dataset while maintaining the meta-data. At the moment, I believe that the API will only actually delete a dataset if there are no resources dependent on it.
  • We should update the terms and conditions to make this more clear I think.

We should definitely discuss more. Maybe we should split it up into two issues, one for removing users without resources (which we can do fast), and one where we also take care of properly removing/archiving or anonymizing their resources.

@PGijsbers
Copy link
Contributor Author

PGijsbers commented Sep 23, 2024

Thanks for the elaboration. The deletion of datasets themselves is interesting. On the one hand, people should be able to. On the other, entities not owned by the same users may depend on it (tasks, benchmarking suites). That goes against the promise of availability. On top of that, other users may have forked the data, rendering deletion useless to some degree. Definitely worth having a discussion about and also exploring the legal obligations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants