Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raises error when dataset is an empty list in NanoBEIREvaluator #3122

Merged
merged 2 commits into from
Dec 10, 2024

Conversation

JINO-ROHIT
Copy link
Contributor

Raises an error when dataset names are a empty list in NanoBEIREvaluator.

Fixes #3103

Who can review?
@tomaarsen

@tomaarsen tomaarsen requested a review from Copilot December 9, 2024 09:31

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated no suggestions.

@@ -420,6 +420,8 @@ def _load_dataset(self, dataset_name: DatasetNameType, **ir_evaluator_kwargs) ->
)

def _validate_dataset_names(self):
if self.dataset_names == []:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.dataset_names == []:
if len(self.dataset_names) == 0:

This is a personal preference, but I prefer checking lengths like this :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on second thought, theres also this case when datasets = [''], an empty string,

currently it recognizes as an invalid dataset

ValueError: Dataset(s) [''] not found in the NanoBEIR collection.Valid dataset names are: ['climatefever', 'dbpedia', 'fever', 'fiqa2018', 'hotpotqa', 'msmarco', 'nfcorpus', 'nq', 'quoraretrieval', 'scidocs', 'arguana', 'scifact', 'touche2020']

would it be nicer to consider it as an empty dataset and use this check instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have to worry about this case, I don't think many people would use [""] and if they do, the error is already pretty descriptive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool then

@tomaarsen tomaarsen merged commit 58d68ac into UKPLab:master Dec 10, 2024
9 checks passed
@tomaarsen
Copy link
Collaborator

Thanks again!

  • Tom Aarsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NanoBeirEvaluator takes in empty dataset
2 participants