SemOOD

overview

The project currently contains an evaluation code for 2 datasets containing hard examples for Vision language models (VLM's)
The 2 known benchmarks are

sugar-crepe (available at here
MMBenchmark (available at OpenCompass Project page)

Citations

"...the go to statement should be abolished..." [1].

References

[1]

experiments

according to a few executions, the prompt

"Question: The following is a multiple choice question. Choose an answer by it's number
     ...: \n 1.There is a tower in the image\n 2. There is a castle in the image.\n Answer:"

returns always A.

https://discuss.huggingface.co/t/trying-to-understand-system-prompts-with-llama-2-and-transformers-interface/59016

further notes issues and FAQs

For installation and running of LLaMA2 using transformers python package (by Huggingface.co) you will need to create a read token for the LLaMA resource. More information can be found here

blogpost

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.github/workflows		.github/workflows
.vscode		.vscode
SEED-Bench		SEED-Bench
data		data
docs		docs
files/configs		files/configs
results		results
scores		scores
scripts		scripts
sugar-crepe		sugar-crepe
.gitignore		.gitignore
LICENSE		LICENSE
LLaVA.py		LLaVA.py
README.md		README.md
answers.csv		answers.csv
environment.yaml		environment.yaml
eval_rephrasing.sh		eval_rephrasing.sh
filter_pattern		filter_pattern
meta.yaml		meta.yaml
mistral.py		mistral.py
mistral_model_draft.py		mistral_model_draft.py
scores_for_InstructBLIP_under_InstructBlipAnswerByRephrasing.csv		scores_for_InstructBLIP_under_InstructBlipAnswerByRephrasing.csv
scores_for_InstructBLIP_under_InstructBlipBaseline.csv		scores_for_InstructBLIP_under_InstructBlipBaseline.csv
scores_for_InstructBLIP_under_IntructBlipEvalByHub.csv		scores_for_InstructBLIP_under_IntructBlipEvalByHub.csv
scores_for_None_under_Blip2AnswerByQuestionRephrasing.csv		scores_for_None_under_Blip2AnswerByQuestionRephrasing.csv
scores_for_blip2_under_InstructBlipBaseline.csv		scores_for_blip2_under_InstructBlipBaseline.csv
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SemOOD

overview

Citations

References

experiments

further notes issues and FAQs

About

Releases

Packages

Languages

License

idan-tankel/SemOOD

Folders and files

Latest commit

History

Repository files navigation

SemOOD

overview

Citations

References

experiments

further notes issues and FAQs

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages