Masader Bot

Overview

A Python script that searches arXiv for research papers, extracts dataset metadata using AI models, and validates the extracted information using Masader Examples.

Features

Search arXiv papers by keywords, month, and year
Extract detailed metadata using AI models (Claude, ChatGPT, Gemini)
Validate extracted metadata against a reference dataset
Clean and process LaTeX/PDF papers
Compute extraction costs and filling ratio

Requirements

Python 3.8+
Dependencies:
- anthropic
- openai
- google-generativeai
- arxiv
- pdfplumber
- python-dotenv

Installation

git clone https://github.com/arbml/masader-bot.git
cd masader-bot
pip install -r requirements.txt

Usage

python prompt.py -k "arabic dataset" -m 3 -y 2024 -n claude-3-5-sonnet-latest

Benchmarking

The following script can be used to evaluate different models on multiple datasets

python3 compare.py -k "ArabicMMLU,CIDAR" -m 2 -y 2024 -n gemini-1.5-flash,claude-3-5-sonnet-latest

Arguments

-k, --keywords: Comma-separated research keywords
-m, --month: Search month (1-12)
-y, --year: Search year
-n, --model_name: AI model to use (optional)
-c, --check_abstract: Whether to validate abstract (optional)

Output

Generates a results.json with:

Extracted metadata
Extraction cost
Validation score
Configuration details

License

[Add your license here]

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.streamlit		.streamlit
pages		pages
static/results		static/results
.gitignore		.gitignore
README.md		README.md
compare.py		compare.py
constants.py		constants.py
explore.py		explore.py
railway.json		railway.json
requirements.txt		requirements.txt
search_arxiv.py		search_arxiv.py
test_search.py		test_search.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Masader Bot

Overview

Features

Requirements

Installation

Usage

Benchmarking

Arguments

Output

License

About

Releases

Packages

Languages

ARBML/masader_bot

Folders and files

Latest commit

History

Repository files navigation

Masader Bot

Overview

Features

Requirements

Installation

Usage

Benchmarking

Arguments

Output

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages