Skip to content

GitHub Org's stars Twitter Follow Hugging Face

ARBML is a group of researchers working on democratizing Arabic NLP research and deveopment:

  • 🙋‍♀️ All about Arabic NLP and ML, open source for the win!
  • 🏵️ Contribution guidelines - open an issue and given the go-ahead submit a PR.
  • 👩‍💻 Some repos have specific contribution guidlines.
  • 📝 Remember to cite if you use one of our resources.

Pinned Loading

  1. ARBML ARBML Public

    Implementation of many Arabic NLP and CV projects. Providing real time experience using many interfaces like web, command line and notebooks.

    JavaScript 395 45

  2. klaam klaam Public

    Arabic speech recognition, classification and text-to-speech.

    Jupyter Notebook 376 74

  3. masader masader Public

    The largest public catalogue for Arabic NLP and speech datasets. There are +500 datasets annotated with more than 25 attributes.

    JavaScript 153 25

  4. Calliar Calliar Public

    A dataset for online Arabic calligraphy. A collection of 2500 annotated calligraphic styles.

    Jupyter Notebook 146 16

  5. tkseem tkseem Public

    Arabic Tokenization Library. It provides many tokenization algorithms.

    Jupyter Notebook 95 18

  6. CIDAR CIDAR Public

    Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.

    Jupyter Notebook 33 3

Repositories

Showing 10 of 32 repositories
  • masader_bot Public
    ARBML/masader_bot’s past year of commit activity
    Python 0 0 1 0 Updated Dec 25, 2024
  • masader_form Public
    ARBML/masader_form’s past year of commit activity
    Python 0 0 0 0 Updated Dec 23, 2024
  • masader Public

    The largest public catalogue for Arabic NLP and speech datasets. There are +500 datasets annotated with more than 25 attributes.

    ARBML/masader’s past year of commit activity
    JavaScript 153 GPL-3.0 25 3 0 Updated Dec 22, 2024
  • ARBML/masader-webservice’s past year of commit activity
    Python 5 MIT 5 2 1 Updated Dec 21, 2024
  • Calliar Public

    A dataset for online Arabic calligraphy. A collection of 2500 annotated calligraphic styles.

    ARBML/Calliar’s past year of commit activity
    Jupyter Notebook 146 MIT 16 2 0 Updated Jun 24, 2024
  • dar Public

    A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.

    ARBML/dar’s past year of commit activity
    Python 11 Apache-2.0 1 1 0 Updated Jun 23, 2024
  • ARBML/arbml.github.io’s past year of commit activity
    HTML 0 2 1 0 Updated May 10, 2024
  • .github Public
    ARBML/.github’s past year of commit activity
    1 1 0 0 Updated Apr 13, 2024
  • CIDAR-v2 Public
    ARBML/CIDAR-v2’s past year of commit activity
    Jupyter Notebook 5 1 2 0 Updated Mar 30, 2024
  • ARBML/cidar_human_eval’s past year of commit activity
    Python 1 1 1 0 Updated Mar 3, 2024