Welcome to the LLM Projects Archive! This repository serves as a curated collection of projects related to Large Language Models (LLMs). If you are interested in exploring and contributing to projects that leverage LLMs for various applications, you're in the right place.
Large Language Models (LLMs), such as GPT-3, BERT, and others, have revolutionized natural language processing and understanding. This repository aims to bring together a diverse set of projects that utilize LLMs for different purposes. Whether it's text generation, sentiment analysis, summarization, or any other application, this archive aims to showcase the versatility and creativity of LLM-based projects.
-
Introduction to Huggingface, Spacy, and PyTorch:
- Description: This repository provides an introduction to HuggingFace, PyTorch, and Spacy, focusing on building NLP projects. It includes links to several other repositories, each containing notebooks to familiarize oneself with these libraries.
- Repository: Introduction to Huggingface, Spacy, and PyTorch
-
Introduction to LLMs:
- Description: A comprehensive GitHub project offering a beginner-friendly guide and hands-on examples to understand and work with Large Language Models (LLMs) in the field of NLP( Natural Language Processing).
- Repository: Introduction to LLMs
-
Customize Word Embeddings for LLMs:
- Description: It's a GitHub project focused on tailoring word embeddings specifically for Large Language Models, enabling fine-tuned linguistic representations to enhance performance in domain-specific applications
- Repository: Customize Word Embeddings for LLMs
-
Create Word Embeddings using Word2Vec and GloVe:
- Description: This repository focuses on creating word embeddings using Word2Vec and GloVe techniques, utilizing the IMDB dataset for experimentation.
- Repository: Create Word Embeddings using Word2Vec and GloVe
-
Finetune Open-source LLMs on Custom Data:
- Description: It's a GitHub project focused on fine-tuning open-source LLMs on custom datasets, showcasing tailored solutions for enhanced language models.
- Repository: Finetune Open-source LLMs on custom data
-
Build & evaluate Retrieval-Augmented-Generation pipelines:
- Description: A comprehensive GitHub project focused on building and evaluating Retrieval-Augmented-Generation pipelines, enhancing natural language understanding and generation capabilities."
- Repository: Build & evaluate Retrieval-Augmented-Generation pipelines
-
Finetune Llama using QLoRA Method:
- Description: "GitHub project for fine-tuning the Llama language model using the QLoRA (Quantized Low Rank Adaption) method for enhanced natural language understanding."
- Repository: Finetune Llama using QLoRA Method
-
Conversationl AI system using LLMs on E-commerce Data:
- Description: An innovative Conversational AI system leveraging Large Language Models (LLMs) for enhanced customer interaction and support, specifically tailored for E-commerce Data.
- Repository: Conversationl AI system using LLMs on E-commerce Data
-
Text Classification using Transformer Encoder Model:
- Description: This repository focuses on text classification using a Transformer Encoder model, utilizing the IMDB dataset for experimentation.
- Repository: Text Classification using Transformer Encoder Model
-
Language Translation using Transformer Decoder Model:
- Description: This repository contains code for language translation using the Transformer Decoder Model. You'll learn about the Transformer architecture and apply it to a machine translation problem.
- Repository: Language Translation using Transformer Model
-
Text Classification using BERT Model:
- Description: This repository contains code for building a text classification model using the BERT (Bidirectional Encoder Representations from Transformers) Model. The IMDB dataset has been utilized for this experiment.
- Repository: Text Classification using BERT Model
-
Build RAG-pipelines using Llama-Index:
- Description: Build a RAG pipeline using Llama-Index.
- Repository: Build RAG-pipelines using Llama-Index
-
Finetune GPT2 Model on downstream tasks:
- Description: This repository demonstrates how to finetune GPT2 Model on custom dataset.
- Repository: Finetune GPT2 Model on downstream tasks
-
PEFT for Text summarization:
- Description: This repository demonstrates how to PEFT( Parameter-efficient-finetuning) for text summarization task.
- Repository: PEFT for Text summarization
-
Finetune T5-Model_for Text Summary:
- Description: This repository demonstrates how to finetune T5 Model on custom dataset for text summary task.
- Repository: Finetune T5-Model_for Text Summary
-
Finetune Llama2 and Mistral7B using Langchain:
- Description: This repository demonstrates how to finetune Llama2 and Mistral7B Models on custom dataset using Langchain.
- Repository: Finetune Llama2 and Mistral7B using Langchain
-
Text Classification using Naive Bayes Classifier:
- Description: This repository demonstrates text classification using a multi-nomial Naive Bayes Classifier. The IMDB dataset is used for this experiment.
- Repository: Text Classification using Naive Bayes Classifier
-
Build a Custom NER Model using Spacy:
- Description: This repository contains code for building a custom Named Entity Recognition (NER) model using the spaCy library. The Medical NER dataset has been utilized for this experiment.
- Repository: Build a Custom NER Model using Spacy
-
Sentiment Analysis using LSTM Model:
- Description: This repository contains a Python notebook for Sentiment Analysis using the LSTM (Long Short-Term Memory) model. The IMDB dataset has been utilized for this experiment.
- Repository: Sentiment Analysis using LSTM Model
-
Build a Forecasting Model using RNN:
- Description: This repository contains code for building a forecasting model using Recurrent Neural Networks (RNNs). A climate-related dataset has been utilized for this experiment.
- Repository: Build a Forecasting Model using RNN
We encourage you to contribute to this archive by adding your own LLM-related projects or discovering new ones. Follow these steps to contribute:
- Fork the repository.
- Add your project information to the
projects.md
file. Include the project name, a brief description, and the GitHub repository link. - If your project falls into a specific category (e.g., sentiment analysis, chatbots, translation), please categorize it accordingly.
- Submit a pull request.
Please adhere to the contributing guidelines for a smooth collaboration.
This repository and its contents are open-sourced under the MIT License. Feel free to use, modify, and distribute these projects in accordance with the terms specified in the license.
If you encounter any issues or have suggestions for improvement, please open an issue in the Issues section of this repository.
The code has been tested on Windows system. It should work well on other distributions but has not yet been tested. In case of any issue with installation or otherwise, please contact me on Linkedin
I’m a seasoned Data Scientist and founder of TowardsMachineLearning.Org. I've worked on various Machine Learning, NLP, and cutting-edge deep learning frameworks to solve numerous business problems.