Skip to content

Detecting TOS violations and AI content in social media.

License

Notifications You must be signed in to change notification settings

bdasgupta02/is4242-group8

 
 

Repository files navigation

Logo

PowerText

Automatic moderation for text in social media through PowerText: Terms of service violation and AI content detection
View the Demo Application for Content Regulators »
View the Video Demo for Common Users » »

This project was created using publicy available APIs and was created for educational reasons. Due to the nature of this project, this repository may include harmful or inappropriate content. Contents of this project should ONLY be used for NON-COMMERICAL reasons.

Table of Contents

  1. Project Overview
  2. Authors
  3. Codes and Resources Used
  4. Getting Started
  5. Source Layer
  6. Data Processing Layer
  7. Development Layer
  8. Deployment Layer
  9. Feedback Layer
  10. Final Deliverables
  11. Acknowledgements

PowerText Implementation

Project Overview:

The objective of this project is to develop an automated content moderator for text content in social media platforms. Our proposed system is designed to accurately identify and flag content that violates terms of service in various categories such as hate speech, cyberbullying, and advertisements, while also being capable of distinguishing between human-generated and AI-generated content. To achieve this, we will leverage four commonly used Natural Language Processing (NLP) algorithms, namely NaiveBayes, PassiveAggressive, XGBoost, CNN, LSTM, GRU, and Transformers.

For our models, we will utilize a self-tagged dataset scrapped from Reddit posts (A global social media platform with diverse user-generated content), and a high-quality dataset of AI-generated content using GPT-3.5 and GPT-4. This combined dataset ensures comprehensive training on a variety of real-world posts, ensuring accuracy, effectiveness, and applicability across all the domains for social media.

The system offers two distinct end products: an automated content collection and screening service for social-media platforms, and a user side plug-in for post/comments check. Successful implementation will bring huge benefits to social media platforms, content creators and users, fostering a safe and healthy online environment.

Solution Architecture:

Solution Architecture

Keywords:

Data Pipeline, Sentiment Analysis, MNB, PassiveAggressive, XGBoost, Transformers, RoBERTa, hateBERT, CNN, LSTM, Hugging Face, Natural Language Processing, TOS Violation Analysis, Web Scraping, Data Visualisation

(back to top)

Authors:

  • Bikramjit Dasgupta
  • Lee Leonard
  • Lin Yongqian
  • Loh Hong Tak Edmund
  • Tang Hanyang
  • Tay Zhi Sheng
  • Wong Deshun

(back to top)

Codes and Resources Used

Python Version: 3.9.10

Built with: Microsoft Visual Studio Code, Google Colab, Streamlit, Django, Git

Notable Packages: praw, pandas, numpy, scikit-learn, xgboost, transformers, pytorch, torchvision, tqdm, streamlit, django (view requirements.txt for full list)

(back to top)

Getting Started

Prerequisites

Make sure you have installed all of the following on your development machine:

  • Python 3.8.X or 3.9.X

(back to top)

Installation

We recommend setting up a virtual environment to run this project.

1. Python Virtual Environment

Installing and Creation of a Virtual Environment

pip install virtualenv
virtualenv <your_env_name>
source <your_env_name>/bin/active

The requirements.txt file contains Python libraries that your notebooks depend on, and they will be installed using:

pip install -r requirements.txt

2. Google Colab Environment

Our deep learning models were primarily trained on Google Colab Pro due to the access to high performance GPUs required for the training of complex neural network systems.

You can set up the Google Colab environment to run our model training codes by executing the following:

from google.colab import drive
drive.mount('/content/drive')
content_path = "insert/path/to/your/data"

You will be prompted to log in with your Google Account. Simply replace the content_path with the path to directory where you have uploaded the data.

Alternatively, if you do not wish to authenticate with your Google Account, you may simply run the following code to retrieve the data from a permanent link:

train_path = 'https://drive.google.com/uc?export=download&id=1ZTfYOXeZLW57mLR7IegIFovi7FW1chS6'
val_path = 'https://drive.google.com/uc?export=download&id=1ZMJI7DyKMLHpHp-HBUO64kWbP6A-qj9k'

train_df = pd.read_csv(train_path)
val_df = pd.read_csv(val_path)

The required additional modules required for each .ipynb notebook runned on Google Colab have been included in each notebook to be installed using pip.

(back to top)

Source Layer

📦WebScrapper
 ┗ 📜reddit_scrapper.py

Data Ingestion Sources

Our team extracted both structured and unstructred data from the following sources:

Source Description Size
Reddit Extracted using the PRAW API endpoint 16828
ChatGPT Generated using ChatGPT 3.5 and 4 6043
Confounding Dataset Manually created to include confounding and additional hate & ads 26676

Reddit Scraper Agent

The Reddit scraper agent can be location in WebScrapper/reddit_scrapper.py

This script defines methods that scrapes posts and comments based on user-specified subreddits, post counts and comment counts.

To initiate a reddit_scraper instance, execute the following Python code:

from reddit_scaper import reddit_scrapper

reddit_agent = praw.Reddit(client_id='my_client_id', client_secret='my_client_secret', user_agent='my_user_agent')
reddit_scrapper(reddit_agent, ['MachineLearning', 'learnmachinelearning', 'GPT'], limit=10, comment_limit=10, topic="machinemind")

🔍To generate and specify the client_id, client_secret and user_agent, please follow the steps detailed here

Storing of Raw Data

Our scraped data are compiled as xlsx and csv files, and stored within the Dataset/ folder, categorised into

  • Reddit Content: Dataset/Reddit Tagged Content
  • AI Content: Dataset/AI Content
  • Confounding Content: Dataset/Additional Data

(back to top)

Data Processing Layer

📦data_processing
 ┣ 📜data_eda.ipynb
 ┗ 📜text_preprocessing.ipynb

Data Preprocessing Notebook

Following the compilation of data from the multiple sources as detailed in the previous section, we proceeded to process the data to prepare it for model training and exploratory data analysis.

The preprocessing pipeline consist of the following steps:

Data Concatenation

  1. Concatenating all datasets into a single DataFrame.
  2. Setting up target columns for each data entry

Text Preprocessing

  1. Remove Punctuations
  2. Convert to lowercase
  3. Remove non-alphanumeric characters
  4. Remove stopwords
  5. Remove extra spaces, new lines, tabs
  6. Lemmetize and Stem text
  7. Remove words with length < 2

To run the Data Preprocessing step, simply run the text_preprocessing.ipynb notebook in data_processing/ folder.

Exploratory Data Analysis

With the processed data, we conducted a comprehensive Exploratory Data Analysis of the combined dataset. The analyses undertaken include:

  1. Class distribution
  2. N-gram analysis
  3. Cluster analysis (Document and word)
  4. Polarity analysis
  5. Perplexity analysis
  6. Burstiness analysis

To view the Data Preprocessing step, simply run the data_eda.ipynb notebook in data_processing/ folder.

⚠️ Do remember to adjust the paths to where you store your dataset on your local machine!

(back to top)

Development Layer

📦model_building
 ┣ 📂baselines
 ┃ ┣ 📂models
 ┃ ┃ ┣ 📜MultinomialNB_model.pkl
 ┃ ┃ ┣ 📜MultinomialNB_model_cv_dict.pkl
 ┃ ┃ ┣ 📜PassiveAggressiveClassifier_model.pkl
 ┃ ┃ ┣ 📜PassiveAggressiveClassifier_model_cv_dict.pkl
 ┃ ┃ ┣ 📜XGBClassifier_model.pkl
 ┃ ┃ ┗ 📜XGBClassifier_model_cv_dict.pkl
 ┃ ┣ 📂roc_curves
 ┃ ┃ ┣ 📜LGBMClassifier_roc_curve.png
 ┃ ┃ ┣ 📜MultinomialNB_roc_curve.png
 ┃ ┃ ┣ 📜PassiveAggressiveClassifier_roc_curve.png
 ┃ ┃ ┣ 📜SGDClassifier_roc_curve.png
 ┃ ┃ ┗ 📜XGBClassifier_roc_curve.png
 ┃ ┣ 📜baseline_models.ipynb
 ┃ ┗ 📜baseline_model_performance_cv.txt
 ┣ 📂CNN
 ┃ ┗ 📜CNN.ipynb
 ┣ 📂GRU
 ┃ ┗ 📜GRU.ipynb
 ┣ 📂LSTM
 ┃ ┗ 📜Single_LSTM_Learning_model.ipynb
 ┣ 📂Transfer_Learning
 ┃ ┣ 📂data
 ┃ ┃ ┗ 📜processed.csv
 ┃ ┣ 📜BERT.ipynb
 ┃ ┣ 📜hateBERT-variant-1.ipynb
 ┃ ┣ 📜hateBERT-variant-2.ipynb
 ┃ ┣ 📜hateBERT-variant-3.ipynb
 ┃ ┗ 📜RoBERTa.ipynb
 ┗ 📂Transformer
 ┃ ┗ 📜Transformer.ipynb

We proceed to build and train a suite of models to predict potential TOS violations within the text input. The following models were built, trained and evaluated:

Baseline Models

  1. Multinomial NaiveBayes Classifier
  2. PassiveAggressive Classifier
  3. XGBoost Classifier

The baseline models training notebook can be found at model_building/baselines/baseline_models.ipynb.

Deep Learning Models

  1. Convolutional Neural Network (CNN)
  2. Gated Recurrent Unit
  3. Long Short-term Memory
  4. Basic Transformer
  5. Bidirectional Encoder Representations from Transformers (BERT)
  6. RoBERTa
  7. hateBERT (multiple varients)

The respective deep learning models training notebooks can be found at model_building/xxx/xxx.ipynb.

Model Precision Recall F1
Multinomial Naive Bayes 0.27 0.17 0.20
PassiveAggressive 0.67 0.55 0.60
XGBoost 0.79 0.42 0.53
CNN 0.33 0.31 0.32
GRU 0.33 0.34 0.33
LSTM 0.32 0.32 0.32
Basic Transformer 0.68 0.52 0.55
BERT 0.85 0.65 0.70
RoBERTa 0.80 0.63 0.68
hateBERT variant 1 0.79 0.79 0.79
hateBERT variant 2 0.89 0.77 0.82
hateBERT variant 3 0.81 0.69 0.72

From initial testing, we observed that hateBERT performs the best for our task and given dataset. To increase robustness in evaluation, we further conducted a 10-fold Cross Validation on individual targets for the hateBERT model using variant 2.

Model Precision Recall F1 n
Hate Speech 0.94 0.95 0.94 1958
Privacy 0.66 0.73 0.69 26
Sexual 0.67 0.57 0.61 46
Impersonation 0.60 0.46 0.52 26
Illegal 0.50 0.57 0.53 28
Advertisement 0.78 0.83 0.80 47
AI Content 0.98 0.98 0.98 684
Neutral 0.94 0.93 0.94 2175

The model weights of the hateBert has been saved for future deployment. It is stored at analysis_system/models/final_model_weights_full.pth

(back to top)

Deployment Layer

Product 1: Automatic Post Analysis Dashboard for Content Moderators

The first product was built using Streamlit, an open-source framework that allows developers to rapidly develop web applications for data science purposes.

To run the Automatic Post Analysis Dashboard, execute the following code in the CLI:

streamlit run analysis_system/Home.py

⚠️ Ensure that you have the Streamlit package installed! Do install the requirements via the requirements.txt if you have not done so!

The application should run on http://localhost:8501/.

Alternatively, you may access the deployed version here.

The structure of the Automatic Post Analysis Dashboard for Content Moderators

📦analysis_system
 ┣ 📂assets
 ┃ ┣ 📜capy.png
 ┃ ┣ 📜logo-colour.png
 ┃ ┗ 📜logo-white.png
 ┣ 📂cache_files
 ┃ ┗ 📜stopwords.txt
 ┣ 📂data_store
 ┃ ┣ 📂feedback
 ┃ ┃ ┗ 📜feedback.csv
 ┃ ┗ 📂scraped
 ┃ ┃ ┗ 📜reddit_content.csv
 ┣ 📂functions
 ┃ ┣ 📂__pycache__
 ┃ ┃ ┣ 📜content_explorer.cpython-39.pyc
 ┃ ┃ ┣ 📜DS.cpython-39.pyc
 ┃ ┃ ┣ 📜model_inference.cpython-39.pyc
 ┃ ┃ ┣ 📜SingleClassifier.cpython-39.pyc
 ┃ ┃ ┗ 📜text_preprocessing.cpython-39.pyc
 ┃ ┣ 📜content_explorer.py
 ┃ ┣ 📜DS.py
 ┃ ┣ 📜model_inference.py
 ┃ ┣ 📜polarity_analysis.py
 ┃ ┣ 📜reddit_scraper.py
 ┃ ┣ 📜SingleClassifier.py
 ┃ ┗ 📜text_preprocessing.py
 ┣ 📂models
 ┃ ┣ 📜final_model_weights_full.pth
 ┃ ┗ 📜PassiveAggressiveClassifier_model_dict.pkl
 ┣ 📂pages
 ┃ ┣ 📜1__Content_Explorer.py
 ┃ ┣ 📜2__Content_Classification.py
 ┃ ┗ 📜3__Sentiment_Polarity_Analysis.py
 ┣ 📂utils
 ┃ ┣ 📂__pycache__
 ┃ ┃ ┣ 📜design_format.cpython-39.pyc
 ┃ ┃ ┗ 📜utility.cpython-39.pyc
 ┃ ┣ 📜design_format.py
 ┃ ┗ 📜utility.py
 ┣ 📜Home.py
 ┗ 📜requirements.txt

Product 2: Text Classification for Common Users

The second product was built using Django, another framework that enables developers to build web apps rapidly.

To run the Text Classification for Commun Users application, execute the following code in the CLI:

python Platform/manage.py runserver

⚠️ Ensure that you have Django installed! Do install the requirements via the requirements.txt if you have not done so!

The application should run on http://localhost:8000/.

The structure of the Text Classification for Common Users

📦Platform
 ┣ 📂core
 ┃ ┣ 📂migrations
 ┃ ┃ ┗ 📜__init__.py
 ┃ ┣ 📜admin.py
 ┃ ┣ 📜apps.py
 ┃ ┣ 📜handler_bert.py
 ┃ ┣ 📜models.py
 ┃ ┣ 📜tests.py
 ┃ ┣ 📜urls.py
 ┃ ┣ 📜views.py
 ┃ ┗ 📜__init__.py
 ┣ 📂Platform
 ┃ ┣ 📜asgi.py
 ┃ ┣ 📜settings.py
 ┃ ┣ 📜urls.py
 ┃ ┣ 📜wsgi.py
 ┃ ┗ 📜__init__.py
 ┣ 📂static
 ┃ ┣ 📂assets
 ┃ ┃ ┣ 📜.DS_Store
 ┃ ┃ ┗ 📜icon.png
 ┃ ┣ 📂css
 ┃ ┃ ┣ 📂webfonts
 ┃ ┃ ┃ ┣ 📜fa-brands-400.ttf
 ┃ ┃ ┃ ┣ 📜fa-brands-400.woff2
 ┃ ┃ ┃ ┣ 📜fa-regular-400.ttf
 ┃ ┃ ┃ ┣ 📜fa-regular-400.woff2
 ┃ ┃ ┃ ┣ 📜fa-solid-900.ttf
 ┃ ┃ ┃ ┣ 📜fa-solid-900.woff2
 ┃ ┃ ┃ ┣ 📜fa-v4compatibility.ttf
 ┃ ┃ ┃ ┗ 📜fa-v4compatibility.woff2
 ┃ ┃ ┣ 📜.DS_Store
 ┃ ┃ ┣ 📜all.css
 ┃ ┃ ┣ 📜all.min.css
 ┃ ┃ ┣ 📜emoji.css
 ┃ ┃ ┣ 📜styles.css
 ┃ ┃ ┗ 📜tailwind.css
 ┃ ┣ 📂js
 ┃ ┃ ┗ 📜app.js
 ┃ ┣ 📜.DS_Store
 ┃ ┗ 📜favicon.ico
 ┣ 📂templates
 ┃ ┣ 📜index.html
 ┃ ┣ 📜user_demo_1.html
 ┃ ┗ 📜user_demo_2.html
 ┣ 📜favicon.ico
 ┣ 📜final_model_weights_full.pth
 ┗ 📜manage.py

(back to top)

Feedback Layer

In this last layer, we intend to collect feedback from content moderators regarding the predicted target violations for each Reddit post. The feedback is collected via the Automatic Post Analysis Dashboard for Content Moderators.

The feedback is collected and saved into a csv file located at: analysis_system/data_store/feedback/feedback.csv.

This feedback will be incorporated into the next model retraining cycle.

(back to top)

Final Deliverables

A copy of the read-only final deliverables (Project report and project presentation slides) are provided in the Deliverables/ folder.

(back to top)

Acknowledgements

We would like to express our gratitude to Prof Sun Chenshuo and Ms Phoebe Chua for their invaluable guidance, support, and insights throughout the development of this project. Their expertise and encouragement have been instrumental in shaping our work.

(back to top)



Copyright (C) 2023. This project was created using publicity available APIs and was created for educational reasons. Any parts of this project should ONLY be used for NON-COMMERICAL reasons. This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses.

About

Detecting TOS violations and AI content in social media.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 61.6%
  • CSS 37.7%
  • Other 0.7%