Cyberbullying

Detecting instances of cyberbullying in text-based communications.

ML
Python+Django

Introduction

In today's interconnected world, cyberbullying has emerged as a pervasive and harmful phenomenon, affecting individuals of all ages across various online platforms. I am deeply committed to combating cyberbullying by leveraging the power of machine learning and web development.

To address this pressing concern, I embarked on a journey to develop an intelligent solution that can identify and mitigate instances of cyberbullying in online communication platforms.

Project Overview

My aim is to develop a robust model capable of detecting instances of cyberbullying in text-based communications such as social media posts, comments, and messages. By harnessing machine learning algorithms trained on labelled datasets of cyberbullying content, I am confident that we can make a significant impact in curbing this harmful behaviour online.

Key Features

1. Machine Learning Model Integration: Employing Logistic Regression, a powerful classification algorithm, I trained a model capable of analyzing textual data and predicting whether it contains instances of cyberbullying behavior.

2. Natural Language Processing (NLP): Utilizing NLP techniques, I implemented various preprocessing steps to enhance the effectiveness of the model. These steps include:

  • Removing non-alphabetic characters: Eliminating irrelevant symbols and characters from the text data.
  • Tokenizing words: Breaking down sentences into individual words or tokens for analysis.
  • Removing stop words: Filtering out common words that do not carry significant meaning.
  • Lemmatizing words: Reducing words to their base or root form to standardize vocabulary.

3. Web Application Development: Leveraging the Django framework, I developed a user-friendly web application interface that allows users to:

  • Input text data for analysis.
  • Receive real-time feedback on the likelihood of cyberbullying behavior in the input text.

Dataset Information

The dataset used for training and testing the model consists of 24,784 instances of text data, each labeled with corresponding categories indicating whether it is offensive or non-offensive.

Conclusion

With the proliferation of online communication channels, the need for effective cyberbullying detection mechanisms has never been more critical. Through this project, I have demonstrated my commitment to leveraging technology for social good by creating a solution that empowers individuals and communities to combat cyberbullying effectively.

Written by

Amrutha Aneesh

Amrutha Aneesh