印
Twitter Sentiment Analysis
CPE342 Mini Project

Impact
Automated emotional tone classification of social media text with 85%+ precision.
Overview
The Challenge: Listening to the Digital Noise
Social media is a firehose of raw human emotion. For brands and researchers, manually reading through thousands of tweets to gauge public sentiment is impossible. We needed a scalable, automated way to "listen" to the digital noise and categorize it into meaningful emotional buckets: Positive, Negative, or Neutral.
The Solution: A NLP Classification Pipeline
We built a robust sentiment analysis system that transforms unstructured text into structured emotional insights. Using Scikit-learn as our engine and Streamlit as our interactive dashboard, we created a tool that provides real-time analysis for any user input.

Figure: Twitter Sentiment Analysis Hero
Technical Architecture
The system follows a classic Machine Learning pipeline, optimized for short-form text processing.


Figure: System Architecture
1. Text Preprocessing Pipeline
Tweets are messy. Our pipeline cleans the "digital noise" before it hits the model:
- Tokenization: Splitting sentences into individual words.
- Stopword Removal: Filtering out common words (the, is, at) that don't carry emotional weight.
- Stemming: Reducing words to their root form (e.g., "loving" -> "love") to decrease vocabulary dimensionality.
2. Multi-Model Benchmarking
We evaluated several architectures focusing on accuracy, training speed, and deployment efficiency. We implemented a comparison framework to find the best performer for the task.
- Naive Bayes: Fast and surprisingly effective for text classification.
- Logistic Regression: Provides clear probabilistic interpretations of sentiment.
- Random Forest: Captures non-linear relationships between word combinations.
Dataset
We utilized the Kaggle Sentiment Analysis Dataset, which provides a rich diversity of labeled tweets, including metadata like time of posting and user demographics.
Interactive Web Application
Built with Streamlit, the application enables users to:
- Real-time Prediction: Type any tweet and get an instant sentiment classification.
- Confidence Scores: Visualize how "sure" the model is about its prediction.
- Model Selection: Toggle between models (NB, LR, RF) to see how they interpret the same text differently.

Figure: Application Interface
Why This Matters
This project serves as a foundation for understanding how NLP can be used to monitor brand health, public opinion, and social trends. By combining powerful ML models with an intuitive UI, we bridge the gap between complex data science and accessible user interfaces.
NLPMachine LearningStreamlitScikit-Learn
Gallery Overview



Siwarat Laoprom © 2026