An interactive NLP-powered application that performs sentiment and readability analysis on raw text or article URLs. Built with Python and Streamlit, it allows users to upload datasets, supply custom stopword and dictionary files, and download detailed analysis results in Excel format.
- Accepts
.csvor.xlsxfiles with either text or URLs - Uses NLP techniques to analyze sentiment and structure of the text
- Calculates metrics like Polarity Score, Subjectivity, FOG Index, Complex Word Count, etc.
- Allows users to upload multiple stopword files and separate positive/negative dictionaries
- Offers real-time previews and Excel export of results
- Positive Score / Negative Score
- Polarity Score:
(pos - neg) / (pos + neg) - Subjectivity Score:
(pos + neg) / total_words - Avg Sentence Length
- Percentage of Complex Words (words with > 2 syllables)
- FOG Index
- Syllables Per Word
- Personal Pronouns Count
- Avg Word Length
- Tokenization (NLTK) to split text into words and sentences
- Stopword Removal using custom files to clean text
- Lexicon-based Sentiment Analysis with user-defined dictionaries
- Regex-based Parsing for pronouns and syllables
- Structural Readability Metrics using psycholinguistic heuristics (e.g., FOG index)
git clone https://github.com/yourusername/semantic-readability-analyzer.git
cd semantic-readability-analyzerpython -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txtstreamlit run streamlit_app.pysemantic-readability-analyzer/
├── streamlit_app.py # Main UI logic
├── analyzer.py # All NLP logic and metric computations
├── scraper.py # Article extraction from URLs
├── requirements.txt # Python dependencies
.csvor.xlsxfile with a text or URL column- One or more
.txtstopword files - One
.txtpositive dictionary file - One
.txtnegative dictionary file
- Preview of top 10 analyzed rows
- Downloadable
.xlsxfile with all computed scores
- Python 3.10+
- Streamlit
- Pandas
- BeautifulSoup
- NLTK
- Financial sentiment analysis on scraped articles
- Readability scoring for educational or legal text
- Preprocessing engine for larger NLP pipelines
Chirag Dahiya
This project reflects my interest in building practical NLP applications that combine rule-based linguistics with usable, deployable tools.
MIT License. Feel free to use, modify, and build on top of this project.