🌍 Multilingual Sentiment Analysis

A multilingual sentiment analysis web application that classifies input text as Positive, Neutral, or Negative using a transformer-based NLP model.

Built with Hugging Face Transformers and Gradio, and deployed on Hugging Face Spaces.

👨‍💻 Author

K. Siddhartha — AI / NLP Developer

🔗 GitHub: https://github.com/k-siddhartha-ai 🤗 Hugging Face: https://huggingface.co/Siddhartha001

🚀 Overview

This project demonstrates real-time sentiment analysis across multiple languages using a pretrained multilingual transformer model. Users can enter text in different languages and instantly receive:

Sentiment label
Confidence score

The goal of this project is to showcase practical deployment of multilingual NLP models using a lightweight web interface.

🧠 Model

Model: cardiffnlp/twitter-xlm-roberta-base-sentiment Architecture: XLM-RoBERTa (Transformer Encoder)

Why this model?

Supports sentiment analysis across ~100 languages
Strong multilingual generalization capability
Pretrained on large-scale Twitter datasets
Balanced trade-off between accuracy and inference speed

Output Classes

Positive
Neutral
Negative

🧱 Architecture

Frontend: Gradio Interface Inference Layer: Hugging Face Transformers Pipeline Model: XLM-RoBERTa Multilingual Transformer Deployment: Hugging Face Spaces (CPU)

Flow:

User Input → Tokenizer → Transformer Model → Sentiment Prediction → UI Output

✨ Features

🌐 Multilingual sentiment detection (~100 languages)
📊 Confidence score for predictions
⚡ Real-time inference
🖥️ Interactive web interface
☁️ Cloud deployment via Hugging Face Spaces

🛠️ Tech Stack

Python
Hugging Face Transformers
Gradio
PyTorch (via Transformers)

▶️ Run Locally

Clone the repository and install dependencies:

pip install -r requirements.txt
python app.py

Then open the local Gradio URL shown in the terminal.

🌐 Live Demo

Hugging Face Space:

https://huggingface.co/spaces/Siddhartha001/multilingual-sentiment-analysis

📸 Demo Preview

⚡ Performance Notes

Model size: ~1GB
CPU inference latency: ~1–3 seconds per request
Maximum input length: 512 tokens

📌 Limitations

Model is trained primarily on Twitter data; performance may vary on long formal text.
Very long inputs are truncated to 512 tokens.
Sarcasm, slang, or mixed-language content may reduce accuracy.

🔮 Future Improvements

GPU acceleration for faster inference
Language detection before prediction
Batch input support
Sentiment visualization UI

📄 License

This project uses a pretrained model provided by CardiffNLP under its respective license.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
Screenshot 2026-02-07 222901.png		Screenshot 2026-02-07 222901.png
Screenshot 2026-02-07 223037.png		Screenshot 2026-02-07 223037.png
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌍 Multilingual Sentiment Analysis

👨‍💻 Author

🚀 Overview

🧠 Model

Why this model?

Output Classes

🧱 Architecture

✨ Features

🛠️ Tech Stack

▶️ Run Locally

🌐 Live Demo

📸 Demo Preview

⚡ Performance Notes

📌 Limitations

🔮 Future Improvements

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌍 Multilingual Sentiment Analysis

👨‍💻 Author

🚀 Overview

🧠 Model

Why this model?

Output Classes

🧱 Architecture

✨ Features

🛠️ Tech Stack

▶️ Run Locally

🌐 Live Demo

📸 Demo Preview

⚡ Performance Notes

📌 Limitations

🔮 Future Improvements

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages