A machine learning-powered translation system for converting English public notices and official documents to Hindi using IndicTrans2, deployed as an interactive web application.
This project addresses the critical need for accessible multilingual public information in India by providing accurate English-to-Hindi translation specifically optimized for:
- Government notices and announcements
- Official documents and circulars
- Public service information
- Legal and administrative text
Live Demo: Click here for live demo.
- High-Quality Translation: Powered by AI4Bharat's IndicTrans2 model
- Domain-Specific Optimization: Fine-tuned on public notice terminology
- Interactive Web UI: Built with Gradio for easy access
- Real-time Processing: Instant translation with user-friendly interface
- Production Deployment: Hosted on Hugging Face Spaces with 99%+ uptime
Visit the live deployment - no installation required!
Clone repository git clone https://github.com/UtkarshSingh31/english-to-hindi-translation-software.git cd english-to-hindi-translation-software
Create virtual environment uv venv source venv/bin/activate # On Windows: .venv\Scripts\activate
Install dependencies uv pip install -r requirements.txt
Run application uv run app.py
| Component | Technology |
|---|---|
| ML Framework | PyTorch, Transformers (Hugging Face) |
| Translation Model | Helsinki Model |
| Web Interface | Gradio 4.44.1 |
| Deployment | Hugging Face Spaces |
| Language | Python 3.10+ |
| Data Processing | Pandas, NumPy |
- Base Model:
Helsinki-NLP/opus-mt-en-hi - Architecture: Transformer-based neural machine translation
- Training Data: Custom dataset of 4000+ English-Hindi public notice pairs
- Performance: Optimized for formal and administrative language
- Collection: Scraped and curated public notices from government sources
- Cleaning: Removed duplicates, fixed encoding issues, standardized formatting
- Preprocessing: Tokenization, normalization, quality filtering
- Training: Fine-tuning on domain-specific corpus
- Navigate to the live demo
- Enter English text in the input box
- Click "Submit" or press Enter
- View Hindi translation in real-time
| Resource | URL |
|---|---|
| Live Demo | Hugging Face Space |
| GitHub Repository | link |
| Base Model | Helsinki-NLP-opus-en-hi on Hugging Face |
| Dataset | Custom Public Notices Dataset |
| Developer | Utkarsh Singh |
- Government Agencies: Translate official notices for bilingual publication
- Educational Institutions: Disseminate announcements to diverse audiences
- Legal Professionals: Convert administrative documents
- Public Services: Improve accessibility of citizen-facing information
- Research: Multilingual NLP and translation studies
- Support for additional Indian languages (Tamil, Telugu, Bengali)
- Batch translation for large documents
- API endpoint for programmatic access
- Mobile app integration
- Translation quality metrics and user feedback
- Docker containerization for portable deployment
- CI/CD pipeline with automated testing
Contributions are welcome! Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- AI4Bharat for the IndicTrans2 model
- Hugging Face for hosting and transformers library
- Gradio for the intuitive web interface framework
- Government of India for public domain training data
Utkarsh Singh
- GitHub: @utkarshsingh0013
- Hugging Face: @utkarshsingh0013
- LinkedIn: LinkedIn
- Email: singhutkarsh.1013@gmail.com
For questions, suggestions, or collaborations:
- Open an issue
- Email: singhutkarsh.1013@gmail.com
Made with ❤️ for multilingual India