6.2 KiB
Farsi Transcriber - Quick Start Guide
You now have TWO complete applications for Farsi transcription:
🖥️ Option 1: Desktop App (PyQt6)
Location: /home/user/whisper/farsi_transcriber/
Setup
cd farsi_transcriber
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python main.py
Features:
- ✅ Standalone desktop application
- ✅ Works completely offline
- ✅ Direct access to file system
- ✅ Lightweight and fast
- ⚠️ Simpler UI (green theme)
Good for:
- Local-only transcription
- Users who prefer desktop apps
- Offline processing
🌐 Option 2: Web App (React + Flask)
Location: /home/user/whisper/farsi_transcriber_web/
Setup
Backend (Flask):
cd farsi_transcriber_web/backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python app.py
# API runs on http://localhost:5000
Frontend (React):
cd farsi_transcriber_web
npm install
npm run dev
# App runs on http://localhost:3000
Features:
- ✅ Modern web-based UI (matches your Figma design exactly)
- ✅ File queue management
- ✅ Dark/Light theme toggle
- ✅ Search with text highlighting
- ✅ Copy segments to clipboard
- ✅ Resizable window
- ✅ RTL support for Farsi
- ✅ Multiple export formats
- ✅ Professional styling
Good for:
- Modern web experience
- Team collaboration (can be deployed online)
- More features and polish
- Professional appearance
📊 Comparison
| Feature | Desktop (PyQt6) | Web (React) |
|---|---|---|
| Interface | Simple, green | Modern, professional |
| Dark Mode | ❌ | ✅ |
| File Queue | ❌ | ✅ |
| Search | ❌ | ✅ |
| Copy Segments | ❌ | ✅ |
| Resizable Window | ❌ | ✅ |
| Export Formats | SRT, TXT, VTT, JSON, TSV | TXT, SRT, VTT, JSON |
| Offline | ✅ | Requires backend |
| Easy Setup | ✅✅ | ✅ (2 terminals) |
| Deployment | Desktop only | Can host online |
| Code Size | ~25KB | ~200KB |
🚀 Which Should You Use?
Use Desktop App if:
- You want simple, quick setup
- You never share transcriptions
- You prefer offline processing
- You don't need advanced features
Use Web App if:
- You like modern interfaces
- You want dark/light themes
- You need file queue management
- You want to potentially share online
- You want professional appearance
📁 Project Structure
whisper/
├── farsi_transcriber/ (Desktop PyQt6 App)
│ ├── ui/
│ ├── models/
│ ├── utils/
│ ├── config.py
│ ├── main.py
│ └── requirements.txt
│
└── farsi_transcriber_web/ (Web React App)
├── src/
│ ├── App.tsx
│ ├── components/
│ └── main.tsx
├── backend/
│ ├── app.py
│ └── requirements.txt
├── package.json
└── vite.config.ts
🔧 System Requirements
Desktop App
- Python 3.8+
- ffmpeg
- 4GB RAM
Web App
- Python 3.8+ (backend)
- Node.js 16+ (frontend)
- ffmpeg
- 4GB RAM
📝 Setup Checklist
Initial Setup (One-time)
-
Install ffmpeg
# Ubuntu/Debian sudo apt install ffmpeg # macOS brew install ffmpeg # Windows choco install ffmpeg -
Verify Python 3.8+
python3 --version -
Verify Node.js 16+ (for web app only)
node --version
Desktop App Setup
- Create virtual environment
- Install requirements
- Run app
Web App Setup
Backend:
- Create virtual environment
- Install requirements
- Run Flask server
Frontend:
- Install Node dependencies
- Run dev server
🎯 Quick Start (Fastest)
Desktop (30 seconds)
cd whisper/farsi_transcriber
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt && python main.py
Web (2 minutes)
Terminal 1:
cd whisper/farsi_transcriber_web/backend
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt && python app.py
Terminal 2:
cd whisper/farsi_transcriber_web
npm install && npm run dev
🐛 Troubleshooting
"ffmpeg not found"
Install ffmpeg (see requirements above)
"ModuleNotFoundError" (Python)
# Ensure virtual environment is activated
source venv/bin/activate # Linux/Mac
# or
venv\Scripts\activate # Windows
"npm: command not found"
Install Node.js from https://nodejs.org
App runs slow
- Use GPU: Install CUDA
- Reduce model size: change to 'small' or 'tiny'
- Close other applications
📚 Full Documentation
- Desktop App:
farsi_transcriber/README.md - Web App:
farsi_transcriber_web/README.md - API Docs:
farsi_transcriber_web/README.md(Endpoints section)
🎓 What Was Built
Desktop Application (PyQt6)
✅ File picker for audio/video ✅ Whisper integration with word-level timestamps ✅ 5 export formats (TXT, SRT, VTT, JSON, TSV) ✅ Professional styling ✅ Progress indicators ✅ Threading to prevent UI freezing
Web Application (React + Flask)
✅ Complete Figma design implementation ✅ File queue management ✅ Dark/light theme ✅ Search with highlighting ✅ Segment management ✅ Resizable window ✅ RTL support ✅ Flask backend with Whisper integration ✅ 4 export formats ✅ Real file upload handling
🚀 Next Steps
- Choose your app (Desktop or Web)
- Install ffmpeg if not already installed
- Follow the setup instructions above
- Test with a Farsi audio file
- Export in your preferred format
💡 Tips
- First transcription is slow (downloads 769MB model)
- Use larger models (medium/large) for better accuracy
- Use smaller models (tiny/base) for speed
- GPU significantly speeds up transcription
- Both apps work offline (after initial model download)
📧 Need Help?
- Check the full README in each app's directory
- Verify all requirements are installed
- Check browser console (web app) or Python output (desktop)
- Ensure ffmpeg is in your PATH
Enjoy your Farsi transcription apps! 🎉