# Farsi Transcriber - Quick Start Guide You now have **TWO** complete applications for Farsi transcription: ## 🖥️ Option 1: Desktop App (PyQt6) **Location:** `/home/user/whisper/farsi_transcriber/` ### Setup ```bash cd farsi_transcriber python3 -m venv venv source venv/bin/activate pip install -r requirements.txt python main.py ``` **Features:** - ✅ Standalone desktop application - ✅ Works completely offline - ✅ Direct access to file system - ✅ Lightweight and fast - ⚠️ Simpler UI (green theme) **Good for:** - Local-only transcription - Users who prefer desktop apps - Offline processing --- ## 🌐 Option 2: Web App (React + Flask) **Location:** `/home/user/whisper/farsi_transcriber_web/` ### Setup **Backend (Flask):** ```bash cd farsi_transcriber_web/backend python3 -m venv venv source venv/bin/activate pip install -r requirements.txt python app.py # API runs on http://localhost:5000 ``` **Frontend (React):** ```bash cd farsi_transcriber_web npm install npm run dev # App runs on http://localhost:3000 ``` **Features:** - ✅ Modern web-based UI (matches your Figma design exactly) - ✅ File queue management - ✅ Dark/Light theme toggle - ✅ Search with text highlighting - ✅ Copy segments to clipboard - ✅ Resizable window - ✅ RTL support for Farsi - ✅ Multiple export formats - ✅ Professional styling **Good for:** - Modern web experience - Team collaboration (can be deployed online) - More features and polish - Professional appearance --- ## 📊 Comparison | Feature | Desktop (PyQt6) | Web (React) | |---------|-----------------|------------| | **Interface** | Simple, green | Modern, professional | | **Dark Mode** | ❌ | ✅ | | **File Queue** | ❌ | ✅ | | **Search** | ❌ | ✅ | | **Copy Segments** | ❌ | ✅ | | **Resizable Window** | ❌ | ✅ | | **Export Formats** | SRT, TXT, VTT, JSON, TSV | TXT, SRT, VTT, JSON | | **Offline** | ✅ | Requires backend | | **Easy Setup** | ✅✅ | ✅ (2 terminals) | | **Deployment** | Desktop only | Can host online | | **Code Size** | ~25KB | ~200KB | --- ## 🚀 Which Should You Use? ### Use **Desktop App** if: - You want simple, quick setup - You never share transcriptions - You prefer offline processing - You don't need advanced features ### Use **Web App** if: - You like modern interfaces - You want dark/light themes - You need file queue management - You want to potentially share online - You want professional appearance --- ## 📁 Project Structure ``` whisper/ ├── farsi_transcriber/ (Desktop PyQt6 App) │ ├── ui/ │ ├── models/ │ ├── utils/ │ ├── config.py │ ├── main.py │ └── requirements.txt │ └── farsi_transcriber_web/ (Web React App) ├── src/ │ ├── App.tsx │ ├── components/ │ └── main.tsx ├── backend/ │ ├── app.py │ └── requirements.txt ├── package.json └── vite.config.ts ``` --- ## 🔧 System Requirements ### Desktop App - Python 3.8+ - ffmpeg - 4GB RAM ### Web App - Python 3.8+ (backend) - Node.js 16+ (frontend) - ffmpeg - 4GB RAM --- ## 📝 Setup Checklist ### Initial Setup (One-time) - [ ] Install ffmpeg ```bash # Ubuntu/Debian sudo apt install ffmpeg # macOS brew install ffmpeg # Windows choco install ffmpeg ``` - [ ] Verify Python 3.8+ ```bash python3 --version ``` - [ ] Verify Node.js 16+ (for web app only) ```bash node --version ``` ### Desktop App Setup - [ ] Create virtual environment - [ ] Install requirements - [ ] Run app ### Web App Setup **Backend:** - [ ] Create virtual environment - [ ] Install requirements - [ ] Run Flask server **Frontend:** - [ ] Install Node dependencies - [ ] Run dev server --- ## 🎯 Quick Start (Fastest) ### Desktop (30 seconds) ```bash cd whisper/farsi_transcriber python3 -m venv venv && source venv/bin/activate pip install -r requirements.txt && python main.py ``` ### Web (2 minutes) Terminal 1: ```bash cd whisper/farsi_transcriber_web/backend python3 -m venv venv && source venv/bin/activate pip install -r requirements.txt && python app.py ``` Terminal 2: ```bash cd whisper/farsi_transcriber_web npm install && npm run dev ``` --- ## 🐛 Troubleshooting ### "ffmpeg not found" Install ffmpeg (see requirements above) ### "ModuleNotFoundError" (Python) ```bash # Ensure virtual environment is activated source venv/bin/activate # Linux/Mac # or venv\Scripts\activate # Windows ``` ### "npm: command not found" Install Node.js from https://nodejs.org ### App runs slow - Use GPU: Install CUDA - Reduce model size: change to 'small' or 'tiny' - Close other applications --- ## 📚 Full Documentation - **Desktop App:** `farsi_transcriber/README.md` - **Web App:** `farsi_transcriber_web/README.md` - **API Docs:** `farsi_transcriber_web/README.md` (Endpoints section) --- ## 🎓 What Was Built ### Desktop Application (PyQt6) ✅ File picker for audio/video ✅ Whisper integration with word-level timestamps ✅ 5 export formats (TXT, SRT, VTT, JSON, TSV) ✅ Professional styling ✅ Progress indicators ✅ Threading to prevent UI freezing ### Web Application (React + Flask) ✅ Complete Figma design implementation ✅ File queue management ✅ Dark/light theme ✅ Search with highlighting ✅ Segment management ✅ Resizable window ✅ RTL support ✅ Flask backend with Whisper integration ✅ 4 export formats ✅ Real file upload handling --- ## 🚀 Next Steps 1. **Choose your app** (Desktop or Web) 2. **Install ffmpeg** if not already installed 3. **Follow the setup instructions** above 4. **Test with a Farsi audio file** 5. **Export in your preferred format** --- ## 💡 Tips - **First transcription is slow** (downloads 769MB model) - **Use larger models** (medium/large) for better accuracy - **Use smaller models** (tiny/base) for speed - **GPU significantly speeds up** transcription - **Both apps work offline** (after initial model download) --- ## 📧 Need Help? - Check the full README in each app's directory - Verify all requirements are installed - Check browser console (web app) or Python output (desktop) - Ensure ffmpeg is in your PATH --- **Enjoy your Farsi transcription apps!** 🎉