mirror of
https://github.com/openai/whisper.git
synced 2025-11-24 06:26:03 +00:00
- Create styles.py module with comprehensive stylesheet - Implement color palette and typography configuration - Apply consistent styling across all UI elements - Improve button, text input, and progress bar appearance - Use monospace font for transcription results display - Add hover and active states for interactive elements - Phase 5 complete: Professional UI styling applied
Farsi Transcriber
A desktop application for transcribing Farsi audio and video files using OpenAI's Whisper model.
Features
- 🎙️ Transcribe audio files (MP3, WAV, M4A, FLAC, OGG, etc.)
- 🎬 Extract audio from video files (MP4, MKV, MOV, WebM, AVI, etc.)
- 🇮🇷 High-accuracy Farsi transcription
- ⏱️ Word-level timestamps
- 📤 Export to multiple formats (TXT, SRT, JSON)
- 💻 Clean PyQt6-based GUI
System Requirements
- Python 3.8+
- ffmpeg (for audio/video processing)
- 8GB+ RAM recommended (for high-accuracy model)
Install ffmpeg
Ubuntu/Debian:
sudo apt update && sudo apt install ffmpeg
macOS (Homebrew):
brew install ffmpeg
Windows (Chocolatey):
choco install ffmpeg
Installation
- Clone the repository
- Create a virtual environment:
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Run the application:
python main.py
Usage
GUI Application
python main.py
Then:
- Click "Select File" to choose an audio or video file
- Click "Transcribe" and wait for processing
- View results with timestamps
- Export to your preferred format
Command Line (Coming Soon)
python -m farsi_transcriber --input audio.mp3 --output transcription.srt
Model Information
This application uses OpenAI's Whisper model optimized for Farsi:
- Model: medium or large (configurable)
- Accuracy: Optimized for Persian language
- Processing: Local processing (no cloud required)
Project Structure
farsi_transcriber/
├── ui/ # PyQt6 UI components
├── models/ # Whisper model management
├── utils/ # Utility functions
├── main.py # Application entry point
├── requirements.txt # Python dependencies
└── README.md # This file
Development
Running Tests
pytest tests/
Code Style
black .
flake8 .
isort .
License
MIT License - See LICENSE file for details
Contributing
This is a personal project, but feel free to fork and modify for your needs!