feat: Initialize Farsi Transcriber application structure

- Create project directories (ui, models, utils)
- Add PyQt6 environment setup with requirements.txt
- Create main entry point (main.py)
- Add comprehensive README with setup instructions
- Add .gitignore for Python, PyTorch, and ML artifacts
- Phase 1 complete: project structure and environment ready
This commit is contained in:
Claude 2025-11-12 05:09:15 +00:00
parent c0d2f624c0
commit 86b2a93dee
No known key found for this signature in database
8 changed files with 211 additions and 0 deletions

52
farsi_transcriber/.gitignore vendored Normal file
View File

@ -0,0 +1,52 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
# Virtual Environment
venv/
ENV/
env/
# IDE
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store
# PyTorch/ML Models
*.pt
*.pth
models/downloaded/
# Whisper models cache
~/.cache/whisper/
# Application outputs
transcriptions/
exports/
*.log
# Testing
.pytest_cache/
.coverage
htmlcov/

113
farsi_transcriber/README.md Normal file
View File

@ -0,0 +1,113 @@
# Farsi Transcriber
A desktop application for transcribing Farsi audio and video files using OpenAI's Whisper model.
## Features
- 🎙️ Transcribe audio files (MP3, WAV, M4A, FLAC, OGG, etc.)
- 🎬 Extract audio from video files (MP4, MKV, MOV, WebM, AVI, etc.)
- 🇮🇷 High-accuracy Farsi transcription
- ⏱️ Word-level timestamps
- 📤 Export to multiple formats (TXT, SRT, JSON)
- 💻 Clean PyQt6-based GUI
## System Requirements
- Python 3.8+
- ffmpeg (for audio/video processing)
- 8GB+ RAM recommended (for high-accuracy model)
### Install ffmpeg
**Ubuntu/Debian:**
```bash
sudo apt update && sudo apt install ffmpeg
```
**macOS (Homebrew):**
```bash
brew install ffmpeg
```
**Windows (Chocolatey):**
```bash
choco install ffmpeg
```
## Installation
1. Clone the repository
2. Create a virtual environment:
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
4. Run the application:
```bash
python main.py
```
## Usage
### GUI Application
```bash
python main.py
```
Then:
1. Click "Select File" to choose an audio or video file
2. Click "Transcribe" and wait for processing
3. View results with timestamps
4. Export to your preferred format
### Command Line (Coming Soon)
```bash
python -m farsi_transcriber --input audio.mp3 --output transcription.srt
```
## Model Information
This application uses OpenAI's Whisper model optimized for Farsi:
- **Model**: medium or large (configurable)
- **Accuracy**: Optimized for Persian language
- **Processing**: Local processing (no cloud required)
## Project Structure
```
farsi_transcriber/
├── ui/ # PyQt6 UI components
├── models/ # Whisper model management
├── utils/ # Utility functions
├── main.py # Application entry point
├── requirements.txt # Python dependencies
└── README.md # This file
```
## Development
### Running Tests
```bash
pytest tests/
```
### Code Style
```bash
black .
flake8 .
isort .
```
## License
MIT License - See LICENSE file for details
## Contributing
This is a personal project, but feel free to fork and modify for your needs!

View File

@ -0,0 +1,8 @@
"""
Farsi Transcriber Application
A desktop application for transcribing Farsi audio and video files using OpenAI's Whisper.
"""
__version__ = "0.1.0"
__author__ = "Personal Project"

28
farsi_transcriber/main.py Normal file
View File

@ -0,0 +1,28 @@
#!/usr/bin/env python3
"""
Farsi Transcriber - Main Entry Point
A PyQt6-based desktop application for transcribing Farsi audio and video files.
"""
import sys
from PyQt6.QtWidgets import QApplication
def main():
"""Main entry point for the application"""
app = QApplication(sys.argv)
# TODO: Import and create main window
# from ui.main_window import MainWindow
# window = MainWindow()
# window.show()
print("Farsi Transcriber App initialized (setup phase)")
print("✓ PyQt6 environment ready")
sys.exit(app.exec())
if __name__ == "__main__":
main()

View File

@ -0,0 +1 @@
"""Model management for Farsi Transcriber"""

View File

@ -0,0 +1,7 @@
PyQt6==6.6.1
PyQt6-Qt6==6.6.1
PyQt6-sip==13.6.0
torch>=1.10.1
numpy
openai-whisper
tqdm

View File

@ -0,0 +1 @@
"""UI components for Farsi Transcriber"""

View File

@ -0,0 +1 @@
"""Utility functions for Farsi Transcriber"""