185 Commits

Author SHA1 Message Date
Claude
54ffdb7e33
fix: Update dependencies for Railway compatibility
- Change openai-whisper to flexible version constraint (>=20230314)
- Add explicit numpy dependency for better compatibility
- Remove exact version pins that cause build failures on Railway

This fixes the KeyError: '__version__' error during pip install on Railway.
2025-11-16 07:43:01 +00:00
Arya vaghayenegar
d93554a9ec
Merge pull request #3 from ariavn-byte/copilot/review-repo-011cv3pvca7zsctw2yquumb8
Fix critical flake8 violations: unused imports, f-string placeholders, and slice spacing
2025-11-16 02:39:36 -05:00
copilot-swe-agent[bot]
6c5c3b3d56 Fix critical code quality issues: remove unused imports, fix f-strings and whitespace
Co-authored-by: ariavn-byte <151469489+ariavn-byte@users.noreply.github.com>
2025-11-15 23:48:38 +00:00
copilot-swe-agent[bot]
6217f530f5 Initial plan 2025-11-15 23:42:10 +00:00
Arya vaghayenegar
980886bef4
Merge pull request #1 from ariavn-byte/copilot/prepare-for-railway-deployment
Prepare repository for Railway deployment
2025-11-15 18:38:46 -05:00
copilot-swe-agent[bot]
538917daa8 Update openai-whisper to compatible version for deployment
Co-authored-by: ariavn-byte <151469489+ariavn-byte@users.noreply.github.com>
2025-11-15 23:20:13 +00:00
copilot-swe-agent[bot]
f4abb3abef Fix build configuration and TypeScript errors for Railway deployment
Co-authored-by: ariavn-byte <151469489+ariavn-byte@users.noreply.github.com>
2025-11-15 23:16:54 +00:00
copilot-swe-agent[bot]
245f1c3941 Ready for Railway deployment
Co-authored-by: ariavn-byte <151469489+ariavn-byte@users.noreply.github.com>
2025-11-15 23:11:28 +00:00
copilot-swe-agent[bot]
22b810a7e7 Initial plan 2025-11-15 23:06:16 +00:00
Claude
826659d896
docs: Add Railway quick start deployment guide 2025-11-15 21:49:50 +00:00
Claude
8c76e1b518
feat: Prepare web app for Railway deployment
Backend Updates:
- Add lazy loading for Whisper model (faster startup)
- Use environment variables for port and config
- Add root endpoint for health checking
- Configure CORS for production
- Add tempfile support for uploads
- Update to support gunicorn production server
- Add Procfile for Heroku/Railway compatibility

Frontend Updates:
- Optimize Vite build configuration
- Add production build optimizations
- Enable minification and code splitting
- Configure preview server for production

Configuration:
- Add .env.example files for both frontend and backend
- Create railway.toml for Railway deployment
- Add Procfile for process management
- Setup environment variable templates

Documentation:
- Create comprehensive RAILWAY_DEPLOYMENT.md guide
- Include step-by-step deployment instructions
- Add troubleshooting section
- Include cost breakdown
- Add monitoring and maintenance guide

Dependencies:
- Add gunicorn for production WSGI server

Ready for Railway deployment with:
- Free $5/month credit
- Automatic scaling
- 24/7 uptime
- Custom domain support (optional)
2025-11-15 21:49:26 +00:00
Claude
7238568b42
docs: Add quick start guide for both desktop and web apps 2025-11-13 08:03:44 +00:00
Claude
22ddbf4796
feat: Create React web application with Figma design and Flask backend
Frontend:
- Initialize React 18 + TypeScript project with Vite
- Implement complete App.tsx matching Figma design
- Add dark/light theme toggle support
- Create file queue management UI
- Implement search with text highlighting
- Add segment copy functionality
- Create reusable UI components (Button, Progress, Input, Select)
- Configure Tailwind CSS v4.0 for styling
- Setup window resizing functionality
- Implement RTL support for Farsi text

Backend:
- Create Flask API server with CORS support
- Implement /transcribe endpoint for audio/video processing
- Add /models endpoint for available models info
- Implement /export endpoint for multiple formats (TXT, SRT, VTT, JSON)
- Setup Whisper model integration
- Handle file uploads with validation
- Format transcription results with timestamps

Configuration:
- Setup Vite dev server with API proxy
- Configure Tailwind CSS with custom colors
- Setup TypeScript strict mode
- Add PostCSS with autoprefixer
- Configure Flask for development

Documentation:
- Write comprehensive README with setup instructions
- Include API endpoint documentation
- Add troubleshooting guide
- Include performance tips

Includes everything ready to run with: npm install && npm run dev (frontend) and python backend/app.py (backend)
2025-11-13 08:03:09 +00:00
Claude
efdcf42ffd
feat: Add comprehensive configuration and documentation
- Create config.py with model, device, and format settings
- Add model descriptions and performance information
- Expand README with detailed installation instructions
- Add troubleshooting section for common issues
- Include advanced usage examples
- Document all export formats and features
- Add performance tips and recommendations
- Phase 6 complete: Full configuration and documentation ready
2025-11-12 05:13:35 +00:00
Claude
72ab2e3fa9
feat: Add professional styling and theming
- Create styles.py module with comprehensive stylesheet
- Implement color palette and typography configuration
- Apply consistent styling across all UI elements
- Improve button, text input, and progress bar appearance
- Use monospace font for transcription results display
- Add hover and active states for interactive elements
- Phase 5 complete: Professional UI styling applied
2025-11-12 05:12:38 +00:00
Claude
dd57adab18
feat: Implement comprehensive export functionality
- Create TranscriptionExporter utility supporting TXT, SRT, VTT, JSON, TSV formats
- Implement proper timestamp formatting for subtitle formats
- Update GUI export dialog with all supported formats
- Integrate exporter with main window
- Add robust error handling for export operations
- Phase 4 complete: Full export capabilities ready
2025-11-12 05:12:06 +00:00
Claude
3fa194fa1f
feat: Implement Whisper integration for Farsi transcription
- Create FarsiTranscriber class wrapping OpenAI's Whisper model
- Support both audio and video file formats
- Implement word-level timestamp extraction
- Add device detection (CUDA/CPU) for optimal performance
- Format results for display with timestamps
- Integrate transcriber with PyQt6 worker thread
- Add error handling and progress updates
- Phase 3 complete: Core transcription engine ready
2025-11-12 05:11:31 +00:00
Claude
0cc07b98e3
feat: Create PyQt6 GUI with file picker and results display
- Implement MainWindow class with professional layout
- Add file picker for audio and video formats
- Create transcription button with threading support
- Add progress bar and status indicators
- Implement TranscriptionWorker thread to prevent UI freezing
- Add results display with timestamps support
- Create export button (placeholder for Phase 4)
- Add error handling and user feedback
- Phase 2 complete: Full GUI scaffolding ready
2025-11-12 05:10:53 +00:00
Claude
86b2a93dee
feat: Initialize Farsi Transcriber application structure
- Create project directories (ui, models, utils)
- Add PyQt6 environment setup with requirements.txt
- Create main entry point (main.py)
- Add comprehensive README with setup instructions
- Add .gitignore for Python, PyTorch, and ML artifacts
- Phase 1 complete: project structure and environment ready
2025-11-12 05:09:15 +00:00
Jong Wook Kim
c0d2f624c0 Release 20250625 2025-06-25 18:05:47 -07:00
Jong Wook Kim
db7fbc75fe Release 20250625 2025-06-25 18:03:25 -07:00
Jong Wook Kim
31243bad24 Release 20250625 v20250625 2025-06-25 18:00:48 -07:00
Dridi Yassin
1f8fc975d3
Fix: Update torch.load to use weights_only=True to prevent security w… (#2451)
* Fix: Update torch.load to use weights_only=True to prevent security warning

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2025-06-25 17:54:30 -07:00
Nathan Harmon
679ae1d141
Fix: Ensure DTW cost tensor is on the same device as input tensor (#2561)
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2025-06-25 17:42:09 -07:00
Nicholas Nadeau, Ph.D., P.Eng.
f50c4f264e
docs: updated README to specify translation model limitation (#2547)
Updated README given info from https://github.com/openai/whisper/discussions/2483
2025-06-25 17:03:47 -07:00
ExtReMLapin
86899243e9
Fixed triton kernel update to support latest triton versions (#2588)
* Update triton kernel using _unsafe_update_src

* support old triton versions

* refactored changes to update triton kernel only once

* Update triton_ops.py

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
Co-authored-by: Jong Wook Kim <ilikekjw@gmail.com>
2025-06-25 17:02:54 -07:00
Learpcs
5dff4db81a
Fix: GitHub display errors for Jupyter notebooks (#2589)
* Update LibriSpeech.ipynb

Update LibriSpeech.ipynb

* Update Multilingual_ASR.ipynb
2025-06-25 16:55:15 -07:00
dependabot[bot]
dd985ac4b9
Bump the github-actions group with 3 updates (#2592)
Bumps the github-actions group with 3 updates: [actions/checkout](https://github.com/actions/checkout), [actions/setup-python](https://github.com/actions/setup-python) and [softprops/action-gh-release](https://github.com/softprops/action-gh-release).


Updates `actions/checkout` from 3 to 4
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)

Updates `actions/setup-python` from 4 to 5
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v4...v5)

Updates `softprops/action-gh-release` from 1 to 2
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](https://github.com/softprops/action-gh-release/compare/v1...v2)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: actions/setup-python
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: softprops/action-gh-release
  dependency-version: '2'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-13 11:22:31 -07:00
Christian Clauss
e1e6aa60ff
Keep GitHub Actions up to date with GitHub's Dependabot (#2486)
Automates the creation of pull requests like
* #2430 

* [Keeping your actions up to date with Dependabot](https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot)
* [Configuration options for the dependabot.yml file - package-ecosystem](https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#package-ecosystem)
2025-05-13 11:10:43 -07:00
Christian Clauss
e6a5fc0ff0
pre-commit: Upgrade black v25.1.0 and isort v6.0.0 (#2514) 2025-05-13 09:43:34 -07:00
Christian Clauss
13907bed90
GitHub Actions: Add Python 3.13 to the testing (#2487)
* GitHub Actions: Add Python 3.13 to the testing

* GitHub Actions: Add Python 3.13 to the testing

* numba==0.61.0rc2; python_version=='3.13'

* triton>=2; python_version<'3.13'

* fail-fast: false

* Numba v0.61.0 is released

https://github.com/numba/numba/releases

* Update pyproject.toml
2025-05-12 21:10:40 -07:00
Jong Wook Kim
517a43ecd1
Update python-publish.yml
using `-m build --sdist` instead of `setup.py sdist`
2025-01-04 12:56:16 -08:00
Christian Clauss
dd4d010d2c
PEP 621: Migrate from setup.py to pyproject.toml (#2435) 2025-01-04 01:38:35 -08:00
Christian Clauss
26a7cacc83
pre-commit autoupdate && pre-commit run --all-files (#2484)
* pre-commit autoupdate && pre-commit run --all-files

* Black formatter needs a current version of Python
2025-01-04 01:02:18 -08:00
Christian Clauss
6c1d8f1ea1
Upgrade GitHub Actions (#2430) 2025-01-04 00:47:12 -08:00
Purfview
90db0de189
Bugfix: Illogical "Avoid computing higher temperatures on no_speech" (#1903)
* Bugfix: Illogical "Avoid computing higher temperatures on no_speech"

Bugfix for https://github.com/openai/whisper/pull/1279

It's "silence" when decoding has failed due to `compression_ratio_threshold` too, when further down the code it's not "silence" anymore.

"Silence" should be only when decoding has failed due to `logprob_threshold`.

Like described there:
8bc8860694/whisper/transcribe.py (L421)

And in code there:
8bc8860694/whisper/transcribe.py (L243-L251)

* Fix if "logprob_threshold=None"

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2024-11-30 21:47:01 -08:00
Lowell Vaughn
fc5ded7d90
Updating README and doc strings to reflect that n_mels can now be 128 (#2049) 2024-11-26 09:37:01 -08:00
f1sh
173ff7dd1d
fix typo data/README.md (#2433) 2024-11-12 16:35:54 -08:00
BotMaster3000
271445b2f2
Update README.md (#2379)
Default now uses Turbo instead of Small
2024-11-03 23:00:30 -08:00
kittsil
5979f03701
Add option to carry initial_prompt with the sliding window (#2343)
* Add option to carry initial_prompt with the sliding window

Add an option `carry_initial_prompt = False` to `whisper.transcribe()`.
When set to `True`, `initial_prompt` is prepended to each internal `decode()` call's `prompt`.
If there is not enough context space at the start of the prompt, the prompt is left-sliced to make space.

* Prevent redundant initial_prompt_tokens

* Revert unnecessary .gitignore change

---------

Co-authored-by: Kittsil <kittsil@gmail.com>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2024-10-26 07:17:31 -07:00
Jong Wook Kim
cdb8147962
more pytorch versions in tests (#2408) 2024-10-25 17:30:02 -07:00
Jong Wook Kim
25639fc17d Release 20240930 v20240930 2024-09-30 11:20:53 -07:00
Jong Wook Kim
260bbcfcb3
allowing numpy 2 in tests (#2362)
* allowing numpy 2 in tests

* allowing numpy 2 in tests
2024-09-30 11:18:17 -07:00
Jong Wook Kim
25e5c364e0
large-v3-turbo model (#2361) 2024-09-30 10:59:51 -07:00
Jong Wook Kim
b66b46f32d
test on python/pytorch versions up to 3.12 and 2.4.1 (#2360) 2024-09-30 10:33:56 -07:00
Jong Wook Kim
27f971320a
using sdpa if available (#2359)
* using sdpa if available

* Update model.py
2024-09-30 10:27:14 -07:00
Jong Wook Kim
423492dda7 Release 20240927 v20240927 2024-09-27 16:43:58 -07:00
Jong Wook Kim
279133e310
pinning numpy<2 in tests (#2332)
* pinning numpy<2 in tests

* pip install together

* pip install together
2024-09-10 10:43:21 -07:00
Jianan Xing
32d55d5d76
Relax triton requirements for compatibility with pytorch 2.4 and newer (#2307)
* Relax triton requirements for compatibility with pytorch 2.4 and newer

Similar to https://github.com/openai/whisper/pull/1802, but now when pytorch upgrades to 2.4, it requires triton==3.0.0. I am not sure if it makes sense to remove the upper bound version constraints

* Update requirements.txt
2024-09-10 09:53:08 -07:00
ryanheise
ba3f3cd54b
Skip silence around hallucinations (#1838)
* Add clip_timestamps option

* Add hallucination_silence_threshold option

* Fix typing for python < 3.9

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-12-18 12:11:16 -08:00