169 Commits

Author SHA1 Message Date
Louis Brulé Naudet
e3a432ca64
Merge 67f5b1d2317f1b7223b0d39c1b7fd2f3ed0c6272 into c0d2f624c09dc18e709e37c2ad90c039a4eb72a2 2025-06-27 02:27:48 +00:00
Jong Wook Kim
c0d2f624c0 Release 20250625 2025-06-25 18:05:47 -07:00
Jong Wook Kim
db7fbc75fe Release 20250625 2025-06-25 18:03:25 -07:00
Jong Wook Kim
31243bad24 Release 20250625 v20250625 2025-06-25 18:00:48 -07:00
Dridi Yassin
1f8fc975d3
Fix: Update torch.load to use weights_only=True to prevent security w… (#2451)
* Fix: Update torch.load to use weights_only=True to prevent security warning

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2025-06-25 17:54:30 -07:00
Nathan Harmon
679ae1d141
Fix: Ensure DTW cost tensor is on the same device as input tensor (#2561)
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2025-06-25 17:42:09 -07:00
Nicholas Nadeau, Ph.D., P.Eng.
f50c4f264e
docs: updated README to specify translation model limitation (#2547)
Updated README given info from https://github.com/openai/whisper/discussions/2483
2025-06-25 17:03:47 -07:00
ExtReMLapin
86899243e9
Fixed triton kernel update to support latest triton versions (#2588)
* Update triton kernel using _unsafe_update_src

* support old triton versions

* refactored changes to update triton kernel only once

* Update triton_ops.py

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
Co-authored-by: Jong Wook Kim <ilikekjw@gmail.com>
2025-06-25 17:02:54 -07:00
Learpcs
5dff4db81a
Fix: GitHub display errors for Jupyter notebooks (#2589)
* Update LibriSpeech.ipynb

Update LibriSpeech.ipynb

* Update Multilingual_ASR.ipynb
2025-06-25 16:55:15 -07:00
dependabot[bot]
dd985ac4b9
Bump the github-actions group with 3 updates (#2592)
Bumps the github-actions group with 3 updates: [actions/checkout](https://github.com/actions/checkout), [actions/setup-python](https://github.com/actions/setup-python) and [softprops/action-gh-release](https://github.com/softprops/action-gh-release).


Updates `actions/checkout` from 3 to 4
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)

Updates `actions/setup-python` from 4 to 5
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v4...v5)

Updates `softprops/action-gh-release` from 1 to 2
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](https://github.com/softprops/action-gh-release/compare/v1...v2)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: actions/setup-python
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: softprops/action-gh-release
  dependency-version: '2'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-13 11:22:31 -07:00
Christian Clauss
e1e6aa60ff
Keep GitHub Actions up to date with GitHub's Dependabot (#2486)
Automates the creation of pull requests like
* #2430 

* [Keeping your actions up to date with Dependabot](https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot)
* [Configuration options for the dependabot.yml file - package-ecosystem](https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#package-ecosystem)
2025-05-13 11:10:43 -07:00
Christian Clauss
e6a5fc0ff0
pre-commit: Upgrade black v25.1.0 and isort v6.0.0 (#2514) 2025-05-13 09:43:34 -07:00
Christian Clauss
13907bed90
GitHub Actions: Add Python 3.13 to the testing (#2487)
* GitHub Actions: Add Python 3.13 to the testing

* GitHub Actions: Add Python 3.13 to the testing

* numba==0.61.0rc2; python_version=='3.13'

* triton>=2; python_version<'3.13'

* fail-fast: false

* Numba v0.61.0 is released

https://github.com/numba/numba/releases

* Update pyproject.toml
2025-05-12 21:10:40 -07:00
Jong Wook Kim
517a43ecd1
Update python-publish.yml
using `-m build --sdist` instead of `setup.py sdist`
2025-01-04 12:56:16 -08:00
Christian Clauss
dd4d010d2c
PEP 621: Migrate from setup.py to pyproject.toml (#2435) 2025-01-04 01:38:35 -08:00
Christian Clauss
26a7cacc83
pre-commit autoupdate && pre-commit run --all-files (#2484)
* pre-commit autoupdate && pre-commit run --all-files

* Black formatter needs a current version of Python
2025-01-04 01:02:18 -08:00
Christian Clauss
6c1d8f1ea1
Upgrade GitHub Actions (#2430) 2025-01-04 00:47:12 -08:00
Purfview
90db0de189
Bugfix: Illogical "Avoid computing higher temperatures on no_speech" (#1903)
* Bugfix: Illogical "Avoid computing higher temperatures on no_speech"

Bugfix for https://github.com/openai/whisper/pull/1279

It's "silence" when decoding has failed due to `compression_ratio_threshold` too, when further down the code it's not "silence" anymore.

"Silence" should be only when decoding has failed due to `logprob_threshold`.

Like described there:
8bc8860694/whisper/transcribe.py (L421)

And in code there:
8bc8860694/whisper/transcribe.py (L243-L251)

* Fix if "logprob_threshold=None"

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2024-11-30 21:47:01 -08:00
Lowell Vaughn
fc5ded7d90
Updating README and doc strings to reflect that n_mels can now be 128 (#2049) 2024-11-26 09:37:01 -08:00
f1sh
173ff7dd1d
fix typo data/README.md (#2433) 2024-11-12 16:35:54 -08:00
BotMaster3000
271445b2f2
Update README.md (#2379)
Default now uses Turbo instead of Small
2024-11-03 23:00:30 -08:00
Louis Brulé Naudet
67f5b1d231
Merge branch 'main' into main 2024-11-03 18:03:57 +01:00
kittsil
5979f03701
Add option to carry initial_prompt with the sliding window (#2343)
* Add option to carry initial_prompt with the sliding window

Add an option `carry_initial_prompt = False` to `whisper.transcribe()`.
When set to `True`, `initial_prompt` is prepended to each internal `decode()` call's `prompt`.
If there is not enough context space at the start of the prompt, the prompt is left-sliced to make space.

* Prevent redundant initial_prompt_tokens

* Revert unnecessary .gitignore change

---------

Co-authored-by: Kittsil <kittsil@gmail.com>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2024-10-26 07:17:31 -07:00
Jong Wook Kim
cdb8147962
more pytorch versions in tests (#2408) 2024-10-25 17:30:02 -07:00
Jong Wook Kim
25639fc17d Release 20240930 v20240930 2024-09-30 11:20:53 -07:00
Jong Wook Kim
260bbcfcb3
allowing numpy 2 in tests (#2362)
* allowing numpy 2 in tests

* allowing numpy 2 in tests
2024-09-30 11:18:17 -07:00
Jong Wook Kim
25e5c364e0
large-v3-turbo model (#2361) 2024-09-30 10:59:51 -07:00
Jong Wook Kim
b66b46f32d
test on python/pytorch versions up to 3.12 and 2.4.1 (#2360) 2024-09-30 10:33:56 -07:00
Jong Wook Kim
27f971320a
using sdpa if available (#2359)
* using sdpa if available

* Update model.py
2024-09-30 10:27:14 -07:00
Jong Wook Kim
423492dda7 Release 20240927 v20240927 2024-09-27 16:43:58 -07:00
Jong Wook Kim
279133e310
pinning numpy<2 in tests (#2332)
* pinning numpy<2 in tests

* pip install together

* pip install together
2024-09-10 10:43:21 -07:00
Jianan Xing
32d55d5d76
Relax triton requirements for compatibility with pytorch 2.4 and newer (#2307)
* Relax triton requirements for compatibility with pytorch 2.4 and newer

Similar to https://github.com/openai/whisper/pull/1802, but now when pytorch upgrades to 2.4, it requires triton==3.0.0. I am not sure if it makes sense to remove the upper bound version constraints

* Update requirements.txt
2024-09-10 09:53:08 -07:00
Louis Brulé Naudet
492c05c5f3 Update utils.py
Dear Developers,

I'm pleased to inform you that I have completed the documentation update the utils.py file.

The updated documentation provides clear explanations of function parameters, return types, and expected behavior. Additionally, it adheres to consistent formatting and organization, ensuring ease of understanding for both current and future developers.

Please review the updated documentation at your earliest convenience. If you have any feedback or suggestions for further improvements, please don't hesitate to let me know.

Thank you for your attention to this matter.

Best regards,
Louis Brulé Naudet
2024-02-19 20:12:26 +01:00
ryanheise
ba3f3cd54b
Skip silence around hallucinations (#1838)
* Add clip_timestamps option

* Add hallucination_silence_threshold option

* Fix typing for python < 3.9

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-12-18 12:11:16 -08:00
Bob Lin
8bc8860694
Fix triton env marker (#1887) 2023-12-11 10:39:08 -05:00
Jong Wook Kim
e58f288045 Release 20231117 v20231117 2023-11-17 11:59:28 -08:00
Eugene Indenbom
1cea435768
Relax triton requirements for compatibility with pytorch 2.1 and newer (#1802) 2023-11-13 09:43:42 -08:00
Jong Wook Kim
fcfeaf1b61 Release 20231106 v20231106 2023-11-06 10:14:04 -08:00
Jong Wook Kim
c5d4256076
large-v3 (#1761)
* mel_filters() loads 128 mel bins

* can load 100-language models

* large-v3 checkpoint and evals

* add mandarin alias

* remove unused path

* flake8 fix

* formatting fix
2023-11-06 10:10:30 -08:00
Jong Wook Kim
f6f01c561c Release 20231105 v20231105 2023-11-06 03:08:56 -08:00
Jong Wook Kim
746aaaeafa
remove tiktoken pin (#1759) 2023-11-06 03:05:21 -08:00
Philippe Hebert
b9f17e1f2d
docs: Disambiguation of the term "relative speed" in the README (#1751)
* docs: defines relative speed in README

* combined paragraphs

---------

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
2023-11-06 02:43:07 -08:00
Mohamad Zamini
7dfcd56304
allow_pickle=False while loading of mel matrix IN audio.py (#1511)
* Update audio.py

 The `mel_filters` function is using a `np.load` function to load a pre-computed mel filterbank matrix. This function is not thread-safe, which means that if it is called from multiple threads at the same time, it may corrupt the data.

To fix this, you can use the `torch.load` function instead. This function is thread-safe, so it will not corrupt the data if it is called from multiple threads at the same time.

* Update audio.py

updated the docstring

* allow_pickle=False

* newline

---------

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-11-06 02:28:51 -08:00
Marco Zucconelli
b7d277acd5
handling transcribe exceptions. (#1682)
* handling transcribe() exceptions.

* printing stacktrace

---------

Co-authored-by: invalid <invalid@email.com>
Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-11-06 02:06:19 -08:00
amosal
6ed314fe41
Add new option to generate subtitles by a specific number of words (#1729)
* ADD parser for new argument --max_words_count

* ADD max_words_count in words_options
ADD warning for max_line_width compatibility

* ADD logic for max_words_count

* rename to max_words_per_line

* make them kwargs

* allow specifying file path by --model

* black formatting

---------

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
2023-11-06 01:49:33 -08:00
Jordi Mas
b38a1f20f4
Fix exception when an audio file with no speech is provided (#1396)
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-10-10 10:01:01 -07:00
Jong Wook Kim
0a60fcaa9b Release 20230918 v20230918 2023-09-18 17:13:19 -07:00
Jong Wook Kim
5f957da5ca
Update test.yml 2023-09-18 16:38:17 -07:00
Arthur Kim
8b330df096
Add .pre-commit-config.yaml (#1528)
* Add .pre-commit-config.yaml

Co-authored-by: arthur <arthur@rtzr.ai>

* flake8 E741

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-09-18 16:15:33 -07:00
sqhao
21010ef454
fix doc of TextDecoder (#1526)
Signed-off-by: haoshengqiang <haoshengqiang@xiaohongshu.com>
Co-authored-by: haoshengqiang <haoshengqiang@xiaohongshu.com>
2023-09-18 16:09:59 -07:00