166 Commits

Author SHA1 Message Date
Jong Wook Kim
c0d2f624c0 Release 20250625 2025-06-25 18:05:47 -07:00
Jong Wook Kim
db7fbc75fe Release 20250625 2025-06-25 18:03:25 -07:00
Jong Wook Kim
31243bad24 Release 20250625 v20250625 2025-06-25 18:00:48 -07:00
Dridi Yassin
1f8fc975d3
Fix: Update torch.load to use weights_only=True to prevent security w… (#2451)
* Fix: Update torch.load to use weights_only=True to prevent security warning

* Update __init__.py

* Update __init__.py

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2025-06-25 17:54:30 -07:00
Nathan Harmon
679ae1d141
Fix: Ensure DTW cost tensor is on the same device as input tensor (#2561)
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2025-06-25 17:42:09 -07:00
Nicholas Nadeau, Ph.D., P.Eng.
f50c4f264e
docs: updated README to specify translation model limitation (#2547)
Updated README given info from https://github.com/openai/whisper/discussions/2483
2025-06-25 17:03:47 -07:00
ExtReMLapin
86899243e9
Fixed triton kernel update to support latest triton versions (#2588)
* Update triton kernel using _unsafe_update_src

* support old triton versions

* refactored changes to update triton kernel only once

* Update triton_ops.py

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
Co-authored-by: Jong Wook Kim <ilikekjw@gmail.com>
2025-06-25 17:02:54 -07:00
Learpcs
5dff4db81a
Fix: GitHub display errors for Jupyter notebooks (#2589)
* Update LibriSpeech.ipynb

Update LibriSpeech.ipynb

* Update Multilingual_ASR.ipynb
2025-06-25 16:55:15 -07:00
dependabot[bot]
dd985ac4b9
Bump the github-actions group with 3 updates (#2592)
Bumps the github-actions group with 3 updates: [actions/checkout](https://github.com/actions/checkout), [actions/setup-python](https://github.com/actions/setup-python) and [softprops/action-gh-release](https://github.com/softprops/action-gh-release).


Updates `actions/checkout` from 3 to 4
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)

Updates `actions/setup-python` from 4 to 5
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v4...v5)

Updates `softprops/action-gh-release` from 1 to 2
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](https://github.com/softprops/action-gh-release/compare/v1...v2)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: actions/setup-python
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
- dependency-name: softprops/action-gh-release
  dependency-version: '2'
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-13 11:22:31 -07:00
Christian Clauss
e1e6aa60ff
Keep GitHub Actions up to date with GitHub's Dependabot (#2486)
Automates the creation of pull requests like
* #2430 

* [Keeping your actions up to date with Dependabot](https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot)
* [Configuration options for the dependabot.yml file - package-ecosystem](https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#package-ecosystem)
2025-05-13 11:10:43 -07:00
Christian Clauss
e6a5fc0ff0
pre-commit: Upgrade black v25.1.0 and isort v6.0.0 (#2514) 2025-05-13 09:43:34 -07:00
Christian Clauss
13907bed90
GitHub Actions: Add Python 3.13 to the testing (#2487)
* GitHub Actions: Add Python 3.13 to the testing

* GitHub Actions: Add Python 3.13 to the testing

* numba==0.61.0rc2; python_version=='3.13'

* triton>=2; python_version<'3.13'

* fail-fast: false

* Numba v0.61.0 is released

https://github.com/numba/numba/releases

* Update pyproject.toml
2025-05-12 21:10:40 -07:00
Jong Wook Kim
517a43ecd1
Update python-publish.yml
using `-m build --sdist` instead of `setup.py sdist`
2025-01-04 12:56:16 -08:00
Christian Clauss
dd4d010d2c
PEP 621: Migrate from setup.py to pyproject.toml (#2435) 2025-01-04 01:38:35 -08:00
Christian Clauss
26a7cacc83
pre-commit autoupdate && pre-commit run --all-files (#2484)
* pre-commit autoupdate && pre-commit run --all-files

* Black formatter needs a current version of Python
2025-01-04 01:02:18 -08:00
Christian Clauss
6c1d8f1ea1
Upgrade GitHub Actions (#2430) 2025-01-04 00:47:12 -08:00
Purfview
90db0de189
Bugfix: Illogical "Avoid computing higher temperatures on no_speech" (#1903)
* Bugfix: Illogical "Avoid computing higher temperatures on no_speech"

Bugfix for https://github.com/openai/whisper/pull/1279

It's "silence" when decoding has failed due to `compression_ratio_threshold` too, when further down the code it's not "silence" anymore.

"Silence" should be only when decoding has failed due to `logprob_threshold`.

Like described there:
8bc8860694/whisper/transcribe.py (L421)

And in code there:
8bc8860694/whisper/transcribe.py (L243-L251)

* Fix if "logprob_threshold=None"

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2024-11-30 21:47:01 -08:00
Lowell Vaughn
fc5ded7d90
Updating README and doc strings to reflect that n_mels can now be 128 (#2049) 2024-11-26 09:37:01 -08:00
f1sh
173ff7dd1d
fix typo data/README.md (#2433) 2024-11-12 16:35:54 -08:00
BotMaster3000
271445b2f2
Update README.md (#2379)
Default now uses Turbo instead of Small
2024-11-03 23:00:30 -08:00
kittsil
5979f03701
Add option to carry initial_prompt with the sliding window (#2343)
* Add option to carry initial_prompt with the sliding window

Add an option `carry_initial_prompt = False` to `whisper.transcribe()`.
When set to `True`, `initial_prompt` is prepended to each internal `decode()` call's `prompt`.
If there is not enough context space at the start of the prompt, the prompt is left-sliced to make space.

* Prevent redundant initial_prompt_tokens

* Revert unnecessary .gitignore change

---------

Co-authored-by: Kittsil <kittsil@gmail.com>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2024-10-26 07:17:31 -07:00
Jong Wook Kim
cdb8147962
more pytorch versions in tests (#2408) 2024-10-25 17:30:02 -07:00
Jong Wook Kim
25639fc17d Release 20240930 v20240930 2024-09-30 11:20:53 -07:00
Jong Wook Kim
260bbcfcb3
allowing numpy 2 in tests (#2362)
* allowing numpy 2 in tests

* allowing numpy 2 in tests
2024-09-30 11:18:17 -07:00
Jong Wook Kim
25e5c364e0
large-v3-turbo model (#2361) 2024-09-30 10:59:51 -07:00
Jong Wook Kim
b66b46f32d
test on python/pytorch versions up to 3.12 and 2.4.1 (#2360) 2024-09-30 10:33:56 -07:00
Jong Wook Kim
27f971320a
using sdpa if available (#2359)
* using sdpa if available

* Update model.py
2024-09-30 10:27:14 -07:00
Jong Wook Kim
423492dda7 Release 20240927 v20240927 2024-09-27 16:43:58 -07:00
Jong Wook Kim
279133e310
pinning numpy<2 in tests (#2332)
* pinning numpy<2 in tests

* pip install together

* pip install together
2024-09-10 10:43:21 -07:00
Jianan Xing
32d55d5d76
Relax triton requirements for compatibility with pytorch 2.4 and newer (#2307)
* Relax triton requirements for compatibility with pytorch 2.4 and newer

Similar to https://github.com/openai/whisper/pull/1802, but now when pytorch upgrades to 2.4, it requires triton==3.0.0. I am not sure if it makes sense to remove the upper bound version constraints

* Update requirements.txt
2024-09-10 09:53:08 -07:00
ryanheise
ba3f3cd54b
Skip silence around hallucinations (#1838)
* Add clip_timestamps option

* Add hallucination_silence_threshold option

* Fix typing for python < 3.9

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-12-18 12:11:16 -08:00
Bob Lin
8bc8860694
Fix triton env marker (#1887) 2023-12-11 10:39:08 -05:00
Jong Wook Kim
e58f288045 Release 20231117 v20231117 2023-11-17 11:59:28 -08:00
Eugene Indenbom
1cea435768
Relax triton requirements for compatibility with pytorch 2.1 and newer (#1802) 2023-11-13 09:43:42 -08:00
Jong Wook Kim
fcfeaf1b61 Release 20231106 v20231106 2023-11-06 10:14:04 -08:00
Jong Wook Kim
c5d4256076
large-v3 (#1761)
* mel_filters() loads 128 mel bins

* can load 100-language models

* large-v3 checkpoint and evals

* add mandarin alias

* remove unused path

* flake8 fix

* formatting fix
2023-11-06 10:10:30 -08:00
Jong Wook Kim
f6f01c561c Release 20231105 v20231105 2023-11-06 03:08:56 -08:00
Jong Wook Kim
746aaaeafa
remove tiktoken pin (#1759) 2023-11-06 03:05:21 -08:00
Philippe Hebert
b9f17e1f2d
docs: Disambiguation of the term "relative speed" in the README (#1751)
* docs: defines relative speed in README

* combined paragraphs

---------

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
2023-11-06 02:43:07 -08:00
Mohamad Zamini
7dfcd56304
allow_pickle=False while loading of mel matrix IN audio.py (#1511)
* Update audio.py

 The `mel_filters` function is using a `np.load` function to load a pre-computed mel filterbank matrix. This function is not thread-safe, which means that if it is called from multiple threads at the same time, it may corrupt the data.

To fix this, you can use the `torch.load` function instead. This function is thread-safe, so it will not corrupt the data if it is called from multiple threads at the same time.

* Update audio.py

updated the docstring

* allow_pickle=False

* newline

---------

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-11-06 02:28:51 -08:00
Marco Zucconelli
b7d277acd5
handling transcribe exceptions. (#1682)
* handling transcribe() exceptions.

* printing stacktrace

---------

Co-authored-by: invalid <invalid@email.com>
Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-11-06 02:06:19 -08:00
amosal
6ed314fe41
Add new option to generate subtitles by a specific number of words (#1729)
* ADD parser for new argument --max_words_count

* ADD max_words_count in words_options
ADD warning for max_line_width compatibility

* ADD logic for max_words_count

* rename to max_words_per_line

* make them kwargs

* allow specifying file path by --model

* black formatting

---------

Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
2023-11-06 01:49:33 -08:00
Jordi Mas
b38a1f20f4
Fix exception when an audio file with no speech is provided (#1396)
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-10-10 10:01:01 -07:00
Jong Wook Kim
0a60fcaa9b Release 20230918 v20230918 2023-09-18 17:13:19 -07:00
Jong Wook Kim
5f957da5ca
Update test.yml 2023-09-18 16:38:17 -07:00
Arthur Kim
8b330df096
Add .pre-commit-config.yaml (#1528)
* Add .pre-commit-config.yaml

Co-authored-by: arthur <arthur@rtzr.ai>

* flake8 E741

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-09-18 16:15:33 -07:00
sqhao
21010ef454
fix doc of TextDecoder (#1526)
Signed-off-by: haoshengqiang <haoshengqiang@xiaohongshu.com>
Co-authored-by: haoshengqiang <haoshengqiang@xiaohongshu.com>
2023-09-18 16:09:59 -07:00
Nino Risteski
29b7df6231
Update model-card.md (#1643)
fixed a few typos
2023-09-18 15:59:49 -07:00
taylorchu
e8622f9afc
word timing tweaks (#1559)
* word timing tweaks

* comment on eot

* clearer comments
2023-08-08 06:48:56 +09:00
WangChou Lu
b91c907694
Avoid rearranging all caches (#1483)
* avoid rearranging all kv_caches

* avoid calculating the same kv_cache from cross attn

* Update decoding.py

* linter fix

---------

Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-07-06 12:48:08 -07:00