Jong Wook Kim
c0d2f624c0
Release 20250625
2025-06-25 18:05:47 -07:00
Jong Wook Kim
db7fbc75fe
Release 20250625
2025-06-25 18:03:25 -07:00
Jong Wook Kim
31243bad24
Release 20250625
v20250625
2025-06-25 18:00:48 -07:00
Dridi Yassin
1f8fc975d3
Fix: Update torch.load to use weights_only=True to prevent security w… ( #2451 )
...
* Fix: Update torch.load to use weights_only=True to prevent security warning
* Update __init__.py
* Update __init__.py
---------
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2025-06-25 17:54:30 -07:00
Nathan Harmon
679ae1d141
Fix: Ensure DTW cost tensor is on the same device as input tensor ( #2561 )
...
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2025-06-25 17:42:09 -07:00
Nicholas Nadeau, Ph.D., P.Eng.
f50c4f264e
docs: updated README to specify translation model limitation ( #2547 )
...
Updated README given info from https://github.com/openai/whisper/discussions/2483
2025-06-25 17:03:47 -07:00
ExtReMLapin
86899243e9
Fixed triton kernel update to support latest triton versions ( #2588 )
...
* Update triton kernel using _unsafe_update_src
* support old triton versions
* refactored changes to update triton kernel only once
* Update triton_ops.py
---------
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
Co-authored-by: Jong Wook Kim <ilikekjw@gmail.com>
2025-06-25 17:02:54 -07:00
Learpcs
5dff4db81a
Fix: GitHub display errors for Jupyter notebooks ( #2589 )
...
* Update LibriSpeech.ipynb
Update LibriSpeech.ipynb
* Update Multilingual_ASR.ipynb
2025-06-25 16:55:15 -07:00
dependabot[bot]
dd985ac4b9
Bump the github-actions group with 3 updates ( #2592 )
...
Bumps the github-actions group with 3 updates: [actions/checkout](https://github.com/actions/checkout ), [actions/setup-python](https://github.com/actions/setup-python ) and [softprops/action-gh-release](https://github.com/softprops/action-gh-release ).
Updates `actions/checkout` from 3 to 4
- [Release notes](https://github.com/actions/checkout/releases )
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md )
- [Commits](https://github.com/actions/checkout/compare/v3...v4 )
Updates `actions/setup-python` from 4 to 5
- [Release notes](https://github.com/actions/setup-python/releases )
- [Commits](https://github.com/actions/setup-python/compare/v4...v5 )
Updates `softprops/action-gh-release` from 1 to 2
- [Release notes](https://github.com/softprops/action-gh-release/releases )
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md )
- [Commits](https://github.com/softprops/action-gh-release/compare/v1...v2 )
---
updated-dependencies:
- dependency-name: actions/checkout
dependency-version: '4'
dependency-type: direct:production
update-type: version-update:semver-major
dependency-group: github-actions
- dependency-name: actions/setup-python
dependency-version: '5'
dependency-type: direct:production
update-type: version-update:semver-major
dependency-group: github-actions
- dependency-name: softprops/action-gh-release
dependency-version: '2'
dependency-type: direct:production
update-type: version-update:semver-major
dependency-group: github-actions
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-13 11:22:31 -07:00
Christian Clauss
e1e6aa60ff
Keep GitHub Actions up to date with GitHub's Dependabot ( #2486 )
...
Automates the creation of pull requests like
* #2430
* [Keeping your actions up to date with Dependabot](https://docs.github.com/en/code-security/dependabot/working-with-dependabot/keeping-your-actions-up-to-date-with-dependabot )
* [Configuration options for the dependabot.yml file - package-ecosystem](https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#package-ecosystem )
2025-05-13 11:10:43 -07:00
Christian Clauss
e6a5fc0ff0
pre-commit: Upgrade black v25.1.0 and isort v6.0.0 ( #2514 )
2025-05-13 09:43:34 -07:00
Christian Clauss
13907bed90
GitHub Actions: Add Python 3.13 to the testing ( #2487 )
...
* GitHub Actions: Add Python 3.13 to the testing
* GitHub Actions: Add Python 3.13 to the testing
* numba==0.61.0rc2; python_version=='3.13'
* triton>=2; python_version<'3.13'
* fail-fast: false
* Numba v0.61.0 is released
https://github.com/numba/numba/releases
* Update pyproject.toml
2025-05-12 21:10:40 -07:00
Jong Wook Kim
517a43ecd1
Update python-publish.yml
...
using `-m build --sdist` instead of `setup.py sdist`
2025-01-04 12:56:16 -08:00
Christian Clauss
dd4d010d2c
PEP 621: Migrate from setup.py to pyproject.toml ( #2435 )
2025-01-04 01:38:35 -08:00
Christian Clauss
26a7cacc83
pre-commit autoupdate && pre-commit run --all-files ( #2484 )
...
* pre-commit autoupdate && pre-commit run --all-files
* Black formatter needs a current version of Python
2025-01-04 01:02:18 -08:00
Christian Clauss
6c1d8f1ea1
Upgrade GitHub Actions ( #2430 )
2025-01-04 00:47:12 -08:00
Purfview
90db0de189
Bugfix: Illogical "Avoid computing higher temperatures on no_speech" ( #1903 )
...
* Bugfix: Illogical "Avoid computing higher temperatures on no_speech"
Bugfix for https://github.com/openai/whisper/pull/1279
It's "silence" when decoding has failed due to `compression_ratio_threshold` too, when further down the code it's not "silence" anymore.
"Silence" should be only when decoding has failed due to `logprob_threshold`.
Like described there:
8bc8860694/whisper/transcribe.py (L421)
And in code there:
8bc8860694/whisper/transcribe.py (L243-L251)
* Fix if "logprob_threshold=None"
---------
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2024-11-30 21:47:01 -08:00
Lowell Vaughn
fc5ded7d90
Updating README and doc strings to reflect that n_mels can now be 128 ( #2049 )
2024-11-26 09:37:01 -08:00
f1sh
173ff7dd1d
fix typo data/README.md ( #2433 )
2024-11-12 16:35:54 -08:00
BotMaster3000
271445b2f2
Update README.md ( #2379 )
...
Default now uses Turbo instead of Small
2024-11-03 23:00:30 -08:00
kittsil
5979f03701
Add option to carry initial_prompt with the sliding window ( #2343 )
...
* Add option to carry initial_prompt with the sliding window
Add an option `carry_initial_prompt = False` to `whisper.transcribe()`.
When set to `True`, `initial_prompt` is prepended to each internal `decode()` call's `prompt`.
If there is not enough context space at the start of the prompt, the prompt is left-sliced to make space.
* Prevent redundant initial_prompt_tokens
* Revert unnecessary .gitignore change
---------
Co-authored-by: Kittsil <kittsil@gmail.com>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2024-10-26 07:17:31 -07:00
Jong Wook Kim
cdb8147962
more pytorch versions in tests ( #2408 )
2024-10-25 17:30:02 -07:00
Jong Wook Kim
25639fc17d
Release 20240930
v20240930
2024-09-30 11:20:53 -07:00
Jong Wook Kim
260bbcfcb3
allowing numpy 2 in tests ( #2362 )
...
* allowing numpy 2 in tests
* allowing numpy 2 in tests
2024-09-30 11:18:17 -07:00
Jong Wook Kim
25e5c364e0
large-v3-turbo model ( #2361 )
2024-09-30 10:59:51 -07:00
Jong Wook Kim
b66b46f32d
test on python/pytorch versions up to 3.12 and 2.4.1 ( #2360 )
2024-09-30 10:33:56 -07:00
Jong Wook Kim
27f971320a
using sdpa if available ( #2359 )
...
* using sdpa if available
* Update model.py
2024-09-30 10:27:14 -07:00
Jong Wook Kim
423492dda7
Release 20240927
v20240927
2024-09-27 16:43:58 -07:00
Jong Wook Kim
279133e310
pinning numpy<2 in tests ( #2332 )
...
* pinning numpy<2 in tests
* pip install together
* pip install together
2024-09-10 10:43:21 -07:00
Jianan Xing
32d55d5d76
Relax triton requirements for compatibility with pytorch 2.4 and newer ( #2307 )
...
* Relax triton requirements for compatibility with pytorch 2.4 and newer
Similar to https://github.com/openai/whisper/pull/1802 , but now when pytorch upgrades to 2.4, it requires triton==3.0.0. I am not sure if it makes sense to remove the upper bound version constraints
* Update requirements.txt
2024-09-10 09:53:08 -07:00
ryanheise
ba3f3cd54b
Skip silence around hallucinations ( #1838 )
...
* Add clip_timestamps option
* Add hallucination_silence_threshold option
* Fix typing for python < 3.9
---------
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-12-18 12:11:16 -08:00
Bob Lin
8bc8860694
Fix triton env marker ( #1887 )
2023-12-11 10:39:08 -05:00
Jong Wook Kim
e58f288045
Release 20231117
v20231117
2023-11-17 11:59:28 -08:00
Eugene Indenbom
1cea435768
Relax triton requirements for compatibility with pytorch 2.1 and newer ( #1802 )
2023-11-13 09:43:42 -08:00
Jong Wook Kim
fcfeaf1b61
Release 20231106
v20231106
2023-11-06 10:14:04 -08:00
Jong Wook Kim
c5d4256076
large-v3 ( #1761 )
...
* mel_filters() loads 128 mel bins
* can load 100-language models
* large-v3 checkpoint and evals
* add mandarin alias
* remove unused path
* flake8 fix
* formatting fix
2023-11-06 10:10:30 -08:00
Jong Wook Kim
f6f01c561c
Release 20231105
v20231105
2023-11-06 03:08:56 -08:00
Jong Wook Kim
746aaaeafa
remove tiktoken pin ( #1759 )
2023-11-06 03:05:21 -08:00
Philippe Hebert
b9f17e1f2d
docs: Disambiguation of the term "relative speed" in the README ( #1751 )
...
* docs: defines relative speed in README
* combined paragraphs
---------
Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
2023-11-06 02:43:07 -08:00
Mohamad Zamini
7dfcd56304
allow_pickle=False while loading of mel matrix IN audio.py ( #1511 )
...
* Update audio.py
The `mel_filters` function is using a `np.load` function to load a pre-computed mel filterbank matrix. This function is not thread-safe, which means that if it is called from multiple threads at the same time, it may corrupt the data.
To fix this, you can use the `torch.load` function instead. This function is thread-safe, so it will not corrupt the data if it is called from multiple threads at the same time.
* Update audio.py
updated the docstring
* allow_pickle=False
* newline
---------
Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-11-06 02:28:51 -08:00
Marco Zucconelli
b7d277acd5
handling transcribe exceptions. ( #1682 )
...
* handling transcribe() exceptions.
* printing stacktrace
---------
Co-authored-by: invalid <invalid@email.com>
Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-11-06 02:06:19 -08:00
amosal
6ed314fe41
Add new option to generate subtitles by a specific number of words ( #1729 )
...
* ADD parser for new argument --max_words_count
* ADD max_words_count in words_options
ADD warning for max_line_width compatibility
* ADD logic for max_words_count
* rename to max_words_per_line
* make them kwargs
* allow specifying file path by --model
* black formatting
---------
Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
2023-11-06 01:49:33 -08:00
Jordi Mas
b38a1f20f4
Fix exception when an audio file with no speech is provided ( #1396 )
...
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-10-10 10:01:01 -07:00
Jong Wook Kim
0a60fcaa9b
Release 20230918
v20230918
2023-09-18 17:13:19 -07:00
Jong Wook Kim
5f957da5ca
Update test.yml
2023-09-18 16:38:17 -07:00
Arthur Kim
8b330df096
Add .pre-commit-config.yaml ( #1528 )
...
* Add .pre-commit-config.yaml
Co-authored-by: arthur <arthur@rtzr.ai>
* flake8 E741
---------
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-09-18 16:15:33 -07:00
sqhao
21010ef454
fix doc of TextDecoder ( #1526 )
...
Signed-off-by: haoshengqiang <haoshengqiang@xiaohongshu.com>
Co-authored-by: haoshengqiang <haoshengqiang@xiaohongshu.com>
2023-09-18 16:09:59 -07:00
Nino Risteski
29b7df6231
Update model-card.md ( #1643 )
...
fixed a few typos
2023-09-18 15:59:49 -07:00
taylorchu
e8622f9afc
word timing tweaks ( #1559 )
...
* word timing tweaks
* comment on eot
* clearer comments
2023-08-08 06:48:56 +09:00
WangChou Lu
b91c907694
Avoid rearranging all caches ( #1483 )
...
* avoid rearranging all kv_caches
* avoid calculating the same kv_cache from cross attn
* Update decoding.py
* linter fix
---------
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
2023-07-06 12:48:08 -07:00