* Fix: Update torch.load to use weights_only=True to prevent security warning
* Update __init__.py
* Update __init__.py
---------
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
* Update triton kernel using _unsafe_update_src
* support old triton versions
* refactored changes to update triton kernel only once
* Update triton_ops.py
---------
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
Co-authored-by: Jong Wook Kim <ilikekjw@gmail.com>
* Bugfix: Illogical "Avoid computing higher temperatures on no_speech"
Bugfix for https://github.com/openai/whisper/pull/1279
It's "silence" when decoding has failed due to `compression_ratio_threshold` too, when further down the code it's not "silence" anymore.
"Silence" should be only when decoding has failed due to `logprob_threshold`.
Like described there:
8bc8860694/whisper/transcribe.py (L421)
And in code there:
8bc8860694/whisper/transcribe.py (L243-L251)
* Fix if "logprob_threshold=None"
---------
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
* Add option to carry initial_prompt with the sliding window
Add an option `carry_initial_prompt = False` to `whisper.transcribe()`.
When set to `True`, `initial_prompt` is prepended to each internal `decode()` call's `prompt`.
If there is not enough context space at the start of the prompt, the prompt is left-sliced to make space.
* Prevent redundant initial_prompt_tokens
* Revert unnecessary .gitignore change
---------
Co-authored-by: Kittsil <kittsil@gmail.com>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
* Relax triton requirements for compatibility with pytorch 2.4 and newer
Similar to https://github.com/openai/whisper/pull/1802, but now when pytorch upgrades to 2.4, it requires triton==3.0.0. I am not sure if it makes sense to remove the upper bound version constraints
* Update requirements.txt
Dear Developers,
I'm pleased to inform you that I have completed the documentation update the utils.py file.
The updated documentation provides clear explanations of function parameters, return types, and expected behavior. Additionally, it adheres to consistent formatting and organization, ensuring ease of understanding for both current and future developers.
Please review the updated documentation at your earliest convenience. If you have any feedback or suggestions for further improvements, please don't hesitate to let me know.
Thank you for your attention to this matter.
Best regards,
Louis Brulé Naudet
* Update audio.py
The `mel_filters` function is using a `np.load` function to load a pre-computed mel filterbank matrix. This function is not thread-safe, which means that if it is called from multiple threads at the same time, it may corrupt the data.
To fix this, you can use the `torch.load` function instead. This function is thread-safe, so it will not corrupt the data if it is called from multiple threads at the same time.
* Update audio.py
updated the docstring
* allow_pickle=False
* newline
---------
Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>
Co-authored-by: Jong Wook Kim <jongwook@openai.com>
* ADD parser for new argument --max_words_count
* ADD max_words_count in words_options
ADD warning for max_line_width compatibility
* ADD logic for max_words_count
* rename to max_words_per_line
* make them kwargs
* allow specifying file path by --model
* black formatting
---------
Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>