whisper

mirror of https://github.com/openai/whisper.git synced 2025-03-30 14:28:27 +00:00

Author	SHA1	Message	Date
Vicki Anand	9f70a352f9	Fix attention caching to make it actually work (#370 )	2022-10-19 16:44:03 -07:00
Sumana Harihareswara	7f3e408e09	Add package metadata to setup.py (#315 ) Add project summary, license, etc. for display with "pip show" and similar Python package distribution tools.	2022-10-17 13:51:16 -07:00
Michael Monashev	f680570016	Fix bug (#305 ) Fix bug: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)	2022-10-17 11:38:20 -07:00
Jong Wook Kim	d18e9ea5dd	transcribe() on English-only model won't complain when language="en" is not given	2022-10-09 02:40:12 -07:00
David Marx	82725cea9c	infer download_root from XDG_CACHE_HOME if avail (#257 )	2022-10-09 02:14:03 -07:00
eudoxos	35713c66e0	Add --threads option to transcribe (#278 ) * Add --threads option to transcribe Torch on CPU uses by default number_of_cores/2. This option allows to override this default. * Update transcribe.py Co-authored-by: Jong Wook Kim <ilikekjw@gmail.com>	2022-10-09 02:11:15 -07:00
Corentin Jemine	9e653bd0ea	Fixed CoW RuntimeError in DecodingTask.run() (#240 )	2022-10-04 08:49:31 -07:00
Tom Stuart	02b74308ff	Fix timestamps and strip extraneous whitespace in WebVTT output (#219 ) * Use two-digit hours in WebVTT timestamps Per the WebVTT specification [0]: > A WebVTT timestamp consists of the following components, in the given > order: > > 1. Optionally (required if hours is non-zero): > 1. Two or more ASCII digits, representing the hours as a base ten > integer. > 2. A U+003A COLON character (:) YouTube won’t accept timestamps containing single-digit hours. [0] https://www.w3.org/TR/webvtt1/#webvtt-timestamp * Strip segment text in WebVTT output We already do this for plain text and SubRip output, so we should do it for WebVTT too.	2022-10-03 14:51:07 -07:00
Jibin Mathew	0b1ba3d46e	Add model_dir to arguments (#202 ) * Add model_dir to arguments * minor formatting change Co-authored-by: Jong Wook Kim <jongwook@openai.com>	2022-09-30 14:45:51 -07:00
Caleb McQuillin	60132ade70	Use , character instead of . for SRT output. (#197 ) The SRT format uses the decimal comma character as the fractional separator rather than the decimal point character. Adjust format_timestamp and write_srt to specify the separator character. See https://en.wikipedia.org/wiki/SubRip#:~:text=the%20fractional%20separator%20used%20is%20the%20comma%2C%20since%20the%20program%20was%20written%20in%20france.	2022-09-29 20:44:12 -07:00
Jong Wook Kim	7cb4cc21bf	allowing nonzero initial temperature	2022-09-29 18:05:12 -07:00
Jong Wook Kim	30dc5c581b	pointer to the show and tell section	2022-09-29 14:57:49 -07:00
Szabolcs Pasztor	5905e503b8	Update README.md (#161 ) * Update README.md * merging paragraphs Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>	2022-09-29 14:18:54 -07:00
Fabiano	0457aac342	Adds missing command for install (mac) (#90 ) * Adds missing command for install (mac) Required for users who didn't previously have Rust installed. * minor wording change Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>	2022-09-29 14:08:58 -07:00
sawadata	deafef05f3	Update audio.py (#178 ) add '-nostdin' argument	2022-09-29 12:34:04 -07:00
Vicki Anand	2b0c2971af	Don't update duration if last timestamp is same as begin (#191 )	2022-09-29 12:27:48 -07:00
Jong Wook Kim	62fe7f1009	patience definition to match the paper	2022-09-27 19:00:41 -07:00
Nick Konovalchuk	b4308c4782	fix: transcribe verbosity (#140 )	2022-09-26 11:46:21 -07:00
Michael Goin	9c8183a179	Use PyTorch as logits transpose for ONNX support (#141 )	2022-09-26 10:54:26 -07:00
VulumeCode	2037b65f3f	Context prompt (#128 ) Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>	2022-09-26 05:22:33 -07:00
EliEron	fc0f40981d	Write each sentence as a separate line for the txt output (#101 ) * Write each sentence as a separate line for the txt output Write each sentence as a separate line for the txt output * Update utils.py Co-authored-by: EliEron <example@example.com> Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>	2022-09-26 04:52:28 -07:00
VulumeCode	520796a34c	fix token suppression (#123 )	2022-09-26 04:35:21 -07:00
fatih	ead77fab97	add srt subtitle export utility (#102 ) * add srt subtitle export utility * simplifying Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>	2022-09-26 03:50:26 -07:00
Ashutosh Tripathi	5485428c81	arch linux ffmpeg install (#93 )	2022-09-26 03:24:47 -07:00
fatih	9e7e418ff1	add progress bar for transcribe loop (#100 ) * add progress bar to transcribe loop * improved warning message for English-only models * add --condition_on_previous_text * progressbar renames Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>	2022-09-26 03:24:13 -07:00
Jong Wook Kim	5d8d3e75a4	add --condition_on_previous_text	2022-09-25 05:16:08 -07:00
Jong Wook Kim	2d3032de01	improved warning message for English-only models	2022-09-25 02:10:36 -07:00
Jong Wook Kim	8cf36f3508	allow hyphens and single quotes between words	2022-09-23 20:11:27 +09:00
Jong Wook Kim	15ab548263	nocaptions -> nospeech to match the paper figure	2022-09-23 15:45:32 +09:00
mj-kh	61989529b7	Fix possible mistake when loading model to device (#57 ) Before this change, the model is loaded into GPU regardless of the value of "device" argument in CLI. (e.g. whisper "test.wav" --device cpu loads into GPU anyway)	2022-09-23 15:21:47 +09:00
Niklas K	f296bcd3fa	Avoid keeping redundant copies of model weights in memory during load (#42 ) * don't keep copies of model weights in host memory * adding type annotation Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>	2022-09-23 12:57:39 +09:00
Sidney Radcliffe	a4fe05aa71	Add conda environment.yml (and fix requirements.txt) (#8 ) * fix: more-itertools name in requirements.txt * feature: minimal environment.yml for conda * Revert "feature: minimal environment.yml for conda" This reverts commit 8fd7438b368b0eb5df85f667fea911f293fa5e6d. Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>	2022-09-23 12:30:45 +09:00
Giovanni Lanzani	957ffc77de	Add rust as a dependency (#30 ) * Add rust as a dependency * Update README.md Co-authored-by: Jong Wook Kim <ilikekjw@gmail.com>	2022-09-23 12:26:38 +09:00
Ram Rachum	59f543e218	Fix exception cause in audio.py (#33 )	2022-09-23 12:12:37 +09:00
hanacchi	c85eaaae29	Use UTF-8 encoding to save the txt and vtt files (#37 ) Explicitly set the text encoding to UTF-8 in order to avoid UnicodeEncodeErrors Co-authored-by: Jong Wook Kim <jongwook@nyu.edu>	2022-09-23 12:10:55 +09:00
EliEron	759e8d47a8	Fix output_dir argument when audio file is a path (#45 )	2022-09-23 11:38:37 +09:00
Micheal Taylor	c0607e8d22	Add scoop install for windows (#48 ) Adding scoop install to setup for windows for ffmpeg	2022-09-23 11:37:57 +09:00
Jong Wook Kim	e90b8fa7e8	Merge pull request #14 from bquast/patch-1 make LICENSE a link instead of code-formatted text	2022-09-22 11:51:05 +09:00
Jong Wook Kim	f83cb83a42	Merge pull request #24 from ldanilov/patch-1 fixes the link to the model paper	2022-09-22 11:48:57 +09:00
Lev Danilov	45fc3d43c1	fixes the link to the model paper	2022-09-21 21:25:17 -04:00
Bastiaan Quast	08a739ad79	make LICENSE a link instead of code-formatted text	2022-09-21 23:17:02 +02:00
Jong Wook Kim	49a3ffc997	add section Available models and languages	2022-09-22 05:36:25 +09:00
Jong Wook Kim	cfd6bdda21	a note on speed-accuracy tradeoffs	2022-09-22 02:58:56 +09:00
Jong Wook Kim	834f00a0ea	making small model the default	2022-09-22 02:45:12 +09:00
Jong Wook Kim	6e3be77e1a	initial commit	2022-09-22 01:09:43 +09:00

1 2

95 Commits