Commit Graph

  • 8a37460636
    Add model to a new job_details key John Pariseau 2023-04-12 06:42:18 -04:00
  • a248046ec8 fix condition_on_previous_text Valentin Berkes 2023-04-11 17:40:50 +02:00
  • c09a7ae299
    Update decoding.py (#1219) Jong Wook Kim 2023-04-11 18:13:13 -04:00
  • f17b158af5
    Update decoding.py jongwook-patch-2 Jong Wook Kim 2023-04-11 15:08:19 -07:00
  • b0022b3283
    Update decoding.py (#1155) Fernando O. Gallego 2023-04-12 00:06:03 +02:00
  • e2e65dbdcf
    Merge branch 'main' into main Jong Wook Kim 2023-04-11 17:32:39 -04:00
  • a5c80e6cb9
    Suggested changes according to the linter Fernando O. Gallego 2023-04-11 08:36:42 +02:00
  • 76c901ab8d
    Update README.md to reference tiktoken (#1105) Arseniy Bushyn 2023-04-11 03:39:17 +03:00
  • 15d30c3aeb
    Merge branch 'main' into patch-1 Jong Wook Kim 2023-04-10 20:33:15 -04:00
  • 5b60c867a5
    Merge 169d2428ba6a8606824698321e7e136ef98bec0a into 43940fc9780cb91c4f94899755b4648e19d7b977 SFARPak 2023-04-10 17:32:00 -07:00
  • 43940fc978
    Implement max line width and max line count, and make word highlighting optional (#1184) ryanheise 2023-04-11 10:28:35 +10:00
  • f49254538a
    Merge branch 'main' into line-char-limits Jong Wook Kim 2023-04-10 20:25:30 -04:00
  • 255887f219
    Squash long words at window and sentence boundaries. (#1114) ryanheise 2023-04-11 10:23:53 +10:00
  • 339ac46da2
    Merge branch 'main' into line-char-limits Jong Wook Kim 2023-04-10 20:22:58 -04:00
  • 4a6a3ead86
    Merge branch 'main' into truncate-long-words Jong Wook Kim 2023-04-10 20:21:30 -04:00
  • a151816b6b
    python-publish.yml: bump actions version to fix node warning (#1211) K.B.Dharun Krishna 2023-04-11 02:24:09 +05:30
  • 56402123d4
    python-publish.yml: bump actions version to fix node warning K.B.Dharun Krishna 2023-04-08 19:44:13 +05:30
  • 169d2428ba
    Update README.md SFARPak 2023-04-05 01:19:23 +05:00
  • 34124a9fce
    Merge bda69505dbb55a1958cff0b5b150c44cf25f9353 into b5851c6c40e753606765ac45b85b298e3ae9e00d helumanxa 2023-04-02 13:22:12 +02:00
  • bda69505db Bug bounty test - please ignore.... (ahkakk) helumanxa 2023-04-02 11:21:46 +00:00
  • 663b2ee54e Refactor subtitle generator Ryan Heise 2023-04-02 21:03:18 +10:00
  • 62241c0f71 Add highlight_words, max_line_width, max_line_count Ryan Heise 2023-04-02 11:52:24 +10:00
  • 0fa2ab25e3
    added cantonese to the language list Keith Hon 2023-04-01 13:18:23 +08:00
  • 185f5a05c0
    Merge 0a7446cdd11897376e14e4fa528eb0983f00f245 into b5851c6c40e753606765ac45b85b298e3ae9e00d Keith Hon 2023-04-01 03:19:09 +00:00
  • 0a7446cdd1
    yue also does not use spaces Keith Hon 2023-04-01 11:18:05 +08:00
  • be71d42b14
    added cantonese to the language list Keith Hon 2023-04-01 11:16:10 +08:00
  • e2cfaff2f7
    Update requirements.txt Johnny 2023-03-31 00:48:41 +02:00
  • 87fc1090b8 revert changes johnnynunez 2023-03-31 00:36:24 +02:00
  • 06dc24a5ad fix johnnynunez 2023-03-31 00:26:18 +02:00
  • 4dbb5476af fix johnnynunez 2023-03-31 00:24:10 +02:00
  • a124b581b8 fix johnnynunez 2023-03-31 00:22:27 +02:00
  • 3a4a95a0de python 3.11 johnnynunez 2023-03-31 00:20:57 +02:00
  • 6ec4d54716 python 3.11 johnnynunez 2023-03-31 00:06:42 +02:00
  • 2cd1fc862d Removed blank line and whitespaces in empty lines. FernanOrtega 2023-03-30 09:03:30 +02:00
  • 42a8d6f80b
    Merge branch 'openai:main' into main Fernando O. Gallego 2023-03-30 08:38:48 +02:00
  • b5851c6c40
    Update tokenizer.py (#1163) Jong Wook Kim 2023-03-29 16:12:36 -04:00
  • 4eb8932c06
    Update tokenizer.py pat-str-fix Jong Wook Kim 2023-03-29 13:10:10 -07:00
  • fc181e9fa9
    Update decoding.py Fernando O. Gallego 2023-03-27 11:15:37 +02:00
  • cfce274f07 IndexError: arrays used as indices must be of integer type Your Name 2023-03-27 11:02:49 +02:00
  • ef14efdc54 feat: improve language detection petrosvav 2023-03-24 16:02:48 +02:00
  • b589d3b467
    Fixed perf bug with color Sinan 2023-03-23 14:59:00 +01:00
  • 6750a98bdd Fixed token_prob length! :) SinanAkkoyun 2023-03-23 02:42:25 +01:00
  • 2ff7dbb41a committed SinanAkkoyun 2023-03-23 02:25:21 +01:00
  • 5e6714ef11 committed SinanAkkoyun 2023-03-23 02:20:01 +01:00
  • 3ea3ae15e5
    Merge 85ba58c077ffe351addf54f758555251461bcc81 into 6dea21fd7f7253bfe450f1e2512a0fe47ee2d258 doublex 2023-03-22 10:06:06 +00:00
  • 85ba58c077 Better fix Your Name 2023-03-22 11:04:04 +01:00
  • 4df11b6587 IndexError: arrays used as indices must be of integer type Your Name 2023-03-21 19:17:22 +01:00
  • ca4cadd3d6
    Merge 7311f8b30d3b89b01cc766d4520d9af988609cfb into 6dea21fd7f7253bfe450f1e2512a0fe47ee2d258 doublex 2023-03-21 18:15:16 +00:00
  • 7311f8b30d IndexError: arrays used as indices must be of integer type Your Name 2023-03-21 19:06:14 +01:00
  • d425c30226
    Create maison8 bibouuu 2023-03-21 17:16:19 +01:00
  • 63e2c6be73 Fix squashing logic to point to correct words. Ryan Heise 2023-03-21 23:22:07 +11:00
  • 0f9ee4794b
    docs(readme): remove instructions for installing huggingface transformers::tokenizer Soumya Deb 2023-03-21 01:50:42 +05:30
  • b24d29355d committed SinanAkkoyun 2023-03-19 13:39:10 +01:00
  • db30d12efb committed SinanAkkoyun 2023-03-19 13:28:07 +01:00
  • ab3f38d52e Formatting requirements. Ryan Heise 2023-03-18 19:14:54 +11:00
  • 6771ef9fe8 Squash long words at window and sentence boundaries. Ryan Heise 2023-03-18 17:28:48 +11:00
  • 8eee85d4b3
    Update README.md to reference tiktoken Arseniy Bushyn 2023-03-16 23:01:28 +03:00
  • 395db62ccd
    Update README.md Sinan 2023-03-16 12:46:17 +01:00
  • 6dea21fd7f Release 20230314 v20230314 Jong Wook Kim 2023-03-15 00:39:05 -07:00
  • 79c43e4859
    abort find_alignment on empty input (#1090) Jong Wook Kim 2023-03-14 15:47:58 -04:00
  • 24ba319e70
    abort find_alignment on empty input jongwook-patch-1 Jong Wook Kim 2023-03-14 12:44:54 -07:00
  • 5f9ac653b7
    Fix truncated words list when the replacement character is decoded (#1089) Guillaume Klein 2023-03-14 17:32:41 +01:00
  • e564a27fc9 Fix truncated words list when the replacement character is decoded Guillaume Klein 2023-03-14 10:00:54 +01:00
  • ba88b8e1b3
    fix github language stats getting dominated by jupyter notebook (#1076) Akash Mahajan 2023-03-14 00:07:09 -07:00
  • 854851a2fd
    Merge branch 'main' into fix-github-nb-stats Jong Wook Kim 2023-03-14 02:56:24 -04:00
  • 671ac5a4ce
    Fix alignment between the segments and the list of words (#1087) Guillaume Klein 2023-03-14 00:34:09 +01:00
  • be8e726c90 Ensure the word index does not overflow Guillaume Klein 2023-03-13 18:06:22 +01:00
  • 17f30c3ea7 Fix alignment between the segments and the list of words Guillaume Klein 2023-03-13 17:18:47 +01:00
  • 839639a223
    Use tiktoken (#1044) Jong Wook Kim 2023-03-13 05:34:16 -04:00
  • a0bd014f13 bypassing load_tiktoken_bpe to avoid blobfile dep Jong Wook Kim 2023-03-13 02:30:42 -07:00
  • 72e5e6746e cleanup Jong Wook Kim 2023-03-13 02:20:08 -07:00
  • 6869cd8284 reflecting suggestions Jong Wook Kim 2023-03-13 02:11:26 -07:00
  • 2a14e808cc use tiktoken 0.3.1 Jong Wook Kim 2023-03-13 01:43:49 -07:00
  • 06e59be0ec
    Merge branch 'main' into use-tiktoken Jong Wook Kim 2023-03-13 04:43:16 -04:00
  • 117ed3edc1
    Update whisper/tokenizer.py Jong Wook Kim 2023-03-13 01:18:59 -07:00
  • a05529da57 fix github language stats getting dominated by jupyter notebook Akash Mahajan 2023-02-14 12:35:41 -08:00
  • ad3250a846 Release 20230308 v20230308 Jong Wook Kim 2023-03-08 15:48:57 -08:00
  • c4b50c0824
    kwargs in decode() for convenience (#1061) Jong Wook Kim 2023-03-08 18:46:38 -05:00
  • 90c23cd46c formatting fix Jong Wook Kim 2023-03-08 15:37:24 -08:00
  • 8711e5f6c3 kwargs in decode() for convenience Jong Wook Kim 2023-03-08 15:36:47 -08:00
  • 38f2f4d99d
    fix all_tokens handling that caused more repetitions and discrepancy in JSON (#1060) Jong Wook Kim 2023-03-08 18:34:07 -05:00
  • 17b3e50d73 fix all_tokens handling that caused more repetitions and discrepancy in JSON Jong Wook Kim 2023-03-08 15:30:16 -08:00
  • aac47c9834 fix typo Jong Wook Kim 2023-03-07 20:43:49 -08:00
  • 26807ec6d3 Release 20230307 v20230307 Jong Wook Kim 2023-03-07 20:36:29 -08:00
  • 919a713499
    attempt to fix the repetition/hallucination issue identified in #1046 (#1052) Jong Wook Kim 2023-03-07 23:08:45 -05:00
  • ea5ef5051b delete debug print Jong Wook Kim 2023-03-07 18:21:33 -08:00
  • 1c6a3b47ea
    Merge branch 'main' into fix-decoding-repetition-degradation Jong Wook Kim 2023-03-07 21:08:56 -05:00
  • 41410b761f formatting fix Jong Wook Kim 2023-03-07 17:53:37 -08:00
  • 477f0befc7 zero-pad the audio instead of spectrogram Jong Wook Kim 2023-03-07 17:50:43 -08:00
  • 38e990d853
    Use triton==2.0.0 (#1053) Jong Wook Kim 2023-03-07 19:56:31 -05:00
  • 789e49aa39
    Use triton==2.0.0 Jong Wook Kim 2023-03-07 16:45:35 -08:00
  • f9cfde996b attempt to fix the repetition/hallucination issue identified in #1046 Jong Wook Kim 2023-03-07 13:48:09 -08:00
  • 924e1f8e06
    Try installing triton only if linux & x86_64 (#1051) Jong Wook Kim 2023-03-07 14:31:40 -05:00
  • 178fbe682a
    Try installing triton only if linux & x86_64 Jong Wook Kim 2023-03-07 11:28:25 -08:00
  • 4b0d5e58d0
    Update setup.py Jong Wook Kim 2023-03-07 04:47:46 -08:00
  • 67e8805f24 tuple should be safer Jong Wook Kim 2023-03-07 03:51:04 -08:00
  • 39237a3531 formatting Jong Wook Kim 2023-03-07 02:36:13 -08:00
  • 5e35893afc use tiktoken==0.3.0 Jong Wook Kim 2023-03-07 02:24:38 -08:00
  • 8180fde939 Release 20230306 v20230306 Jong Wook Kim 2023-03-06 18:50:41 -08:00
  • c6e4e5efb3
    remove auxiliary audio extension (#1021) Local State 2023-03-06 20:48:14 -05:00