mirror of
https://github.com/openai/whisper.git
synced 2025-11-26 15:35:57 +00:00
Merge pull request #2 from KomunikacjaTechnicznaVistula/main
Improve structure
This commit is contained in:
commit
6d2c1f1ab3
43
README.md
43
README.md
@ -8,11 +8,11 @@
|
|||||||
|
|
||||||
## Contents <!-- omit in toc -->
|
## Contents <!-- omit in toc -->
|
||||||
|
|
||||||
- [Introduction](#introduction)
|
- [What is Whisper](#what-is-whisper)
|
||||||
- [Approach](#approach)
|
- [Setup](#setup)
|
||||||
- [Prerequisites](#prerequisites)
|
- [Prerequsites](#prerequisites)
|
||||||
- [Installation](#installation)
|
- [Installation](#installation)
|
||||||
- [Installation troubleshooting](#installation-troubleshooting)
|
- [Installation troubleshooting](#installation-troubleshooting)
|
||||||
- [Available models and languages](#available-models-and-languages)
|
- [Available models and languages](#available-models-and-languages)
|
||||||
- [Performance](#performance)
|
- [Performance](#performance)
|
||||||
- [Command-line usage](#command-line-usage)
|
- [Command-line usage](#command-line-usage)
|
||||||
@ -20,42 +20,50 @@
|
|||||||
- [More examples](#more-examples)
|
- [More examples](#more-examples)
|
||||||
- [License](#license)
|
- [License](#license)
|
||||||
|
|
||||||
|
## What is Whisper
|
||||||
## Introduction
|
|
||||||
|
|
||||||
Whisper is a multilingual speech recognition model for general purposes, including speech translation and language identification. Whisper is trained on a large dataset of diverse audio.
|
Whisper is a multilingual speech recognition model for general purposes, including speech translation and language identification. Whisper is trained on a large dataset of diverse audio.
|
||||||
|
|
||||||
## Approach
|
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
A Transformer sequence-to-sequence model is trained on various speech processing tasks. The tasks include multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder. As a result, a single model replaces many steps in traditional speech processing. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.
|
A Transformer sequence-to-sequence model is trained on various speech processing tasks. The tasks include multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder. As a result, a single model replaces many steps in traditional speech processing. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.
|
||||||
|
|
||||||
We used Python 3.9.9 and [PyTorch](https://pytorch.org/) 1.10.1 to train and test our models. The codebase should be compatible with Python 3.8-3.11 and recent PyTorch versions. The codebase also depends on a few Python packages, most notably [OpenAI's tiktoken](https://github.com/openai/tiktoken) for their fast tokenizer implementation.
|
We used Python 3.9.9 and [PyTorch](https://pytorch.org/) 1.10.1 to train and test our models. The codebase should be compatible with Python 3.8-3.11 and recent PyTorch versions. The codebase also depends on a few Python packages, most notably [OpenAI's tiktoken](https://github.com/openai/tiktoken) for their fast tokenizer implementation.
|
||||||
|
|
||||||
## Prerequisites
|
## Setup
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
|
||||||
* Whisper requires the command-line tool [`ffmpeg`](https://ffmpeg.org/) to be installed on your system. The command-line tool is available from most package managers. To install [`ffmpeg`](https://ffmpeg.org/), use one of the following commands for your operating system:
|
* Whisper requires the command-line tool [`ffmpeg`](https://ffmpeg.org/) to be installed on your system. The command-line tool is available from most package managers. To install [`ffmpeg`](https://ffmpeg.org/), use one of the following commands for your operating system:
|
||||||
|
|
||||||
|
**on Ubuntu or Debian**
|
||||||
```bash
|
```bash
|
||||||
# on Ubuntu or Debian
|
|
||||||
sudo apt update && sudo apt install ffmpeg
|
sudo apt update && sudo apt install ffmpeg
|
||||||
|
```
|
||||||
|
|
||||||
# on Arch Linux
|
**on Arch Linux**
|
||||||
|
```bash
|
||||||
sudo pacman -S ffmpeg
|
sudo pacman -S ffmpeg
|
||||||
|
```
|
||||||
|
|
||||||
# on MacOS using Homebrew (https://brew.sh/)
|
**on MacOS using Homebrew (https://brew.sh/)**
|
||||||
|
```bash
|
||||||
brew install ffmpeg
|
brew install ffmpeg
|
||||||
|
```
|
||||||
|
|
||||||
# on Windows using Chocolatey (https://chocolatey.org/)
|
**on Windows using Chocolatey (https://chocolatey.org/)**
|
||||||
|
```bash
|
||||||
choco install ffmpeg
|
choco install ffmpeg
|
||||||
|
```
|
||||||
|
|
||||||
# on Windows using Scoop (https://scoop.sh/)
|
**on Windows using Scoop (https://scoop.sh/)**
|
||||||
|
```bash
|
||||||
scoop install ffmpeg
|
scoop install ffmpeg
|
||||||
```
|
```
|
||||||
|
|
||||||
* If [tiktoken](https://github.com/openai/tiktoken) does not provide a pre-built wheel for your platform, install [`rust`](http://rust-lang.org). Follow the [Getting started page](https://www.rust-lang.org/learn/get-started) to install the Rust development environment.
|
* If [tiktoken](https://github.com/openai/tiktoken) does not provide a pre-built wheel for your platform, install [`rust`](http://rust-lang.org). Follow the [Getting started page](https://www.rust-lang.org/learn/get-started) to install the Rust development environment.
|
||||||
|
|
||||||
## Installation
|
### Installation
|
||||||
|
|
||||||
* You can download and install (or update to) the latest release of Whisper with the following command:
|
* You can download and install (or update to) the latest release of Whisper with the following command:
|
||||||
|
|
||||||
@ -74,7 +82,8 @@ pip install git+https://github.com/openai/whisper.git
|
|||||||
```bash
|
```bash
|
||||||
pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
|
pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
|
||||||
```
|
```
|
||||||
## Installation troubleshooting
|
|
||||||
|
### Installation troubleshooting
|
||||||
|
|
||||||
If you see installation errors during the installation of Whisper, follow these steps:
|
If you see installation errors during the installation of Whisper, follow these steps:
|
||||||
* Check if you have [`rust`](http://rust-lang.org) installed on your system. If not, follow the [Getting started page](https://www.rust-lang.org/learn/get-started) to install the Rust development environment.
|
* Check if you have [`rust`](http://rust-lang.org) installed on your system. If not, follow the [Getting started page](https://www.rust-lang.org/learn/get-started) to install the Rust development environment.
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user