mirror of
https://github.com/openai/whisper.git
synced 2025-11-26 15:35:57 +00:00
Merge pull request #2 from KomunikacjaTechnicznaVistula/main
Improve structure
This commit is contained in:
commit
6d2c1f1ab3
45
README.md
45
README.md
@ -8,11 +8,11 @@
|
||||
|
||||
## Contents <!-- omit in toc -->
|
||||
|
||||
- [Introduction](#introduction)
|
||||
- [Approach](#approach)
|
||||
- [Prerequisites](#prerequisites)
|
||||
- [Installation](#installation)
|
||||
- [Installation troubleshooting](#installation-troubleshooting)
|
||||
- [What is Whisper](#what-is-whisper)
|
||||
- [Setup](#setup)
|
||||
- [Prerequsites](#prerequisites)
|
||||
- [Installation](#installation)
|
||||
- [Installation troubleshooting](#installation-troubleshooting)
|
||||
- [Available models and languages](#available-models-and-languages)
|
||||
- [Performance](#performance)
|
||||
- [Command-line usage](#command-line-usage)
|
||||
@ -20,42 +20,50 @@
|
||||
- [More examples](#more-examples)
|
||||
- [License](#license)
|
||||
|
||||
|
||||
## Introduction
|
||||
## What is Whisper
|
||||
|
||||
Whisper is a multilingual speech recognition model for general purposes, including speech translation and language identification. Whisper is trained on a large dataset of diverse audio.
|
||||
|
||||
## Approach
|
||||
|
||||

|
||||
|
||||
A Transformer sequence-to-sequence model is trained on various speech processing tasks. The tasks include multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder. As a result, a single model replaces many steps in traditional speech processing. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.
|
||||
|
||||
We used Python 3.9.9 and [PyTorch](https://pytorch.org/) 1.10.1 to train and test our models. The codebase should be compatible with Python 3.8-3.11 and recent PyTorch versions. The codebase also depends on a few Python packages, most notably [OpenAI's tiktoken](https://github.com/openai/tiktoken) for their fast tokenizer implementation.
|
||||
|
||||
## Prerequisites
|
||||
## Setup
|
||||
|
||||
### Prerequisites
|
||||
|
||||
* Whisper requires the command-line tool [`ffmpeg`](https://ffmpeg.org/) to be installed on your system. The command-line tool is available from most package managers. To install [`ffmpeg`](https://ffmpeg.org/), use one of the following commands for your operating system:
|
||||
|
||||
**on Ubuntu or Debian**
|
||||
```bash
|
||||
# on Ubuntu or Debian
|
||||
sudo apt update && sudo apt install ffmpeg
|
||||
```
|
||||
|
||||
# on Arch Linux
|
||||
**on Arch Linux**
|
||||
```bash
|
||||
sudo pacman -S ffmpeg
|
||||
```
|
||||
|
||||
# on MacOS using Homebrew (https://brew.sh/)
|
||||
**on MacOS using Homebrew (https://brew.sh/)**
|
||||
```bash
|
||||
brew install ffmpeg
|
||||
```
|
||||
|
||||
# on Windows using Chocolatey (https://chocolatey.org/)
|
||||
**on Windows using Chocolatey (https://chocolatey.org/)**
|
||||
```bash
|
||||
choco install ffmpeg
|
||||
```
|
||||
|
||||
# on Windows using Scoop (https://scoop.sh/)
|
||||
**on Windows using Scoop (https://scoop.sh/)**
|
||||
```bash
|
||||
scoop install ffmpeg
|
||||
```
|
||||
|
||||
* If [tiktoken](https://github.com/openai/tiktoken) does not provide a pre-built wheel for your platform, install [`rust`](http://rust-lang.org). Follow the [Getting started page](https://www.rust-lang.org/learn/get-started) to install the Rust development environment.
|
||||
|
||||
## Installation
|
||||
### Installation
|
||||
|
||||
* You can download and install (or update to) the latest release of Whisper with the following command:
|
||||
|
||||
@ -74,7 +82,8 @@ pip install git+https://github.com/openai/whisper.git
|
||||
```bash
|
||||
pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git
|
||||
```
|
||||
## Installation troubleshooting
|
||||
|
||||
### Installation troubleshooting
|
||||
|
||||
If you see installation errors during the installation of Whisper, follow these steps:
|
||||
* Check if you have [`rust`](http://rust-lang.org) installed on your system. If not, follow the [Getting started page](https://www.rust-lang.org/learn/get-started) to install the Rust development environment.
|
||||
@ -178,4 +187,4 @@ Use the [🙌 Show and tell](https://github.com/openai/whisper/discussions/categ
|
||||
|
||||
## License
|
||||
|
||||
Whisper's code and model weights are released under the Massachusetts Institute of Technology (MIT) License. See [LICENSE](https://github.com/openai/whisper/blob/main/LICENSE) for further details.
|
||||
Whisper's code and model weights are released under the Massachusetts Institute of Technology (MIT) License. See [LICENSE](https://github.com/openai/whisper/blob/main/LICENSE) for further details.
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user