Update Dockerfile.hpu and README.md files:

- rename `requirements_hpu.txt`, - make docker run mapping optional, - add running HPU tests in docs
2025-11-24 22:45:52 +00:00 · 2024-11-12 20:36:09 +01:00 · 2024-11-12 20:36:09 +01:00 · 6770610528
commit 6770610528
parent 3e826a2aff
2 changed files with 30 additions and 10 deletions
--- a/Dockerfile.hpu
+++ b/Dockerfile.hpu
@ -23,13 +23,12 @@ RUN mkdir -p /usr/local/bin/ffmpeg && \
    cp -a ffmpeg-*-static/ffprobe /usr/bin/ffprobe && \
    rm -rf /usr/local/bin/ffmpeg

-# Add Whisper repo contents
-ADD . /root/whisper
-WORKDIR /root/whisper
+COPY . /workspace/whisper
+WORKDIR /workspace/whisper

 # Copy HPU requirements
-COPY requirements_hpu.txt /root/whisper/requirements.txt
+COPY requirements_hpu.txt /workspace/requirements_hpu.txt

 # Install Python packages
 RUN pip install --upgrade pip \
-    && pip install -r requirements.txt
+    && pip install -r requirements_hpu.txt
--- a/README.md
+++ b/README.md
@ -93,6 +93,10 @@ Adding `--task translate` will translate the speech into English:

    whisper japanese.wav --language Japanese --task translate

+The following command will transcribe speech in audio files, using the Intel® Gaudi® HPU (`--device hpu` option):
+
+    whisper audio.flac audio.mp3 audio.wav --model turbo --device hpu
+
 Run the following to view all available options:

    whisper --help
@ -148,23 +152,40 @@ print(result.text)
 docker build -t whisper_hpu:latest -f Dockerfile.hpu .
 ```

-In the `Dockerfile.hpu`, we use the `vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest` base image, make sure to replace it with the appropriate version for your environment if needed.
+In the `Dockerfile.hpu`, we use the `vault.habana.ai/gaudi-docker/1.18.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest` base image, make sure to replace it with the appropriate version for your environment if needed.
 See the [PyTorch Docker Images for the Intel® Gaudi® Accelerator](https://developer.habana.ai/catalog/pytorch-container/) for more information.

 ### Run the Container

 ```bash
-docker run -it --runtime=habana -v /path/to/your/whisper:/root/whisper whisper_hpu:latest /bin/bash
+docker run -it --runtime=habana whisper_hpu:latest
 ```

-Make sure to replace `/path/to/your/whisper` with the path to the Whisper repository on your local machine.
+Using a mapping volume (`-v`) is optional, but it allows you to access the Whisper repository from within the container. 
+You can make this by adding `-v /path/to/your/whisper:/workspace/whisper` to the `docker run` command.
+If you decide to use the mapping make sure to replace `/path/to/your/whisper` with the path to the Whisper repository on your local machine.

 ### Command-line usage with Intel® Gaudi® hpu

-To run the `whisper` command with Intel® Gaudi® hpu, you can use the `--device hpu` option:
+To run the `transcribe` process with Intel® Gaudi® HPU, you can use the `--device hpu` option:

-    python3 -m whisper.transcribe audio.flac audio.mp3 audio.wav --model turbo --device hpu
+```bash
+python3 -m whisper.transcribe audio_file.wav --model turbo --device hpu
+```

+* Note: Change `audio_file.wav` to the path of the audio file you want to transcribe. (Example file: https://www.kaggle.com/datasets/pavanelisetty/sample-audio-files-for-speech-recognition?resource=download)
+
+To run the `transcribe` tests with Intel® Gaudi® HPU, make sure to install the `pytest` package:
+
+```bash
+pip install pytest
+```
+
+and run the following command:
+
+```bash
+PYTHONPATH=. pytest -s tests/test_transcribe.py::test_transcribe_hpu
+```

 ### Python usage with Intel® Gaudi® hpu