Use SCALE with Docker#
SCALE offers docker images with a similar tag layout to those offered by NVIDIA.
These serve two main purposes:
- Deploying the compiler in your CI, to benefit from its improved diagnostics etc.
- Use as a base image for applications that depend on the SCALE runtime library.
The images are available here:
- spectralcompute/scale on Docker Hub
- spectral-compute/scale on Quay.io
For example:
# Downloads the latest version of SCALE from Docker Hub that:
# - Imitates CUDA 13.0.2
# - Includes the full SCALE developer toolkit
# - Is based on Ubuntu 24.04
docker pull docker.io/spectralcompute/scale:13.0.2-devel-ubuntu24.04
# Downloads SCALE 1.5.1 from Quay.io that:
# - Imitates CUDA 12.1.0
# - Includes the SCALE runtime, but not the compiler nor other development tools
# - Is based on Ubuntu 22.04
docker pull quay.io/spectral-compute/scale:12.1.0-runtime-ubuntu22.04-1.5.1
Using SCALE with Docker#
To use the container, you need to accept the SCALE License.
It can be done by setting an environment variable SCALE_LICENSE_ACCEPT=1 in the container.
If you are using docker run, this is what starting bash in the container would look like.
docker run -it -e SCALE_LICENSE_ACCEPT=1 docker.io/spectralcompute/scale:latest
Example: whisper.cpp#
Let's see how you can build and run whisper.cpp using the SCALE Docker image.
This example will use the docker.io/spectralcompute/scale:latest image for simplicity, and will build the most recent version of whisper.cpp at the time of writing: 21411d8.
You can find the results of automated testing of SCALE against whisper.cpp on GitHub: spectral-compute/scale-validation.
# 1. Clone the whisper.cpp repository.
git clone https://github.com/ggml-org/whisper.cpp
# 2. Start the container, mount the whisper.cpp repository inside.
# `--device` flags allow accessing the GPU from the container.
# See `docker run --help` to learn more about other flags.
docker run -it \
--mount type=bind,src=$(pwd)/whisper.cpp,dst=/root/whisper.cpp \
--env SCALE_LICENSE_ACCEPT=1 \
--device /dev/dri \
--device /dev/kfd \
docker.io/spectralcompute/scale:latest
# 3. Inside of the container, activate scaleenv.
# Replace `gfx1201` with your GPU architecture.
source /opt/scale/bin/scaleenv gfx1201
# 4. Configure the whisper.cpp build tree.
cd /root/whisper.cpp
cmake \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_CUDA_ARCHITECTURES="86" \
-DGGML_CCACHE=OFF \
-DGGML_CUDA=ON \
-DGGML_CUDA_NO_PEER_COPY=ON \
-B"build" \
.
# 5. Build whisper.cpp.
cmake \
--build "build" \
-j $(nproc)
# 6. Download the base model for whisper.cpp.
sh ./models/download-ggml-model.sh base.en
# 7. Transcribe an example audio file.
./build/bin/whisper-cli -m ./models/ggml-base.en.bin -f ./samples/jfk.wav
You should then see whisper.cpp logs and the transcription result:
And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.