Skip to main content

@remotion/install-whisper-cpp

Available from v4.0.115

With Whisper.cpp, you can transcribe audio locally on your machine.
This package provides easy to use cross-platform functions to install Whisper.cpp and a model.

npm i --save-exact @remotion/install-whisper-cpp@4.0.141
npm i --save-exact @remotion/install-whisper-cpp@4.0.141
This assumes you are currently using v4.0.141 of Remotion.
Also update remotion and all `@remotion/*` packages to the same version.
Remove all ^ character in front of the version numbers of it as it can lead to a version conflict.

Example usage

Install Whisper at commit 48a145 (a ref that we find works well and supports token-level timestamps) and the base.en model to the whisper.cpp folder.

install-whisper.cpp
tsx
import path from "path";
import {
downloadWhisperModel,
installWhisperCpp,
transcribe,
convertToCaptions,
} from "@remotion/install-whisper-cpp";
 
const to = path.join(process.cwd(), "whisper.cpp");
 
await installWhisperCpp({
to,
version: "48a145", // Commit after 1.5.4 that supports token-level timestamps
});
 
await downloadWhisperModel({
model: "medium.en",
folder: to,
});
 
const { transcription } = await transcribe({
model: "medium.en",
whisperPath: to,
inputPath: "/path/to/audio.wav",
tokenLevelTimestamps: true,
});
 
for (const token of transcription) {
console.log(token.timestamps.from, token.timestamps.to, token.text);
}
 
// Optional: Apply our recommended postprocessing
const { captions } = convertToCaptions({
transcription,
combineTokensWithinMilliseconds: 200,
});
 
for (const line of captions) {
console.log(line.text, line.startInSeconds);
}
install-whisper.cpp
tsx
import path from "path";
import {
downloadWhisperModel,
installWhisperCpp,
transcribe,
convertToCaptions,
} from "@remotion/install-whisper-cpp";
 
const to = path.join(process.cwd(), "whisper.cpp");
 
await installWhisperCpp({
to,
version: "48a145", // Commit after 1.5.4 that supports token-level timestamps
});
 
await downloadWhisperModel({
model: "medium.en",
folder: to,
});
 
const { transcription } = await transcribe({
model: "medium.en",
whisperPath: to,
inputPath: "/path/to/audio.wav",
tokenLevelTimestamps: true,
});
 
for (const token of transcription) {
console.log(token.timestamps.from, token.timestamps.to, token.text);
}
 
// Optional: Apply our recommended postprocessing
const { captions } = convertToCaptions({
transcription,
combineTokensWithinMilliseconds: 200,
});
 
for (const line of captions) {
console.log(line.text, line.startInSeconds);
}

Functions

License

MIT