@remotion/install-whisper-cpp

Available from v4.0.115

With Whisper.cpp, you can transcribe audio locally on your machine.
This package provides easy to use cross-platform functions to install Whisper.cpp and a model.

npm
yarn
pnpm
bun

npm i --save-exact @remotion/install-whisper-cpp@4.0.141

npm i --save-exact @remotion/install-whisper-cpp@4.0.141

pnpm i @remotion/install-whisper-cpp@4.0.141

pnpm i @remotion/install-whisper-cpp@4.0.141

bun i @remotion/install-whisper-cpp@4.0.141

bun i @remotion/install-whisper-cpp@4.0.141

yarn --exact add @remotion/install-whisper-cpp@4.0.141

yarn --exact add @remotion/install-whisper-cpp@4.0.141

This assumes you are currently using v4.0.141 of Remotion.
Also update remotion and all `@remotion/*` packages to the same version.
Remove all ^ character in front of the version numbers of it as it can lead to a version conflict.

Example usage

Install Whisper at commit 48a145 (a ref that we find works well and supports token-level timestamps) and the base.en model to the whisper.cpp folder.

install-whisper.cpp
tsx
import path from "path";
import {
  downloadWhisperModel,
  installWhisperCpp,
  transcribe,
  convertToCaptions,
} from "@remotion/install-whisper-cpp";
 
const to = path.join(process.cwd(), "whisper.cpp");
 
await installWhisperCpp({
  to,
  version: "48a145", // Commit after 1.5.4 that supports token-level timestamps
});
 
await downloadWhisperModel({
  model: "medium.en",
  folder: to,
});
 
const { transcription } = await transcribe({
  model: "medium.en",
  whisperPath: to,
  inputPath: "/path/to/audio.wav",
  tokenLevelTimestamps: true,
});
 
for (const token of transcription) {
  console.log(token.timestamps.from, token.timestamps.to, token.text);
}
 
// Optional: Apply our recommended postprocessing
const { captions } = convertToCaptions({
  transcription,
  combineTokensWithinMilliseconds: 200,
});
 
for (const line of captions) {
  console.log(line.text, line.startInSeconds);
}

install-whisper.cpp
tsx
import path from "path";
import {
  downloadWhisperModel,
  installWhisperCpp,
  transcribe,
  convertToCaptions,
} from "@remotion/install-whisper-cpp";
 
const to = path.join(process.cwd(), "whisper.cpp");
 
await installWhisperCpp({
  to,
  version: "48a145", // Commit after 1.5.4 that supports token-level timestamps
});
 
await downloadWhisperModel({
  model: "medium.en",
  folder: to,
});
 
const { transcription } = await transcribe({
  model: "medium.en",
  whisperPath: to,
  inputPath: "/path/to/audio.wav",
  tokenLevelTimestamps: true,
});
 
for (const token of transcription) {
  console.log(token.timestamps.from, token.timestamps.to, token.text);
}
 
// Optional: Apply our recommended postprocessing
const { captions } = convertToCaptions({
  transcription,
  combineTokensWithinMilliseconds: 200,
});
 
for (const line of captions) {
  console.log(line.text, line.startInSeconds);
}