Thanks for the great contribution @glifocat! This is a really well-structured skill — clean package, thorough docs, and solid test coverage. Hope to see more skills like this from you!
1.6 KiB
1.6 KiB
Intent: src/transcription.ts modifications
What changed
Replaced the OpenAI Whisper API backend with local whisper.cpp CLI execution. Audio is converted from ogg/opus to 16kHz mono WAV via ffmpeg, then transcribed locally using whisper-cpp. No API key or network required.
Key sections
Imports
- Removed:
readEnvFilefrom./env.js(no API key needed) - Added:
execFilefromchild_process,fs,os,path,promisifyfromutil
Configuration
- Removed:
TranscriptionConfiginterface andDEFAULT_CONFIG(no model/enabled/fallback config) - Added:
WHISPER_BINconstant (envWHISPER_BINor'whisper-cli') - Added:
WHISPER_MODELconstant (envWHISPER_MODELordata/models/ggml-base.bin) - Added:
FALLBACK_MESSAGEconstant
transcribeWithWhisperCpp (replaces transcribeWithOpenAI)
- Writes audio buffer to temp .ogg file
- Converts to 16kHz mono WAV via ffmpeg
- Runs whisper-cpp CLI with
--no-timestamps -ntflags - Cleans up temp files in finally block
- Returns trimmed stdout or null on error
transcribeAudioMessage
- Same signature:
(msg: WAMessage, sock: WASocket) => Promise<string | null> - Same download logic via
downloadMediaMessage - Calls
transcribeWithWhisperCppinstead oftranscribeWithOpenAI - Same fallback behavior on error/null
isVoiceMessage
- Unchanged:
msg.message?.audioMessage?.ptt === true
Invariants (must-keep)
transcribeAudioMessageexport signature unchangedisVoiceMessageexport unchanged- Fallback message strings unchanged:
[Voice Message - transcription unavailable] - downloadMediaMessage call pattern unchanged
- Error logging pattern unchanged