Skip to content

Make whisper an optional extra with faster-whisper by default#1877

Open
Dreamsorcerer wants to merge 12 commits intodevfrom
sam/move-whisper
Open

Make whisper an optional extra with faster-whisper by default#1877
Dreamsorcerer wants to merge 12 commits intodevfrom
sam/move-whisper

Conversation

@Dreamsorcerer
Copy link
Copy Markdown
Collaborator

@Dreamsorcerer Dreamsorcerer commented Apr 17, 2026

Problem

Whisper requires downloading a 150MB model and depends on torch (with GBs of CUDA downloads).

Solution

Provide faster-whisper by default (2MB) and use as a fallback when whisper is not available.
This avoids the 150MB download, and means we are one step closer to not depending on torch for a base install.

Breaking Changes

Users need to request dimos[whisper] now for full whisper feature.

Test

python -c "
from dimos.stream.audio.pipelines import stt
node = stt()
node.emit_text().subscribe(on_next=lambda t: print(f"USER: {t}"))
from dimos.stream.audio.utils import keepalive
keepalive()
"

@Dreamsorcerer Dreamsorcerer marked this pull request as ready for review April 17, 2026 15:54
@Dreamsorcerer
Copy link
Copy Markdown
Collaborator Author

TTS seems to work pretty well with faster-whisper anyway.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 17, 2026

Greptile Summary

This PR makes openai-whisper an optional extra (dimos[whisper]) and adds faster-whisper as the default audio transcription backend in dimos[agents], significantly reducing the default install footprint. The WhisperNode class now auto-detects which backend is available at import time, preferring openai-whisper if present and falling back to faster-whisper otherwise.

  • The UserWarning on line 36–40 fires for every default install (faster-whisper is in agents), misleading users into thinking their setup is degraded when it is the intended configuration.
  • faster-whisper in pyproject.toml has no lower version bound, but device=\"auto\" requires >=1.0.0, which can cause a TypeError at runtime on older installs.

Confidence Score: 4/5

Safe to merge after fixing the misleading UserWarning that fires for all default users.

One P1 finding: the UserWarning implies a degraded fallback state for every default install, which will confuse users. The missing version constraint on faster-whisper (P2) could also cause a runtime TypeError with older versions. Both are straightforward one-line fixes.

dimos/stream/audio/stt/node_whisper.py (misleading warning), pyproject.toml (missing version constraint)

Important Files Changed

Filename Overview
dimos/stream/audio/stt/node_whisper.py Adds faster-whisper fallback when openai-whisper is absent; misleading UserWarning fires for all default installs, and caller's modelopts dict is mutated via pop().
pyproject.toml Moves openai-whisper to a new optional [whisper] extra and adds faster-whisper to [agents]; no version constraint on faster-whisper despite using device="auto" (requires >=1.0.0).
uv.lock Lockfile updated to reflect new faster-whisper dependency; no manual review needed.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[import WhisperNode] --> B{try: import whisper}
    B -- success --> C[_USE_FASTER_WHISPER = False\nopenai-whisper backend]
    B -- ImportError --> D{try: from faster_whisper\nimport WhisperModel}
    D -- success --> E[UserWarning fired\n_USE_FASTER_WHISPER = True\nfaster-whisper backend]
    D -- ImportError --> F[Raise ImportError\nNo backend found]

    C --> G[WhisperNode.__init__]
    E --> G
    G --> H{_USE_FASTER_WHISPER?}
    H -- True --> I[pop fp16 → compute_type\nWhisperModel device=auto]
    H -- False --> J[whisper.load_model]

    I --> K[transcribe → segments iterator\njoin seg.text]
    J --> L[transcribe → dict\nresult text]
Loading

Reviews (1): Last reviewed commit: "Add warning" | Re-trigger Greptile

Comment thread dimos/stream/audio/stt/node_whisper.py Outdated
Comment thread dimos/stream/audio/stt/node_whisper.py
Comment thread pyproject.toml Outdated
Comment thread pyproject.toml Outdated
Comment thread pyproject.toml Outdated
Comment thread dimos/stream/audio/stt/node_whisper.py Outdated
Comment thread dimos/stream/audio/stt/node_whisper.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants