feat(p1): faster-whisper engine + audio ingest + transcribe (CPU verified)

- engine/: FasterWhisperEngine 래퍼 + model_registry (turbo→CT2 repo)
- audio/ingest.py: ffprobe duration/size probe + 413 상한 훅
- cli transcribe: device-auto, model 오버라이드, 413 가드, model_used 출력
- 단위 테스트 3 (resolve_model, probe_media); README 갱신

검증(CPU): JFK 11s 클립 → 정확 전사, detected_lang=en. 10 tests pass, ruff clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-07 15:07:41 +09:00
parent d75d60671e
commit 73380bebf9
8 changed files with 202 additions and 8 deletions
+23
View File
@@ -0,0 +1,23 @@
"""engine.model_registry / audio.ingest 경량 단위 테스트 (모델 로드 불요)."""
from __future__ import annotations
import pytest
from luke_scribe.audio.ingest import probe_media
from luke_scribe.engine.model_registry import resolve_model
def test_resolve_model_turbo_maps_to_ct2_repo():
expected = "deepdml/faster-whisper-large-v3-turbo-ct2"
assert resolve_model("large-v3-turbo") == expected
assert resolve_model("turbo") == expected
def test_resolve_model_standard_passthrough():
assert resolve_model("tiny") == "tiny"
assert resolve_model("large-v3") == "large-v3"
def test_probe_media_missing_raises():
with pytest.raises(FileNotFoundError):
probe_media("/no/such/file.wav")