8f6f8969fd
- api/: FastAPI app, X-API-Key 인증(미설정 시 임시키), 엔진 load-once 풀 (+transcribe lock), POST /v1/transcribe(multipart, 동기), /health, /v1/system, /v1/models. 업로드 임시파일 finally 삭제(프라이버시). - postprocess/: llm.correct(scripts/llm_correct.py 승격; opt-in·allowlist·감사로그·재시도) + rules.normalize(EmbeddingGemma 등 정규화). - results/formats.py: txt/srt/vtt. connectivity/tunnel.py: cloudflared quick tunnel(Colab). - cli serve: uvicorn 단일워커 + --tunnel cloudflare; config llm_* 필드; pyproject api/queue extra 분리(+python-multipart, dev httpx). 검증: 22 단위테스트(API TestClient·formats·postprocess) + 실서버 e2e (/health·auth 401·실제 전사(JFK)·SRT·임시파일 삭제). KO 품질은 turbo/large-v3 필요(tiny는 한국어 degenerate). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
26 lines
630 B
Python
26 lines
630 B
Python
"""results.formats — txt/srt/vtt."""
|
|
from __future__ import annotations
|
|
|
|
from luke_scribe.results import formats
|
|
|
|
SEGS = [
|
|
{"start": 0.0, "end": 1.5, "text": "안녕 world"},
|
|
{"start": 1.5, "end": 3.0, "text": "두번째"},
|
|
]
|
|
|
|
|
|
def test_txt():
|
|
assert formats.to_txt(SEGS) == "안녕 world\n두번째"
|
|
|
|
|
|
def test_srt():
|
|
out = formats.to_srt(SEGS)
|
|
assert "1\n00:00:00,000 --> 00:00:01,500\n안녕 world" in out
|
|
assert "2\n00:00:01,500 --> 00:00:03,000\n두번째" in out
|
|
|
|
|
|
def test_vtt():
|
|
out = formats.to_vtt(SEGS)
|
|
assert out.startswith("WEBVTT")
|
|
assert "00:00:00.000 --> 00:00:01.500" in out
|