feat(api): sync test API (serve) + opt-in LLM correction + cloudflared tunnel

- api/: FastAPI app, X-API-Key 인증(미설정 시 임시키), 엔진 load-once 풀
  (+transcribe lock), POST /v1/transcribe(multipart, 동기), /health, /v1/system,
  /v1/models. 업로드 임시파일 finally 삭제(프라이버시).
- postprocess/: llm.correct(scripts/llm_correct.py 승격; opt-in·allowlist·감사로그·재시도)
  + rules.normalize(EmbeddingGemma 등 정규화).
- results/formats.py: txt/srt/vtt. connectivity/tunnel.py: cloudflared quick tunnel(Colab).
- cli serve: uvicorn 단일워커 + --tunnel cloudflare; config llm_* 필드;
  pyproject api/queue extra 분리(+python-multipart, dev httpx).

검증: 22 단위테스트(API TestClient·formats·postprocess) + 실서버 e2e
(/health·auth 401·실제 전사(JFK)·SRT·임시파일 삭제). KO 품질은 turbo/large-v3 필요(tiny는 한국어 degenerate).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-08 23:20:01 +09:00
parent 480a36edfe
commit 8f6f8969fd
22 changed files with 744 additions and 28 deletions
+5 -3
View File
@@ -18,8 +18,10 @@ dependencies = [
engine = ["faster-whisper>=1.0.3", "av>=11"]
# GPU CUDA 런타임 (faster-whisper GPU 추론 시)
gpu = ["nvidia-cublas-cu12", "nvidia-cudnn-cu12"]
# P2 API + Queue
api = ["fastapi>=0.110", "uvicorn[standard]>=0.29", "redis>=5.0", "rq>=1.16"]
# 테스트 API (동기) — serve
api = ["fastapi>=0.110", "uvicorn[standard]>=0.29", "python-multipart>=0.0.9"]
# P2 비동기 큐 (보류)
queue = ["redis>=5.0", "rq>=1.16"]
# P5 옵션
diarize = ["pyannote.audio>=3.1"]
llm = ["openai>=1.30"]
@@ -35,4 +37,4 @@ build-backend = "hatchling.build"
packages = ["src/luke_scribe"]
[dependency-groups]
dev = ["pytest>=8.2", "ruff>=0.5"]
dev = ["pytest>=8.2", "ruff>=0.5", "httpx>=0.27"]