feat(p1): scaffolding + Device Manager / VRAM probe + CLI detect

- pyproject (uv, src layout) + extras: engine/gpu/api/diarize/llm
- config.py (pydantic-settings, SCRIBE_ env)
- devices/: vram_probe (NVML/psutil/disk) + DeviceManager →
  capability tier T0–T3, precision by cc/VRAM, worker estimate (계획 §3.6, AC-2/3)
- cli.py (typer): detect (구현) + transcribe/bench/serve (스텁)
- run.sh, .env.example, README

Verified on GTX 1050/2GB: detect → T0_CPU (turbo doesn't fit → explicit
downgrade, fail-explicit). Overrides (--device/--workers) work. 7 unit tests
cover T0–T3 + overrides via synthetic VRAM. ruff clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-07 12:56:07 +09:00
parent 612b353105
commit 5d2604105b
13 changed files with 4389 additions and 0 deletions
+24
View File
@@ -0,0 +1,24 @@
# luke_scribe 설정 예시 — 복사: cp .env.example .env (env prefix: SCRIBE_)
# 모델 (하이브리드 기본; P1 bench 결과에 따라 단일 turbo로 통일 가능)
SCRIBE_MODEL_REALTIME=large-v3-turbo
SCRIBE_MODEL_BATCH=large-v3
# 디바이스: auto|cpu|cuda|cuda:0 — 자동 산정, 강제 가능
SCRIBE_DEVICE=auto
# SCRIBE_COMPUTE_TYPE=int8 # 비우면 cc/VRAM 기반 자동
# SCRIBE_WORKERS=1 # 비우면 자동 산정
SCRIBE_LANGUAGE=ko
# 입력 절대 상한 (초과 413)
SCRIBE_MAX_DURATION_S=14400 # 4h
SCRIBE_MAX_SIZE_BYTES=2147483648 # 2GB
# 보관 (P2+)
SCRIBE_RETENTION_DAYS=7
# SCRIBE_REDIS_URL=redis://localhost:6379/0
# SCRIBE_API_KEYS=["key1","key2"]
# 터널 (P5): none|cloudflare|ngrok
SCRIBE_TUNNEL=none