GPU(T4) 셀: ffmpeg+uv → 익명 clone → uv sync(engine+gpu) → detect →
오디오 업로드 → large-v3-turbo 풀 전사 → transcript.txt 다운로드.
(Colab은 사내 게이트 미도달이라 전사 전용; 보정은 온프렘.)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
사내 GPT-4o 컨텍스트(<30k)에 맞춰 긴 전사를 문장 경계로 청크 분할하고,
각 청크 보정의 영문 용어를 '러닝 글로서리'로 다음 청크 system에 전달 →
큰 창 없이 강연 전체 용어 일관성 유지. config.llm_max_chars(기본 3000;
~8k창→1500/~16k→3000/~30k→6000). 과대 단일문장은 글자단위 강제 분할 안전망.
23 tests pass(청크 분할/글로서리 주입 포함), ruff clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Document the exact format for the KO+EN labeled clips that the bench gate
needs (manifest.jsonl + ground-truth text + optional entities). Ignore
audio/video under samples/** while keeping manifests tracked.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Populate the previously-empty .omc/project-memory.json so teammates and
future OMC sessions inherit context: 4 user directives (SoT location,
greenfield/next-step, locked design decisions, measurement-gated residual),
3 notes (architecture, tech stack, env), and the decided tech stack.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fold post-plan decisions into the spec and consensus plan:
- Q1 deploy HW: undecided/mixed → delegate to hardware-adaptive auto-sizing
- Q2 model strategy: collapse to single turbo model if P1 bench entity ≥95%
- Q3 cancellation: cooperative (segment-boundary) is sufficient; no hard-kill
- Q4 concurrency N: delegate to boot-time auto-sizing (AC-8 = ≤5s within auto N)
Recompute clarity with the deep-interview model (Goal 0.96 / Constraint 0.95
/ Success 0.95 → Total 0.954): ambiguity ~10% → ~5%. Residual is now entirely
measurement/code-gated (AC-4 R-WER baseline, hybrid→single confirmation,
CT2 GIL) — next lever is P1 bench, not further interview.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>