luke_scribe

8 Commits 2 Branches 0 Tags

Author	SHA1	Message	Date
lukehemmin	45690371c3	docs: add samples/ bench dataset spec (KO+EN) + broaden audio gitignore Document the exact format for the KO+EN labeled clips that the bench gate needs (manifest.jsonl + ground-truth text + optional entities). Ignore audio/video under samples/** while keeping manifests tracked. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 15:12:20 +09:00
lukehemmin	518c03174a	chore(omc): record P1 progress note (engine+transcribe) + hotpaths Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 15:08:07 +09:00
lukehemmin	73380bebf9	feat(p1): faster-whisper engine + audio ingest + transcribe (CPU verified) - engine/: FasterWhisperEngine 래퍼 + model_registry (turbo→CT2 repo) - audio/ingest.py: ffprobe duration/size probe + 413 상한 훅 - cli transcribe: device-auto, model 오버라이드, 413 가드, model_used 출력 - 단위 테스트 3 (resolve_model, probe_media); README 갱신 검증(CPU): JFK 11s 클립 → 정확 전사, detected_lang=en. 10 tests pass, ruff clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 15:07:41 +09:00
lukehemmin	d75d60671e	chore(omc): seed build commands + hotpaths from P1 scaffolding Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 12:56:07 +09:00
lukehemmin	5d2604105b	feat(p1): scaffolding + Device Manager / VRAM probe + CLI detect - pyproject (uv, src layout) + extras: engine/gpu/api/diarize/llm - config.py (pydantic-settings, SCRIBE_ env) - devices/: vram_probe (NVML/psutil/disk) + DeviceManager → capability tier T0–T3, precision by cc/VRAM, worker estimate (계획 §3.6, AC-2/3) - cli.py (typer): detect (구현) + transcribe/bench/serve (스텁) - run.sh, .env.example, README Verified on GTX 1050/2GB: detect → T0_CPU (turbo doesn't fit → explicit downgrade, fail-explicit). Overrides (--device/--workers) work. 7 unit tests cover T0–T3 + overrides via synthetic VRAM. ruff clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 12:56:07 +09:00
lukehemmin	612b353105	docs(omc): seed project memory — directives, notes, tech stack Populate the previously-empty .omc/project-memory.json so teammates and future OMC sessions inherit context: 4 user directives (SoT location, greenfield/next-step, locked design decisions, measurement-gated residual), 3 notes (architecture, tech stack, env), and the decided tech stack. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 12:30:17 +09:00
lukehemmin	84faa121fe	docs: resolve open questions, recompute ambiguity ~10%→~5% (v2.3) Fold post-plan decisions into the spec and consensus plan: - Q1 deploy HW: undecided/mixed → delegate to hardware-adaptive auto-sizing - Q2 model strategy: collapse to single turbo model if P1 bench entity ≥95% - Q3 cancellation: cooperative (segment-boundary) is sufficient; no hard-kill - Q4 concurrency N: delegate to boot-time auto-sizing (AC-8 = ≤5s within auto N) Recompute clarity with the deep-interview model (Goal 0.96 / Constraint 0.95 / Success 0.95 → Total 0.954): ambiguity ~10% → ~5%. Residual is now entirely measurement/code-gated (AC-4 R-WER baseline, hybrid→single confirmation, CT2 GIL) — next lever is P1 bench, not further interview. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 11:07:36 +09:00
lukehemmin	fbe13dddcc	chore: initial commit — planning docs and omc project context Greenfield setup for luke_scribe (local STT transcription API). No source code yet; this captures the completed design phase so teammates can ramp through oh-my-claudecode. Includes: - .omc/plans/consensus-luke-scribe-stt-api.md — consensus impl plan v2.2 - .omc/specs/deep-interview-luke-scribe-stt-api.md — deep-interview spec - .omc/artifacts/ask/{codex,gemini}-*.md — external review (CCG) - .omc/project-memory.json — omc project memory - opencode.json, .claude/settings.json — shared tooling config - .gitignore — excludes ephemeral omc state/session logs and local settings Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-07 10:08:17 +09:00