luke_scribe

Files

T

lukehemmin b721ca6419 feat(api): chunk LLM correction for small context windows (+running glossary)

사내 GPT-4o 컨텍스트(<30k)에 맞춰 긴 전사를 문장 경계로 청크 분할하고,
각 청크 보정의 영문 용어를 '러닝 글로서리'로 다음 청크 system에 전달 →
큰 창 없이 강연 전체 용어 일관성 유지. config.llm_max_chars(기본 3000;
~8k창→1500/~16k→3000/~30k→6000). 과대 단일문장은 글자단위 강제 분할 안전망.

23 tests pass(청크 분할/글로서리 주입 포함), ruff clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-09 07:09:51 +09:00

test_api.py

feat(api): sync test API (serve) + opt-in LLM correction + cloudflared tunnel

2026-06-08 23:20:01 +09:00

test_device_manager.py

feat(p1): scaffolding + Device Manager / VRAM probe + CLI detect

2026-06-07 12:56:07 +09:00

test_engine_audio.py

feat(p1): faster-whisper engine + audio ingest + transcribe (CPU verified)

2026-06-07 15:07:41 +09:00

test_formats.py

feat(api): sync test API (serve) + opt-in LLM correction + cloudflared tunnel

2026-06-08 23:20:01 +09:00

test_postprocess.py

feat(api): chunk LLM correction for small context windows (+running glossary)

2026-06-09 07:09:51 +09:00