Compare commits

..

7 Commits

Author SHA1 Message Date
lukehemmin a5e6d56568 docs: add Colab notebook for full-talk transcription (notebooks/colab_full_transcribe.ipynb)
GPU(T4) 셀: ffmpeg+uv → 익명 clone → uv sync(engine+gpu) → detect →
오디오 업로드 → large-v3-turbo 풀 전사 → transcript.txt 다운로드.
(Colab은 사내 게이트 미도달이라 전사 전용; 보정은 온프렘.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 07:33:54 +09:00
lukehemmin cd2f807557 chore(omc): hotpaths (beam-size/correct/COLAB)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 07:29:37 +09:00
lukehemmin 7a8cc12cb3 feat(cli): --beam-size + --correct; add COLAB.md GPU full-transcribe guide
- transcribe: --beam-size(CPU 속도), --correct(사내 LLM 청크 보정, SCRIBE_LLM_*),
  config.beam_size(CPU 1~2 권장). 보정 시 전체 수집 후 한 번에 출력.
- COLAB.md: Colab(전사 전용·게이트 미도달) + 온프렘 GPU(전사+보정 풀 파이프라인) 가이드.

23 tests pass, ruff clean. --correct 미설정 시 우아한 에러 검증.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 07:29:37 +09:00
lukehemmin 1a91060c43 chore(omc): hotpaths (chunked correction)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 07:09:51 +09:00
lukehemmin b721ca6419 feat(api): chunk LLM correction for small context windows (+running glossary)
사내 GPT-4o 컨텍스트(<30k)에 맞춰 긴 전사를 문장 경계로 청크 분할하고,
각 청크 보정의 영문 용어를 '러닝 글로서리'로 다음 청크 system에 전달 →
큰 창 없이 강연 전체 용어 일관성 유지. config.llm_max_chars(기본 3000;
~8k창→1500/~16k→3000/~30k→6000). 과대 단일문장은 글자단위 강제 분할 안전망.

23 tests pass(청크 분할/글로서리 주입 포함), ruff clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-09 07:09:51 +09:00
lukehemmin 1ea96c36c8 chore(omc): record GPT-4o correction finding + P2 API progress (hotpaths)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 23:20:01 +09:00
lukehemmin 8f6f8969fd feat(api): sync test API (serve) + opt-in LLM correction + cloudflared tunnel
- api/: FastAPI app, X-API-Key 인증(미설정 시 임시키), 엔진 load-once 풀
  (+transcribe lock), POST /v1/transcribe(multipart, 동기), /health, /v1/system,
  /v1/models. 업로드 임시파일 finally 삭제(프라이버시).
- postprocess/: llm.correct(scripts/llm_correct.py 승격; opt-in·allowlist·감사로그·재시도)
  + rules.normalize(EmbeddingGemma 등 정규화).
- results/formats.py: txt/srt/vtt. connectivity/tunnel.py: cloudflared quick tunnel(Colab).
- cli serve: uvicorn 단일워커 + --tunnel cloudflare; config llm_* 필드;
  pyproject api/queue extra 분리(+python-multipart, dev httpx).

검증: 22 단위테스트(API TestClient·formats·postprocess) + 실서버 e2e
(/health·auth 401·실제 전사(JFK)·SRT·임시파일 삭제). KO 품질은 turbo/large-v3 필요(tiny는 한국어 degenerate).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-08 23:20:01 +09:00
25 changed files with 1285 additions and 60 deletions
+200 -24
View File
@@ -1,22 +1,38 @@
{ {
"version": "1.0.0", "version": "1.0.0",
"lastScanned": 1780794206309, "lastScanned": 1780919472386,
"projectRoot": "/root/luke_scribe", "projectRoot": "/root/luke_scribe",
"techStack": { "techStack": {
"languages": [ "languages": [
"Python" {
"name": "Python",
"version": null,
"confidence": "high",
"markers": [
"pyproject.toml"
]
}
], ],
"frameworks": [ "frameworks": [
"FastAPI · faster-whisper/CTranslate2 · Redis/RQ(no-fork) · pydantic v2 · ffmpeg · Silero VAD" {
"name": "fastapi",
"version": null,
"category": "backend"
},
{
"name": "pytest",
"version": null,
"category": "testing"
}
], ],
"packageManager": "uv", "packageManager": null,
"runtime": "Python 3.11+" "runtime": null
}, },
"build": { "build": {
"buildCommand": "uv sync", "buildCommand": null,
"testCommand": "export PATH=\"$HOME/.local/bin:$HOME/.cargo/bin:$PATH\"\nuv run pytest -q 2>&1 | tail -8\necho \"=== ruff ===\"; uv run ruff check src/ tests/ && echo \"clean\"", "testCommand": "export PATH=\"$HOME/.local/bin:$HOME/.cargo/bin:$PATH\"\necho \"=== ruff ===\"; uv run ruff check src/ tests/ && echo clean\necho \"=== pytest ===\"; uv run pytest -q 2>&1 | tail -2\necho \"=== --correct 경로(설정 없음 → 우아한 에러) ===\"\nuv run luke-scribe transcribe /tmp/jfk.flac --model tiny --language en --correct 2>&1 | tail -4; echo \"exit=${PIPESTATUS[0]}\"",
"lintCommand": "uv run ruff check src/ tests/", "lintCommand": "ruff check",
"devCommand": "uv run luke-scribe detect", "devCommand": null,
"scripts": {} "scripts": {}
}, },
"conventions": { "conventions": {
@@ -29,9 +45,10 @@
"isMonorepo": false, "isMonorepo": false,
"workspaces": [], "workspaces": [],
"mainDirectories": [ "mainDirectories": [
"src/luke_scribe (계획, 미생성)" "src",
"tests"
], ],
"gitBranches": "main" "gitBranches": null
}, },
"customNotes": [ "customNotes": [
{ {
@@ -57,10 +74,67 @@
"source": "manual", "source": "manual",
"category": "status", "category": "status",
"content": "P1 진행(2026-06-07): ✅ detect(능력등급 T0~T3, 1050→T0_CPU 명시강등) · ✅ transcribe(faster-whisper CPU 검증: JFK 11s 클립 정확 전사, model_used 출력) · 단위테스트 10개 통과. 코드 존재함(더 이상 0%). 남음: word-ts/format 출력옵션·Silero VAD 옵션화, VRAM 실측 probe(정적추정 대체), bench(라벨 KO+EN 샘플셋 필요), 상위 tier(T2/T3) Colab 검증, P2(API+Redis/RQ). 브랜치 feat/p1-core." "content": "P1 진행(2026-06-07): ✅ detect(능력등급 T0~T3, 1050→T0_CPU 명시강등) · ✅ transcribe(faster-whisper CPU 검증: JFK 11s 클립 정확 전사, model_used 출력) · 단위테스트 10개 통과. 코드 존재함(더 이상 0%). 남음: word-ts/format 출력옵션·Silero VAD 옵션화, VRAM 실측 probe(정적추정 대체), bench(라벨 KO+EN 샘플셋 필요), 상위 tier(T2/T3) Colab 검증, P2(API+Redis/RQ). 브랜치 feat/p1-core."
},
{
"timestamp": 1780926195887,
"source": "manual",
"category": "finding",
"content": "검증된 발견(2026-06-07): KO+EN 혼용어 음차 문제의 open-vocab 해법 = 사내 GPT-4o 텍스트 후처리 보정. faster-whisper(turbo)가 음차로 망친 영문 용어를 hotwords 등록 없이 문맥+지식으로 복원. 실증(EmbeddingGemma 강연 90초 슬라이스): 인베딩 점마→Embedding Gemma, 재미나이→Gemini, 점마→Gemma, 랭기징→Language, 구글 포 디벨로퍼스→Google for Developers (5/5, 일반 한국어는 보존). 게이트=OpenAI 호환(baseURL http://192.168.0.123:8080/v1, model copilot-gpt-4o, API키 필요·키는 메모리에 저장 안 함; localhost:8080은 사용자 머신 터널이라 샌드박스선 미도달) → 사내 호출이라 외부 egress 0(프라이버시 OK). 함의: hotwords는 등록된 것만 잡아 불충분, LLM 문맥보정이 '모르는 용어'까지 커버. 단서: (1) 'Embedding Gemma' 띄어쓰기(공식 EmbeddingGemma)→rules/glossary 정규화 병행 필요, (2) LLM이 아는/추론가능 용어만·초신조어는 confidence 플래그→휴먼, (3) 샘플1개라 과교정 추가검증, (4) 게이트 경로 불안정(401→timeout→reset)→재시도 필요(스크립트에 반영). 작은 컨텍스트는 청크+러닝글로서리로 우회. PoC=scripts/llm_correct.py → 승격 대상 postprocess/llm.py(confidence-gated·청크·backend=internal·감사로그) + transcribe --correct 플래그."
} }
], ],
"directoryMap": {}, "directoryMap": {
"samples": {
"path": "samples",
"purpose": null,
"fileCount": 1,
"lastAccessed": 1780919472362,
"keyFiles": [
"README.md"
]
},
"src": {
"path": "src",
"purpose": "Source code",
"fileCount": 0,
"lastAccessed": 1780919472371,
"keyFiles": []
},
"tests": {
"path": "tests",
"purpose": "Test files",
"fileCount": 2,
"lastAccessed": 1780919472373,
"keyFiles": [
"test_device_manager.py",
"test_engine_audio.py"
]
}
},
"hotPaths": [ "hotPaths": [
{
"path": "src/luke_scribe/cli.py",
"accessCount": 8,
"lastAccessed": 1780957705972,
"type": "file"
},
{
"path": "src/luke_scribe/config.py",
"accessCount": 5,
"lastAccessed": 1780957473801,
"type": "file"
},
{
"path": "scripts/llm_correct.py",
"accessCount": 4,
"lastAccessed": 1780925584647,
"type": "file"
},
{
"path": "pyproject.toml",
"accessCount": 4,
"lastAccessed": 1780928043613,
"type": "file"
},
{ {
"path": "README.md", "path": "README.md",
"accessCount": 3, "accessCount": 3,
@@ -68,15 +142,21 @@
"type": "file" "type": "file"
}, },
{ {
"path": "src/luke_scribe/cli.py", "path": "src/luke_scribe/postprocess/llm.py",
"accessCount": 2, "accessCount": 3,
"lastAccessed": 1780812315014, "lastAccessed": 1780956524689,
"type": "file" "type": "file"
}, },
{ {
"path": "pyproject.toml", "path": "src/luke_scribe/api/routes/transcribe.py",
"accessCount": 1, "accessCount": 3,
"lastAccessed": 1780804235420, "lastAccessed": 1780956549345,
"type": "file"
},
{
"path": "tests/test_postprocess.py",
"accessCount": 2,
"lastAccessed": 1780956556589,
"type": "file" "type": "file"
}, },
{ {
@@ -85,12 +165,6 @@
"lastAccessed": 1780804261889, "lastAccessed": 1780804261889,
"type": "file" "type": "file"
}, },
{
"path": "src/luke_scribe/config.py",
"accessCount": 1,
"lastAccessed": 1780804262703,
"type": "file"
},
{ {
"path": "src/luke_scribe/devices/__init__.py", "path": "src/luke_scribe/devices/__init__.py",
"accessCount": 1, "accessCount": 1,
@@ -168,6 +242,108 @@
"accessCount": 1, "accessCount": 1,
"lastAccessed": 1780812413312, "lastAccessed": 1780812413312,
"type": "file" "type": "file"
},
{
"path": "samples/README.md",
"accessCount": 1,
"lastAccessed": 1780812722445,
"type": "file"
},
{
"path": "samples/ko_en/manifest.jsonl.example",
"accessCount": 1,
"lastAccessed": 1780812854083,
"type": "file"
},
{
"path": "src/luke_scribe/results/__init__.py",
"accessCount": 1,
"lastAccessed": 1780927886298,
"type": "file"
},
{
"path": "src/luke_scribe/results/formats.py",
"accessCount": 1,
"lastAccessed": 1780927892282,
"type": "file"
},
{
"path": "src/luke_scribe/postprocess/__init__.py",
"accessCount": 1,
"lastAccessed": 1780927894092,
"type": "file"
},
{
"path": "src/luke_scribe/postprocess/rules.py",
"accessCount": 1,
"lastAccessed": 1780927897308,
"type": "file"
},
{
"path": "src/luke_scribe/api/__init__.py",
"accessCount": 1,
"lastAccessed": 1780927952439,
"type": "file"
},
{
"path": "src/luke_scribe/api/schemas.py",
"accessCount": 1,
"lastAccessed": 1780927953308,
"type": "file"
},
{
"path": "src/luke_scribe/api/engine_pool.py",
"accessCount": 1,
"lastAccessed": 1780927954191,
"type": "file"
},
{
"path": "src/luke_scribe/api/deps.py",
"accessCount": 1,
"lastAccessed": 1780927955218,
"type": "file"
},
{
"path": "src/luke_scribe/api/app.py",
"accessCount": 1,
"lastAccessed": 1780927956175,
"type": "file"
},
{
"path": "src/luke_scribe/api/routes/__init__.py",
"accessCount": 1,
"lastAccessed": 1780927957095,
"type": "file"
},
{
"path": "src/luke_scribe/connectivity/__init__.py",
"accessCount": 1,
"lastAccessed": 1780927962648,
"type": "file"
},
{
"path": "src/luke_scribe/connectivity/tunnel.py",
"accessCount": 1,
"lastAccessed": 1780927971385,
"type": "file"
},
{
"path": "tests/test_formats.py",
"accessCount": 1,
"lastAccessed": 1780928016400,
"type": "file"
},
{
"path": "tests/test_api.py",
"accessCount": 1,
"lastAccessed": 1780928028187,
"type": "file"
},
{
"path": "COLAB.md",
"accessCount": 1,
"lastAccessed": 1780957731994,
"type": "file"
} }
], ],
"userDirectives": [ "userDirectives": [
+79
View File
@@ -0,0 +1,79 @@
# Colab / GPU 풀 전사 가이드
GPU 환경(Colab T4/A100 또는 온프렘 GPU)에서 **풀 강연을 빠르게** 전사(+선택 보정)합니다.
CPU(개발 박스)는 풀 강연이 느려(turbo ~RTF 5×) 비권장 — 여기서 돌리세요.
GPU(T4)에서 turbo는 대략 실시간의 ~0.1~0.3×**37분 강연이 수 분**.
---
## A) Google Colab — 전사 전용
> Colab은 외부 클라우드라 **사내 LLM 게이트(192.168.0.123)에 못 닿습니다** → `--correct`(보정) 불가, **전사만**.
> 런타임 → 런타임 유형 변경 → **GPU(T4)** 선택.
```python
# 1) 시스템 의존성 + uv
!apt-get -qq update && apt-get -qq install -y ffmpeg
!curl -LsSf https://astral.sh/uv/install.sh | sh
import os; os.environ["PATH"] = "/root/.local/bin:" + os.environ["PATH"]
# 2) 코드 (저장소 익명 read 허용)
!git clone -b feat/p1-core https://git.lukehemmin.com/lukehemmin/luke_scribe.git
%cd luke_scribe
# 3) 의존성 (엔진 + GPU CUDA 런타임)
!uv sync --extra engine --extra gpu
# 4) GPU 인식 확인 (T3면 turbo+large-v3 동시상주)
!uv run luke-scribe detect
# 5) 오디오 업로드 (또는 Drive 마운트)
from google.colab import files
AUDIO = list(files.upload().keys())[0]
# 6) 풀 전사 (large-v3-turbo) — 더 높은 정확도는 --model large-v3
!uv run luke-scribe transcribe "$AUDIO" --model large-v3-turbo --language ko --timestamps | tee transcript.txt
```
### Colab을 API로 외부 노출하려면
```python
# cloudflared 공개 URL 발급 → 외부에서 curl
!uv sync --extra engine --extra gpu --extra api
import subprocess, os
os.environ["SCRIBE_API_KEYS"] = '["colab-test"]'
!nohup uv run luke-scribe serve --host 0.0.0.0 --port 8000 --tunnel cloudflare > serve.log 2>&1 &
import time; time.sleep(8); print(open("serve.log").read()) # public *.trycloudflare.com URL 확인
```
---
## B) 온프렘 GPU — 전사 + 사내 LLM 보정 (풀 파이프라인)
사내망(게이트 192.168.0.123 도달) + GPU 머신이면 **음차→영문 복원까지** 한 번에:
```bash
git clone -b feat/p1-core https://git.lukehemmin.com/lukehemmin/luke_scribe.git && cd luke_scribe
uv sync --extra engine --extra gpu
export SCRIBE_LLM_BASE_URL=http://192.168.0.123:8080/v1
export SCRIBE_LLM_API_KEY=<사내 키> # 셸 히스토리 주의
export SCRIBE_LLM_MODEL=copilot-gpt-4o
export SCRIBE_LLM_MAX_CHARS=3000 # 사내 LLM 컨텍스트 창에 맞춰(~8k→1500/~16k→3000/~30k→6000)
# 전사 + 청크 보정을 한 명령으로
uv run luke-scribe transcribe talk.m4a --model large-v3-turbo --language ko --correct | tee transcript.txt
```
API로:
```bash
uv run luke-scribe serve # 출력된 X-API-Key 사용
curl -H "X-API-Key: <키>" -F file=@talk.m4a -F model=large-v3-turbo -F correct=true \
http://localhost:8000/v1/transcribe
```
---
## 참고
- 보정은 긴 전사를 `SCRIBE_LLM_MAX_CHARS` 청크로 분할 + **러닝 글로서리**로 처리(작은 컨텍스트 창 대응).
- 약 GPU(1050/2GB)는 turbo도 안 들어가 자동으로 **CPU(T0)** 로 강등 — `detect`로 등급 확인.
- 오디오 파일은 저장소에 없음(`.gitignore`) — Colab 업로드/Drive 또는 온프렘 로컬 경로 사용.
+130
View File
@@ -0,0 +1,130 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# luke_scribe — Colab 풀 강연 전사\n",
"\n",
"GPU(T4)에서 풀 강연을 **수 분**에 전사합니다.\n",
"\n",
"**먼저:** 런타임 → 런타임 유형 변경 → 하드웨어 가속기 **GPU** 선택.\n",
"\n",
"> ⚠️ Colab은 외부라 **사내 LLM 게이트(192.168.0.123)에 못 닿습니다** → 보정(`--correct`) 불가, **전사만**. 보정까지는 사내망 GPU에서 (repo `COLAB.md` B절).\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# 0) GPU 확인 (없으면 런타임 유형을 GPU로)\n",
"!nvidia-smi -L || echo \"GPU 없음 → 런타임 유형을 GPU로 바꾸세요\"\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# 1) 시스템 의존성 + uv\n",
"!apt-get -qq update && apt-get -qq install -y ffmpeg\n",
"!curl -LsSf https://astral.sh/uv/install.sh | sh\n",
"import os\n",
"os.environ['PATH'] = '/root/.local/bin:' + os.environ['PATH']\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# 2) 코드 가져오기 (저장소 익명 read 허용)\n",
"!git clone -b feat/p1-core https://git.lukehemmin.com/lukehemmin/luke_scribe.git\n",
"%cd luke_scribe\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# 3) 의존성 (엔진 + GPU CUDA 런타임) — 수 분 소요\n",
"!uv sync --extra engine --extra gpu\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# 4) 하드웨어 등급 확인 (T3 = turbo+large-v3 동시상주)\n",
"!uv run luke-scribe detect\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# 5) 강연 오디오 업로드 (m4a/mp3/wav/mp4 …)\n",
"from google.colab import files\n",
"AUDIO = list(files.upload().keys())[0]\n",
"print('업로드:', AUDIO)\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# 6) 풀 전사 (large-v3-turbo; 더 정확히는 --model large-v3)\n",
"!uv run luke-scribe transcribe \"$AUDIO\" --model large-v3-turbo --language ko --timestamps | tee transcript.txt\n"
]
},
{
"cell_type": "code",
"metadata": {},
"execution_count": null,
"outputs": [],
"source": [
"# 7) 전사문 내려받기\n",
"from google.colab import files\n",
"files.download('transcript.txt')\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 참고\n",
"- **모델**: `large-v3-turbo`(빠름) ↔ `large-v3`(정확). `detect`가 T0(CPU)면 약 GPU(느림).\n",
"- **보정(음차→영문)**: Colab 불가(게이트 미도달). 사내망 GPU에서 `--correct` + `SCRIBE_LLM_*` (`COLAB.md` B절).\n",
"- **속도**: T4 turbo ≈ 실시간 0.1~0.3× → 37분 강연 수 분.\n"
]
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"provenance": [],
"gpuType": "T4"
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
+5 -3
View File
@@ -18,8 +18,10 @@ dependencies = [
engine = ["faster-whisper>=1.0.3", "av>=11"] engine = ["faster-whisper>=1.0.3", "av>=11"]
# GPU CUDA 런타임 (faster-whisper GPU 추론 시) # GPU CUDA 런타임 (faster-whisper GPU 추론 시)
gpu = ["nvidia-cublas-cu12", "nvidia-cudnn-cu12"] gpu = ["nvidia-cublas-cu12", "nvidia-cudnn-cu12"]
# P2 API + Queue # 테스트 API (동기) — serve
api = ["fastapi>=0.110", "uvicorn[standard]>=0.29", "redis>=5.0", "rq>=1.16"] api = ["fastapi>=0.110", "uvicorn[standard]>=0.29", "python-multipart>=0.0.9"]
# P2 비동기 큐 (보류)
queue = ["redis>=5.0", "rq>=1.16"]
# P5 옵션 # P5 옵션
diarize = ["pyannote.audio>=3.1"] diarize = ["pyannote.audio>=3.1"]
llm = ["openai>=1.30"] llm = ["openai>=1.30"]
@@ -35,4 +37,4 @@ build-backend = "hatchling.build"
packages = ["src/luke_scribe"] packages = ["src/luke_scribe"]
[dependency-groups] [dependency-groups]
dev = ["pytest>=8.2", "ruff>=0.5"] dev = ["pytest>=8.2", "ruff>=0.5", "httpx>=0.27"]
+82
View File
@@ -0,0 +1,82 @@
#!/usr/bin/env python3
"""STT 후처리 PoC — 음차된 영문 기술용어를 사내 LLM(OpenAI 호환)으로 복원.
게이트가 닿는 환경에서 실행:
export SCRIBE_LLM_BASE_URL=http://localhost:8080/v1
export SCRIBE_LLM_API_KEY=<사내 키>
export SCRIBE_LLM_MODEL=copilot-gpt-4o
python3 scripts/llm_correct.py # 내장 샘플로 데모
python3 scripts/llm_correct.py < my.txt # 임의 전사 교정
외부 의존성 없음(urllib). 향후 postprocess/llm.py(confidence-gated, 청크/러닝글로서리)로 발전.
"""
from __future__ import annotations
import json
import os
import sys
import time
import urllib.error
import urllib.request
SYSTEM = (
"너는 한국어 STT 전사 후처리기다. 한국어 음성에 섞여 나온 영어 기술용어·고유명사가 "
"발음대로 한글로 음차되어 잘못 적힌 부분을 문맥과 지식으로 원래 영어 표기로 복원하라. "
"일반 한국어는 그대로 두고, 확실하지 않으면 바꾸지 마라. 설명 없이 교정된 전사문만 출력하라."
)
# turbo가 망친 실제 전사(EmbeddingGemma 강연) — 내장 데모용
SAMPLE = (
"그래서 오늘 준비한 내용은 기본적으로 인베딩 점마에 대해서 설명을 드릴 텐데요. "
"여러분들이 알고 계시는 랭기징 모델이 정말 사람이 생각하는 것처럼 하는데 "
"그 다음에 구글에 런칭한 오픈모델입니다. 인베딩 점마 라는 것을 소개를 해드릴 예정입니다. "
"그리고 어 재미나이 하고 이제 점마하고 두 가지가 있는데요. "
"구글 포 디벨로퍼스 사이트에 가시면 제가 올린 포스트도 보실 수 있는데."
)
def correct(text: str) -> str:
base = os.environ.get("SCRIBE_LLM_BASE_URL", "http://localhost:8080/v1").rstrip("/")
key = os.environ.get("SCRIBE_LLM_API_KEY", "")
model = os.environ.get("SCRIBE_LLM_MODEL", "copilot-gpt-4o")
payload = {
"model": model,
"temperature": 0,
"messages": [
{"role": "system", "content": SYSTEM},
{"role": "user", "content": text},
],
}
req = urllib.request.Request(
base + "/chat/completions",
data=json.dumps(payload).encode(),
headers={"Content-Type": "application/json", "Authorization": "Bearer " + key},
)
retries = 4
for attempt in range(1, retries + 1):
try:
with urllib.request.urlopen(req, timeout=90) as resp:
return json.loads(resp.read())["choices"][0]["message"]["content"]
except urllib.error.HTTPError:
raise # 실제 HTTP 응답(401/400 등) — 재시도 무의미
except (urllib.error.URLError, OSError) as exc: # 연결 reset/timeout 등 transient
if attempt == retries:
raise
print(f" [retry {attempt}/{retries - 1}] {type(exc).__name__} → 재시도", file=sys.stderr)
time.sleep(1.5 * attempt)
raise RuntimeError("unreachable")
def main() -> None:
src = (sys.stdin.read().strip() if not sys.stdin.isatty() else "") or SAMPLE
print("=== 원본 ===\n" + src + "\n\n=== 교정 ===")
try:
print(correct(src))
except urllib.error.HTTPError as exc:
sys.exit(f"HTTP {exc.code}: {exc.read().decode()[:300]}")
except Exception as exc: # noqa: BLE001
sys.exit(f"{type(exc).__name__}: {exc}")
if __name__ == "__main__":
main()
+1
View File
@@ -0,0 +1 @@
"""HTTP API (FastAPI) — 동기 테스트 API. 비동기 큐/실시간은 P2/P3."""
+24
View File
@@ -0,0 +1,24 @@
"""FastAPI 앱 팩토리."""
from __future__ import annotations
import contextlib
import logging
from collections.abc import AsyncIterator
from fastapi import FastAPI
from .deps import ensure_keys
from .routes.transcribe import router
logger = logging.getLogger("luke_scribe.api")
def create_app() -> FastAPI:
@contextlib.asynccontextmanager
async def lifespan(_app: FastAPI) -> AsyncIterator[None]:
logger.info("luke_scribe API ready · X-API-Key=%s", ensure_keys()[0])
yield
app = FastAPI(title="luke_scribe", version="0.1.0", lifespan=lifespan)
app.include_router(router)
return app
+26
View File
@@ -0,0 +1,26 @@
"""인증 — X-API-Key (스펙 §3.8). 키 미설정 시 기동 때 임시 키 1개 생성·강제."""
from __future__ import annotations
import secrets
from fastapi import Header, HTTPException, status
from ..config import settings
_ephemeral_key: str | None = None
def ensure_keys() -> list[str]:
"""유효 키 목록. 설정이 없으면 임시 키를 1회 생성해 반환(앱이 출력)."""
global _ephemeral_key
if settings.api_keys:
return settings.api_keys
if _ephemeral_key is None:
_ephemeral_key = "sk-luke-" + secrets.token_urlsafe(24)
return [_ephemeral_key]
def require_api_key(x_api_key: str | None = Header(default=None)) -> str:
if x_api_key not in ensure_keys():
raise HTTPException(status.HTTP_401_UNAUTHORIZED, "invalid or missing X-API-Key")
return x_api_key
+27
View File
@@ -0,0 +1,27 @@
"""프로세스 레벨 엔진 캐시 — 모델 load-once 재사용 (스펙 §3.5).
전사는 `transcribe_lock`으로 직렬화(단일 GPU/CPU, 테스트 등급). uvicorn 단일 워커 전제.
"""
from __future__ import annotations
import threading
from ..engine.faster_whisper_engine import FasterWhisperEngine
_engines: dict[tuple[str, str, str], FasterWhisperEngine] = {}
_cache_lock = threading.Lock()
transcribe_lock = threading.Lock()
def get_engine(
model: str, device: str, compute_type: str, cache_dir: str | None = None
) -> FasterWhisperEngine:
key = (model, device, compute_type)
eng = _engines.get(key)
if eng is None:
with _cache_lock:
eng = _engines.get(key)
if eng is None:
eng = FasterWhisperEngine(model, device, compute_type, cache_dir)
_engines[key] = eng
return eng
+124
View File
@@ -0,0 +1,124 @@
"""라우트 — /health, /v1/system, /v1/models, POST /v1/transcribe (동기)."""
from __future__ import annotations
import contextlib
import os
import tempfile
from fastapi import APIRouter, Depends, File, Form, HTTPException, UploadFile, status
from fastapi.responses import PlainTextResponse
from ...audio.ingest import probe_media
from ...config import settings
from ...devices import DeviceManager
from ...postprocess import llm as llm_correct
from ...postprocess import rules
from ...results import formats
from ..deps import require_api_key
from ..engine_pool import get_engine, transcribe_lock
router = APIRouter()
@router.get("/health")
def health() -> dict[str, str]:
return {"status": "ok"}
@router.get("/v1/system")
def system(): # noqa: ANN201 — DeviceProfile(pydantic) 직렬화
return DeviceManager.detect()
@router.get("/v1/models")
def models() -> dict:
profile = DeviceManager.detect()
return {
"tier": profile.tier.value,
"served": profile.served_models,
"realtime": settings.model_realtime,
"batch": settings.model_batch,
}
@router.post("/v1/transcribe")
def transcribe_ep( # noqa: PLR0913 — 요청 옵션 다수(스펙 options 스키마)
file: UploadFile = File(...),
language: str | None = Form(None),
model: str | None = Form(None),
device: str = Form("auto"),
vad: bool = Form(True),
word_timestamps: bool = Form(False),
correct: bool = Form(False),
response_format: str = Form("json"),
_api_key: str = Depends(require_api_key),
):
suffix = os.path.splitext(file.filename or "")[1] or ".bin"
fd, tmp = tempfile.mkstemp(prefix="luke_up_", suffix=suffix)
try:
with os.fdopen(fd, "wb") as out:
while chunk := file.file.read(1 << 20):
out.write(chunk)
info = probe_media(tmp)
if info.duration_s > settings.max_duration_s or info.size_bytes > settings.max_size_bytes:
raise HTTPException(
status.HTTP_413_CONTENT_TOO_LARGE,
f"{info.duration_s:.0f}s/{info.size_bytes}B "
f"exceeds {settings.max_duration_s}s/{settings.max_size_bytes}B",
)
profile = DeviceManager.detect(force_device=(None if device == "auto" else device))
dev = "cpu" if profile.kind == "cpu" else "cuda"
model_name = model or settings.model_realtime
lang = language or settings.language
engine = get_engine(model_name, dev, profile.compute_type, settings.model_cache_dir)
with transcribe_lock:
segments, tinfo = engine.transcribe(
tmp, language=lang, word_timestamps=word_timestamps, vad=vad
)
seg_list = [
{"start": float(s.start), "end": float(s.end), "text": s.text.strip()}
for s in segments
]
text = " ".join(s["text"] for s in seg_list).strip()
corrected = False
if correct:
try:
text = rules.normalize(
llm_correct.correct(
text,
base_url=settings.llm_base_url,
api_key=settings.llm_api_key,
model=settings.llm_model,
max_chars=settings.llm_max_chars,
)
)
corrected = True
except llm_correct.LLMNotConfigured as exc:
raise HTTPException(status.HTTP_400_BAD_REQUEST, f"correct=true but {exc}") from exc
except Exception as exc: # noqa: BLE001
raise HTTPException(
status.HTTP_502_BAD_GATEWAY, f"LLM correction failed: {exc}"
) from exc
if response_format == "txt":
return PlainTextResponse(text)
if response_format == "srt":
return PlainTextResponse(formats.to_srt(seg_list))
if response_format == "vtt":
return PlainTextResponse(formats.to_vtt(seg_list))
return {
"text": text,
"segments": seg_list,
"language": getattr(tinfo, "language", None),
"model_used": model_name,
"corrected": corrected,
"duration_s": info.duration_s,
}
finally:
with contextlib.suppress(OSError):
os.remove(tmp) # 프라이버시: 모든 종료경로에서 임시파일 삭제
file.file.close()
+19
View File
@@ -0,0 +1,19 @@
"""API 응답 스키마."""
from __future__ import annotations
from pydantic import BaseModel
class Segment(BaseModel):
start: float
end: float
text: str
class TranscribeResult(BaseModel):
text: str
segments: list[Segment]
language: str | None = None
model_used: str
corrected: bool = False
duration_s: float = 0.0
+77 -7
View File
@@ -55,6 +55,8 @@ def transcribe(
device: str = typer.Option("auto", help="auto|cpu|cuda"), device: str = typer.Option("auto", help="auto|cpu|cuda"),
word_timestamps: bool = typer.Option(False, "--word-timestamps"), word_timestamps: bool = typer.Option(False, "--word-timestamps"),
vad: bool = typer.Option(True, "--vad/--no-vad", help="무음 제거"), vad: bool = typer.Option(True, "--vad/--no-vad", help="무음 제거"),
beam_size: int = typer.Option(None, "--beam-size", help="디코딩 빔(CPU 1~2 권장=속도↑)"),
correct: bool = typer.Option(False, "--correct", help="사내 LLM 보정(SCRIBE_LLM_* 설정 필요)"),
timestamps: bool = typer.Option(False, "--timestamps", help="세그먼트 [startend] 표시"), timestamps: bool = typer.Option(False, "--timestamps", help="세그먼트 [startend] 표시"),
) -> None: ) -> None:
"""단발 파일 전사 (faster-whisper, CPU/GPU 자동, AC-4 일부).""" """단발 파일 전사 (faster-whisper, CPU/GPU 자동, AC-4 일부)."""
@@ -90,17 +92,45 @@ def transcribe(
) )
engine = FasterWhisperEngine(model_name, dev, profile.compute_type, cache_dir=settings.model_cache_dir) engine = FasterWhisperEngine(model_name, dev, profile.compute_type, cache_dir=settings.model_cache_dir)
segments, tinfo = engine.transcribe(file, language=lang, word_timestamps=word_timestamps, vad=vad) segments, tinfo = engine.transcribe(
file, language=lang, word_timestamps=word_timestamps, vad=vad,
beam_size=(beam_size or settings.beam_size),
)
count = 0 seg_list = []
for seg in segments: for seg in segments:
count += 1 seg_list.append({"start": seg.start, "end": seg.end, "text": seg.text.strip()})
if not correct: # 스트리밍 출력(보정 시엔 전체를 모은 뒤 한 번에)
if timestamps: if timestamps:
console.print(f"[cyan][{seg.start:6.2f}{seg.end:6.2f}][/] {seg.text.strip()}") console.print(f"[cyan][{seg.start:6.2f}{seg.end:6.2f}][/] {seg.text.strip()}")
else: else:
console.print(seg.text.strip()) console.print(seg.text.strip())
if correct:
from .postprocess import llm as llm_correct
from .postprocess import rules
text = " ".join(s["text"] for s in seg_list).strip()
try:
text = rules.normalize(
llm_correct.correct(
text,
base_url=settings.llm_base_url,
api_key=settings.llm_api_key,
model=settings.llm_model,
max_chars=settings.llm_max_chars,
)
)
except llm_correct.LLMNotConfigured as exc:
console.print(f"[red]--correct:[/] {exc}")
raise typer.Exit(code=1) from exc
console.print(text)
detected = getattr(tinfo, "language", None) detected = getattr(tinfo, "language", None)
console.print(f"[green]✓ {count} segments · detected_lang={detected} · model_used={model_name}[/]") console.print(
f"[green]✓ {len(seg_list)} segments · detected_lang={detected} · "
f"model_used={model_name} · corrected={correct}[/]"
)
@app.command() @app.command()
@@ -110,9 +140,49 @@ def bench(samples: str = typer.Option(None, help="라벨된 KO+EN 샘플 디렉
@app.command() @app.command()
def serve() -> None: def serve(
"""API 서버 (P2).""" host: str = typer.Option(None, help="bind host (기본 설정값)"),
_todo("serve", "→ P2 (FastAPI + Redis/RQ)") port: int = typer.Option(None, help="bind port (기본 설정값)"),
tunnel: str = typer.Option("none", help="none|cloudflare (Colab 외부 노출)"),
) -> None:
"""테스트 API 서버 (동기 transcribe + opt-in 보정). AC-1/11/12 일부."""
from .config import settings
try:
import uvicorn
from .api.app import create_app
from .api.deps import ensure_keys
except ImportError as exc:
console.print(f"[red]API 의존성 미설치:[/] {exc}\n→ `uv sync --extra api --extra engine`")
raise typer.Exit(code=1) from exc
bind_host = host or settings.host
bind_port = port or settings.port
key = ensure_keys()[0]
console.print(
f"[green]luke_scribe API[/] → http://{bind_host}:{bind_port} "
f"(X-API-Key: [bold]{key}[/])"
)
proc = None
if tunnel == "cloudflare":
try:
from .connectivity.tunnel import start_cloudflared
proc, public = start_cloudflared(bind_port)
console.print(
f"[green]public:[/] {public}" if public
else "[yellow]cloudflared URL 미수신(계속 진행).[/]"
)
except Exception as exc: # noqa: BLE001
console.print(f"[yellow]터널 실패(무시): {exc}[/]")
try:
uvicorn.run(create_app(), host=bind_host, port=bind_port, workers=1, log_level="info")
finally:
if proc is not None:
proc.terminate()
def main() -> None: def main() -> None:
+13
View File
@@ -15,6 +15,7 @@ class Settings(BaseSettings):
device: str = "auto" device: str = "auto"
compute_type: str | None = None # None=자동(cc/VRAM 기반) compute_type: str | None = None # None=자동(cc/VRAM 기반)
workers: int | None = None # None=자동 산정 workers: int | None = None # None=자동 산정
beam_size: int = 5 # 디코딩 빔(CPU는 1~2 권장=속도↑, GPU는 5)
# 언어 (기본 ko, 요청별 override) # 언어 (기본 ko, 요청별 override)
language: str = "ko" language: str = "ko"
@@ -34,5 +35,17 @@ class Settings(BaseSettings):
# 모델 캐시 디렉터리 (None=HF 기본) # 모델 캐시 디렉터리 (None=HF 기본)
model_cache_dir: str | None = None model_cache_dir: str | None = None
# API 서버 (테스트 동기 API)
host: str = "127.0.0.1"
port: int = 8000
# LLM 보정 (opt-in, 사내/로컬 OpenAI 호환 백엔드)
llm_enabled: bool = False
llm_base_url: str | None = None # 예: http://192.168.0.123:8080/v1 (allowlist=이 endpoint만)
llm_api_key: str | None = None # env SCRIBE_LLM_API_KEY 로만 주입
llm_model: str = "copilot-gpt-4o"
# 보정 청크 크기(글자) — 사내 LLM 컨텍스트 창에 맞춰 조정 (예: ~8k창→1500, ~16k→3000, ~30k→6000)
llm_max_chars: int = 3000
settings = Settings() settings = Settings()
+1
View File
@@ -0,0 +1 @@
"""외부 노출 — Colab 등 공인 IP 부재 환경 (스펙 §8). MVP: cloudflared quick tunnel."""
+63
View File
@@ -0,0 +1,63 @@
"""cloudflared quick tunnel (스펙 §8). 바이너리 없으면 캐시에 다운로드. best-effort.
`serve --tunnel cloudflare` 가 호출 → 공개 https://<rand>.trycloudflare.com 발급(계정 불필요).
"""
from __future__ import annotations
import os
import platform
import re
import shutil
import stat
import subprocess
import time
import urllib.request
_RELEASE = "https://github.com/cloudflare/cloudflared/releases/latest/download"
_ASSETS = {
("Linux", "x86_64"): "cloudflared-linux-amd64",
("Linux", "aarch64"): "cloudflared-linux-arm64",
}
_URL_RE = re.compile(r"https://[-a-z0-9]+\.trycloudflare\.com")
def ensure_cloudflared() -> str:
found = shutil.which("cloudflared")
if found:
return found
cache = os.path.expanduser("~/.cache/luke_scribe")
os.makedirs(cache, exist_ok=True)
path = os.path.join(cache, "cloudflared")
if os.path.exists(path):
return path
asset = _ASSETS.get((platform.system(), platform.machine()))
if not asset:
raise RuntimeError(
f"cloudflared 자동설치 미지원: {platform.system()}/{platform.machine()} "
"— 수동 설치 후 PATH에 두세요."
)
urllib.request.urlretrieve(f"{_RELEASE}/{asset}", path) # noqa: S310
os.chmod(path, os.stat(path).st_mode | stat.S_IEXEC)
return path
def start_cloudflared(port: int, timeout: float = 30.0) -> tuple[subprocess.Popen, str | None]:
"""터널 프로세스 시작 → (proc, public_url). URL 못 받으면 url=None(프로세스는 유지)."""
binp = ensure_cloudflared()
proc = subprocess.Popen( # noqa: S603
[binp, "tunnel", "--no-autoupdate", "--url", f"http://localhost:{port}"],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
text=True,
)
deadline = time.time() + timeout
while time.time() < deadline:
line = proc.stdout.readline() if proc.stdout else ""
if not line:
if proc.poll() is not None:
break
continue
m = _URL_RE.search(line)
if m:
return proc, m.group(0)
return proc, None
+1
View File
@@ -0,0 +1 @@
"""후처리 — glossary/rules + (opt-in) LLM 보정 + confidence (스펙 §7)."""
+138
View File
@@ -0,0 +1,138 @@
"""LLM 보정 (스펙 §7 stage 3 / §3.8) — 음차된 영문 용어를 문맥+지식으로 복원.
작은 컨텍스트 창 대응(사내 GPT-4o < 30k 토큰): 긴 전사는 **문장 경계로 청크 분할**,
각 청크를 순차 보정하며 **이미 확정된 영문 표기(러닝 글로서리)** 를 다음 청크로 전달 →
큰 창 없이도 강연 전체 용어 일관성 유지.
OpenAI 호환 백엔드(사내/로컬). **opt-in**(요청 correct=true) · **allowlist**(설정 base_url만) ·
**감사로그**(호출 요약 1줄). transient(연결 reset/timeout) 재시도.
"""
from __future__ import annotations
import json
import logging
import re
import time
import urllib.error
import urllib.request
logger = logging.getLogger("luke_scribe.postprocess.llm")
SYSTEM = (
"너는 한국어 STT 전사 후처리기다. 한국어 음성에 섞여 나온 영어 기술용어·고유명사가 "
"발음대로 한글로 음차되어 잘못 적힌 부분을 문맥과 지식으로 원래 영어 표기로 복원하라. "
"일반 한국어는 그대로 두고, 확실하지 않으면 바꾸지 마라. 설명 없이 교정된 전사문만 출력하라."
)
_SENT_RE = re.compile(r"(?<=[.!?。…\n])\s+") # 문장 경계
_TERM_RE = re.compile(r"[A-Za-z][A-Za-z0-9.+/#-]{1,}") # 러닝 글로서리용 영문 토큰
_GLOSSARY_CAP = 60
class LLMNotConfigured(RuntimeError):
"""llm_base_url / llm_api_key 미설정."""
def _chunk(text: str, max_chars: int) -> list[str]:
"""문장 경계로 max_chars 이하 청크 패킹. 한 문장이 과대하면 글자 단위 강제 분할."""
if len(text) <= max_chars:
return [text]
packed: list[str] = []
cur = ""
for part in _SENT_RE.split(text):
if not part:
continue
if cur and len(cur) + len(part) + 1 > max_chars:
packed.append(cur)
cur = part
else:
cur = f"{cur} {part}" if cur else part
if cur:
packed.append(cur)
out: list[str] = []
for c in packed: # 안전망: 단일 문장이 너무 길면 글자 단위 강제 분할
if len(c) > max_chars:
out.extend(c[i : i + max_chars] for i in range(0, len(c), max_chars))
else:
out.append(c)
return out
def _terms(text: str) -> list[str]:
seen: dict[str, None] = {}
for m in _TERM_RE.finditer(text):
seen.setdefault(m.group(0), None)
return list(seen)
def _request(
messages: list[dict],
*,
url: str,
api_key: str,
model: str,
retries: int,
timeout: float,
) -> str:
payload = {"model": model, "temperature": 0, "messages": messages}
req = urllib.request.Request(
url,
data=json.dumps(payload).encode(),
headers={"Content-Type": "application/json", "Authorization": "Bearer " + api_key},
)
for attempt in range(1, retries + 1):
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
return json.loads(resp.read())["choices"][0]["message"]["content"]
except urllib.error.HTTPError:
raise # 실제 HTTP 응답(401/4xx) — 재시도 무의미
except (urllib.error.URLError, OSError): # transient
if attempt == retries:
raise
time.sleep(1.0 * attempt)
raise RuntimeError("unreachable")
def correct(
text: str,
*,
base_url: str | None,
api_key: str | None,
model: str = "copilot-gpt-4o",
max_chars: int = 3000,
retries: int = 4,
timeout: float = 90.0,
) -> str:
"""음차 영문 용어 복원. max_chars로 청크 분할(작은 컨텍스트 창 대응)."""
if not base_url or not api_key:
raise LLMNotConfigured("llm_base_url/llm_api_key 미설정 — correct에 SCRIBE_LLM_* 필요")
url = base_url.rstrip("/") + "/chat/completions"
chunks = _chunk(text, max_chars)
logger.info(
"llm-correct egress endpoint=%s model=%s chars=%d chunks=%d",
url, model, len(text), len(chunks),
)
glossary: dict[str, None] = {}
out: list[str] = []
for chunk in chunks:
system = SYSTEM
if glossary:
system += (
"\n이미 이 전사에서 확정된 영문 표기: "
+ ", ".join(glossary)
+ ". 같은/유사 용어는 이 표기로 통일하라."
)
corrected = _request(
[{"role": "system", "content": system}, {"role": "user", "content": chunk}],
url=url,
api_key=api_key,
model=model,
retries=retries,
timeout=timeout,
)
out.append(corrected)
for term in _terms(corrected):
glossary.setdefault(term, None)
if len(glossary) > _GLOSSARY_CAP:
glossary = dict(list(glossary.items())[-_GLOSSARY_CAP:])
return " ".join(out).strip()
+18
View File
@@ -0,0 +1,18 @@
"""결정적 정규화 (스펙 §7 stage 2). LLM 복원 뒤 정확한 표기로 보정.
발견 노트: LLM이 'Embedding Gemma'로 복원 → rules가 공식 표기 'EmbeddingGemma'로 정규화.
"""
from __future__ import annotations
# 기본 내장 맵 (config/glossary로 확장 가능)
DEFAULT_RULES: dict[str, str] = {
"Embedding Gemma": "EmbeddingGemma",
"embedding gemma": "EmbeddingGemma",
"Google for developers": "Google for Developers",
}
def normalize(text: str, extra: dict[str, str] | None = None) -> str:
for src, dst in {**DEFAULT_RULES, **(extra or {})}.items():
text = text.replace(src, dst)
return text
+1
View File
@@ -0,0 +1 @@
"""결과 포맷·보관 (스펙 §4). MVP: 출력 포맷(txt/srt/vtt)."""
+45
View File
@@ -0,0 +1,45 @@
"""세그먼트 → txt/srt/vtt 변환 (스펙 §4, AC-9). 세그먼트=dict{start,end,text}."""
from __future__ import annotations
from collections.abc import Sequence
Segment = dict # {"start": float, "end": float, "text": str}
def _hms(t: float) -> tuple[int, int, int, int]:
t = max(0.0, t)
h = int(t // 3600)
m = int((t % 3600) // 60)
s = int(t % 60)
ms = int(round((t - int(t)) * 1000))
if ms == 1000: # 반올림 보정
ms, s = 0, s + 1
return h, m, s, ms
def _ts_srt(t: float) -> str:
h, m, s, ms = _hms(t)
return f"{h:02d}:{m:02d}:{s:02d},{ms:03d}"
def _ts_vtt(t: float) -> str:
h, m, s, ms = _hms(t)
return f"{h:02d}:{m:02d}:{s:02d}.{ms:03d}"
def to_txt(segments: Sequence[Segment]) -> str:
return "\n".join(s["text"].strip() for s in segments)
def to_srt(segments: Sequence[Segment]) -> str:
out: list[str] = []
for i, s in enumerate(segments, 1):
out += [str(i), f"{_ts_srt(s['start'])} --> {_ts_srt(s['end'])}", s["text"].strip(), ""]
return "\n".join(out).strip() + "\n"
def to_vtt(segments: Sequence[Segment]) -> str:
out: list[str] = ["WEBVTT", ""]
for s in segments:
out += [f"{_ts_vtt(s['start'])} --> {_ts_vtt(s['end'])}", s["text"].strip(), ""]
return "\n".join(out).strip() + "\n"
+86
View File
@@ -0,0 +1,86 @@
"""API — FastAPI TestClient. 엔진은 monkeypatch(가짜)로 모델 로드 회피."""
from __future__ import annotations
from types import SimpleNamespace
import pytest
from fastapi.testclient import TestClient
import luke_scribe.api.routes.transcribe as route
from luke_scribe.api.app import create_app
from luke_scribe.config import settings
class _FakeSeg:
def __init__(self, start: float, end: float, text: str) -> None:
self.start = start
self.end = end
self.text = text
class _FakeEngine:
def transcribe(self, _audio, **_kw):
return [_FakeSeg(0.0, 1.0, "안녕 vLLM"), _FakeSeg(1.0, 2.0, "두번째")], SimpleNamespace(
language="ko"
)
@pytest.fixture
def client(monkeypatch):
monkeypatch.setattr(route, "get_engine", lambda *a, **k: _FakeEngine())
monkeypatch.setattr(
route, "probe_media", lambda p: SimpleNamespace(path=p, duration_s=2.0, size_bytes=1234)
)
monkeypatch.setattr(settings, "api_keys", ["testkey"])
return TestClient(create_app())
def _files():
return {"file": ("a.wav", b"RIFF0000WAVE", "audio/wav")}
def test_health(client):
assert client.get("/health").json() == {"status": "ok"}
def test_requires_key(client):
assert client.post("/v1/transcribe", files=_files()).status_code == 401
def test_transcribe_ok(client):
r = client.post(
"/v1/transcribe", files=_files(), headers={"X-API-Key": "testkey"}, data={"language": "ko"}
)
assert r.status_code == 200
body = r.json()
assert body["segments"][0]["text"] == "안녕 vLLM"
assert body["model_used"]
assert body["corrected"] is False
def test_413(client, monkeypatch):
monkeypatch.setattr(
route, "probe_media", lambda p: SimpleNamespace(path=p, duration_s=999999, size_bytes=1)
)
r = client.post("/v1/transcribe", files=_files(), headers={"X-API-Key": "testkey"})
assert r.status_code == 413
def test_srt_format(client):
r = client.post(
"/v1/transcribe",
files=_files(),
headers={"X-API-Key": "testkey"},
data={"response_format": "srt"},
)
assert r.status_code == 200
assert "00:00:00,000 --> 00:00:01,000" in r.text
def test_correct_path(client, monkeypatch):
monkeypatch.setattr(route.llm_correct, "correct", lambda text, **k: text + " [보정]")
r = client.post(
"/v1/transcribe", files=_files(), headers={"X-API-Key": "testkey"}, data={"correct": "true"}
)
assert r.status_code == 200
assert r.json()["corrected"] is True
+25
View File
@@ -0,0 +1,25 @@
"""results.formats — txt/srt/vtt."""
from __future__ import annotations
from luke_scribe.results import formats
SEGS = [
{"start": 0.0, "end": 1.5, "text": "안녕 world"},
{"start": 1.5, "end": 3.0, "text": "두번째"},
]
def test_txt():
assert formats.to_txt(SEGS) == "안녕 world\n두번째"
def test_srt():
out = formats.to_srt(SEGS)
assert "1\n00:00:00,000 --> 00:00:01,500\n안녕 world" in out
assert "2\n00:00:01,500 --> 00:00:03,000\n두번째" in out
def test_vtt():
out = formats.to_vtt(SEGS)
assert out.startswith("WEBVTT")
assert "00:00:00.000 --> 00:00:01.500" in out
+59
View File
@@ -0,0 +1,59 @@
"""postprocess.rules / postprocess.llm (urllib monkeypatch)."""
from __future__ import annotations
import json
import pytest
from luke_scribe.postprocess import llm, rules
def test_rules_normalize():
assert rules.normalize("구글 Embedding Gemma 소개") == "구글 EmbeddingGemma 소개"
assert rules.normalize("그대로") == "그대로"
def test_llm_not_configured():
with pytest.raises(llm.LLMNotConfigured):
llm.correct("x", base_url=None, api_key=None)
class _FakeResp:
def __init__(self, payload: dict) -> None:
self._p = payload
def read(self) -> bytes:
return json.dumps(self._p).encode()
def __enter__(self):
return self
def __exit__(self, *_a):
return False
def test_llm_correct_monkeypatched(monkeypatch):
def fake_urlopen(_req, timeout=90): # noqa: ARG001
return _FakeResp({"choices": [{"message": {"content": "EmbeddingGemma 복원됨"}}]})
monkeypatch.setattr(llm.urllib.request, "urlopen", fake_urlopen)
out = llm.correct("인베딩 점마", base_url="http://x/v1", api_key="k", model="m")
assert out == "EmbeddingGemma 복원됨"
def test_llm_chunking_and_glossary(monkeypatch):
"""긴 입력 → 청크 분할 + 러닝 글로서리(작은 컨텍스트 창 대응)."""
calls: list[list[dict]] = []
def fake_request(messages, **_kw):
calls.append(messages)
return messages[1]["content"] # 청크 그대로 echo
monkeypatch.setattr(llm, "_request", fake_request)
long_text = ". ".join(f"문장{i} EmbeddingGemma 설명" for i in range(400))
out = llm.correct(long_text, base_url="http://x/v1", api_key="k", max_chars=200)
assert len(calls) > 1 # 분할됨
assert "EmbeddingGemma" in out # 재조립됨
# 2번째 청크부터 이전에 확정된 영문 표기가 system에 주입됨
assert any("확정된 영문 표기" in m[0]["content"] for m in calls[1:])
Generated
+37 -22
View File
@@ -521,7 +521,7 @@ name = "cuda-bindings"
version = "13.3.1" version = "13.3.1"
source = { registry = "https://pypi.org/simple" } source = { registry = "https://pypi.org/simple" }
dependencies = [ dependencies = [
{ name = "cuda-pathfinder" }, { name = "cuda-pathfinder", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
] ]
wheels = [ wheels = [
{ url = "https://files.pythonhosted.org/packages/51/6b/457ca12dad3ee9bfcc9a545cfd6b64b359ba49de40f776f6e028e678f262/cuda_bindings-13.3.1-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c5879712accf6e14bb01aa5e67440eb84998b8d104b509cc7a6dc0b8f656a474", size = 6053539, upload-time = "2026-05-29T23:11:43.19Z" }, { url = "https://files.pythonhosted.org/packages/51/6b/457ca12dad3ee9bfcc9a545cfd6b64b359ba49de40f776f6e028e678f262/cuda_bindings-13.3.1-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c5879712accf6e14bb01aa5e67440eb84998b8d104b509cc7a6dc0b8f656a474", size = 6053539, upload-time = "2026-05-29T23:11:43.19Z" },
@@ -554,34 +554,34 @@ wheels = [
[package.optional-dependencies] [package.optional-dependencies]
cudart = [ cudart = [
{ name = "nvidia-cuda-runtime", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "nvidia-cuda-runtime", marker = "sys_platform == 'linux'" },
] ]
cufft = [ cufft = [
{ name = "nvidia-cufft", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "nvidia-cufft", marker = "sys_platform == 'linux'" },
] ]
cufile = [ cufile = [
{ name = "nvidia-cufile", marker = "sys_platform == 'linux'" }, { name = "nvidia-cufile", marker = "sys_platform == 'linux'" },
] ]
cupti = [ cupti = [
{ name = "nvidia-cuda-cupti", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "nvidia-cuda-cupti", marker = "sys_platform == 'linux'" },
] ]
curand = [ curand = [
{ name = "nvidia-curand", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "nvidia-curand", marker = "sys_platform == 'linux'" },
] ]
cusolver = [ cusolver = [
{ name = "nvidia-cusolver", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "nvidia-cusolver", marker = "sys_platform == 'linux'" },
] ]
cusparse = [ cusparse = [
{ name = "nvidia-cusparse", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "nvidia-cusparse", marker = "sys_platform == 'linux'" },
] ]
nvjitlink = [ nvjitlink = [
{ name = "nvidia-nvjitlink", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "nvidia-nvjitlink", marker = "sys_platform == 'linux'" },
] ]
nvrtc = [ nvrtc = [
{ name = "nvidia-cuda-nvrtc", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "nvidia-cuda-nvrtc", marker = "sys_platform == 'linux'" },
] ]
nvtx = [ nvtx = [
{ name = "nvidia-nvtx", marker = "sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "nvidia-nvtx", marker = "sys_platform == 'linux'" },
] ]
[[package]] [[package]]
@@ -1384,8 +1384,7 @@ dependencies = [
[package.optional-dependencies] [package.optional-dependencies]
api = [ api = [
{ name = "fastapi" }, { name = "fastapi" },
{ name = "redis" }, { name = "python-multipart" },
{ name = "rq" },
{ name = "uvicorn", extra = ["standard"] }, { name = "uvicorn", extra = ["standard"] },
] ]
diarize = [ diarize = [
@@ -1402,9 +1401,14 @@ gpu = [
llm = [ llm = [
{ name = "openai" }, { name = "openai" },
] ]
queue = [
{ name = "redis" },
{ name = "rq" },
]
[package.dev-dependencies] [package.dev-dependencies]
dev = [ dev = [
{ name = "httpx" },
{ name = "pytest" }, { name = "pytest" },
{ name = "ruff" }, { name = "ruff" },
] ]
@@ -1423,16 +1427,18 @@ requires-dist = [
{ name = "pyannote-audio", marker = "extra == 'diarize'", specifier = ">=3.1" }, { name = "pyannote-audio", marker = "extra == 'diarize'", specifier = ">=3.1" },
{ name = "pydantic", specifier = ">=2.7" }, { name = "pydantic", specifier = ">=2.7" },
{ name = "pydantic-settings", specifier = ">=2.3" }, { name = "pydantic-settings", specifier = ">=2.3" },
{ name = "redis", marker = "extra == 'api'", specifier = ">=5.0" }, { name = "python-multipart", marker = "extra == 'api'", specifier = ">=0.0.9" },
{ name = "redis", marker = "extra == 'queue'", specifier = ">=5.0" },
{ name = "rich", specifier = ">=13.7" }, { name = "rich", specifier = ">=13.7" },
{ name = "rq", marker = "extra == 'api'", specifier = ">=1.16" }, { name = "rq", marker = "extra == 'queue'", specifier = ">=1.16" },
{ name = "typer", specifier = ">=0.12" }, { name = "typer", specifier = ">=0.12" },
{ name = "uvicorn", extras = ["standard"], marker = "extra == 'api'", specifier = ">=0.29" }, { name = "uvicorn", extras = ["standard"], marker = "extra == 'api'", specifier = ">=0.29" },
] ]
provides-extras = ["engine", "gpu", "api", "diarize", "llm"] provides-extras = ["engine", "gpu", "api", "queue", "diarize", "llm"]
[package.metadata.requires-dev] [package.metadata.requires-dev]
dev = [ dev = [
{ name = "httpx", specifier = ">=0.27" },
{ name = "pytest", specifier = ">=8.2" }, { name = "pytest", specifier = ">=8.2" },
{ name = "ruff", specifier = ">=0.5" }, { name = "ruff", specifier = ">=0.5" },
] ]
@@ -1836,7 +1842,7 @@ name = "nvidia-cublas"
version = "13.1.1.3" version = "13.1.1.3"
source = { registry = "https://pypi.org/simple" } source = { registry = "https://pypi.org/simple" }
dependencies = [ dependencies = [
{ name = "nvidia-cuda-nvrtc" }, { name = "nvidia-cuda-nvrtc", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
] ]
wheels = [ wheels = [
{ url = "https://files.pythonhosted.org/packages/a7/a1/0bd24ee8c8d03adac032fd2909426a00c88f8c57961b1277ded97f91119f/nvidia_cublas-13.1.1.3-py3-none-manylinux_2_27_aarch64.whl", hash = "sha256:b7a210458267ac818974c53038fbec2e969d5c99f305ab15c72522fa9f001dd5", size = 542848918, upload-time = "2026-04-08T18:46:22.985Z" }, { url = "https://files.pythonhosted.org/packages/a7/a1/0bd24ee8c8d03adac032fd2909426a00c88f8c57961b1277ded97f91119f/nvidia_cublas-13.1.1.3-py3-none-manylinux_2_27_aarch64.whl", hash = "sha256:b7a210458267ac818974c53038fbec2e969d5c99f305ab15c72522fa9f001dd5", size = 542848918, upload-time = "2026-04-08T18:46:22.985Z" },
@@ -1911,7 +1917,7 @@ name = "nvidia-cudnn-cu13"
version = "9.20.0.48" version = "9.20.0.48"
source = { registry = "https://pypi.org/simple" } source = { registry = "https://pypi.org/simple" }
dependencies = [ dependencies = [
{ name = "nvidia-cublas" }, { name = "nvidia-cublas", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
] ]
wheels = [ wheels = [
{ url = "https://files.pythonhosted.org/packages/56/c5/83384d846b2fd17c44bd499b36c75a45ed4f095fbbb2252294e89cea5c5c/nvidia_cudnn_cu13-9.20.0.48-py3-none-manylinux_2_27_aarch64.whl", hash = "sha256:e31454ae00094b0c55319d9d15b6fa2fc50a9e1c0f5c8c80fb75258234e731e1", size = 444574296, upload-time = "2026-03-09T19:28:27.751Z" }, { url = "https://files.pythonhosted.org/packages/56/c5/83384d846b2fd17c44bd499b36c75a45ed4f095fbbb2252294e89cea5c5c/nvidia_cudnn_cu13-9.20.0.48-py3-none-manylinux_2_27_aarch64.whl", hash = "sha256:e31454ae00094b0c55319d9d15b6fa2fc50a9e1c0f5c8c80fb75258234e731e1", size = 444574296, upload-time = "2026-03-09T19:28:27.751Z" },
@@ -1923,7 +1929,7 @@ name = "nvidia-cufft"
version = "12.0.0.61" version = "12.0.0.61"
source = { registry = "https://pypi.org/simple" } source = { registry = "https://pypi.org/simple" }
dependencies = [ dependencies = [
{ name = "nvidia-nvjitlink" }, { name = "nvidia-nvjitlink", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
] ]
wheels = [ wheels = [
{ url = "https://files.pythonhosted.org/packages/8b/ae/f417a75c0259e85c1d2f83ca4e960289a5f814ed0cea74d18c353d3e989d/nvidia_cufft-12.0.0.61-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:2708c852ef8cd89d1d2068bdbece0aa188813a0c934db3779b9b1faa8442e5f5", size = 214053554, upload-time = "2025-09-04T08:31:38.196Z" }, { url = "https://files.pythonhosted.org/packages/8b/ae/f417a75c0259e85c1d2f83ca4e960289a5f814ed0cea74d18c353d3e989d/nvidia_cufft-12.0.0.61-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:2708c852ef8cd89d1d2068bdbece0aa188813a0c934db3779b9b1faa8442e5f5", size = 214053554, upload-time = "2025-09-04T08:31:38.196Z" },
@@ -1953,9 +1959,9 @@ name = "nvidia-cusolver"
version = "12.0.4.66" version = "12.0.4.66"
source = { registry = "https://pypi.org/simple" } source = { registry = "https://pypi.org/simple" }
dependencies = [ dependencies = [
{ name = "nvidia-cublas" }, { name = "nvidia-cublas", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
{ name = "nvidia-cusparse" }, { name = "nvidia-cusparse", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
{ name = "nvidia-nvjitlink" }, { name = "nvidia-nvjitlink", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
] ]
wheels = [ wheels = [
{ url = "https://files.pythonhosted.org/packages/c8/c3/b30c9e935fc01e3da443ec0116ed1b2a009bb867f5324d3f2d7e533e776b/nvidia_cusolver-12.0.4.66-py3-none-manylinux_2_27_aarch64.whl", hash = "sha256:02c2457eaa9e39de20f880f4bd8820e6a1cfb9f9a34f820eb12a155aa5bc92d2", size = 223467760, upload-time = "2025-09-04T08:33:04.222Z" }, { url = "https://files.pythonhosted.org/packages/c8/c3/b30c9e935fc01e3da443ec0116ed1b2a009bb867f5324d3f2d7e533e776b/nvidia_cusolver-12.0.4.66-py3-none-manylinux_2_27_aarch64.whl", hash = "sha256:02c2457eaa9e39de20f880f4bd8820e6a1cfb9f9a34f820eb12a155aa5bc92d2", size = 223467760, upload-time = "2025-09-04T08:33:04.222Z" },
@@ -1967,7 +1973,7 @@ name = "nvidia-cusparse"
version = "12.6.3.3" version = "12.6.3.3"
source = { registry = "https://pypi.org/simple" } source = { registry = "https://pypi.org/simple" }
dependencies = [ dependencies = [
{ name = "nvidia-nvjitlink" }, { name = "nvidia-nvjitlink", marker = "sys_platform != 'emscripten' and sys_platform != 'win32'" },
] ]
wheels = [ wheels = [
{ url = "https://files.pythonhosted.org/packages/f8/94/5c26f33738ae35276672f12615a64bd008ed5be6d1ebcb23579285d960a9/nvidia_cusparse-12.6.3.3-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:80bcc4662f23f1054ee334a15c72b8940402975e0eab63178fc7e670aa59472c", size = 162155568, upload-time = "2025-09-04T08:33:42.864Z" }, { url = "https://files.pythonhosted.org/packages/f8/94/5c26f33738ae35276672f12615a64bd008ed5be6d1ebcb23579285d960a9/nvidia_cusparse-12.6.3.3-py3-none-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:80bcc4662f23f1054ee334a15c72b8940402975e0eab63178fc7e670aa59472c", size = 162155568, upload-time = "2025-09-04T08:33:42.864Z" },
@@ -2834,6 +2840,15 @@ wheels = [
{ url = "https://files.pythonhosted.org/packages/0b/d7/1959b9648791274998a9c3526f6d0ec8fd2233e4d4acce81bbae76b44b2a/python_dotenv-1.2.2-py3-none-any.whl", hash = "sha256:1d8214789a24de455a8b8bd8ae6fe3c6b69a5e3d64aa8a8e5d68e694bbcb285a", size = 22101, upload-time = "2026-03-01T16:00:25.09Z" }, { url = "https://files.pythonhosted.org/packages/0b/d7/1959b9648791274998a9c3526f6d0ec8fd2233e4d4acce81bbae76b44b2a/python_dotenv-1.2.2-py3-none-any.whl", hash = "sha256:1d8214789a24de455a8b8bd8ae6fe3c6b69a5e3d64aa8a8e5d68e694bbcb285a", size = 22101, upload-time = "2026-03-01T16:00:25.09Z" },
] ]
[[package]]
name = "python-multipart"
version = "0.0.32"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/5b/42/55c32bb9b12693c092ad250a0e82edb5b31ddeda6eb772de5f308b3804ad/python_multipart-0.0.32.tar.gz", hash = "sha256:be54b7f3fa167bb83e4fcd936b887b708f4e57fe75911c02aebf53efaf8d938e", size = 46881, upload-time = "2026-06-04T16:18:58.647Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e1/04/e8135ebd1ad02c56ec633277529b2602ff99ff634be76cdba5744cf554fd/python_multipart-0.0.32-py3-none-any.whl", hash = "sha256:ff6d3f776f16878c894e52e107296ffc890e913c611b1a4ec6c44e2821fe2e23", size = 30042, upload-time = "2026-06-04T16:18:57.319Z" },
]
[[package]] [[package]]
name = "pytorch-lightning" name = "pytorch-lightning"
version = "2.6.5" version = "2.6.5"