Project: Audio Transcription Pipeline
Status Legend
pending— not startedin_progress— currently activecompleted— done and verifiedblocked— waiting on dependency
Tasks
T1 · Set up S3 bucket and IAM roles
- status:
completed - blocks: T2, T3
- description: Create S3 bucket for raw Zoom recordings, output transcripts, and configure IAM roles for read/write access.
- notes: Bucket name:
zoom-transcripts-prod. IAM role ARN saved to.env.example.
T2 · Build Zoom webhook receiver
- status:
completed - blocks: T5
- blockedBy: T1
- description: FastAPI endpoint that receives Zoom recording-complete webhooks, validates signatures, and enqueues the download job.
- notes: Deployed to Lambda. Webhook secret stored in AWS Secrets Manager.
T3 · Implement S3 download worker
- status:
completed - blocks: T5
- blockedBy: T1
- description: Worker that pulls raw
.m4afiles from Zoom cloud storage and stages them in S3 underraw/. - notes: Handles Zoom OAuth token refresh automatically.
T4 · Integrate Whisper transcription
- status:
in_progress - blocks: T6, T7
- blockedBy: none
- description: Wrap OpenAI Whisper API (or local model) to transcribe audio files. Output raw transcript JSON with word-level timestamps.
- notes: Currently testing
whisper-large-v3. Cost ~$0.006/min. Need to decide on local vs. API.
T5 · Wire webhook → download → transcription pipeline
- status:
pending - blocks: T8
- blockedBy: T2, T3, T4
- description: Connect the three components end-to-end. Webhook triggers download worker; on completion, enqueues transcription job.
- notes: Use SQS queues between stages for resilience.
T6 · Build diarization layer
- status:
pending - blocks: T7
- blockedBy: T4
- description: Add speaker diarization using pyannote.audio. Merge diarization output with Whisper timestamps to produce speaker-labeled transcript.
- notes: Requires HuggingFace token. pyannote model:
pyannote/speaker-diarization-3.1.
T7 · LLM post-processing (cleanup + summary)
- status:
pending - blocks: T8
- blockedBy: T4, T6
- description: Pass labeled transcript through Claude to: (1) clean filler words, (2) generate structured meeting summary with action items.
- notes: Use claude-sonnet-4-6. Prompt template in
prompts/summary.md.
T8 · Output delivery (email + Notion)
- status:
pending - blocks: none
- blockedBy: T5, T7
- description: Deliver final transcript and summary via: (1) email to meeting host, (2) Notion page in shared workspace.
- notes: Notion API token in Secrets Manager. Email via SES.
Dependency Graph
T1 (completed)
├── T2 (completed) ──────────────────────┐
└── T3 (completed) ──────────────────────┤
▼
T4 (in_progress) ──────────────────── T5 (pending) ──┐
├── T6 (pending) │
│ └── T7 (pending) ──────────────────────────────── T8 (pending)
└──────────────────┘
Session Notes
- 2026-03-01: T1, T2, T3 completed. T4 started — testing Whisper models.
- 2026-03-04: T4 still in progress. Blocked on local vs. API cost decision. T6, T7, T8 not yet started.
Hydration Instructions for Claude
When asked to hydrate this plan:
- Create a Task for each
Txentry - Set
statusfrom the status field - Wire
addBlockedByfrom each task’sblockedBylist - Skip creating tasks with status
completed— just note them as context - Begin work on tasks whose blockedBy tasks are all
completedor absent