SkillJavaScriptv1.0.0

Douyin Video Transcribe

Extract audio from Douyin (抖音/TikTok China) videos and transcribe to text using Whisper. Trigger when user sends a Douyin link (v.douyin.com or www.douyin.com/video/) and asks for transcription, extract text, analyze video content, or summarize.

0 downloads
don068589
Updated Mar 12, 2026

Douyin Video Transcribe

Extract speech from Douyin videos and convert to text. Supports Chinese/English, cross-platform (Windows/macOS/Linux).

Core Principle

Douyin has strict anti-scraping. Must:

  1. Load page in browser, wait for video stream
  2. Extract real CDN URL from DOM or network requests
  3. Download with Referer: https://www.douyin.com/ header (403 without it)
  4. Convert audio to 16kHz mono WAV for Whisper

Prerequisites

ToolPurposeInstall
ffmpegAudio extractionbrew install ffmpeg / winget install ffmpeg / apt install ffmpeg
whisperSpeech-to-textpip install openai-whisper
curlDownload videoBuilt-in (Windows: curl.exe)

Workflow

1. Resolve Short URL

Douyin share links are usually v.douyin.com/xxx, resolve to full URL:

# macOS/Linux
curl -sL -o /dev/null -w '%{url_effective}' "https://v.douyin.com/xxx/"

# Windows PowerShell
curl.exe -sL -o NUL -w "%{url_effective}" "https://v.douyin.com/xxx/"

Output: https://www.douyin.com/video/7616020798351871284

Video ID is the 19-digit number in URL.

2. Get Video URL

Open video page in browser, wait 3-5 seconds, execute JS:

(() => {
  const videos = document.querySelectorAll('video');
  for (const v of videos) {
    const src = v.currentSrc || v.src;
    if (src && src.startsWith('http') && !src.includes('uuu_265')) {
      return src;
    }
  }
  return null;
})()

Key points:

  • Returns null: Page not loaded, retry after waiting
  • Contains uuu_265: Placeholder video, retry after waiting
  • Starts with blob:: Streaming, wait for real URL
  • CDN URLs expire (~2 hours), re-fetch if needed

3. Download Video

# macOS/Linux
curl -L -H "Referer: https://www.douyin.com/" -o video.mp4 "<CDN_URL>"

# Windows
curl.exe -L -H "Referer: https://www.douyin.com/" -o video.mp4 "<CDN_URL>"

Referer header is required, otherwise 403.

4. Extract Audio

ffmpeg -i video.mp4 -ar 16000 -ac 1 -c:a pcm_s16le audio.wav -y

Parameters:

  • -ar 16000: 16kHz sample rate (Whisper requirement)
  • -ac 1: Mono channel
  • -c:a pcm_s16le: 16-bit PCM

5. Transcribe

python -m whisper audio.wav --model small --language zh

Model selection:

ModelSize5-min video (CPU)AccuracyUse case
tiny75MB~30sFairQuick preview
base142MB~1minGoodDaily use
small466MB~3minBetterRecommended
medium1.5GB~8minBestHigh accuracy

Language:

  • Chinese: --language zh
  • English: --language en
  • Auto-detect: omit flag (slower)

Output files in current directory: audio.txt, audio.srt, audio.json

Troubleshooting

IssueDetectionSolution
Short URL failsReturns non-douyin.comCheck link completeness, remove share text noise
Video URL not foundJS returns nullWait 3-5s and retry, max 3 times
Placeholder videoURL contains uuu_265Page not loaded, wait and retry
Download 403curl returns 403Check Referer header; URL may be expired
Whisper hangsNo output for long timeFirst run downloads model (~460MB for small)
Garbled outputTerminal shows gibberishNormal, read .txt file directly
Out of memoryProcess killedUse smaller model (base/tiny)

Output Convention

Name files by video ID, save to user-specified directory:

output/
├── 7616020798351871284.mp4   # Original video (optional)
├── 7616020798351871284.wav   # Audio (delete after)
├── 7616020798351871284.txt   # Transcript
└── 7616020798351871284.srt   # Subtitles (optional)

Scripts (Optional)

Helper scripts in skill directory:

  • scripts/get_video_url.js: Browser-side video URL extraction with multiple methods
  • scripts/transcribe.py: CLI one-click transcription (requires video URL)

Scripts are accelerators, not required. Implement yourself after understanding the workflow.

Notes

  • Article links (/article/): Use browser snapshot directly, no transcription needed
  • Douyin AI summary: Some video pages have AI-generated chapter summaries, extract from snapshot as supplement
  • Other platforms: This skill is for Douyin only. Use yt-dlp for YouTube/Bilibili
Free
Installation
Reviews

Sign in to leave a review.

No reviews yet. Be the first.