mcptube
YouTube video knowledge engine with transcripts and vision
TLDR
SYNOPSIS
mcptube \<command\> [options]
DESCRIPTION
mcptube is a YouTube video knowledge engine that extracts metadata, transcripts, and frames from YouTube videos, indexes them for semantic search, and exposes everything as both a CLI tool and an MCP (Model Context Protocol) server.The tool builds a persistent wiki knowledge base that grows richer with each video ingested, rather than treating videos as isolated searchable chunks. It uses scene-change detection instead of fixed-interval sampling to capture high-information-density frames. Search combines FTS5 keyword matching with LLM-powered reasoning for hybrid retrieval.mcptube operates in two modes: CLI mode using your own API keys (Anthropic, OpenAI, or Google) for deterministic results, and MCP passthrough mode where the connected AI assistant analyzes data using its own model, avoiding double-billing.
PARAMETERS
add url [--text-only]
Add a YouTube video to the library; use --text-only to skip frame extractionremove query
Remove a video from the librarylist
List all videos in the libraryinfo query
Show detailed information about a videosearch query
Search across video transcriptsask question
Ask a natural language question about video contentframe query timestamp
Extract a frame at a specific timestampframe-query query description
Find frames matching a visual descriptionclassify query
Classify video contentreport query [--focus topic] [--format html] [-o file]
Generate a report about a videoreport-query topic [--tag tag]
Generate a report across videos by topicdiscover topic
Discover new videos related to a topicwiki list [--type type] [--tag tag]
List wiki pageswiki show slug
Display a wiki pagewiki search query
Search wiki contentwiki toc
Show wiki table of contentswiki export [--format html] [--page slug]
Export wiki pagesserve [--stdio] [--host host] [--port port] [--reload]
Start the MCP server
CAVEATS
Requires Python 3.12 or 3.13 (ChromaDB is not compatible with Python 3.14) and ffmpeg for frame extraction. Full feature set requires API keys from at least one LLM provider (Anthropic, OpenAI, or Google). The MCP server is currently local-only. Text-only mode is available for cost reduction when vision features are not needed.
HISTORY
mcptube was created by 0xchamin and written in Python. It evolved from a simple transcript search tool into a full video knowledge engine with wiki capabilities, vision-based frame analysis, and MCP server integration for use with AI coding assistants.
SEE ALSO
yt-dlp(1), ffmpeg(1), youtube-dl(1)
