AI agents can be extended through skills. I recently built one that automates my YouTube content creation process, and the experience revealed something worth sharing.

My existing workflow created video descriptions after completing a blog post. A Cursor command would combine blog content with the transcript to produce a description with timestamps. But I wanted to flip the process: start with the transcript and work toward the blog post. When I record a video and wait days before writing the post, the details become fuzzy. Having something refresh my memory and interview me seemed more natural.

The skill fetches the YouTube transcript when given a URL, presents a summary, and conducts a structured interview to shape the blog post. Anthropic has made their skills format open, though standardisation is lacking. I had Cursor create the skill based on a stream-of-consciousness description. Using plan mode proved effective. The agent read the AgentSkills.io website to understand the format and put together a plan. I added one modification: parrot back a summary before starting questions, helping when the video was recorded days ago and details have faded.

The resulting skill consists of a few components. The skill definition is minimal, just enough for the agent to decide whether to load the rest. A description format document explains how to create consistent descriptions. There’s also a Python script that handles fetching the YouTube transcript. A test run demonstrated the workflow: the script fetches the transcript, the agent presents a summary, confirms accuracy, and proceeds with structured questions about the main message, target audience, and unique angle.

The interview pushed me to articulate things like why I’m making these videos in the first place and what distinguishes my perspective from professional YouTubers with something to sell. One question made me smile: “What makes your perspective unique or valuable?” The honest answer is that it’s not unique. What it offers is a grounded take from someone who values automation and sees something genuinely valuable in these tools, cutting through both the fear and the hype.

There’s something pleasantly self-referential about using the skill I built to create the blog post about building the skill. The content remains mine, transformed rather than generated, retaining my voice while removing the drudgery. There’s a fine line between AI slop and authenticity, but refusing to use available tools because every character must be typed manually makes no sense either. The goal is my content, expressed clearly, that actually says something worth reading.