AI-assisted production

Course 2 · Ch 9
AI-Assisted Production
Script generation with Claude, voiceover with ElevenLabs, AI avatars with HeyGen — using AI as a production partner, not a replacement

AI tools don't replace the creator — they compress the time between idea and finished video. A script that took two hours now takes twenty minutes to draft. A voiceover you'd have re-recorded four times can be cloned and adjusted in seconds. An avatar-based explainer that would have needed a studio can be produced at a desk. This chapter covers the practical toolkit: what each tool does, where it fits in a real production workflow, and where it falls short.

The AI Involvement Spectrum

AI can be involved at every stage of production — from none at all to completely generated. Where you sit on this spectrum is a creative and strategic choice, not a binary one.

AI involvement in your production workflow
TraditionalHybridAutomated
Human-only: You write, film, edit, publish. AI = none.
AI-assisted: AI helps research, scripts, thumbnails, captions. You film and present.
AI-heavy: AI script + AI voice + AI avatar or B-roll. Human edits and directs.
Fully AI: Everything generated. Covered in Chapter 10.

This chapter covers the AI-assisted and AI-heavy zones — tools that work alongside a real human creator to dramatically accelerate production without losing authenticity.

Script Generation — AI as Your Writing Partner

AI script generation is the highest-value use of AI in most creators' workflows. Not because AI writes better scripts than humans — it doesn't — but because it eliminates the blank-page problem, speeds up research synthesis, and handles the structural scaffolding so you can focus on making the content genuinely interesting.

The AI script trap
Publishing an unedited AI-generated script is immediately detectable — not by an algorithm, but by your audience. AI defaults to generic structures, hedged language ("it's important to note that..."), and a flat, encyclopaedic tone. The script AI gives you is a first draft, not a finished product. Your job is to rewrite it in your voice, cut the filler, add your opinions, and make it sound like something a human who cares about this topic would actually say.

How to use Claude for script drafting

The quality of what you get back is entirely determined by the quality of what you put in. A vague prompt gets a generic script. A specific, well-structured prompt gets a workable first draft.

Weak prompt — don't do this
Write a YouTube script about microphones for creators.
Too vague. Result: a generic, listicle-style script with no voice, no hook, and nothing that couldn't be found on the first page of Google.
Strong prompt — structure like this
Write a YouTube script for a 7-minute video aimed at beginner creators who are frustrated that their audio sounds bad despite buying a "good" microphone. My channel tone is direct and slightly sarcastic — I don't coddle beginners but I'm not mean about it. The hook should call out the most common mistake (buying a condenser mic for an untreated room). Structure: hook (30 sec) → the real problem (room acoustics, 90 sec) → three fixes in order of cost (free, cheap, paid) → CTA to the next video on audio post-processing. Don't write outro filler. No bullet-point lists — flowing paragraphs only.
Specifies length, audience pain point, tone, structure, hook angle, section breakdown, and what to exclude. Result: a draft that actually sounds like a real video, not a blog post read aloud.
Refinement prompt — use after the first draft
The hook is too slow — get to the main point in the first 10 seconds, not 30. The section on acoustic foam is too promotional — make it more sceptical. Replace the phrase "it's important to note" wherever it appears. Add a specific real-world example in the free-fixes section — something that sounds like a personal anecdote, not a generalisation.
Treats the AI as an editor, not an oracle. Specific, actionable notes produce specific, actionable revisions. This is exactly how you'd brief a human copywriter.
Repurposing prompt — turn a script into other formats
Based on this script, write: (1) a YouTube title using the curiosity-gap formula, (2) three thumbnail text options of 4 words or fewer, (3) a 150-word video description with the primary keyword in the first sentence, and (4) five chapter timestamps assuming the video is 8 minutes long.
One script → multiple assets. This is where AI saves the most cumulative time — not in writing the script itself, but in spinning up all the supporting content around it.

Other AI script-assistance tools

✍️
Claude / ChatGPT / Gemini
LLM — script drafting, research, repurposing
Free / paid
General-purpose LLMs. Use for drafting, restructuring, rewriting in your voice, research synthesis, and generating supporting assets (titles, descriptions, chapters). Claude tends to produce more nuanced, less robotic prose for long-form content.
Best for: all script-adjacent tasks. The most versatile tool in this list.
📋
Descript
Text-based video editing + AI writing tools
Free / ~£12/mo
Transcribes your existing recordings, then lets you edit the video by editing the transcript. Also includes AI tools for removing filler words, generating summaries, and creating social clips from long-form content.
Best for: creators who prefer writing/editing in a document rather than a timeline.
🔍
Perplexity AI
AI-powered research with citations
Free / ~£17/mo
AI research tool that cites its sources. Far safer than asking an LLM for facts directly — Perplexity shows where each claim comes from, so you can verify before putting it in a script. Dramatically accelerates research-heavy scripts.
Best for: fact-heavy explainer content where accuracy matters and hallucination is a risk.
Still verify claims — sources cited aren't always what they appear to be.

AI Voiceover — ElevenLabs & the Competition

AI voiceover has crossed a quality threshold in the last two years. The best outputs from ElevenLabs, Murf, and PlayHT are genuinely indistinguishable from human recordings in many use cases. This opens up several production workflows: creating content in languages you don't speak, cloning your own voice for faster re-recording, and generating narration for faceless channels without recording sessions.

🎙️
ElevenLabs
AI voice synthesis + voice cloning
Free / from ~£5/mo
The current quality leader. Realistic voices, accurate emotion and pacing, and voice cloning from as little as one minute of audio. The Starter plan (~£5/mo) includes 30,000 characters/month — roughly 30 minutes of narration. Professional voice cloning requires the Creator plan (~£17/mo).
Best for: faceless channels, dubbed content, re-recording corrections without re-shooting.
Voice cloning requires explicit consent if cloning someone else's voice. ElevenLabs enforces this — your own voice is fine.
🎤
Murf AI
AI voiceover studio
Free / from ~£19/mo
Studio-style interface with 120+ voices across 20+ languages. Built-in pitch, speed, and emphasis controls. Syncs voiceover to video in the browser — no separate NLE needed for simple projects. Slightly less natural than ElevenLabs at the top end but a polished workflow.
Best for: multi-language content, creators who want everything in one browser-based tool.
🔊
PlayHT / LOVO
AI voice generation
Free / from ~£19/mo
Strong alternatives with large voice libraries and API access. PlayHT's ultra-realistic voices approach ElevenLabs quality. LOVO (now Genny) includes an in-browser video editor. Worth testing if ElevenLabs pricing doesn't fit your use case.
Best for: API-driven automation workflows, or if ElevenLabs character limits are too restrictive.

Voice cloning your own voice — the workflow

  1. Record a clean voice sample. ElevenLabs needs 1–5 minutes of clear, noise-free audio — your normal recording setup is fine. Speak naturally at your normal pace. Avoid music, background noise, and excessive editing (the AI needs natural breath patterns).
  2. Upload to ElevenLabs → Voices → Add a Professional Voice Clone. The system analyses your voice characteristics — tone, pacing, articulation, accent — and builds a synthesis model. Processing takes a few minutes.
  3. Test with a short script section. Paste 2–3 sentences and listen. Check that the clone captures your natural cadence — not just your pitch. Adjust the stability and similarity sliders if the output sounds too robotic (raise similarity) or too flat (lower stability slightly).
  4. Use for corrections and gap-fills. The highest-value use case: you filmed a video, edited it, then realise you mispronounced something at minute 4. Instead of re-recording, type the corrected sentence and drop the clone audio over the mistake. Saves a full re-shoot.
  5. Generate full narration for supplementary content. Scripts for community posts, short explainers, or translated versions of existing videos can all be narrated by your cloned voice without sitting at a microphone.
The voice clone disclosure question
If your channel's content is presented as authentic and personal, using a cloned voice for full episodes without disclosure will eventually erode trust when viewers notice. Using it for corrections, dubbed languages, or explicitly faceless content is unambiguously fine. Using it to fake being present when you're not — particularly for news, opinion, or documentary content — is ethically murky. Transparency with your audience is always the safer long-term play.

AI Avatars — HeyGen & Synthesia

AI avatar tools generate a video of a realistic human presenter from a text script or audio file. You either use one of the platform's stock avatars, or create a digital twin of yourself from a short recorded video. The avatar lip-syncs to the voiceover, moves naturally, and can be placed in front of various backgrounds.

🤖
HeyGen
AI avatar video generation + video translation
Free trial / from ~£24/mo
Creates video of a realistic avatar (stock or your own digital twin) speaking a script. Also features video translation — upload an existing video and HeyGen re-lips-syncs your face to a translated audio track in 30+ languages. A serious production tool for multi-language channels.
Best for: translated content, explainer videos, product demos, training content — anywhere a "presenter" is needed without filming.
Instant avatar quality has improved dramatically but still shows subtle uncanny valley tells on close inspection. Works best at medium shot framing.
👤
Synthesia
AI avatar video — training & corporate
From ~£22/mo
Similar to HeyGen but with a stronger corporate/training focus. 230+ stock avatars, 140+ languages, PowerPoint-style slide integration. More polished workflow for structured explainer content. Less suited to entertainment-style YouTube than HeyGen.
Best for: educational/training content, corporate explainers, anyone primarily making structured slide-based presentations.
🌐
HeyGen Video Translation
Lip-sync translation of existing videos
Included in HeyGen paid plans
Upload a video of yourself speaking in English. HeyGen translates the audio (via AI), re-generates a lip-synced version of your face speaking the new language, and renders a complete translated video. Quality varies by language — Spanish, French, German, Portuguese perform best currently.
Best for: growing a multi-language audience without learning new languages or hiring dubbing actors.
Review every translated video manually — AI translation errors can be significant, especially for technical or cultural content.

When to use AI avatars vs filming yourself

📚
Structured explainer / tutorial
Script-driven, information-heavy content where the presenter is largely a visual anchor rather than a personality.
✓ AI avatar works well here
🌍
Translated versions of existing videos
Reaching Spanish, French, or Portuguese audiences with your existing English content — without re-filming.
✓ HeyGen translation is ideal
🎭
Personality / entertainment channel
Channels built on the creator's authentic presence, humour, reactions, and relationships with the audience.
✗ Avatars destroy the authenticity that makes these channels work
💼
Product demos / corporate content
Presenting features, processes, or procedures where the visual message is the screen or product, not the presenter's face.
✓ Excellent use case — saves significant filming time
🎙️
Opinion / commentary / reaction
Videos where your take, your face, and your spontaneous response are the entire point of the content.
✗ Can't be replicated authentically with an avatar
High-volume informational shorts
Rapid-fire Shorts or Reels covering facts, tips, or news where personality matters less than speed and quantity.
⚠ Works but saturation is high — differentiation gets harder

Other AI Production Tools Worth Knowing

ToolWhat it doesCostBest use
Adobe Podcast Enhance One-click AI noise removal and voice enhancement Free (browser) Cleaning up location audio or suboptimal recordings before editing
Opus Clip AI finds the best moments in long-form video and cuts them to Shorts/Reels Free / ~£13/mo Repurposing long videos into multiple short clips automatically
Midjourney / Ideogram AI image generation for thumbnails, overlays, and graphics Free tier / ~£8/mo Creating background scenes, stylised graphics, or concept images for thumbnails
Captions.ai Auto-captions with animated styling optimised for short-form Free / paid Styled, animated captions for Shorts and Reels faster than manual
Riverside.fm Remote recording studio with AI transcription, magic clips, and auto-editing Free / ~£15/mo Remote interviews and podcasts — records separate high-quality tracks per participant
Vidyo.ai Repurposes long videos into captioned short clips with scene detection Free / ~£16/mo Multi-platform distribution from a single long-form recording

The Honest Assessment — Where AI Helps and Where It Doesn't

  • AI genuinely saves time on: research synthesis, first-draft scripting, repurposing content, generating metadata (titles, descriptions, chapters), correcting recorded audio without re-shooting, translating content for new markets, and removing silence in editing.
  • AI cannot replace: your specific perspective, your genuine reactions, your relationship with your audience, the trust built from showing up consistently as a real person, and the creative judgment that makes content worth watching rather than just technically competent.
  • The hidden cost: AI tools create a quality floor — anyone can produce passable content with them. That raises the bar for what makes content worth watching. The differentiator in an AI-saturated content landscape is everything AI can't provide: authenticity, specificity, earned trust, and a point of view.
The one rule that matters
Use AI to produce content faster without compromising your authentic voice — not to produce content you couldn't have made yourself. The moment your channel becomes indistinguishable from 10,000 other AI-assisted channels targeting the same niche, you've lost the only thing that made it worth building.

Chapter 9 Quick Reference

  • Best AI for scripting: Claude (nuanced long-form) · ChatGPT (fast iteration) · Perplexity (cited research)
  • Prompt rule: Specify length, audience, tone, structure, and what to exclude — vague prompts produce generic scripts
  • AI script = first draft. Rewrite in your voice before filming.
  • Best AI voiceover: ElevenLabs (quality + voice cloning) · Murf (multi-language workflow)
  • Voice clone best use: Correcting errors post-edit without re-filming
  • ElevenLabs Starter: ~£5/mo · 30,000 chars/month (~30 min narration)
  • Best AI avatar: HeyGen (YouTube content + video translation)
  • Avatar works best for: Explainers, translated content, product demos
  • Avatar fails for: Personality channels, opinion content, reaction videos
  • HeyGen video translation: Best for Spanish, French, German, Portuguese
  • Repurposing Shorts: Opus Clip or Vidyo.ai — auto-clip long content
  • Quick audio fix: Adobe Podcast Enhance (free, browser) before editing