What caption formats can I paste in?

SRT and VTT bodies work as-is: numbered cues, timestamp ranges, and WebVTT headers are all parsed. You can also paste plain caption lines with no timing at all. ngram restitches fragmented cues into full sentences so the script reads naturally instead of one slide per subtitle.

Does captions to video keep the original timestamps?

The timestamps guide how lines are grouped into scenes, but the output is re-paced for video rather than locked to the original cue timing. You set the final pacing in the storyboard, where you can merge two cues into one beat or split a long cue across scenes.

Will the captions appear on the finished video?

Yes. The caption text drives the script and is also burned back into every export, restyled in your Brand Kit's caption preset. Muted feeds on LinkedIn and approved social channels still read the same lines you pasted in. You can also export a separate .srt with the Add Subtitles to Video tool if you need a switchable track.

Does captions to video use stock footage or AI-generated visuals?

AI-generated visuals matched to the meaning of each caption beat, styled against your Brand Kit. ngram avoids keyword-matched stock because it makes every captioned explainer look the same. You can swap in your own footage or an image during storyboard review.

What if my captions are just a transcript with no timestamps?

That works. ngram reads untimed caption lines the same way, finding the structure and chunking the text into scenes. For a full transcript with speaker turns, the transcript-to-video converter adds extra parsing for who said what.

Can I narrate the caption lines with a voiceover?

Yes. A default ngram voice, your cloned voice, or a multilingual ElevenLabs voice reads the cue text aloud, kept in sync with the on-screen captions line for line. You approve the wording in the storyboard before the voiceover generates.

What aspect ratios and file formats can I export?

MP4, GIF, or WebM in three ratios from one render: 16:9 for YouTube and embeds, 1:1 for LinkedIn and X, 9:16 for Reels, Shorts, and approved social channels. Output is 1080p with burned-in captions by default; 4K and PPTX export are available on plans that include them.

Can I integrate captions to video into our tooling?

Yes. Zapier, n8n, and Make connectors trigger a render when a caption file lands in storage. There's an MCP server for agentic flows, a Chrome extension for converting highlighted caption lines, and a REST API with webhooks for custom pipelines.

How is ngram different from a subtitle tool for captions to video?

Subtitle tools like VEED and Kapwing start from a finished video and add captions on top. ngram starts from the caption text and builds the video around it: it plans a storyboard you approve before render, generates a visual per scene, and burns your captions back in. The caption file is the input, not a finishing layer.

Captions to Video: turn subtitle lines into a branded marketing video

Paste the body of an SRT or VTT file, or any block of caption lines, and ngram reads each subtitle as a beat, plans a scene per line, and renders a branded video with the captions burned back in.

4.8/5 · 15 reviews

Input · Captions to VideoReady

chars 0 / 4000

Trusted by teams at

Amazon

Google

Microsoft

Nvidia

Apple

Walmart

Salesforce

CVS Health

PayPal

John Deere

Snap Inc.

Amazon

Google

Microsoft

Nvidia

Apple

Walmart

Salesforce

CVS Health

PayPal

John Deere

Snap Inc.

Veeva Systems

DocuSign

DP World

Genpact

Parker Hannifin

Bio-Rad

Imperva

ITV

HubSpot

Rocket Mortgage

Tektronix

Diligent

Times Internet

Veeva Systems

DocuSign

DP World

Genpact

Parker Hannifin

Bio-Rad

Imperva

ITV

HubSpot

Rocket Mortgage

Tektronix

Diligent

Times Internet

Deel

Zapier

Delhivery

SafetyCulture

Demandbase

PingCAP

Quizizz

Apryse

Improvado

Taggbox

Matrixport

Glasswall

ContractSafe

Deel

Zapier

Delhivery

SafetyCulture

Demandbase

PingCAP

Quizizz

Apryse

Improvado

Taggbox

Matrixport

Glasswall

ContractSafe

How it works

Four steps from a caption file to a video you'd actually publish.

No timeline scrubbing, no manual scene-by-scene cutting, no pasting subtitles into a slide builder. The caption text you already have becomes the script, and the script becomes a storyboard you can edit before anything renders.

Paste the caption body or drop a file URL

Paste an SRT or VTT block straight from your editor, or link to a hosted caption file. ngram strips the timestamps and cue numbers, keeps the spoken lines, and stitches fragmented cues back into full sentences.

Caption lines become a paced video script

ngram reads the cue text as a running narration, finds the hook, the body, and the close, and groups adjacent lines into scenes instead of one slide per subtitle. Your wording survives where it carries the message.

Review the storyboard before render

Each scene shows the caption lines it covers, the visual direction, and the duration. Merge two cues into one beat, drop a filler line, or rewrite a hook in plain language, and the script and scene plan re-flow together.

Export with captions burned in

One render produces 16:9 for YouTube and embeds, 1:1 for LinkedIn, and 9:16 for Reels, Shorts, and approved social channels. Captions are restyled in your brand font and burned back into every cut.

Output controls

Smart defaults from the caption text. Real knobs when you need them.

Reads SRT and VTT bodies

Paste the raw file body and ngram parses the cue index, the timestamp range, and the text. Numbered cues, blank-line separators, and WebVTT headers are handled; plain caption lines with no timing convert too.

Cue-line-to-scene mapping

Short cues that belong to one thought get merged into a single scene; a long monologue cue gets split across beats. You see exactly which caption lines map to which scene in the storyboard before render.

Script-first review

Read the full narration assembled from your captions before any visual generates. Cut a line, fix a typo a hand-typed caption left behind, or tighten the open. Each edit re-paces the scenes downstream.

AI Visuals per caption beat

Each scene gets a brand-matched image or short generative clip tied to what that caption line actually says. No keyword-matched stock footage that turns a captioned explainer generic.

Voiceover that reads the captions

A default ngram voice, your cloned founder voice, or a multilingual ElevenLabs voice reads the caption text aloud. The spoken track and the on-screen captions stay in sync line for line.

Caption styling from your Brand Kit

Logo, fonts, colors, motion, intro, and outro pulled from your saved Brand Kit. The burned-in captions use your brand's caption preset, so the 50th captions-to-video looks as on-brand as the first.

Re-render in another language

Translate the caption-driven script and re-burn the captions in the target language without rebuilding the storyboard. One caption source ships a localized cut per market.

Three ratios in one render

16:9 for YouTube and embeds, 1:1 for LinkedIn feed, 9:16 for Reels and approved social channels. Smart reframing keeps the caption text legible across every aspect ratio.

The rest of ngram

What ngram does to a caption file that a subtitle tool doesn't.

Explore all features

Captions

Your source captions are re-styled to your Brand Kit and burned back into every export, so muted feeds still read the same lines you pasted in. The caption text drives the video and rides on top of it.

Learn more

Script Generation

Reads your caption cues and assembles them into a hook-body-CTA narration. Not a slide-per-subtitle echo, but a paced script you'd recognize as the same lines, restitched for video.

Learn more

AI Visuals

Each scene of the captions-to-video gets a brand-matched image or generative clip tied to that cue's meaning. The visuals follow what the caption line says, not a stock-clip keyword table.

Learn more

AI Voiceover

Narrate the caption lines in an ElevenLabs voice, your cloned voice, or a supported language. The voiceover reads the exact cue text you approved in the storyboard, kept in step with the burned-in captions.

Learn more

Brand Kit

Logo, fonts, colors, motion, intro and outro applied to every scene built from your captions. The same kit drives the caption preset, so every future captions-to-video stays on-brand at scale.

Learn more

Multi-format Export

One caption body in, three ratios out. 16:9 for YouTube and embeds, 1:1 for LinkedIn, 9:16 for Reels, with the burned-in captions reframed for each surface in a single render.

Learn more

Use cases

Where a captions-to-video earns its spot in the funnel.

LinkedIn distribution

Turn a captioned clip's text into a LinkedIn video post

Already have the captions from a clip you ran? Paste the cue text, render the 1:1 cut with the captions burned back in, and ride the engagement LinkedIn gives native video.

See use case

Explainer video

Caption files become a clean explainer

An old explainer's subtitle export already has the script structure. ngram lifts the cue text into a fresh explainer the support team and the website can both reuse.

See use case

Help center

Support captions become a help video

The subtitle track from a walkthrough recording carries every step in order. Convert those caption lines into a short branded help video customers watch instead of skim.

See use case

Training

Lesson captions become a training video

Caption files from a recorded session hold the full lesson. Paste them in, get a paced training video with brand visuals, and reuse it without re-recording the instructor.

See use case

Social distribution

Repurpose caption lines into short social clips

Pull the strongest cues from a longer caption file and let ngram build a 30-second branded clip around them. Post it without re-editing the original footage.

See use case

Newsletters and email

Embed a caption-driven video in your newsletter

Turn the captions from a clip into an inline video summary so subscribers who never press play on the original still get the message, captions and all.

See use case

Customer onboarding

Onboarding captions become a welcome video

The subtitle export from a setup walkthrough already lists each step. Convert it into a branded onboarding video new customers finish, with the steps captioned on screen.

See use case

Feature announcements

Feature-demo captions become a feature video

Caption text from a quick feature demo carries the whole talk track. Paste it in, get a 60-second feature video with motion graphics and a brand intro from the same lines.

See use case

Other converters

Source is a file, a transcript, or plain text? Pick the converter that matches the input.

Captions to video is one node on ngram's script-and-storyboard pipeline. Every text-source converter here shares the same scene planner, Brand Kit, review step, and three-ratio export.

All converters

SRTVideo

When your captions are a clean .srt file. ngram parses the numbered cues and timestamp ranges, then runs the same line-to-scene flow you use here for any caption block.

Open converter

TranscriptVideo

When the source is a full transcript instead of timed caption cues. Same scene planner, with extra parsing for speaker turns and long unbroken paragraphs.

Open converter

TextVideo

When you have raw script or copy with no caption timing at all. Paste any chunk of writing and the same engine turns it into a storyboarded branded video.

Open converter

Anything → VideoOther source-to-video converters that ride the same script-and-storyboard pipeline.

ScriptVideo URLVideo BlogVideo PDFVideo DocsVideo MarkdownVideo Help CenterVideo Release NotesVideo AudioVideo WebinarClips

Tools that pair with this converter

Clean the caption text before. Edit the video after.

All ngram tools

Editing the video further

Take the captions-to-video output past the first cut

Video Editor

Re-cut the rendered captions-to-video, drop a scene, or swap a visual. The converted output opens in the timeline editor with the caption-driven script attached.

Open tool

Add Subtitles to Video

The captions are burned in by default; this tool also exports a separate .srt for YouTube SEO or for embeds that want a switchable caption track instead of a baked-in one.

Open tool

Video Translator

Translate the finished captions-to-video into another language, with the captions re-burned and the voiceover regenerated. One caption source, a localized cut per market.

Open tool

Video Cutter

Pull a 15-second teaser out of the full captions-to-video for an ad or pre-roll. Trim by the caption line you want to keep, not by dragging a timeline.

Open tool

Generating from scratch

If you don't have caption text yet

Video Script Generator

No captions to start from? Generate a tight video script from a short brief, then feed it through the same scene planner the caption flow uses.

Open tool

AI Video Generator

Brief the agent in a prompt and skip the paste step. The script is written on the way to the storyboard, with captions auto-generated on the final render.

Open tool

Text to Speech Video

A straight read-through of your caption lines over brand visuals, with no script rewrite. Useful when you want the cue text spoken exactly as written.

Open tool

Auto Subtitle Generator

Have footage but no caption file? Generate the subtitles automatically first, then paste that caption body in to build a fresh video from the same lines.

Open tool

Polishing the source first

Tidy the caption text before you convert it

Video to Text

Pull a clean transcript out of a clip when your caption file is messy or missing. Edit the text, then convert those lines into a new branded video.

Open tool

Audio to Text

Transcribe a voice memo or recording into caption-style lines, fix the wording, and paste the result in to drive the captions-to-video render.

Open tool

Video Caption Generator

Generate accurate captions for an existing clip, then reuse that exported text as the source for a brand-new video built around the same lines.

Open tool

AI Image Generator

Pre-generate the hero thumbnail for the captions-to-video on the same Brand Kit, so the social card and the video's first frame match.

Open tool

Built for teams

Teams who turn caption files into video every week.

All solutions

Support Teams

Caption exports from walkthrough recordings become short branded help videos. The steps stay captioned on screen, so customers watch the fix instead of reading two paragraphs.

See workflows

Product Marketing

Subtitle text from feature demos and launch clips becomes a fresh captioned video the same week. One caption body, three ratios, no editor in the loop.

See workflows

Customer Success

Onboarding and QBR caption files turn into short videos customers actually open, with each beat captioned so the message lands even on mute.

See workflows

Educators

Lecture and lesson caption files convert into paced training videos with brand visuals. Reuse the recorded session's text without putting the instructor back on camera.

See workflows

Growth Marketing

Lift the strongest cues from a longer caption file into 9:16 and 1:1 social creative. Test five caption-led hooks the same day you export the file.

See workflows

Content Creators

Repurpose a video's caption track into a new short for another channel. The cue text becomes the script, captioned and on-brand, without re-cutting the original.

See workflows

Developer Relations

Caption files from conference talks and SDK walkthroughs convert into clips for docs and social. The on-screen captions keep technical terms readable.

See workflows

HR & Internal Comms

All-hands and policy-update caption exports become short videos employees finish, with the key lines captioned instead of buried in an email thread.

See workflows

By size

Solopreneurs Startups SMB Enterprise Remote Teams

By industry

SaaS E-commerce Fintech Healthcare Real Estate

Integrations

Trigger captions to video where your caption files already land.

Wire the converter into the tools that produce or store your caption exports. Every integration ships with a working captions-to-video recipe you can fork.

Zapier

no-code

whenA new .srt or .vtt caption file lands in your Drive or Dropbox folder

thenConvert the caption body into a 16:9 and 9:16 video with the captions burned back in, and drop both into Drive

Integrate with Zapier

MCP Server

agentic

whenClaude or ChatGPT is handed a block of caption lines to turn into a video

thenPass the cue text to ngram and return the rendered captioned video plus a /watch share link

Use MCP server

n8n

self-host

whenA self-hosted pipeline writes a finished caption file to your store

thenConvert those caption lines into a branded video without the subtitle text leaving your VPC

Integrate with n8n

Make.com

scenarios

whenA media record is updated with an approved caption export

thenAuto-convert that caption text into three social videos and attach them to the campaign record

Integrate with Make

Chrome Extension

browser

whenYou highlight caption lines on a page and hit 'Convert to video'

thenGet a storyboard back in a new tab, built from the selected cues and ready to review

Install Chrome extension

publish

whenA captions-to-video finishes rendering

thenPublish the 1:1 cut as a LinkedIn video post with the strongest caption line as the post copy

Publish to LinkedIn

YouTube

publish

whenThe 16:9 cut of a captions-to-video is ready

thenUpload it to your channel and attach the source caption file as the YouTube subtitle track

Publish to YouTube

X (Twitter)

publish

whenA 9:16 captions-to-video clip finishes rendering

thenPost it to X with the opening caption line as the tweet copy

Publish to X

REST API MCP server WebhooksProgrammatic captions-to-video runs in roughly 20 lines against the REST API.

How it compares

If you've been using something else to go from captions to video.

VEED, Kapwing, and Pictory all start from a finished video and add or restyle captions on top of it. ngram works the other way: the caption text is the input. It reads each cue as a beat, plans a storyboard you can argue with before render, generates the visuals per scene, and burns the captions back in on the way out.

Feature	ngram	VEED	Kapwing	Pictory
Direction of the workflow	Caption text in, full video out	Existing video in, captions added on top	Existing video in, captions added on top	Article or script in, stock-clip video out
Reads an SRT or VTT body as the source	Yes, parses cues and builds scenes from them	Imports captions onto a clip	Imports captions onto a clip	No native caption-file input
How the caption text is read	Cues grouped into a hook-body-CTA script	Treated as overlay text	Treated as overlay text	Sentence-by-sentence over stock clips
Storyboard review before render	Full scene-by-scene plan, editable in plain language	Manual timeline editing	Manual timeline editing	Scene cards, limited script edits
Visual generation per scene	AI Visuals matched to each cue's meaning	Stock and uploaded media	Stock and uploaded media	Stock-library matching
Brand application	Brand Kit on every scene plus the caption preset	Template-based brand controls	Template-based brand controls	Brand presets, limited per-scene control
Aspect ratios per render	16:9, 1:1, 9:16 from one render	One ratio per export	One ratio per export	One ratio per export
Re-render in another language	Translate the script and re-burn captions, no rebuild	Manual re-caption	Manual re-caption	Manual rework
API and agentic access	REST, MCP server, Zapier, n8n, Make	API on paid plans	API on paid plans	Limited API

vs VEED in detail vs Kapwing in detail vs Pictory in detail

FAQ

Common questions about captions to video

Paste the body of an SRT or VTT file, or any block of caption lines. ngram strips the cue numbers and timestamps, keeps the spoken text, and assembles it into a hook-body-CTA script. It builds a scene-by-scene storyboard you review and edit in plain language, then exports in 16:9, 1:1, and 9:16 with the captions burned back in.

Still curious?

Captions → Video

Ready to turn a caption file into a branded video?

Paste the caption lines, review the storyboard, export in three ratios with the captions burned in. Roughly five minutes from paste to publish.

Convert captions to video Book a demo

Captions to Video: turn subtitle lines into a branded marketing video

Four steps from a caption file to a video you'd actually publish.

Paste the caption body or drop a file URL

Caption lines become a paced video script

Review the storyboard before render

Export with captions burned in

Smart defaults from the caption text. Real knobs when you need them.

Reads SRT and VTT bodies

Cue-line-to-scene mapping

Script-first review

AI Visuals per caption beat

Voiceover that reads the captions

Caption styling from your Brand Kit

Re-render in another language

Three ratios in one render

What ngram does to a caption file that a subtitle tool doesn't.

Captions

Script Generation

AI Visuals

AI Voiceover

Brand Kit

Multi-format Export

Where a captions-to-video earns its spot in the funnel.

Turn a captioned clip's text into a LinkedIn video post

Caption files become a clean explainer

Support captions become a help video

Lesson captions become a training video

Repurpose caption lines into short social clips

Embed a caption-driven video in your newsletter

Onboarding captions become a welcome video

Feature-demo captions become a feature video

Source is a file, a transcript, or plain text? Pick the converter that matches the input.

Clean the caption text before. Edit the video after.

Video Editor

Add Subtitles to Video

Video Translator

Video Cutter

Video Script Generator

AI Video Generator

Text to Speech Video

Auto Subtitle Generator

Video to Text

Audio to Text

Video Caption Generator

AI Image Generator

Teams who turn caption files into video every week.

Support Teams

Product Marketing

Customer Success

Educators

Growth Marketing

Content Creators

Developer Relations

HR & Internal Comms

Trigger captions to video where your caption files already land.

If you've been using something else to go from captions to video.

Common questions about captions to video

How does ngram's captions to video work?

What caption formats can I paste in?

Does captions to video keep the original timestamps?

Will the captions appear on the finished video?

Does captions to video use stock footage or AI-generated visuals?

What if my captions are just a transcript with no timestamps?

Can I narrate the caption lines with a voiceover?

What aspect ratios and file formats can I export?

Can I integrate captions to video into our tooling?

How is ngram different from a subtitle tool for captions to video?

Ready to turn a caption file into a branded video?