Skip to content
Dubly.AI Support Center home
Dubly.AI Support Center home

Do's and don'ts for your source video

A good dub starts with a good source. This article is the quick checklist for preparing videos that Dubly.AI can process cleanly.


File requirements

  • Accepted formats: MP4 and MOV only.
  • Max file size: 5 GB per upload.
  • Audio required: Your video must have at least one audio track with spoken content. Silent videos can't be dubbed.
  • Mono or stereo both work.

If your file is bigger than 5 GB, export a compressed H.264 MP4 at around 10–15 Mbps before uploading — quality stays high and the file shrinks dramatically.


Do

  • Record clean voice audio. Quiet room, lavalier or shotgun microphone, minimal background noise.
  • Keep music and sound effects under the dialogue. At least 12 dB quieter than the speaker while they're talking.
  • Use a constant frame rate (CFR) when exporting — typically 24, 25, or 30 FPS. Variable frame rate (VFR) can cause timing issues during Lip-Sync.
  • Keep the speaker facing the camera if you plan to Lip-Sync, with a clearly visible mouth throughout.
  • One speaker per shot where possible. Cross-talk and overlapping dialogue are the biggest source of transcription errors.
  • Pick the right source language in the dub creation modal (the auto-detect is good but not perfect for very short clips or multilingual intros).

Don't

  • Don't stack speakers talking over each other. Dub quality drops where voices overlap.
  • Don't bake in Subtitles & Text Overlays: You can leave existing subtitles in your video, but keep in mind they will not be translated. For the best results, ensure they do not overlap any faces, as this can interfere with the AI's processing. If you need translated captions, it's best to add them after the dubbing process.
  • Don't apply heavy effects to the voice track (deep reverb, echo, auto-tune, telephone filters). They interfere with both transcription and stem separation.
  • Don't mumble or rush. Clear pronunciation produces reliable transcripts; mumbled speech doesn't.
  • Don't upload low-resolution videos for Lip-Sync — faces that are tiny in the frame give the model too little to work with. Crop tighter before uploading if needed.
  • Don't worry about pre-separating music and voice — Dubly does that automatically (see Music & Sound Effects Best Practices).

Free-trial specifics

If you're on the free trial:

  • Only the first 60 seconds of each uploaded video are dubbed, regardless of its actual length.
  • A note in the dub creation modal confirms this: "As part of your free trial, the first minute will be dubbed. Subscribe to a plan to dub the entire video."
  • Upload a short clip (under a minute) to get the full result without the trim.

Quick pre-upload check

Before clicking upload, confirm:

  1. File is MP4 or MOV, under 5 GB.
  2. There's clear spoken audio.
  3. Background music is at least a whisper quieter than the speaker.
  4. Frame rate is constant.
  5. If you plan to Lip-Sync: the speaker's face is visible and reasonably large in frame.

If all five check out, your dub will almost certainly come out clean.