Understanding Innovid Video Specs

Last Updated: December 15, 2025 09:41

Purpose

This article explains the nuances of video and audio specifications to ensure your video files meet the necessary requirements to run campaigns seamlessly. It outlines Innovid’s video and audio specification requirements and relevant definitions.

Video specs

The following Innovid pre-roll video specs are designed to ensure that every ad is delivered at its highest possible quality. By requiring that assets meet the highest industry standards, Innovid can work with a hi-res base asset to seamlessly encode across every publisher in your plan.

Note: If we receive a video that does not meet the following minimum specifications, we will need the client's written approval to proceed with the video as is. This is not required for self-service clients, as consent is implied when the client moves forward with the trafficking process.

The following specifications are based on Hulu guidelines and are subject to change at Hulu’s discretion at any time. The latest specs are available directly on Hulu’s site.

Video
1920 x 1080 16:9 display aspect ratio No black bars or intro/outro slates Constant Bitrate (CBR) >15 Mbps*: This is waived when delivering the ProRes codec, as it is built to be variable Main Profile @ Main Level (MP@ML)
Frame Rate
23.98, 25 or 29.97 Constant Frame Rate only Remove any pull-down added for the broadcast Innovid does not detect duplicate or blended frames: To fix blended or duplicate frames, use a de-interlace filter called “auto-adaptive” or “motion adaptive” to remove interlacing To fix interlacing, revisit the master file and encode a file with an auto-adaptive de-interlace filter
File Formats
MPEG-4 (.mp4) format is preferable (especially for DCO) QuickTime movie (.mov) is also acceptable The video file extension must be lowercase; uppercase extensions are not supported. For example, ‘.mov’ instead of ‘.MOV’.
Max File Size
Recommended 200 MB Note: Innovid can accept files under 1G. However, we recommend videos under 200 MB.
Audio
PCM (preferred) or AAC codec 192 Kbps minimum 16 or 24-bit only 48 kHz sample rate 2 channels only -24LKFS +/- 2* True Peak -6 to -9 dBTP* Audio silence max 1.5 sec (1500 msec) 1 audio stream max *Full-episode players (FEP) are more sensitive to audio requirements. Confirm that asset audio complies with the following specs if you run video on FEP inventory. Only applicable to Roku inventory. Roku audio requirements are as follows: LUFS/LKFS: -23LKFS +/- 2 True Peak: Allowed true-peak maximum is -1 dBTP

Terminology

Video

Term		Definition
Aspect ratio		The ratio of a video’s width to the video’s height. For example, a video with dimensions of 1920x1080 has an aspect ratio of 16:9.
Bitrate		The number of bits used per second of playback time. High-definition video requires a minimum of 15 MBPS (15 Megabytes per second).
Bitrate mode		The method by which a video file is encoded; either constant bitrate (CBR) or variable bitrate (VBR). Constant bitrate encoding persists the set data rate over the entire video file. Variable bitrate encoding adjusts the data rate based on the data required by the compressor and can result in portions of the video being under the minimum required bitrate.
Black bars		Whether or not the video file contains black bars on the sides of the frame (pillarboxing) or on the top and bottom of the frame (letterboxing).
Dimensions		The width and height of a particular video, measured in pixels. Common high-definition dimensions include 1280x720 and 1920x1080.
File format		This is one of many standard ways to encode information for storage. Examples of videos include .mov, .mp4, etc.
File size		The amount of space a file occupies on a storage medium such as a computer hard drive. File sizes can be measured in bytes (B), kilobytes (KB), megabytes (MB), gigabytes (GB), and beyond.
Frame rate		The number of frames or images that are projected or displayed per second. Common frame rates in the US are 23.98, 25, and 29.97 fps.
Frame rate mode		The method by which a video file is rendered; either constant frame rate (CFR) or variable frame rate (VFR). Constant frame rate encoding persists the set frame rate over the entire video file. Variable frame rate encoding adjusts the frame rate based on the perceived level of motion in a video and can result in a portion of the video being under the minimum required frame rate.

Audio

Term		Definition
Bitrate		The number of bits used per second of playback time. High-definition audio requires a minimum of 192 kbps (192 Kilobits per second).
Bit depth		The number of bits of information in each sample. High-definition audio requires a bit depth of either 16 or 24 bits.
Channels		A single stream of recorded sound with a location in a sound field (“left speaker” vs. “right speaker”). Digital audio should only ever have 2 channels.
Codec		A device or program for encoding or decoding a digital data stream. Codec is a portmanteau of coder-decoder.
dBFS		Decibels relative to full scale. (dBFS) is a unit of measurement for volume levels in digital systems with a defined maximum peak level. This term defines the optimal audio volume level relative to the systems processing the audio. The recommended dBFS is between -29dB and -25 dB.
Max Peak dB		The loudest single point of an audio file.
Sample rate		The number of samples of a sound that are taken per second to represent the event digitally. High-definition audio requires a sample rate of exactly 48 kHz.
LKFS / LUFS		The standard loudness measurement relative to full scale. One unit of LKFS or LUFS is equal to one dB.
Silence		The perceived absence of audio for longer than 1.5 sec (1500 msec).
Streams		Streams are used as the audio output source. Examples include music tracks, voiceovers, and sound effects.

Was this article helpful?

10 out of 11 found this helpful