Understanding Innovid Video Specs

Description: This article explains the nuances of video and audio specifications to ensure your video files meet the necessary requirements to run campaigns seamlessly. It outlines Innovid’s video and audio specification requirements and relevant definitions.

Video Specs

The following Innovid pre-roll video specs are designed to ensure that every ad is delivered at its highest possible quality. By requiring that assets meet the highest-level industry specs, Innovid can work from a hi-res base asset to seamlessly encode across every publisher on your plan.

Note: If we receive a video that does not meet the following minimum spec requirements, we will need written approval from the client to proceed with the video as is. This is not required for self-service clients, as consent is implied when the client moves forward with the trafficking process.

The following specs are based on Hulu guidelines and are subject to change at any time at Hulu’s discretion. The most up-to-date specs can be found directly on Hulu’s site.

  • 1920 x 1080
  • 16:9 display aspect ratio
  • No black bars or intro/outro slates
  • Constant Bitrate (CBR) >15 Mbps*: This is waived when delivering the ProRes codec, as it is built to be variable
  • Main Profile @ Main Level (MP@ML)
Frame Rate
  • 23.98, 25 or 29.97
  • Constant Frame Rate only
  • Remove any pull-down added for the broadcast
  • Innovid does not detect duplicate or blended frames:
    • To fix blended or duplicate frames, use a de-interlace filter called “auto-adaptive” or “motion adaptive” to remove interlacing
    • To fix interlacing, revisit the master file and encode a file with an auto-adaptive de-interlace filter
File Formats
  • MPEG-4 (.mp4) format is preferable (especially for DCO)
  • QuickTime movie (.mov) is also acceptable
Max File Size
  • Recommended 200 MB
    Note: Innovid can accept files under 1G. However, we recommend videos under 200 MB.
  • PCM (preferred) or AAC codec
  • 192 Kbps minimum
  • 16 or 24-bit only
  • 48 kHz sample rate
  • 2 channels only
  • -24LKFS +/- 2*
  • True Peak  -6 to -9 dBTP*
  • Audio silence max 1.5 sec (1500 msec)
  • 1 audio stream max

*Full-episode players (FEP) are more sensitive to audio requirements. Confirm that asset audio complies with the following specs if you run video on FEP inventory. Only applicable to Roku inventory. Roku audio requirements are as follows:
LUFS/LKFS: -23LKFS +/- 2
True Peak: Allowed true-peak maximum is -1 dBTP




Term Definition
Aspect ratio The ratio of a video’s width to the video’s height. For example, a video with dimensions of 1920x1080 has an aspect ratio of 16:9.
Bitrate The number of bits used per second of playback time. High-definition video requires a minimum of 15 MBPS (15 Megabytes per second).
Bitrate mode The method by which a video file is encoded; either constant bitrate (CBR) or variable bitrate (VBR). Constant bitrate encoding persists the set data rate over the entire video file. Variable bitrate encoding adjusts the data rate based on the data required by the compressor and can result in portions of the video being under the minimum required bitrate.
Black bars Whether or not the video file contains black bars on the sides of the frame (pillarboxing) or on the top and bottom of the frame (letterboxing).
Dimensions The width and height of a particular video, measured in pixels. Common high-definition dimensions include 1280x720 and 1920x1080.
File format This is one of many standard ways to encode information for storage. Examples of videos include .mov, .mp4, etc.
File size The amount of space a file occupies on a storage medium such as a computer hard drive. File sizes can be measured in bytes (B), kilobytes (KB), megabytes (MB), gigabytes (GB), and beyond.
Frame rate The number of frames or images that are projected or displayed per second. Common frame rates in the US are 23.98, 25, and 29.97 fps.
Frame rate mode The method by which a video file is rendered; either constant frame rate (CFR) or variable frame rate (VFR). Constant frame rate encoding persists the set frame rate over the entire video file. Variable frame rate encoding adjusts the frame rate based on the perceived level of motion in a video and can result in a portion of the video being under the minimum required frame rate.



Term Definition
Bitrate The number of bits used per second of playback time. High-definition audio requires a minimum of 192 kbps (192 Kilobits per second).
Bit depth The number of bits of information in each sample. High-definition audio requires a bit depth of either 16 or 24 bits.
Channels A single stream of recorded sound with a location in a sound field (“left speaker” vs. “right speaker”). Digital audio should only ever have 2 channels.
Codec A device or program for encoding or decoding a digital data stream. Codec is a portmanteau of coder-decoder.
dBFS Decibels relative to full scale. (dBFS) is a unit of measurement for volume levels in digital systems with a defined maximum peak level. This term defines the optimal audio volume level relative to the systems processing the audio. The recommended dBFS is between -29dB and -25 dB.
Max Peak dB The loudest single point of an audio file.
Sample rate The number of samples of a sound that are taken per second to represent the event digitally. High-definition audio requires a sample rate of exactly 48 kHz.
LKFS / LUFS The standard loudness measurement relative to full scale. One unit of LKFS or LUFS is equal to one dB.
Silence The perceived absence of audio for longer than 1.5 sec (1500 msec).
Streams Streams are used as the audio output source. Examples include music tracks, voiceovers, and sound effects.
Was this article helpful?
10 out of 10 found this helpful