Media Tracks

A WebVTT file in general consists of a sequence of text segments associated with a time-interval, called a cue (definition). Beyond captioning and subtitling, WebVTT can be used for time-aligned metadata, typically in use for delivering name-value pairs in cues. WebVTT can also be used for delivering chapters, which helps with contextual navigation around an audio/video file. Finally, WebVTT can be used for the delivery of text video descriptions, which is text that describes the visual content of time-intervals and can be synthesized to speech to help vision-impaired users understand context.

Example:

WEBVTT

00:11.000 --> 00:13.000
<v Roger Bingham>We are in New York City

00:13.000 --> 00:16.000
<v Roger Bingham>We’re actually at the Lucern Hotel, just down the street

00:30.500 --> 00:32.500 align:left size:50%
<v Neil deGrasse Tyson>Didn’t we talk about enough in that conversation?

The following video provides an overview of the supported text track types that MyMedia offers at present.

Note: If no specific transcript is uploaded, we will fallback to using the captions (if present) as the transcript.

Subtitles

Translations of the dialogue in the video for when audio is available but not understood. Subtitles are shown over the video.

Captions

Transcription of the dialogue, sound effects, musical cues, and other audio information for viewer who are deaf/hard of hearing, or the video is muted. Captions are also shown over the video.

Transcript

Transcription of the dialogue, sound effects, musical cues, and other audio information for viewer who are deaf/hard of hearing, or the video is muted. Transcripts are shown aside the video with live interactivity. The current cue will be highlighted as the video progresses.

Chapters

Chapter titles that are used to create navigation within the video. Typically, these are in the form of a list of chapters that the viewer can use to navigate the video.

Descriptions

Text descriptions of the action in the content for when the video portion isn’t available or because the viewer is blind or not using a screen. Descriptions are read by a screen reader or turned into a separate audio track.