+ - 0:00:00
Notes for current slide
Notes for next slide

This page intentionally left blank. ⬇️, ➡️, or spacebar 🛰 to start slidedeck.

1 / 20

matroska

Matroska

2 / 20

What is Matroska?

Matroska as an audiovisual file format has been in use since 2002, with widespread internet usage. Matroska has been adopted as the foundation of Google’s webm format -- a file format optimized specifically for web-streaming. Some of Matroska’s features -- such as subtitle management, chaptering, extensible structured metadata, file attachments, and broad support of audiovisual encodings -- have facilitated its adoption in a number of media communities. Matroska has also been implemented into many home media environments such as Xbox and Playstation and works “out of the box” in the Windows 10 operating system.

3 / 20

Why Matroska?

  • Active use since 2002
  • Widespread adoption
  • Foundation of Google's webm (web-streaming video)
  • Subtitle management
  • Chaptering abilities
  • Extensible structured metadata
  • File attachment capabilities
  • Broad support of audiovisual encodings
4 / 20

What about EBML?

Yeah! Matroska is based on and dependent on the EBML Specification.

Learn more about EBML here

5 / 20

What is EBML?

  • Extensible Binary Meta Language (EBML is a Binary XML format)
  • An EBML Schema defines an EBML Document like an XML Schema defines an XML Document
  • Matroska and webm are EBML Document Types
  • Storage is based on a structure of Element ID, Element Data Size, and Element Data
  • Unlike XML, an EBML Document requires an EBML Schema to be interpreted semantically
6 / 20

Did you say Binary XML?

The benefits of binary XML are:

  • less verbose
  • quicker parsing
  • compact

Negatives:

  • can't use a regular text editor
  • must be decoded to understand
7 / 20

(Partial) Example

<EBML>
<EBMLVersion>1</EBMLVersion>
<EBMLReadVersion>1</EBMLReadVersion>
<EBMLMaxIDLength>4</EBMLMaxIDLength>
<EBMLMaxSizeLength>8</EBMLMaxSizeLength>
<DocType>matroska</DocType>
<DocTypeVersion>4</DocTypeVersion>
<DocTypeReadVersion>2</DocTypeReadVersion>
</EBML>
8 / 20

libebml

libebml is a C++ libary to parse EBML files

More information: https://matroska-org.github.io/libebml/
Codebase: https://github.com/Matroska-Org/libebml

9 / 20

Matroska Structure

The Matroska wrapper is organized into top-level sectional elements for the storage of attachments, chapter information, metadata and tags, indexes, track descriptions, and encoding audiovisual data.

matroska

10 / 20

Why Matroska?: Checksum Elements

The Matroska wrapper is organized into sectional elements, and each element may have a dedicated checksum associated with it, which is one of the important reasons why it is deemed such a suitable format for digital preservation. Specific sections of a file can be checked for errors, which means error detection can be more specific to the error’s region (as opposed to having to identify the error within the entire file). For example, a checksum mismatch specific to the descriptive metadata section of the file can be assessed and corrected without having to do quality control and analysis on the file’s content streams. The Matroska format features embeddable technical and descriptive metadata so that contextual information about the file can be embedded within the file itself, not just provided alongside in a different type of document.

11 / 20

Why Matroska?: Metadata

In addition to the robust checksum features, Matroska can carry a significant amount of self-description.

Metadata is held in Matroska as tags. See the Tagging documentation for more details.

12 / 20

Why Matroska?: Chapters

Chapters can replicate the structure of a DVD or CD, or more complex handling.

13 / 20

Why Matroska?: Subtitling

Matroska can have subtitles embedded into the file.

14 / 20

Why Matroska?: Attachments

Files can be added to Matroska as attachments. This capability is mostly used for subtitles by adding a specific font as an attachment but doesn't have to be limited to this purpose.

15 / 20

Why Matroska?: Format support

Matroska is a wrapper that accepts a wide variety of video encoding formats.

16 / 20

WebM

WebM is a royalty-free media file format intended for usage on the web. Development is sponsored by Google and the format is open under the BSD license. It is based on the Matroska profile.

17 / 20

MKVToolNix

MKVToolNix is a suite of software tools created to work with Matroska files. It was designed and is maintained by Moritz Bunkus, a core developer of Matroska and EBML. There is a GUI and the following command-line tools:

mkvmerge merges multimedia streams into a Matroska file.
mkvinfo lists all elements contained in a Matroska file.
mkvextract extracts specific parts from a Matroska file to other formats.
mkvpropedit allows to analyze and modify some Matroska file properties.

18 / 20

Actively in development!

Thanks to the IETF CELLAR working group, EBML and Matroska are actively being standardized. The work is being done on the CELLAR listserv and on Github.

19 / 20

matroska

Matroska

2 / 20
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow