This page intentionally left blank. ⬇️, ➡️, or spacebar 🛰 to start slidedeck. --- class: middle, center .center[![ffmpeg](/img/ffmpeg.png)] # FFmpeg for preservation --- # Using FFmpeg for preservation FFmpeg is a powerful audiovisual manipulation tool in many ways, and is also great for digital preservation. - gives a lot of support for fixity creation and checksum validation. - can generate a LOT of metadata - helps read and write closed captioning data --- # Checksums "By producing checksums on a more granular level, such as per frame, it is more feasible to assess the extent or location of digital change in the event of a checksum mismatch. In recalling the analogy of a whole-file checksum for audiovisual data as a neighborhood file alarm, the additional generation of more granular checksums enables tracking of digital change or alarms with greater precision." *Dave Rice, [Reconsidering the Checksum for Audiovisual Preservation](https://dericed.com/papers/reconsidering-the-checksum-for-audiovisual-preservation/)* --- # framehash A list of supported cryptographic hashing functions: MD5, murmur3, RIPEMD128, RIPEMD160, RIPEMD256, RIPEMD320, SHA160, SHA224, SHA256 (default), SHA512/224, SHA512/256, SHA384, SHA512, CRC32 and adler32 That's a lot! [documentation](https://ffmpeg.org/ffmpeg-formats.html#framehash-1) --- # framemd5 framemd5: A variant of the above `framehash` muxer, but defaults to md5. `-f framemd5 -` is the same as `-f framehash -hash md5` [framemd5 documentation](https://ffmpeg.org/ffmpeg-formats.html#framemd5-1) \* *Muxers are configured elements in FFmpeg which allow writing multimedia streams to a particular type of file.* --- # framecrc framecrc: "This muxer computes and prints the Adler-32 CRC for each audio and video packet. By default audio frames are converted to signed 16-bit raw audio and video frames to raw video before computing the CRC." [framecrc documentation](https://ffmpeg.org/ffmpeg-formats.html#framecrc-1) \* *Muxers are configured elements in FFmpeg which allow writing multimedia streams to a particular type of file.* --- # Checksums (video samples) `ffmpeg -i input_file -f framemd5 -an output_file` This will create an MD5 checksum per video frame. **`ffmpeg`** starts the command **`-i input_file`** path, name and extension of the input file **`-f framemd5`** library used to calculate the MD5 checksums **`-an`** ignores the audio stream (audio no) **`output_file`** path, name and extension of the output file --- # Checksums (audio samples) `ffmpeg -i input_file -af "asetnsamples=n=48000" -f framemd5 -vn output_file` This will create an MD5 checksum for each group of 48000 audio samples. The number of samples per group can be set arbitrarily, but it's good practice to match the sample rate of the media file (so you will get one checksum per second). Examples for other sample rates: * 44.1 kHz: "asetnsamples=n=44100" * 96 kHz: "asetnsamples=n=96000" --- # Checksums (audio samples) 2 Note: This filter transcodes audio to 16 bit PCM by default. The generated framemd5s will represent this value. Validating these framemd5s will require using the same default settings. **`ffmpeg`** starts the command **`-i input_file`** path, name and extension of the input file **`-f framemd5`** library used to calculate the MD5 checksums **`-vn`** ignores the video stream (video no) **`output_file`** path, name and extension of the output file --- # Check FFV1 Fixity `ffmpeg -report -i input_file -f null -` This decodes your video and displays any CRC checksum mismatches. These errors will display in your terminal like this: [ffv1 @ 0x1b04660] CRC mismatch 350FBD8A!at 0.272000 seconds Frame CRCs are enabled by default in FFV1 Version 3. **`ffmpeg`** starts the command **`-report`** Dump full command line and console output to a file named ffmpeg-YYYYMMDD-HHMMSS.log in the current directory. It also implies -loglevel verbose. **`-i input_file`** path, name and extension of the input file **`-f null`** Video is decoded with the null muxer. This allows video decoding without creating an output file. **`-`** FFmpeg syntax requires a specified output, and - is just a place holder. No file is actually created. --- # Metadata (reading)! Archivists love XML. `ffprobe` helps give you a lot. `ffprobe -i input_file -show_format -show_streams -show_data`... etc etc `-print_format xml` will export the data in XML when possible. But you can also get `json` or plain text (`flat`). See the [FFprobe](/presentations/ffprobe.html) presentation for a lot more details about metadata extraction with ffprobe. --- # Metadata (writing)! Adding metadata is also easy with FFmpeg when writing or copying a file. Just add this flag: `-metadata KEY="VALUE"` for each metadata tag. You can see these tags using `ffprobe`. Try it! --- # Read/Extract EIA-608 (Line 21) Closed Captioning `ffprobe -f lavfi -i movie=input_file,readeia608 -show_entries frame=pkt_pts_time:frame_tags=lavfi.readeia608.0.line,lavfi.readeia608.0.cc,lavfi.readeia608.1.line,lavfi.readeia608.1.cc -of csv > input_file.csv` --- # Additional Resources - [ffmprovisr](https://amiaopensource.github.io/ffmprovisr/) - [FFmpeg Cookbook for Archivists](https://avpres.net/FFmpeg/) - [framemd5 Intro and HowTo](https://trac.ffmpeg.org/wiki/framemd5%20Intro%20and%20HowTo) - [Reconsidering the Checksum for Audiovisual Preservation](https://dericed.com/papers/reconsidering-the-checksum-for-audiovisual-preservation/) --- # Learning more - [FFmpeg](/presentations/ffmpeg.html) - [FFmpeg & Art](/presentations/ffmpeg-art.html) - [FFmpeg & Preservation](/presentations/ffmpeg-preservation.html) - [FFplay](/presentations/ffplay.html) - [FFprobe](/presentations/ffprobe.html) [Home](/)