Quantcast

geek stuff from archive.org — making Ogg Theora videos

Fast, reliable way to encode Theora Ogg videos using ffmpeg, libtheora, and liboggz

archive.org has started to make theora derivatives for movie files, where we create an Ogg Theora video format output for each movie file.   after trying a bunch of tools over a good corpus of wide-ranging videos, i found a neat way to make the Archive derivatives.

 High Level:

  • use ffmpeg to turn any video to “rawvideo”. 
  • pipe its output to *another* ffmpeg to turn the video to “yuv4mpegpipe”.
  • pipe its output to the libtheora tool. 
  • for videos with audio, ffmpeg create a vorbis audio .ogg file. 
  • add tasty metadata (with liboggz utils). 
  • combine the video and audio ogg files to an .ogv output!   

Detailed example:  

 
/usr/bin/ffmpeg -v 0 -an -deinterlace  -s 400x300 -r 20.00 -i CapeCodMarsh.avi -vcodec rawvideo -pix_fmt yuv420p -f rawvideo -  | \
/usr/bin/ffmpeg -v 0 -an -f rawvideo   -s 400x300 -r 20.00 -i - -f yuv4mpegpipe -  | \
/petabox/deriver/libtheora-1.0/lt-encoder_example --video-rate-target 512k - -o tmp.ogv;
 
/usr/bin/ffmpeg -y -i CapeCodMarsh.avi -vn -acodec vorbis -ac 2 -ab 128k -ar 44100 audio.ogg;
/petabox/sw/bin/oggz-comment audio.ogg -o audio2.ogg TITLE="Cape Cod Marsh" ARTIST="Tracey Jaquith" LICENSE="http://creativecommons.org/licenses/publicdomain/" DATE="2004" ORGANIZATION="Dumb Bunny Productions"  LOCATION=http://www.archive.org/details/CapeCodMarsh;
/petabox/sw/bin/oggzmerge tmp.ogv audio2.ogg -o CapeCodMarsh.ogv;

WTFs:

  • Why the double pipe above? Some videos could not go directly to yuv4mpegpipe format such that libtheora (or ffmpeg2theora) would work all the time.
  • We do the vorbis audio outside of libtheora (or ffmpeg2theora) to avoid any issues with Audio/Video sync.
  • We convert to yuv420p in the rawvideo step because ffmpeg2theora has (i think) some known issues of not handling all yuv422 video inputs (i found at least a few videos that did this).
  • We add the metadata to the audio vorbis ogg because adding it to the video ogv file wound up making the first video frame not a keyframe (!)

So this will end up working in Firefox 3.1 and greater — the new HTML “video” tag:

<video controls=”true” autoplay=”true” src=”http://www.archive.org/download/commute/commute.ogv”> for firefox betans </video>

This technique above worked nicely across a wide range of source and “trashy” 46 videos that I use for QA before making live a new way to derive our videos at archive.org ( http://www.archive.org/~MY-FIRST-NAME/_/stream.php  [sorry don't necessarily want all that crawled by non rajbot robots] )

-tracey jaquith   “don’t make me 3:2 pulldown you”

5 Responses to “geek stuff from archive.org — making Ogg Theora videos”

  1. November 7th, 2008 | 5:44 am

    I’ve always used ffmpeg2theora without any A/V sync problems — that would seem to be the much simpler option where it works. Are there certain conditions you’ve found where ffmpeg2theora fails?

  2. tracey jaquith
    November 7th, 2008 | 11:43 am

    absolutely i can find A/V sync issues quite easily, unfortunately.

    now, granted, there are likely some issues with the encoding of these videos as inputs to being with — but these aren’t uncommon w/ the stuff that we get uploaded to archive.org.

    http://www.archive.org/~MY-FIRST-NAME/_/amoalaura/amoaLauraMTV.wmv

    both of these fail to sync A/V:
    ffmpeg2theora amoaLauraMTV.wmv -sync -o out.ogv
    ffmpeg2theora amoaLauraMTV.wmv -o out.ogv

    using my technique above, we sync properly.
    it could be just that ffmpeg is more forgiving when dealing with “trashier” encodings…

  3. shag
    December 18th, 2008 | 1:47 pm

    yow, 3:2 pulldown, sounds humiliating ;-)

  4. Gregory Maxwell
    February 7th, 2009 | 2:15 pm

    Please do not use the above instructions unless you want to be accused of intentionally making Vorbis look bad. The FFMPEG internal Vorbis encoder is not something anyone should actually use. The sound quality is terrible.

    I suspect most people (myself included) were unaware of FFMPEG’s internal Vorbis encoder because just about everything else uses the (BSD licensed) Xiph.Org reference encoder.

    The above commands should be changed to use “-acodec libvorbis” rather than “-acodec vorbis”:

    ffmpeg -y -i CapeCodMarsh.avi -vn -acodec libvorbis -ac 2 -ab 128k -ar 44100 audio.ogg

    This is not audio-geek nitpicking: The above “128kbit” FFMPEG produced audio sounds worse than 32kbit/sec output produced from a reasonable encoder.

    In order to make this point more clearly I have posted a couple of 11 second examples. First listen to a 64kbit/sec example produced by Xiph.org libVorbis. Then listen to the “128kbit/sec” FFMPEG output (which is really about 64kbit/sec for this input). As you can see, the FFMPEG output sounds very bad in both absolute and comparative terms. Even 32kbit/sec audio produced by a decent encoder sounds much better than the ffmpeg output.

  5. tracey jaquith
    February 19th, 2009 | 4:43 pm

    thanks for the info.

    couple quick things. the sound is not very good agreed — but i would personally not say “terrible”.

    we are looking into altering our technique to use libvorbis, but i thought i should point out that not everyone is using the most recent version of linux as i suspect you may be? archive.org is still stuck on “gutsy” version of ubuntu which is from oct 2007. so even with “–enable-libvorbis” compiled into gutsy-era ubuntu, there is no known codec alternative other than “-acodec vorbis” that can do vorbis.

    we are more likely to do an OS upgrade and try to update it at that point.

    so i don’t disagree with you, i just think the severity of the warning is a tad higher than need be. likely we’ll disagree about that but that’s ok.

    thx for the pointer!

Leave a reply