A Comparison of Media Engines
Over the past few years while developing the open-source music playerpianod
, I’ve worked with a number of different
media-handling libraries. Based on these experiences, I’m writing
this document to help other software authors choose the most
appropriate package for their needs.
Contents
1. AVFoundation (OS X)
- Language: Objective C or C++, with Swift bindings
- Platform: Apple stuff only
- Documentation: Good, but no source code
- Support: StackOverflow
- Reliability: Good
- API stability: Excellent
- Type safety: Mediocre (Objective C; Swift is probably better)
- Bizarreness: Inside an OS X app, none. Using it against a console app, some.
AVFoundation is Apple’s AV solution. It’s got decent documentation including how-to cookbooks, including some sample code. Most of it snapped together really easily; I had audio output in under 2 hours.
However, detecting end-of-track was another matter. If I’d been working with Apple’s AppKit or UIKit frameworks, I probably would have been fine—but I just wanted an AV solution and didn’t need any UI to go with it. But AVFoundation uses some services, including notifications, which are dispatched through the application frameworks.
The upshot is, you can’t just use AVFoundation in isolation and
expect it to work. You will need to run the application under
AppKit or UIKit, even if you just fork a thread off from a
applicationDidFinishLaunching
delegate.
2. ffmpeg
- Language: C
- Platform: Cross-platform
- Documentation: References for codecs and muxers, a good Doxygen, a few sample projects, and the source code
- Support: Mailing list, StackOverflow
- Reliability: Superb
- API stability: Ok
- Type safety: Very good
- Bizarreness: Extern “C”.
ffmpeg is a beast of an API to work with, with limited documentation, but it’s rock-solid. The API gradually evolves and you will have to occasionally update/#ifdef code to deal with different versions. The changes seem pretty nominal, and there’s a transition period.
ffmpeg’s idea of learning is to go cookbook off the sample project code. There is, unfortunately, no overview explaining what the hell any of these things do, how they work or how they’re supposed to fit together. This would really go a long way to helping neophytes understand the sample code, and prevent misunderstanding which plagued the project and annoy the developers, because don’t you know you’re not supposed to do it that way? Even though that way is a pretty reasonable conclusion to make from the examples and the code.
A few things I’ve learned:
- The “codec” member provided in a stream is for provided for reference when creating a codec. Allocate a new context, copy it, and use that. You’re not supposed to use the member directly for your codec.
- Codecs are completely unassociated with streams. Your app code moving data between them is what connects them.
- Use the Doxygen, which links from extracted docucomments to file/line in the code and back. But, make sure you’re looking at the right version—they have Doxygen editions for releases going back to 0.6. After Googling a certain function, check the URL to make sure you’re looking at the expected version.
The biggest pitfall of this library is that the header files
don’t contain conditional extern “C” {
/
}
bracketing their header files for use with C++. If
using C++, you’ll need to put these around your includes for this
library, or your code won’t link.
3. libav
- Language: C
- Platform: Cross-platform
- Documentation: Poor, a broken Doxygen, a few sample projects, and the source code
- Support: low-activity Mailing list
- Reliability: Mrrr
- API stability: No
- Type safety: Very good
- Bizarreness: Extern “C”.
This fork of ffmpeg was supposed to make things better, but I’m doubtful it has. Their Doxygen is screwed up, making it hard to understand their code and APIs. While initially very compatible with ffmpeg, the API has diverged significantly and has created problems where it’ll compile, but behave differently because run-time options are handled or requested differently. The apparent-but-not-actual similarity makes libav hazardous, and I had poor response when asking for clarifications on their mailing lists.
Debian ditched libav in favor of ffmpeg in July 2015, for considered reasons.
4. gstreamer
- Language: C, with bindings to C++ and other languages
- Platform: Cross-platform
- Documentation: An excellent manual with explanation of the design, a “hello world” example, and some samples. Website has decent references (may/may not be Doxygen, but similar).
- Support: StackOverflow, didn’t try mailing lists
- Reliability: Occasional crash
- API stability: Yes
- Type safety: No
- Bizarreness: Reference counting, leaks in C, lots of packages
gstreamer’s excellent manual explains how it all fits together, and makes it look like it was all carefully planned out. Coupled with the modular design, gstreamer’s API has been very stable over the long term.
The downside is that it’s built on G_Object, which to my eye looks like they were trying to replicate the design of NextStep/Cocoa without using Objective C. I think this comes out of the idea that if it’s written in C, it’ll run anywhere because C is ubiquitous. But this all comes with a price: lots and lots of ugly, unsafe pointer crap.
Object oriented languages understand is-a relationships. So a pipeline is a bin, and a bin is an element. But C doesn’t know this; a pipeline is a pipeline, a bin is a bin, an element is an element. So when you need to perform an element action on a pipeline (which is valid, because a pipeline is a bin), you coerce types and lose the type safety provided by the compiler.
Properties do these too. To keep the API the same across different modules (“plug-ins” in gstreamer parlance), properties are set via key-value pairs. The key is a string, the property is… well, it could be an int, a double, a pointer. It depends on what you’re setting. Just be sure to pass the right thing, because it’s totally unchecked and if you do it wrong the property won’t get set right.
The C++ wrapper (gstreamermm) fixes a lot of this, but has troubles of its own: it’s not documented well, glibmm has its own string type, property values go in containers that are a hassle and proved to be buggy, it’s not always available via package managers and installing it requires more dependencies (glibmm, giomm, gstreamermm on top of the usual glib, gstreamer, gst-plugin-base, gst-plugin-good and possibly gst-plugin-ugly).
Another issue is that under the hood, gstreamer uses a special memory allocator, GSlice. Unfortunately, the shut-it-off mechanism just reverts to g_malloc, not the system-native malloc. So when I enable awesome memory troubleshooting tools (Guard Malloc, Leaks) they don’t work with gstreamer.