Jean-Marc Valin (primary author of Speex and part of the CELT team) has made me aware of a proposal for developing a royalty-free audio codec under the banner of the IETF. The debates on the mailing list address some persistent issues in codec development: patents and royalties. What follows is a post I made to the IETF codec mailing list:
My name is David Rowe, I have a PhD in the field of speech compression and 20 years experience in design and real time implementation of speech codecs and other DSP algorithms. I helped found the Speex project and have been a minor contributor over the years. I have also run small businesses that use speech codecs and experienced first hand the pain of trying to use codecs like g729 with $40k license fees.
My experience as both a codec designer and a business entity using codecs is that codec royalties are a useless tax on business and telecommunication. The license fees benefit no-one but the people receiving the royalties. Why can I get a first class operating system like Linux for free but have to pay to use a tiny bit of software on it like g729?
We have the skills in the open source community to build better codecs using open source techniques. The world will be a better place for it.
Patent free, competitive DSP algorithms can and should be developed, and there are precedents:
- Speex and the various other open codecs.
- A patent free, royalty free line echo canceller (Oslec) which is successfully replacing expensive royalty based solutions. I recall much of the same FUD about patents when developing this algorithm. And yet it works, pushing out royalty-based code in many many thousands of cases, and it’s now part of the Linux kernel (try doing that with closed code).
What has held Speex adoption back is lack of standardisation – people have been forced to use royalty based codecs like g729 for bit-exact interoperability reasons. Well, it looks like we have a chance to fix that.
Speex has shown it’s possible to build a patent free codec. A lot of the algorithms involved in codecs are just well known math, e.g. transforms, vector and scalar quantisation. There usually alternative ways to perform a given operation if an annoying patent is in the way. It really is no big deal. Most of the underlying technology in modern speech codecs (DSP fundamentals, quantisation, LPC, pitch prediction etc) was published in the 60’s and 70’s.
Where the patents come in is that some one gets a good quality codec working using some clever technique in 1% of algorithm, then they patent that 1%. I would imagine they explicitly search for a novel technique in order to lock up their codec. Then when that codec is standardised we are stuck paying them royalties for bit-exact interoperability reasons.
g729 does not hold patents on linear prediction, or vector quantisation, or CELP, or Line Spectrum Pairs, or pitch prediction. All these techniques are also in Speex which sounds about the same at 8 kbit/s.
It’s easy to replace that 1% early in the codec design process – we simply make royalty free a priority rather than maximising royalties. In other words design the codec to help as many people as possible, rather than designing it to make a small number of people wealthy. Isn’t that what the IETF is all about?
I think it’s a great idea to release source code from day 1. Open source development has been shown to be superior to closed, so I am sure we can develop a better codec faster with open source, peer reviewed code.
Re funding an open development effort and travel to meetings the $ involved are trivial compared to the cumulative costs of license fees down the track for the closed approach. So it’s a great business decision to support an open codec.