In this post I talk about some improvements to Codec 2, the pain of closed source, FDM modems, and compressing a SolderSmoke pod-cast.
Since the last Codec 2 post I have spent about 3 weeks full time working on the Codec. This is basic research, I follow up about 10 dud ideas for every idea that actually improves the Codec. Each idea takes hours of work, lots of deep thinking and analysis, and 3 cups (on average) of Italian coffee. I’ll talk about the DSP part of this work in a future post.
I have made some progress on the voicing and decimation/interpolation algorithms. The latest samples (V0.1A) are available at the top of the Codec 2 page, along with the 2000 bit/s AMBE samples from Mel’s DV Dongle.
Mel Whitten kindly sent me DV Dongle. This is a USB device that includes the DVSI AMBE codec on a DSP chip. Until recently, it was the only way to use the DVSI codec on a PC (I think an x86 binary library is now available for license).
This got me thinking about the practical problems of closed source. While I admire the DV Dongle guys for building this hardware, I really feel for the hoops they had to jump through just to use the AMBE codec. For example design, debug, and put hardware into production, write drivers, design protocols, micro controller firmware and demo applications. It took me three days to develop this code just to talk to the DV Dongle. Oh, the pain, the pain of closed source.
If you want to use Codec 2, just link it with your program. The license is LGPL.
Modems for HF
The nice people from TAPR kindly sent me a shiny new Yaesu FT-817ND radio in recognition of my Codec 2 work. Thanks! I have been clambering all over the roof installing and optimising a 40M dipole to get back on HF radio after a break of 20-something years. I had a lot of problems with noise (S8 level) but some re-orientation of the antenna seemed to help and I can now pick up a lot of Amateur radio signals.
I hope to use this radio to make some experimental digital voice radio transmissions using Codec 2 in the next few months. I have had some great email conversations with Mel Whitten K0PFX and Peter Martinez G3PLX about digital HF radio. HF radio (3-30 MHz) presents severe challenges for digital voice. The HF radio channel is “non-stationary”. This means it changes all the time. For example as the conditions of the ionosphere change, the delay between the transmitter and receiver varies. There are multiple paths the signal may take, smearing it in time at the receiver. Interference may be frequency selective and comes and goes.
The HF radio channel requires special FDM modems for HF Digital Voice.
Peter is the inventor of PSK31, and has a wealth of HF modem experience. Mel has been an advocate of Digital Voice over HF radio and has some excellent ideas based on his experience to date. The pooling of this sort of talent is what I love about open source. Such a breath of fresh air compared to the closed source, commercial world of technology development.
Here is a screen shot of a HF modem in actual operation (from the FDMDV software). This picture shows the spectrum of the received modem signal:
Rather than just one carrier, there are many modem carriers operating in parallel. Each carrier carries binary data using Phase Shift Keying (PSK) at a fairly low rate, e.g. 50 symbols/second. When you combine them you get a 1200-2400 bits/s bit stream. This sort of modem is known as Frequency Division Multiplexed (FDM), and is well suited to radio channels that have multipath interference. 802.11g uses a similar technique to overcome multipath interference for 2.4 GHz Wifi.
Integrating Speech Codecs and Modems
I have been thinking about the potential of carefully integrating Codec 2 with a FDM modem. Codec 2 currently sends a frame of around 50 bits every 20ms. Not every bit in a speech codec frame is equal. Some bits are more sensitive than others to errors. In data communications, we require zero bit errors. In digital voice, what we really care about is the speech quality. Errors in some bits may not be very noticeable, so it’s a waste of bits (or transmitted power) to protect them. Other bits (for example the frame energy), may be very sensitive to bit errors.
With a FDM modem, it is possible to give more power to one carrier than another. So we could adjust the power of each carrier based on the sensitivity of each bit. More sensitive bits would be carried by higher power carriers.
This sort of integration can only be done with open source codecs, where you have access to the codec design and bit stream. It’s a good argument against closed source and patents which stifle this sort of innovation.
Highly Compressed SolderSmoke
As well as short samples I have started listening to some longer segments of audio. SolderSmoke is a popular pod cast on home brew electronics and Ham Radio. This sample is a 60 second sample of SolderSmoke, after encoding to 2500 bit/s using Codec 2. Here is the original MP3 for comparison. Some of music and special effects at the start even made it through. Low bit rate codecs usually break down entirely on non-speech signals as they are highly optimised for human speech alone. Using Codec 2 the entire 58 minute, 46Mbyte SolderSmoke 127 MP3 compresses down to 1.2 Mbyte.
The Line spectrum Pair (LSP) parameters currently consume most of the bit rate (36 bit/frame or 1800 bits/s). This is where we can make some more bit rate savings. I have some experimental versions of the codec running at 2300 and 2050 bit/s using LSP differences and Vector Quantisation (VQ). However it’s still early days, I am new to VQ and still have some unexplained problems to track down.
Some goals for the next few months are a 2400 bit/s codec that includes 400 bit/s of FEC plus a 1200-1400 bit/s version specifically for the HF radio channel.