Eric, 4Z1UG, has kindly interviewed me for his fine QSO Today Podcast.
I’m spending a month or so improving the speech quality of a couple of Codec 2 modes. I have two aims:
- Make the 700 bit/s codec sound better, to improve speech quality on low SNR HF channels (beneath 0dB).
- Develop a higher quality mode in the 2000 to 3000 bit/s range, that can be used on HF channels with modest SNRs (around 10dB)
I ran some numbers on the new OFDM modem and LDPC codes, and turns out we can get 3000 bit/s of codec data through a 2000 Hz channel at down to 7dB SNR.
Now 3000 bit/s is broadband for me – I’ve spent years being very frugal with my bits while I play in low SNR HF land. However it’s still a bit low for Opus which kicks in at 6000 bit/s. I can’t squeeze 6000 bit/s through a 2000 Hz RF channel without higher order QAM constellations which means SNRs approaching 20dB.
So – what can I do with 3000 bit/s and Codec 2? I decided to try wideband(-ish) audio – the sort of audio bandwidth you get from Skype or AM broadcast radio. So I spent a few weeks modifying Codec 2 to work at 16 kHz sample rate, and Jean Marc gave me a few tips on using DCTs to code the bits.
It’s early days but here are a few samples:
|2||Codec 2 Model, orignal amplitudes and phases||Listen|
|3||Synthetic phase, one bit voicing, original amplitudes||Listen|
|4||Synthetic phase, one bit voicing, amplitudes at 1800 bit/s||Listen|
|5||Simulated analog SSB, 300-2600Hz BPF, 10dB SNR||Listen|
Couple of interesting points:
- Sample (2) is as good as Codec 2 can do, its the unquantised model parameters (harmonic phases and amplitudes). It’s all down hill from here as we quantise or toss away parameters.
- In (3) I’m using a one bit voicing model, this is very vocoder and shouldn’t work this well. MBE/MELP all say you need mixed excitation. Exploring that conundrum would be a good Masters degree topic.
- In (3) I can hear the pitch estimator making a few mistakes, e.g. around “sheet” on the female.
- The extra 4kHz of audio bandwidth doesn’t take many more bits to encode, as the ear has a log frequency response. It’s maybe 20% more bits than 4kHz audio.
- You can hear some words like “well” are muddy and indistinct in the 1800 bit/s sample (4). This usually means the formants (spectral) peaks are not well defined, so we might be tossing away a little too much information.
- The clipping on the SSB sample (5) around the words “depth” and “hours” is an artifact of the PathSim AGC. But dat noise. It gets really fatiguing after a while.
Wideband audio is a big paradigm shift for Push To Talk (PTT) radio. You can’t do this with analog radio: 2000 Hz of RF bandwidth, 8000 Hz of audio bandwidth. I’m not aware of any wideband PTT radio systems – they all work at best 4000 Hz audio bandwidth. DVSI has a wideband codec, but at a much higher bit rate (8000 bits/s).
Current wideband codecs shoot for artifact-free speech (and indeed general audio signals like music). Codec 2 wideband will still have noticeable artifacts, and probably won’t like music. Big question is will end users prefer this over SSB, or say analog FM – at the same SNR? What will 8kHz audio sound like on your HT?
We shall see. I need to spend some time cleaning up the algorithms, chasing down a few bugs, and getting it all into C, but I plan to be testing over the air later this year.
Let me know if you want to help.
Unquantised Codec 2 with 16 kHz sample rate:
$ ./c2sim ~/Desktop/c2_hd/speech_orig_16k.wav --Fs 16000 -o - | play -t raw -r 16000 -s -2 -
With “Phase 0” synthetic phase and 1 bit voicing:
$ ./c2sim ~/Desktop/c2_hd/speech_orig_16k.wav --Fs 16000 --phase0 --postfilter -o - | play -t raw -r 16000 -s -2 -
FreeDV 2017 Road Map – this work is part of the “Codec 2 Quality” work package.
Codec 2 page – has an explanation of the way Codec 2 models speech with harmonic amplitudes and phases.
On May 25th LilacSat-1 was launched from the ISS. The exiting news is that it contains an analog FM to Codec 2 repeater. I’ve been in touch with Wei Mingchuan, BG2BHC during the development phase, and it’s wonderful to see the satellite in orbit. He reports that some Hams have had preliminary contacts.
The LilacSat-1 team have developed their own waveform, that uses a convolutional code running over BPSK at 9600 bit/s. Wei reports a MDS of about -127 dBm on a USRP B210 SDR which is quite respectable and much better than analog FM. GNU radio modules are available to support reception. I think it’s great that Wei and team have used open source (including Codec 2) to develop their own novel systems, in this case a hybrid FM/digital system with custom FEC and modulation.
Now I need to get organised with some local hams and find out how to work this satellite myself!
Part 2 – Making a LilacSat-1 Contact
On Saturday 3 June 2017 Mark VK5QI, Andy VK5AKH and I just made our first LilacSat-1 contact at 12:36 local time on a lovely sunny winter day here in Adelaide! Mark did a fine job setting up a receive station in his car, and Andy put together the video below showing both ends of the conversation:
The VHF tx and UHF rx stations were only 20m apart but the path to LilacSat-1 was about 400km each way. Plenty of signal as you can see from the error free scatter diagram.
I’m fairly sure there is something wrong with the audio (perhaps levels into the codec), as the decoded Codec 2 1300 bit/s signal is quite distorted. I can also hear similar distortion on other LilicSat-1 contacts I have listened too.
There is a clue in this QSO – one end of the contact is much clearer than the other:
I’ll take a closer look at the Codec 2 bit stream from the satellite over the next few days to see if I can spot any issues.
Well done to LilacSat-1 team – quite a thrill for me to send my own voice through my own codec into space and back!
Part 3 – Level Analysis
Sunday morning 4 June after a cup of coffee! I added a little bit of code to codec2.c:codec2_decode_1300() to dump the energy quantister levels:
e_index = unpack_natural_or_gray(bits, &nbit, E_BITS, c2->gray); e = decode_energy(e_index, E_BITS); fprintf(stderr, "%d %f\n", e_index, e);
The energy of the current frame is encoded as a 5 bit binary number. It’s effectively the “AF gain” or “volume” of the current 40ms frame of speech. We unpack the bits and use a look up table to get the actual energy.
We can then run the Codec 2 command line decoder with the LilacSat-1 Codec 2 data Mark captured yesterday to extract a file of energies:
./c2dec 1300 ~/Desktop/LilacSat-1/lilacsat_dgr.c2 - 2>lilacsat1_energy.txt | play -t raw -r 8000 -s -2 - trim 30 6
The lilacsat1_energy.txt file contains the energy quantiser index and decoded energy in a table (matrix) that I can load into Octave and plot. I also ran the same text on the reference cq_freedv file used in Part 2 above:
So the top plot is the input speech “cq freedv ….”, and the middle plot the resulting energy quantiser index values. The energy bounces about with the level of the input speech. Now the bottom plot is from the LilacSat-1 sample. It is “red lined” – hard up against the upper limits of the quantiser. This could explain the audio distortion we are hearing.
Wei emailed me overnight and other Hams (e.g. Bob N6RFM) have discovered that reducing the Mic gain on the uplink FM radios indeed improves the audio quality. Wei is looking into in-flight adjustments of the gain between the FM rx and Codec 2 tx on LilacSat-1.
Note to self – I should look into quantiser ranges to make Codec 2 robust to people driving it with different levels.
Part 4 – Some Improvements
Sunday morning 4 June 11:36am pass: Mark set up his VHF tx in my car, and we played the cq_freedv canned wave file using a laptop and signalink so we could easily vary the tx drive:
Fortunately I have plenty of power available in my Electric Vehicle – we just tapped across 13.2V worth of Lithium cells in the rear pack:
We achieved better results, but not quite as good as using the source file directly without a journey through the VHF FM uplink:
There is still quite a lot of noise on the decoded audio, probably from the VHF uplink. Codec 2 performs poorly in the presence of high levels of background noise. As we are under-deviating, the SNR of the FM uplink will be reduced, further increasing noise. However Wei has just emailed me that his team is reducing the “AF gain” between the VHF rx and Codec 2 on LilacSat-1 so we should hear some improvements on the next few passes.
Note to self #2 – add some noise reduction inside of Codec 2 to make it more robust to different input signal conditions.
The LilacSat-1 page has links to GNU Radio modules that can be used to receive signals from the satellite.
Mark, VK5QI, describes he car’s exotic antennas system and how it was used on todays LilacSat-1 contact.
LilacSat-1 HowTo, Mark and I have documented the set up procedure for LilacSat-1, and written some scripts to help automate the process.
Half way through the year but I thought I better write down some plans anyway! Helps me organise my thoughts and minimise the tangential work. The main goal for 2017 is a FreeDV mode that is competitive with SSB at low SNRs on HF channels. But first, lets see what happened in 2016….
Achievements in 2016
Here is the 2016 Roadmap. Reviewing it, we actually made good progress on a bunch of the planned activities:
- Brady O’Brien (KC9TPA) worked with me to develop the FreeDV 2400A and 2400B modes.
- Fine progress on the SM2000 project, summarised nicely in my Gippstech 2016 SM2000 talk. Thanks in particular to Brady, Rick (KA8BMA) and Neil (VK5KA).
- The open telemetry work provided some key components for the Wenet system for high speed SSDV images from High Altitude Balloons. This work spun out of FSK modem development by Brady and myself, combined with powerful LDPC FEC codes from VK5DSP, with lots of work by Mark VK5QI, and AREG club members. It operates close to the limits of physics: with a 100mW signals we transmit HD images over 100km at 100 kbit/s using a $20 SDR and a good LNA. Many AREG members have set up Wenet receive stations using $100 roadkill laptops refurbished with Linux. It leaves commercial telemetry chips-sets in the dust, about 10dB behind us in terms of performance.
- Ongoing FreeDV outreach via AREG FreeDV broadcasts, attending conferences and Hamfests. Thank you to all those who promote and use FreeDV.
FreeDV 2017 Roadmap
Codec 2 700C is the breakthrough I have been waiting for. Communications quality, conversational speech at just 700 bit/s, and even on a rough first pass it outperforms MELPe at 600 bit/s. Having a viable codec at 700 bit/s lets us consider powerful LDPC FEC codes in the 2000Hz SSB type bandwidths I’m targeting, which has led to a new OFDM modem and the emerging FreeDV 700D mode. I now feel comfortable that I can reach the goal of sub zero dB SNR digital speech that exceeds SSB in quality.
So here is the 2017 roadmap. Partially shaded work packages are partially complete. The pink work packages are ongoing activities rather than project based:
Rather than push FreeDV 700D straight out, I have decided to have another iteration at Codec 2 quality, using the Codec 700C algorithms as a starting point. The FreeDV 700D work has suggested we can use latency to overcome the HF channel, which means frames of several hundred ms to several seconds. By exploring correlation over longer Codec 2 frames we can achieve lower bit rates (e.g. sub 400 bit/s) or get better voice quality at 700 bit/s and above.
I’ve been knocking myself out to get good results at low SNRs. However many HF and indeed VHF/UHF PTT radio conversions take place at SNRs of greater than 10dB. This allows us to support higher bit rate codecs, and achieve better speech quality. For example moving from 0dB to 10dB means 10 times the bit rate at the same Bit Error Rate (BER). The OFDM modem will allow us to pack up to 4000 bit/s into a 2000 Hz SSB channel.
The algorithms that work so well for Codec 2 700C can be used to increase the quality at higher bit rates. So the goal of the “Codec 2 Quality” work package is to (i) improve the quality at 700-ish bit/s, and (ii) come up with a Codec 2 mode that improves on the speech quality of Codec 2 1300 (as used in the FreeDV 1600 mode) at 2000-ish bit/s.
After the Codec 2 quality improvement I will port the new algorithms to C, release and tune on the FreeDV GUI program, then port to the SM1000. Fortunately the new OFDM modem is simpler in terms of memory and computation that the COHPSK modem used for FreeDV 700C. We have an option to use a short LDPC code (224 bits) which with a little work will run OK on the SM1000.
Putting it all Together
The outputs will be a low SNR mode competitive with SSB at low SNR, and (hopefully) a high SNR mode that sounds better than SSB at medium to high SNRs. It will be available as a free software download (FreeDV GUI program), an embedded stand-alone product (the SM1000), and as a gcc library (FreeDV API)
Then we can get back to VHF/UHF work and the SM2000 project.
When will this all happen? Much sooner if you help!
I’m a busy guy, making steady progress in the field of open source digital radio. While I appreciate your ideas, and enjoy brainstorming as much as the next person, what I really want are your patches, and consistent week in/week out effort. If you can code in C and/or have/are willing to learn a little GNU Octave, there is plenty of work to be done in SM1000 maintenance, a port of the OFDM modem to C, SM1000 hardware/software maintenance, and FreeDV GUI program refactoring and maintenance. Email me.
FreeDV 2016 Roadmap. Promises, promises……
FreeDV 2400A and 2400B modes
SM2000 – First post introducing the SM2000 project
Gippstech 2016 SM2000 talk – Good summary of SM2000 project to date
LowSNR site – from Bill (VK5DSP) modem guru and LDPC code-smith
Horus 39 – Fantastic High Speed SSDV Images – Good summary of Wenet blog posts and modem technology
SM1000 Digital Voice Adaptor
AREG FreeDV broadcasts
Over the past 30 years, HF radio noise in urban areas has steadily increased. S6-S9 noise levels are common, which makes it hard to listen to the signals we want to receive.
I’ve been wondering if we can attenuate this noise using knowledge of the properties of the noise, and some clever DSP. Even 6dB would be useful, that’s like the transmitting station increasing their power by a factor of 4. I’ve just spent 2 months working on a 4dB improvement in my FreeDV work. So this week I’ve been messing about with pen and paper and a few simulations, exploring the problem of man-made noise on HF radio.
One source of noise is switching power supplies, which have short, high current pulses flowing through them at a rate of a few hundred kHz. A series of short impulses in the time domain produces a series of spectral lines (i.e sinusoids or tones) in the frequency domain, so a 200kHz switcher produces tones at 200kHz, 400kHz, 600kHz etc. These tones are the “birdies” we hear as we tune our HF radios. The shorter the pulses are, the higher in frequency they will extend.
Short pulses lead to efficient switch mode power supplies, which is useful for energy efficiency, and especially desirable for high power devices like electric car chargers and solar panel inverters. So the trend is shorter switching times, higher currents and therefore more HF noise.
The power supplies adjust the PWM pulse-width back and forth as they adjust to varying conditions, which introduces a noise component. This is similar to phase noise in oscillators, and causes a continuous noise floor to appear in addition to the tones. The birdies we can tune around, but the noise floor sets a limit on urban HF operations.
The Octave script impulse_noise.m was used to generate the plots in this post. Here is a plot of some PWM impulse samples (top), and the HF spectrum.
I’ve injected a “wanted” signal at 1MHz for comparison. Given a switcher frequency of 255kHz, with 0.1V impulse amplitude, the noise floor is -90dBV down, or about 10uV. This is S5-S6 level noise, assuming 0.1V impulse amplitude induced onto our antenna by local switcher noise, e.g. nearby house wiring, or the neighbors TV. These numbers seem reasonable and match what we hear in our receivers.
Single, isolated pulses are an easier problem. Examples are lightning or man-made sources that produce pulses at a rate slower than the bandwidth of the signal we are interested in.
A single impulse produces a flat spectrum, so the noise at frequency f Hz is almost the same as the noise at frequency f+delta Hz, where delta is small. This means you can use the noise at frequencies next to the one you are interested in to estimate and remove the noise in your frequency of interest.
Here is an impulse that lasts two samples, the magnitude spectrum changes slowly, although the phase changes quickly due to the time offset of the impulse.
Turns out that if the impulse position is known, and most of the energy is confined to that impulse, we can make a reasonable estimate of the noise at one frequency, from the noise at adjacent frequencies. Below we estimate the phase and magnitude (green cross) of frequency bin H(k+1) (nearby blue cross) from bin H(k). I’ve actually plotted H(k-1), H(k), and H(k+1) for comparison. The error in the estimation is -44dB down, so that’s a lot of noise removed.
Unfortunately this gets harder when there are multiple impulses in the same time window, and I can’t work out how to remove noise is this case. However this idea might be useful for some classes of impulse noise.
Another idea I tried was “blanking” out the impulses, buy opening and closing a switch so that the impulses are not allowed into the receiver. This works OK when we have a wideband signal, but falls over when just a bandpass version is available. In the bandpass version the “pulse” is smeared over time and we are no longer able to gate it out.
There will also be problems dealing with multiple PWM signals, that have different timing and frequency.
I haven’t looked at samples of the RF received from any real world switcher signals yet. I anticipate the magnitude and phase of the switcher signal will be all over the place, due to some torturous transfer function between the switcher and the terminals of my receiver. Plus various other signals will be present. Possibly there is a wide spectrum (short noise pulses) that we can work with. However I’d much rather deal with narrow bandpass signals consisting of just our wanted signal plus the switcher noise floor.
I might get back to my FreeDV work now, and leave this work on the back burner. I do feel I’m getting my head around the problem, and developing a “bag of tricks” that will be useful when other pieces fall into place.
The urban noise appears to be localised, e.g. if you head out into the country the background noise level is much lower. This suggests it’s coupled into the HF antenna by some local effect like induction. So another approach is to estimate the noise using a separate receiver that just picks up the local noise, through a sense antenna that is inefficient for long distance HF signals.
The local noise sequence could then be subtracted from the HF signal. I am aware of analog boxes that do this, using a magnitude and phase network to match the differences in signals received by the sense and HF antennas.
However a DSP approach will allow a more complex relationship (like an impulse response that extends for several microseconds) between the two antenna signals, and allow automatic adjustment. The noise spectrum can change quickly, as PWM is modulated and multiple devices turn on and off in the neighborhood. However the relationship between the two antennas will change slowly if they are fixed in space. This problem reminds me of echo cancellation, something I have played with before. Given radio hardware is now very cheap ($20 SDR dongles), multiple receivers could also be used.
So my gut feel remains that HF urban noise can be reduced to some extent (e.g. 6 or 12dB suppression) using DSP. If those nasty PWM switchers are inducing RF voltages into our antennas, we can work out a way to subtract those voltages.
OK so after several attempts I finally managed to push a 700D signal from my QTH in Adelaide (PF95gc) 1170km to the Manly Warringah Radio Society WebSDR in Sydney (QF56oh). Bumped my power up a little, raised my antenna, and hunted around until I found a relatively birdie-free frequency, as even low level birdies are stronger than my very weak signal.
Have a listen:
|Analog SSB||700D modem||Decoded 700D DV|
Here is a spectrogram (i.e. a waterfall with the water falling from left to right) of the analog then digital signal:
Faint birdies (tones) can be seen as horizontal lines at 1000 and 2000 Hz. You can see the slow fading on the digital signal as it dips beneath the noise every few seconds.
The scatter diagram looks like bugs (bits?) splattered on a windscreen:
The slow fading causes the errors to bounce up and down over time (above). The packet error rate (measured on the 28 bit Codec 2 frames) is 26%. This is rather high, but I would argue we have intelligible speech here, and that the intelligibility is better than SSB.
I used 4 interleaver frames, which is about 640ms. Perhaps a longer interleaver would ride over the fades.
I’m impressed! Conditions were pretty bad on 40m, the band was “closed”. This is day 1 of FreeDV 700D. It will improve from here.
The Octave demodulator doing it’s thing:
octave:56> ofdm_rx("~/Desktop/700d_part2/manly5_4.wav",4, "manly5_4.err") Coded BER: 0.0206 Tbits: 12544 Terrs: 259 PER: 0.2612 Tpacketerrs: 117 Tpackets: 448 Raw BER..: 0.0381 Tbits: 26432 Terrs: 1007
Not sure if I’m working out raw and coded BER right as they are not usually this close. Will look into that. Maybe all the errors are in the fades, where both the demod and LDPC decoder fall in a heap.
The ofdm_tx/ofdm_rx system transmits test frames of known data, so we can work out the BER. By xor-ing the tx and rx bits we can generate an error pattern that can be used to insert errors into a Codec 2 700C bit stream, using this magic incantation:
~/codec2-dev/build_linux/src$ sox ~/Desktop/cq_freedv_8k.wav ~/Desktop/cq_freedv_8k.wav -t raw -r 8000 -s -2 - | ./c2enc 700C - - | ./insert_errors - - ../../octave/manly5_4.err 28 | ./c2dec 700C - - | sox -t raw -r 8000 -s -2 - ~/Desktop/manly5_4_ldpc224_4.wav
It’s just like the real thing. Trust me. And it gives me a feel for how the system is hanging together earlier rather than after months more development.
Lots of links on the Towards FreeDV 700D post earlier today.
For the last two months I have been beavering away at FreeDV 700D, as part my eternal quest to show SSB who’s house it is.
This work was inspired by Bill, VK5DSP, who kindly developed some short LDPC codes for me; and suggested I could improve on the synchronisation overhead of the cohpsk modem. As an aside – Bill is part if the communications payload team for the QB50 SUSat Cubesat – currently parked at the ISS awaiting launch! Very Kerbal.
Anyhoo – I’ve developed a new OFDM modem that has less syncronisation overhead, works better, and occupies less RF bandwidth (1000 Hz) than the cohpsk modem used for 700C. I have wrapped my head around such arcane mysteries as coding gain and now have LDPC codes playing nicely over that nasty old HF channel.
It looks like FreeDV 700D has a gain of 4dB over 700C. This means error free operation at -2dB SNR for AWGN, and 2dB SNR over a challenging fast fading HF channel (two paths, 1Hz Doppler, 1ms delay).
- An OFDM modem with with low overhead (small Eb/No penalty) synchronisation, even on fading channels.
- Use of LDPC codes.
- Long (several seconds) interleaver.
- Ruthlessly hunting down any dB’s leaking out of my performance curves.
One nasty surprise was that after a closer look at the short (224,112) LDPC codes, I discovered they don’t give any real improvement over the simple diversity scheme used for FreeDV 700C. However with long interleaving (several seconds) of the short codes, or a long (few thousand bit/several seconds) LDPC code we get an additional 3dB gain. The interleaver allows us to ride over the ups and downs of the fast fading channel.
Interleaving has a few downsides. One is delay, the other is when they fail you lose a big chunk of data.
I’ve avoided delay until now, using the argument that low delay is essential for PTT radio. However I’d like to test long delays and see what the trade off/end user experience is. Once someone is speaking – i.e in the middle of an “over” – I suspect we won’t notice the delay. However it could get confusing in fast handovers. This is experimental radio, designed for very low SNRs, so lets give it a try.
We could send the uncoded data without interleaving – allowing low delay decoding when the SNR is high. A switch could control LDPC decoding, allowing a user selection of coded-high-delay or uncoded-low-delay, like a noise banker. Mark, VK5QI, has suggested interleaver depth also be adjustable which I think is a good idea. The decoder could automagically determine interleaver depth by attempting decoding over a range of depths (1,2,4,8,16 frames etc) and noting when the LDPC code converges.
Or maybe we could use a small, low delay, interleaver, and just live with the fades (like we do on SSB) and get the vocoder to mute or interpolate over them, and enjoy low or modest latency.
I’m also interested to see how the LDPC code mops up errors like static bursts and other real-world HF rubbish that SSB subjects us to even on high SNR channels.
So, lots of room for experimentation. At this stage it’s all in GNU Octave simulation form, no C implementation or FreeDV GUI mode exists yet.
Lots more I could write about the engineering behind the modem, but lets leave it there for now and take a look at some results.
Here is a rather busy set of BER versus SNR curves (click for larger version, and here is an EPS file version):
The 10-2 line is where the codec gets easy to listen to.
Observe far-right green (700C) to black (700D candidate with lots of interleaving) HF curves, which are about 4dB apart. Also the far-left cyan shows 700D working at -3dB SNR on AWGN channels. One dB later (-2dB) LDPC magic stomps all errors.
Here are some speech/modem tone samples on simulated channels:
|AWGN -2dB SNR||Analog SSB||700D modem||700D DV|
|HF +0.8dB SNR||Analog SSB||700D modem||700D DV|
The analog samples have a 300 to 2600 Hz BPF applied at the tx and rx side, to model an analog SSB radio. The analog SSB and 700D modem signals have exactly the same RMS power and channel models applied to them. In the AWGN channel, it’s difficult to hear the 700D modem signal, however the SSB is audible as it has peaks 9dB above the average.
OK so the 700 bit/s vocoder (Codec 2 700C) speech quality is not great even with no errors, but we have found it supports conversations just fine, and there is plenty of room for improvement. The same techniques (OFDM modem, LDPC interleaving) can also be applied to high quality/high bit rate/high SNR voice modes. But first – I want to push this low SNR DV work through to completion.
This list summarises the GNU Octave code I’ve developed, as I’ll probably forget the details when I move onto the next project. Feel free to try any of these scripts and let me know what I’ve forgotten to check in. It’s all checked into codec2-dev/octave.
|ldpc.m||Wrapper functions for using the CML library LDPC functions with Octave|
|ldpcut.m||Unit test/demo for ldpc.m|
|ldpc_qpsk.m||Runs simulations for a bunch of codes for AWGN and HF channels using a simulated QPSK OFDM modem. Runs at the Rs (the symbol rate), assumes ideal modem|
|ldpc_short.m||Simulation used for initial short LDPC code investigation using an ideal rate Rs BPSK modem. Bunch of codes and interleaving schemes tested|
|ofdm_lib.m||Library of OFDM modem functions|
|ofdm_rs.m||Rate Rs OFDM modem simulation used to develop low overhead pilot symbol phase estimation scheme|
|ofmd_dev.m||Rate Fs OFDM modem simulation. This is the real deal, with timing and frequency offset estimation, LDPC integration, and tests for coarse timing and frequency offset estimation|
|ofdm_tx.m||Generates test frames of OFDM raw file samples to play over your HF radio|
|ofdm_rx.m||Receives raw file samples from your HF radio and 700D-demodulates-decodes, and measures BER and PER|
Just this morning I tried to radiate some FreeDV 700D from my home to some interstate SDRs on 40M, but alas conditions were against me. I did manage to radiate across my bench so I know the waveform does make it through real HF radios OK.
Please try sending these files through your radio:
|ssb_otx_224_32.wav||32 frame (5.12 second) interleaver|
|ssb_otx_224_4.wav||4 frame (0.64 second) interleaver|
Get someone (or a websdr) to sample the received signal (8000Hz sample rate, 16 bit mono), and email me the received file.
Or you can decode it yourself using:
The rx side is still a bit rough, I’ll refine it as I try the system with real off-air signals and flush out the bugs.
QB50 SUSat cubesat – Bill and team’s Cubesat currently parked at the ISS!
Codec 2 700C and Short LDPC Codes
Testing FreeDV 700C
Modems for HF Digital Voice Part 1
Modems for HF Digital Voice Part 2
FreeDV 700D – First Over The Air Tests
Today, like most mornings, I biked to a cafe to hack on my laptop while slurping on iced coffee. Exercise, fresh air, sugar, caffeine and R&D. On this lovely sunny Autumn day I’m tapping away on my lappy, teasing bugs out of my latest digital radio system.
Creating new knowledge is a slow, tedious, business.
I test each small change by running an experiment several hundred thousand times, using simulation software on my laptop. R&D – Science by another name – is hard. One in ten of my ideas actually work, despite being at the peak of my career, having a PhD in the field, and help from many very intelligent peers.
A different process is going on at the table next to me. An “Integrative Health Consultant” is going about her business, speaking to a young client.
In an earnest yet authoritative Doctor-voice the “consultant” revs up with ill-informed dietary advice, moves on to over-priced under-performing products that the consultant just happens to sell, and ends up with a thinly disguised invitation to join her Multi-Level-Marketing (MLM) organisation. With a few side journeys through anti-vaxer land, conspiracy theories, organic food, anti-carbo and anti-gluten, sprinkled with disparaging remarks on Science, evidence based medicine and an inspired stab at dissing oncology (“I know this guy who had chemo and still died!”). All heavily backed by n=1 anecdotes.
A hobby of mine is critical thinking, so I am aware that most of their conversation is bullshit. I know how new knowledge is found (see above) and it’s not from Facebook.
But this post is not about the arguments of alt-med versus evidence based medicine. Been there, done that.
Here is what bothers me. These were both good people, who more or less believe in what they say. They are not stupid, they are intelligent and want to help people get and stay healthy. I have friends and family that I love who believe this crap. But they are hurting society and making people sicker.
Steering people away from modern, evidence based medicine kills people. Someone who is persuaded to see a naturopath rather than an oncologist will find out too late the price of well-meaning ignorance. Anti-vaxers hurt, maim, and kill for their beliefs. I shudder to think of the wasted lives and billions of dollars that could be spent on far better outcomes than lining the pockets of snake oil salespeople.
There is some encouraging news. The Australian Government has started removing social security benefits from people who don’t vaccinate. The Nursing and Midwifery Board is also threatening to take action against Nurses who push an anti-vaccination stance.
But this is beating people with a stick, where is the carrot?
Doctors in the Dark Ages were good people. They really believed leaches, blood letting and prayer where helping the patients they loved. But those beliefs sustained untold human misery. The difference with today?
Science, Education, and Policy.
Yesterday I was chatting on the #freedv IRC channel, and a good question was asked: how close is Codec 2 to AMBE+2 ? Turns out – reasonably close. I also discovered, much to my surprise, that Codec 2 700C is better than MELPe 600!
|Original||AMBE+2 3000||AMBE+ 2400||Codec 2 3200||Codec 2 2400|
|Original||MELPe 600||Codec 2 700C|
Here are all the samples in one big tar ball.
I don’t have a AMBE or MELPe codec handy so I used the samples from the DVSI and DSP Innovations web sites. I passed the original “DAMA” speech samples found on these sites through Codec 2 (codec2-dev SVN revision 3053) at various bit rates. Turns out the DAMA samples were the same for the AMBE and MELPe samples which was handy.
These particular samples are “kind” to codecs – I consistently get good results with them when I test with Codec 2. I’m guessing they also allow other codecs to be favorably demonstrated. During Codec 2 development I make a point of using “pathological” samples such as hts1a, cg_ref, kristoff, mmt1 that tend to break Codec 2. Some samples of AMBE and MELP using my samples on the Codec 2 page.
I usually listen to samples through a laptop speaker, as I figure it’s close to the “use case” of a PTT radio. Small speakers do mask codec artifacts, making them sound better. I also tried a powered loud speaker with the samples above. Through the loudspeaker I can hear AMBE reproducing the pitch fundamental – a bass note that can be heard on some males (e.g. 7), whereas Codec 2 is filtering that out.
I feel AMBE is a little better, Codec 2 is a bit clicky or impulsive (e.g. on sample 1). However it’s not far behind. In a digital radio application, with a small speaker and some acoustic noise about – I feel the casual listener wouldn’t discern much difference. Try replaying these samples through your smart-phone’s browser at an airport and let me know if you can tell them apart!
On the other hand, I think Codec 2 700C sounds better than MELPe 600. Codec 2 700C is more natural. To my ear MELPe has very coarse quantisation of the pitch, hence the “Mr Roboto” sing-song pitch jumps. The 700C level is a bit low, an artifact/bug to do with the post filter. Must fix that some time. As a bonus Codec 2 700C also has lower algorithmic delay, around 40ms compared to MELPe 600’s 90ms.
Curiously, Codec 2 uses just 1 voicing bit which means either voiced or unvoiced excitation in each frame. xMBE’s claim to fame (and indeed MELP) over simpler vocoders is the use of mixed excitation. Some of the spectrum is voiced (regular pitch harmonics), some unvoiced (noise like). This suggests the benefits of mixed excitation need to be re-examined.
I haven’t finished developing Codec 2. In particular Codec 2 700C is very much a “first pass”. We’ve had a big breakthrough this year with 700C and development will continue, with benefits trickling up to other modes.
However the 1300, 2400, 3200 modes have been stable for years and will continue to be supported.
Here is the blog post that kicked off Codec 2 – way back in 2009. Here is a video of my linux.conf.au 2012 Codec 2 talk that explains the motivations, IP issues around codecs, and a little about how Codec 2 works (slides here).
What I spoke about then is still true. Codec patents and license fees are a useless tax on business and stifle innovation. Proprietary codecs borrow as much as 95% of their algorithms from the public domain – which are then sold back to you. I have shown that open source codecs can meet and even exceed the performance of closed source codecs.
Wikipedia suggests that AMBE license fees range from USD$100k to USD$1M. For “one license fee” we can improve Codec 2 so it matches AMBE+2 in quality at 2400 and 3000 bit/s. The results will be released under the LGPL for anyone to use, modify, improve, and inspect at zero cost. Forever.
Maybe we should crowd source such a project?
This is how I generated the Codec 2 wave files:
~/codec2-dev/build_linux//src/c2enc 3200 9.wav - | ~/codec2-dev/build_linux/src/c2dec 3200 - - | sox -t raw -r 8000 -s -2 - 9_codec2_3200.wav
DSP Innovations, MELPe samples. Can anyone provide me with TWELP samples from these guys? I couldn’t find any on the web that includes the input, uncoded source samples.
In the last blog post I evaluated FreeDV 700C over the air. This week I’ve been simulating the use of short LDPC FEC codes with Codec 2 700C over AWGN and HF channels.
In my HF Digital Voice work to date I have shied away from FEC:
- We didn’t have the bandwidth for the extra bits required for FEC.
- Modern, high performance codes tend to have large block sizes (1000’s of bits) which leads to large latency (several seconds) when applied to low bit rate speech.
- The error rates we are interested in (e.g. 10% raw, 1% after FEC decoder) are unusual – many codes don’t work well.
However with Codec 2 pushed down to 700 bit/s we now have enough bandwidth for a rate 1/2 code inside a standard 2kHz SSB channel. Over coffee a few weeks ago, Bill VK5DSP offered to develop some short LDPC codes for me specifically for this application. He sent me an Octave simulation of rate 1/2 and 2/3 codes of length 112 and 56 bits. Codec 2 700C has 28 bit frames so this corresponds to 4 or 2 Codec 2 700C frames, which would introduce a latencies of between 80 to 160ms – quite acceptable for Push To Talk (PTT) radio.
I re-factored Bill’s simulation code to produce ldpc_short.m. This measures BER and PER for Bill’s short LDPC codes, and also plots curves for theoretical, HF multipath channels, a Golay (24,12) code, and the current diversity scheme used in FreeDV 700C.
To check my results I compared the Golay BER and ideal HF multipath (Rayleigh Fading) channel curves to other peoples work. Always a good idea to spot check a few values and make sure they are sensible. I took a simple approach to get results in a reasonable amount of coding time (about 1 day of work in this case). This simulation runs at the symbol rate, and assumes ideal synchronisation. My other modem work (i.e experience) lets me move back and forth between this sort of simulation and real world modems, for example accounting for synchronisation losses.
Error Distribution and Packet Error Rate
I had an idea that Packet Error Rate (PER) might be important. Without FEC, bit errors are scattered randomly about. At our target 1% BER, many frames will have 1 or 2 bit errors. As discussed in the last post Codec 2 700C is sensitive to bit errors as “every bit counts”. For example one bit error in the Vector Quantiser (VQ) index (a big look up table) can throw the speech spectrum right off.
However a LDPC decoder will tend to correct all errors in a codeword, or “die trying” (i.e. fail badly). So an average output BER of say 1% will consist of a bunch of perfect frames, plus a completely trashed one every now and again. Digital voice works better with this style of error pattern than a few random errors in each codec packet. So for a given BER, a system that delivers a lower PER is better for our application. I’ve guesstimated a 10% PER target for intelligible low bit rate speech. Lets see how that works out…..
Here are the BER and PER curves for an AWGN channel:
Here are the same curves for HF (multipath fading) channel:
I’ve included a Golay (24,12) block code (hard decision) and uncoded PSK for comparison to the AWGN curves, and the diversity system on the HF curves. The HF channel is modelled as two paths with 1Hz Doppler spread and a 1ms delay.
The best LDPC code reaches the 1% BER/10% PER point at 2dB Eb/No (AWGN) and 6dB (HF multipath). Comparing BER, the coding gain is 2.5 and 3dB (AWGN and HF). Comparing PER, the coding gain is 3 and 5dB (AWGN and HF).
Here is a plot of the error pattern over time using the LDPC code on a HF channel at Eb/No of 6dB:
Note the errors are confined to short bursts – isolated packets where the decoder fails. Even though the average BER is 1%, most of the speech is error free. This is a very nice error distribution for digital speech.
Here are some speech samples, comparing the current diversity scheme used for FreeDV 700C to LDPC, for AWGN and LDPC channels. These were simulated by extracting the error pattern from the simulation then inserting these errors in a Codec 2 700C bit stream (see command lines section below).
|AWGN Eb/No 2dB||Diversity||LDPC|
|HF Eb/No 6dB||Diversity||LDPC|
These results are very encouraging and suggest a gain of 2 to 5dB over FreeDV 700C, and better error distribution (lower PER). Next step is to develop FreeDV 700D – a real world implementation using the 112 data-bit rate 1/2 LDPC code. This will require 4 frames of buffering, and some sort of synchronisation to determine the 112 bit frame boundaries. Fortunately much of the C code for these LDPC codes already exists, as it was developed for the Wenet High Altitude Balloon work.
If most frames at the decoder input are now error free, we can consider more efficient (but less robust) techniques for Codec 2, such as prediction (delta coding). This will decrease the codec bit rate for a given speech quality. We could then choose to reduce our bit rate (making the system more robust for a given channel SNR), or raise speech quality while maintaining the same bit rate.
Generating the decoded speech, first run the Octave ldpc_short simulation to generate “error pattern file”, then subject the Codec 2 700C bit stream to these error patterns.
octave:67> ldpc_short $ ./c2enc 700C ../../raw/ve9qrp_10s.raw - | ./insert_errors - - ../../octave/awgn_2dB_ldpc.err 28 | ./c2dec 700C - - | aplay -f S16_LE -
The simulation generate .eps files as direct generation of PNG leads to font size issues. Converting EPS to PNG without transparent background:
mogrify -resize 700x600 -density 300 -flatten -format png *.eps
However I still feel the images are a bit fuzzy, especially the text. Any ideas? Here’s the eps file if some one would like to try to get a nicer PNG conversion for me! The EPS file looks great at any scaling when I render it using the Ubuntu document viewer.
Update: A friend of mine (Erich) has suggested using GIMP for the conversion. This does seem to work well and has options for text and line anti-aliasing. It would be nice to be able to generate nice PNGs directly from Octave – my best approach so far is to capture screen shots.
LowSNR site Bill VK5DSP writes about his experiments in low SNR communications.
Wenet High Altitude Balloon SSDV System developed with Mark VK5QI and BIll VK5DSP that uses LDPC codes.