FreeDV Robustness Part 1

Here is the next installment in my adventure of making FreeDV work as least as well as analog SSB over HF multipath fading channels. I’ve included the command lines I am using for those who want to play along with me.

Incremental Improvements

Over the past few weeks I have made some incremental improvements:

  1. Bill Cowley pointed out the Gray code mapping was wrong. That’s worth up to 25% of the Bit Error Rate (BER).
  2. Some SSB transmitters may have a non-flat frequency response. This can have a big effect on FreeDV performance. I encourage anyone using FreeDV to test your transmitters frequency response using this wave file and power meter. This file sweeps between 1000 Hz and 2000 Hz over 10 seconds. Just play it through the same rig interface at the same levels as you would run FreeDV. The power of the tone is about the same as the FreeDV modem signal. Monitor the variation in power over the sweep, for example if the power changes between 10W and 20W that’s a 3dB slope.
  3. Improvements to the sync state machine algorithm to avoid sync dropping out so when a fade temporarily wipes out the centre BPSK sync carrier.

Testing FreeDV over HF Channels

I need a repeatable way to test the system. The first step was to generate a 30 second file (1400 bit/s x 30 = 42000 bits) of modulated test bits (112 bit sequence that repeat every 80ms) using the command line tools:
~/codec2-dev/src$ ./fdmdv_get_test_bits - 42000 | ./fdmdv_mod - mod_test_v1.1.raw

This file was converted to a wave file using sox and sent to some friends for transmission over the air. This gives me samples of bit errors over real HF channels. I have been passing it through the Pathsim HF channel simulator (which runs well on Linux under Wine):

After processing by Pathsim, I can use the Octave demod simulation to demodulate the signal and generate error patterns:
octave:12> fdmdv_demod("../src/mod_test_v1.1_moderate_10dB.wav",1400*30,"moderate_10dB_inter.err")

Here is the waterfall and bit error patterns for the CCIR 10dB Moderate channel:

Using my handy collection of command line tools I can then apply this bit error pattern to a Codec 2 bit stream and listen to the results:
/codec2-dev/src$ ./insert_errors ve9qrp.c2 - ../octave/moderate_10dB.err 0 55 | ./c2dec 1400 - - | play -t raw -r 8000 -s -2 -

Note ve9qrp.c2 is a 1400 bit/s Codec 2 file, the “0 55” parameters specifies the range to apply bit errors to. This range lets me simulate the effect of errors on different parts of the Codec 2 frame. Not all bits are created equal. I happen to know that the excitation bits (voicing, pitch/energy VQ) are the most sensitive. They live in the first 20 bits of the 56 bit frame.

I am interested in using a (24,12) Golay code which can protect multiples of 12 bits. If we protect two blocks of 12 bits thats the first 24 bits protected. Lets assume this code can correct all bit errors. To simulate the effect of perfect FEC on the first 24 bits (with no protection on the last 32 bits) we can use:
/codec2-dev/src$ ./insert_errors ve9qrp.c2 - ../octave/moderate_10dB.err 24 55 | ./c2dec 1400 - - | play -t raw -r 8000 -s -2 -

This seems to work pretty well, as you can see from the samples below:

Sample
Modem signal
Modem signal (CCIR moderate 10dB)
Codec 2 with errors on all bits 0 to 55
Codec 2 with errors on bits 24 to 55 (simulated FEC)
Codec 2 with no errors
Analog SSB (CCIR moderate 10dB)

This suggests some FEC protecting the first 24 bits will give good performance on HF multipath channels, with intelligibility similar to analog SSB – but without the noise.

Increasing the bit rate

We currently use all of the 1400 bit/s for the Codec 2 data. We now need some more bits to carry FEC information.

One strategy is to make Codec 2 operate at a lower bit rate (say 1000 bit/s), then use the balance (say 400 bits/s) for FEC. This is tricky, as Codec 2 already operates at a pretty low bit rate. It might be possible if we make wider use prediction of the Codec parameters (i.e. sending differences), and/or vector quantisation (big tables to quantise a bunch of parameters at once). Prediction means the effect of an uncorrected bit error propagates into future frames. Vector Quantisation means a single bit error can change many parameters all at once.

So lowering the Codec 2 bit rate is likely to make the Codec more sensitive to bit errors. This might still be OK if we use blanket FEC protection of all bits. Some people prefer this approach, but I am not going to work on it for now. Instead I will stick with the 1400 bit/s speech codec and look at increasing the bit rate over the channel to provide bits for FEC.

Lets say we need to protect 24 bits/frame with a half rate code. That means we need a total of 56+24 = 80 bits/frame or 2000 bits/s. Some options are:

  • Go from 14 to 20 QPSK carriers, this will make the bandwidth 20x75Hz/carrier = 1500Hz, and reduce the power per carrier by 14/20 or 1.54dB. The wider bandwidth may cause problems with the skirts of the Tx audio filtering. Twenty carriers will also make Peak/Average Power Ratio (PAPR) problems worse, perhaps reducing our effective power output further by requiring further tx drive back off.
  • We could increase the symbol rate from 50 to 100 baud. This would reduce the energy/symbol by 3dB, but we would need only 10 carriers to get 2000 bit/s which means a power increase per carrier of 14/10 or 1.46dB so once again we have a difference of 3-1.46 = 1.54dB. The bandwidth of the signal would be 10*150Hz/carrier = 1500Hz. PAPR would be improved as we have only 10 carriers.
  • We could go to 10 carriers using 8PSK, this would reduce our energy/bit by about 3dB (according to this graph), but we would gain 1.54dB by moving from 14 carriers to 10, giving a raw performance 1.46dB less than the current 1400 bit/s waveform. The bandwidth would be a very tight 10x75Hz/carrier = 750Hz, narrower than the current 1400 bit/s, 1100Hz wide waveform. PAPR will be better. The narrow bandwidth has pros and cons – good for channel density and interference, but bad for frequency diversity – a wider bandwidth is generally better for handling frequency selective fading.

Why does Analog SSB work so well?

I have been thinking about the differences between analog SSB and digital speech. I figure we might be able to learn from the strengths of Analog SSB:

  • Frequency selective fading causes frequency selective filtering of the analog signal which is no big deal as speech is very redundant. With digital speech a freq selective fade may wipe out a bit that carries information for the entire spectrum, for example the voicing, pitch or energy bits.
  • There is no memory in analog speech. An “error” in the speech spectrum only affects the signal at the time it occurs. A bit error in digital speech may affect future bits for a few 100ms, if predictive coding of model parameters is used (e.g. the pitch/energy quantiser in Codec 2).
  • As the SNR drops the analog signal drops into the noise. The human ear is good at digging signals out of noise, although it can be fatiguing. Digital errors due to low SNR cause R2D2 sounds that the human ear finds harder to make sense of.
  • Analog speech automatically applies more power to the perceptually important low speech frequencies, as the audio signal entering the microphone from your mouth has higher low frequency energy. So as the SNR drops the perceptually important low frequency energy is the last to go, an ideal situation.
  • The FDM modem waveform is sensitive to clipping of the peaks. The analog signal processing of the SSB receiver and ear of the listener is not. So analog speech can be sent at a higher average power level.
  • Codec 2 produces intelligible speech up to a bit error rate of about 2%. Now 0.02 of 56 bit/s frame is about 1 bit error. So if even one carrier is suppressed by a fade we are in trouble with digital speech.
  • The FDMDV modem signal is narrower than analog SSB, so a fade of the same width wipes out proportionally more of the digital signal.
  • A fade during silence (say between syllables or words) or in a low energy (inter-formant) area of the speech spectrum has no effect on analog signals. A fade in silence frames or in any part of the modem signal spectrum has an effect on the digital speech.
  • For comparison, here is the waterfall (spectrogram) for the analog SSB sample:

    The strong vertical lines are due to clipping by the channel simulator as it’s AGC adjusts, this can also be heard as clipping in the analog sample. This is probably a set up error on my part. The comb like lines are the pitch harmonics of the speaker. Note the bandwidth is much wider than the FDMDV signal. Unlike the spectrogram of the FDMDV signal, it’s not obvious where the fades are. Most of the “area” of the spectrogram is low energy (blue) – silence in the time domain or low energy regions in the frequency domain.

Next Steps

The next step is to modify the Octave modem simulation so it support 8PSK as well as QPSK, and a run-time defined number of carriers. This can then be used to generate waveforms that can be played over the air or through the channel simulator. I’ll then build up some command line programs to send Codec frames that include various combinations of FEC.

2 thoughts on “FreeDV Robustness Part 1”

  1. David,

    This stuff about SSB vs Digital voice is a fascinating line of thought. I wonder if other digital voice systems that operate over the unreliable internet have come up with something.

    Have you noticed how Skype tends to make an “aaah” sound when packets have been missed? I wonder if they have a strategy of covering missing data with an extension of the kind of sound preceding it. Perhaps this happens more than we know?

    Also, you mention the gaps between speech, perhaps something else could be sent during those times – for example the text data. Often people pause during a contact for a few seconds.

    Great work!

    Peter

    1. Hi Peter,

      Thanks. Over IP links voice packets are sent using UDP, which sometimes results in packet loss. In this case, repeating the last packet is usually a good approximation, and the lost packet may not be noticed (especially if it’s a silence or unvoiced frame). One significant difference is the “burst error” duration for VOIP may be a few 10’s of ms, for HF radio it can be a second or so. When a Skype packet loss extends beyond a few 10’s of ms, you get the ahhh sound as the approximation of repeating the last packet no longer works.

      In one of the early versions of FreeDV I experimented with the idea of substituting a text packet during low energy frames (background noise or speech). It worked OK but the achilles heel was the need for an extra bit to specify voice/data packet. This bit was very senstive to bit errors so I chickened out and when to a 1/bit frame text channel.

      – David

Comments are closed.