Natural and Gray Coding

After writing up the Variable Power Quantiser work I added another function to my fuzzy_gray.m Octave simulation to compare natural and Gray coded binary.

Here are some results for 3,4, and 5 bit quantisers over a range of errors:

Curiously, the natural binary results are a little better (about 1dB less Eb/No for the same SNR). Another surprise is that at low Eb/No (high BERs) the SNRs are about the same for each quantiser. For example around 9dB SNR at Eb/No = -2dB, for 5,4 and 3 bits.

Here is a plot of 2 to 7 bit natural binary quantisers over a wide Eb/No range. Up to about Eb/No of 4dB (a BER of 1%), the 3-7 bit quantisers all work about the same! At lower BER (higher Eb/No), the quantisation noise starts to dominate and the higher resolutions quantisers work better. Each extra bit adds about 6dB of improved SNR.

Channel errors dominate the SNR at BER greater than 1% (Eb/No=4dB). In some sense the extra quantiser bits are “wasted”. This may not be true in terms of subjective decoded speech quality. The occasional large error tends to drag the SNR measure down, as large errors dominate the noise power. Subjectively, this might be a click, followed by several seconds of relatively clean speech. So more (subjective) testing is required to determine if natural or Gray coding is best for Codec 2 parameters. The SNR results suggest there is not much advantage either way.

Here is a plot of the error from the natural and Gray coded quantisers at Eb/No=-2dB. Occasionally, the Gray coded error is very large (around 1.0), compared to the natural coded error which has a maximum of around 0.5.

This example of a 3 bit quantiser helps us understand why. The natural binary and Gray coding is listed below the quantiser values:

Quantised Value 0.0 0.125 0.25 0.375 0.5 0.625 0.75 0.875
Natural Binary Code 000 001 010 011 100 101 110 111
Gray Code 000 001 011 010 110 111 101 100

Although Gray codes are robust to some bit errors (for example 000 and 001), they also have some large jumps, for example the 000 and 100 codes are only 1 bit error apart but jump the entire quantiser range. Natural binary has an exponentially declining error step for each bit.

Variable Power Quantisation

A common task in speech coding is to take a real (floating point) number and quantise it to a fixed number of bits for sending over the channel. For Codec 2 a good example is the energy of the speech signal. This is sampled at a rate of 25Hz (once every 40ms) and quantised to 5 bits.

Here is an example of a 3 bit quantiser that can be used to quantise a real number in the range 0 to 1.0:

Quantised Value 0.0 0.125 0.25 0.375 0.5 0.625 0.75 0.875
Binary Code 000 001 010 011 100 101 110 111

The quantiser has 8 levels and a step size of 0.125 between levels. This introduces some quantisation “noise”, as the quantiser can’t represent all input values exactly. The quantisation noise reduces as the number of bits, and hence number of quantiser levels, increases. Every additional bit doubles the number of levels, so halves the step size between each level. This means the signal to noise ratio of the quantiser increases by 6dB per bit.

We use a modem to send the bits over the channel. Each bit is usually allocated the same transmit power. In poor channels, we get bit errors when the noise overcomes the signal and a 1 turns into a 0 (or a 0 into a 1). These bit errors effectively increases the noise in the decoded value, and therefore reduce the SNR. We now have errors from the quantisation process and bit errors during transmission over the channel.

However not all bits are created equal. If the most significant bit is flipped due to an error (say 000 to 100), the decoded value will be changed by 0.5. If there is an error in the least significant bit, the change will be just 0.125. So I decided to see what would happen if I allocated a different transmit power to each bit. I chose the 5 bits used in Codec 2 to transmit the speech energy. I wrote some Octave code to simulate passing these 5 bits through a simple BPSK modem at different Eb/No values (Eb/No is proportional to the the SNR of a radio channel, which is different to the SNR of the quantiser value).

I ran two simulations, first a baseline simulation where all bits are transmitted with the same power. The second simulation allocates more power to the more significant bits. Here are the amplitudes used for the BPSK symbol representing each bit. The power of each bit is the amplitude squared:

Bit 4 3 2 1 0
Baseline 1.0 1.0 1.0 1.0 1.0
Variable Power 1.61 1.20 0.80 0.40 0.40

Both simulations have the same total power for each 5 bit quantised value (e.g 1*1 + 1*1 + 1*1 + 1*1 + 1*1 = 5W). Here are some graphs from the simulation. The first graph shows the Bit Error Rate (BER) of the BPSK modem. We are interested in the region on the left, where the BER is higher than 10%.

The second graph shows the quantiser SNR performance for the baseline and variable power schemes. At high BER the variable power scheme is about 6dB better than the baseline.

The third figure shows the histograms of the quantiser errors for Eb/No = -2dB. The middle bar on both histograms is the quantisation noise, which is centred around zero. The baseline quantiser has lots of large errors (outliers) due to bit errors, however the variable power scheme has more smaller errors near the centre, where (hopefully) it has less impact on the decoded speech.

The final figure shows a time domain plot of the errors for the two schemes. The baseline quantiser has more large value errors, but a small amount of noise when there are no errors. The variable power scheme look a lot nicer, but you can see the amplitude of the smaller errors is higher than the baseline.

I used the errors from the simulation to corrupt the 5 bit Codec 2 energy parameter. Listen to the results for the baseline and variable power schemes. The baseline sample seems to “flutter” up and down as the energy bounces around due to bit errors. I can hear some “roughness” in the variable transmit power sample, but none of the flutter. However both are quite understandable, even though the bit error rates are 13.1% (baseline) and 18.7% (variable power)! Of course – this is just the BER of the energy parameters, in practice with all of the Codec bits subjected to that BER the speech quality would be significantly worse.

The simple modem simulation used here was BPSK modem over an AWGN channel. For FreeDV we use a DQPSK modem over a HF channel, which will give somewhat poorer results at the same channel Eb/No. However it’s the BER operating point that matters – we are aiming for intelligible speech over a channel between 10 and 20%, this is equivalent to a 1600 bit/s DQPSK modem on a “CCIR poor” HF channel at around 0dB average SNR.

Running Simulations
octave:6> fuzzy_gray
octave:7> compare_baseline_varpower_error_files

codec2-dev/src$ ./c2enc 1300 ../raw/ve9qrp.raw - | ./insert_errors - - ../octave/energy_errors_baseline.bin 56 | ./c2dec 1300 - - | play -t raw -r 8000 -s -2 -

codec2-dev/src$ ./c2enc 1300 ../raw/ve9qrp.raw - | ./insert_errors - - ../octave/energy_errors_varpower.bin 56 | ./c2dec 1300 - - | play -t raw -r 8000 -s -2 -

Note the 1300 bit/s mode actually used 52 bits per frame but c2enc/c2dec works with an integer number of bytes so for the purposes of simulating bit errors we round up to 7 bytes/frame (56 bits).

As I wrote this post I realised the experiments above used natural binary code, however Codec 2 uses Gray code. The next post looks into the difference in SNR performance between natural binary and Gray code.

Not Activiating Mt Remarkable

Last Saturday I had my first Summits on The Air (SOTA) attempt on top of Mount Remarkable here in South Australia.

As a first step on Friday I registered my SOTA attempt on the Sotawatch web site

On Saturday morning I started by testing my FT-817 and Alexloop magnetic loop antenna at our camp. While tuning up I managed to talk to a VK2 (portable in VK5) who was few 100 km away in the Flinders ranges. Good test.

I then hiked for a few hours to get to the top of Mt Remarkable, set up my radio and antenna, and called CQ on 40m and 20m. Alas, I made no contacts. However it was so nice to experience S0 noise on 40 and 20m, so much different to my urban S9 hash experience on those bands. I couldn’t hear much activity on 40m but could hear many international stations on 20m. They just couldn’t hear me!

The members of the SOTA Australia Yahoo Group have been most helpful with many suggestions on how I can do better next time. In particular I can “self spot” using a smart phone app like sotagoat or a web site. I’ll certainly give it another go in future.

Some pictures of my little adventure:

In the last picture the magnetic loop is just behind my head, the FT-817 just visible above the white note book. I use a 1m dowel as the antenna mast which doubles as a walking stick for the hike. As you can see from the pictures a large part of this walk is over paths covered with large rubble. I was told this is from ancient volcanic activity. The rubble moves a bit under your feet, making for slow going. There is a light plane crash about 2/3 of the way up – the alloy remains of the plane still shiny after 30 years. The walk was about 6 hours return for me from the caravan park at the base of Mt Remarkable. However I am a slow walker, and had a sore knee from a bike crash a few days before!

FreeDV Robustness Part 3

Since the last post I have explored some improvements to PAPR, and tested the 1600/2000 modes introduced in the last post in real time. These tests have given me a little more insight into the problems with HF channels and led me to better understand the requirements. This has lead to a new 1600 bit/s FreeDV mode specifically designed to handle these requirements.

Peak/Average Power Ratio (PAPR) Improvements

The FreeDV FDMDV modem waveforms have a PAPR of around 12dB. That means the peaks of the waveform are 12dB higher than the average. So the average power of the signal is limited to 12dB less than the peak power of the amplifier.

Now the average transmit power sets the Bit Error Rate (BER) of the received signal. So if we can reduce PAPR, we can raise the average power without clipping the amplifier, and improve our BER. Peter Martinez, G3PLX, suggested that some hard clipping of the modem waveform might reduce PAPR without adversely affecting performance. Here are the results from clipping, obtained using the fdmdv_ut simulation in an AWGN channel:

Test Eb/No SNR PAPR Clip BER
(a) 6.3 3.0 12.6 1.0 0.0134
(b) 6.3 3.0 7.74 0.7 0.0175
(c) 9.3 6.0 7.71 0.7 0.0024
(d) 11.3 8.0 7.74 0.7 0.0

Test (a) is the baseline unclipped modem waveform with a BER of 0.0134. If we clip the waveform to 0.7 of the peaks (b) in the same channel we get only a slight increase in BER however the PAPR has reduced by 5dB. This is very significant, as it potentially allows us to increase the transmit power, for example by 3dB (c) or even up to 5dB (d), with significant reductions in the BER.

This got me thinking about what happens in a SSB radio power amplifier if we drive it into compression, a somewhat softer form of reducing the peak level than hard clipping. So we tested various power levels on an IC7000 owned by Mark, VK5QI. A nearby receiver and FreeDV was used to monitor the SNR of the received signal. In this case the SNR (as measured by FreeDV) represents distortion due to compression, the tx and rx were so close that there was no significant channel noise affecting SNR.

Test Av Tx Power SNR BER
(a) 8 18 0
(b) 25 10.5 0

We found that at 25W average power the radio became quite hot. A higher average power would not be practical. Now FreeDV users typically drive their tx at the 10-20W average level, this is a backoff from the peak 100W power of 10-7dB. This is similar to the 7dB PAPR obtained from the hard clipping experiments above. This is well into compression, but as we can see above the SNR is still quite high, so the distortion due to this much compression won’t affect the BER much.

So despite the PAPR reduction we found by experimenting with hard clipping above, it is not possible to get any further power benefits from PAPR reduction – we are already running the typical SSB power amplifier near it’s safe limits.

Codecs for HF DV

I spent some time watching the 1600 and 2000 bit/s modes introduced in the last post in action. I noticed they were still falling over on typical HF fading channels, especially in the 0-5dB SNR range. After some thought, I came up with some design ideas for HF DV modes:

  1. Intelligible speech at around 10% raw BER for QPSK (averaging all carriers over over a few seconds).
  2. For a FEC code to work with a raw bit rate of 10% BER we require a low code rate (e.g. 0.3), which means lots of parity bits (a high bit rate), and large block sizes.
  3. But we are constrained by latency to short blocks, and the code rate is constrained by bit rate (e.g. its hard to get more than 2000 bit/s thru this channel).
  4. So it is difficult to protect all bits in the Codec with FEC.

My previous tests show the excitation bits (pitch, voicing, energy) are the most sensitive. The excitation bits affect the entire spectrum, unlike LSPs where a bit error introduces distortion to a localised part of the spectrum.

So I dreamt up a new 1300 bit/s Codec 2 mode that has “less” sensitive bits. The 1300 bit/s Codec 2 mode only sends (scalar) pitch and energy once every 40ms, rather than twice for the previous 1600 Codec 2 bit/s mode. This reduces the 0 BER quality a little, but now there is “less to go wrong” (just 16 bits for the excitation) at high BER. Less excitation bits means they can be protected with just a few extra FEC bits. So I added a single Golay FEC word to protect 12 of the 16 excitation bits to get a total bit rate of 1600 bit/s over the channel. This is known as the new 1600 bit/s mode.

This table shows the difference between the 1300 and previous 1600 bit/s Codec 2 modes, you might be able to hear a small difference:

Sample
hts1a 1300 bit/s
hts1a 1600 bit/s
ve9qrp 1300 bit/s
ve9qrp 1600 bit/s

This table has some samples of the 1300 bit/s Codec 2 + 300 bit/s FEC (1600 bit/s FreeDV mode) over several simulated and real world channels, as shown in the table below:

Sample
FreeDV V0.91 1400 bit/s CCIR poor channel 4dB
1600 bit/s CCIR poor channel 4dB
1600 bit/s VK2MEV in Newcastle to Adelaide 20m
1600 bit/s K5WH to K0PFX with interfering SSB

The signal sampled from VK2MEV had a reasonably high SNR (above 5dB) but a high average BER due to the constant fading:

Note the number of frames in the 10 to 15% error range, and the near constant fading on at least one carrier. As the fading is so regular, the SNR is fairly steady. The last plot is the timing offset, which is slowly drifting downwards indicating a sample clock difference between the tx and rx sound cards.

The K5WH to K0PFX sample is an example of a SSB signal interfering with FreeDV:

You can hear the SSB and modem signals together in this sample of the off air signal. You can hear the FreeDV modem tones start up about 10 seconds in. The decoded speech (in the table above) holds up pretty well.

Command Line
octave:1> fdmdv_demod("/home/david/n4dvr.wav",1600*30,16,"mod_test_1600_n4dvr_001.err")
45952 bits 1321 errors BER: 0.0287 PAPR(rx): 22.53 dB
david@bear:~/codec2-dev/src$ ./c2enc 1300 ../raw/ve9qrp.raw - | ./fec_enc - - 1600 | ./insert_errors - - ../octave/mod_test_1600_n4dvr_001.err 64 | ./fec_dec - - 1600 | ./c2dec 1300 - - | play -t raw -r 8000 -s -2 -

FreeDV Robustness Part 2

Since the last post I have been working on two new FreeDV modes (1600 and 2000 bit/s), designed to improve the performance of FreeDV over HF multipath fading channels. This involved modifying the Octave and C modem code to support a variable number of carriers, a new 1600 bit/s Codec 2 mode, and some C code to implement FEC. Once again I’ve included the command lines I am using for those who want to repeat the simulations. It’s also useful for me to write the commands down so I can find them again!

The 1600 bit/s mode uses simplified scalar pitch and energy quantisers. This makes it more robust to bit errors, but at a cost of increased bit rate. The increase equates to a 10log10(1600/1400) = 0.5dB drop in SNR, which is pretty modest. The 2000 bit/s mode uses the 1400 bit/s codec, plus 600 bit/s of FEC to protect the first (and most sensitive) 24 bits of the Codec data.

Here are the results over a simulated “CCIR moderate” channel with 10dB SNR:

Sample
Baseline 1400 bit/s with no FEC
1600 bit/s Codec 2 with no FEC
2000 bit/s, 1400 bit/s Codec + 600 bit/s FEC
Analog SSB

Out of the digital modes, I think 2000 bit/s is best, with 1600 bit/s quite close. Compared to the analog, the digital has some annoying R2D2 sounds, but lacks the constant hiss. At 2000 and 1600 bit/s my goal of intelligible speech over a HF multipath channel is, I think, achieved. Here is a typical waterfall, bit error pattern, and SNR against time for this channel:

Note the sample files only play the decoded audio for the first 12 seconds, the plots cover 30 seconds.

Here are some results on a “CCIR poor” channel at 4dB SNR, followed by the waterfall, bit error pattern, and SNR plot against time. The poor channel has faster fading and with the low SNR we have a much higher BER (which averages 8% before FEC).

Sample
Baseline 1400 bit/s with no FEC
1600 bit/s Codec 2 with no FEC
2000 bit/s, 1400 bit/s Codec + 600 bit/s FEC
Analog SSB

Although they are all pretty bad, in this case I think the 1600 bit/s version performs best. A QSO might “just” be possible. Actually it’s quite remarkable that we can hear anything given an average BER of 8%! I think the poorer performance of 2000 bit/s on this channel is an example of FEC breaking down at low SNRs/high BERs. The FEC is actually making more bit errors than it corrects! Digital voice is unique in the low SNRs/high BERs it operates in compared to regular data.

The fluttering at the start of the 1600 bit/s sample might be caused by bit errors in the frame energy parameter causing the level to jump around, or possibly bit errors in the voicing bit. If we know the channel is bad (the modem can tell us), we could “lock down” a few bits to enhance speech on poor channels. For example force all frames to be voiced, or smooth rapid changes in the energy. This could even be a “DV noise reduction” checkbox the operator can enable on poor channels. Try doing that with your closed source Codec!

The SSB sample is still very intelligible, perhaps better than the digital if you don’t mind the hiss. We still have a long way to go (maybe 6dB?) to do better than SSB in poor channel conditions!

Command Lines

The samples above were generated by some magical command line incarnations that run simulations of various parts of the system. Pathsim is used to generate modem files corrupted by the simulated HF channel. I then use a GNU Octave simulation of the demod to generate the bit error patterns for that mode/channel combination. For example the 1400 bit/s mode:
octave:18> fdmdv_demod("../src/mod_test_1400_poor_4dB.wav",1400*30,14, "mod_test_1400_poor_4dB.err")
38640 bits 2773 errors BER: 0.0718 PAPR(rx): 16.20 dB

This means take the mod_test_1400_poor_4dB.wav file as input, demodulate 30 seconds worth of bits at 1400 bit/s, using a 14 carrier modem, and save the error patterns to mod_test_1400_poor_4dB.err. Running the simulation also generates a bunch of nice plots to help me work out what’s going on, some of which are above.

Next step is to insert the bit errors into a Codec 2 bit stream and decode the results. I do this with a bunch of piped command line tools:
~/codec2-dev/src$ ./insert_errors ve9qrp_1400.c2 - ../octave/mod_test_1400_poor_4dB.err 56 | ./c2dec 1400 - - | play -t raw -r 8000 -s -2 -

The “56” above refers to the number of bits/frame for the 1400 bits/s mode. When testing the 2000 bit/s mode I use additional command line tools for the FEC, for example:
~/codec2-dev/src$ ./fec_enc ve9qrp_1400.c2 - | ./insert_errors - - ../octave/mod_test_2000_poor_4dB.err 80 | ./fec_dec - - | ./c2dec 1400 - - | play -t raw -r 8000 -s -2 -

And finally to generate a 12 second output wave file (this time for the 1600 bit/s mode):
~/codec2-dev/src$ ./insert_errors ve9qrp_1600.c2 - ../octave/mod_test_1600_poor_4dB.err 64 | ./c2dec 1600 - - | sox -t raw -r 8000 -s -2 - mod_test_1600_poor_4dB.wav trim 0 12

Simulating communications systems on the command line. Who needs a GUI?

Off Air Examples

Simulated channels are useful as the channel is the same for each run, so a direct comparison of the 3 modes on exactly the same channel can be made. However I wanted to backup these simulated results with some bit errors from real world, off air HF channels.

So last Saturday I hooked up with Mark, VK5QI, on 40m. It was a pleasant Summer (well early Autumn) evening for us. Mark and Andy VK5AKH set up a 40m dipole up the back of the Morialta Conservation Park in Adelaide, and were using a Codan 2110 Manpack for most of the testing.

I was operating “portable” in a backyard in Geelong, with an end fed dipole strung from an 8m squid pole. I was happy to find that my FT-817 with 2.5W SSB could reach 800km to Mark so we could coordinate the tests. At my end the analog SSB was noisy (beneath my urban S8 background noise level) but intelligible. Just the sort of channel I’d like FreeDV to work over.

Some pictures of Mark’s portable station are here.

I generated some modem files with test data sent using 14, 16 & 20 tones. Mark played these files to me over his tx at about 4W average power and I received and recorded them using the FreeDV “record from radio” feature on the Tools menu. We recorded several samples of each mode – and each sample was quite different due to the ever changing nature of HF radio channels. Later, I ran the recorded files through the simulations described above to produce samples of the bit error patterns and decoded voice over the three modes. Here are four examples of the 2000 bit/s mode, taken a few minutes apart, followed by the usual plots for one of them (serial 003). These are longer samples, about 30 seconds, to make sure we experience a few fades.

Sample Average BER
2000 bit/s 001 0.010
2000 bit/s 002 0.013
2000 bit/s 003 0.020
2000 bit/s 004 11W 0.062

If you listen to sample 3, you can hear errors creep in at times corresponding to the low SNR (high BER) points in the plots above.

Sample 004 is an example of too much Tx drive. To illustrate this, Mark intentionally cranked the average power up from 4W to 11W, on a radio that is rated 25W PEP for SSB. This distorts the modem waveform, the BER goes up, and it sounds bad.

Unlike the simulations, the waterfall for off air samples shows the effect of the SSB rx AGC bringing the noise level up during fades, as it seeks to maintain the same average output level. The noise at the start and end of the waterfall is the channel noise before and after the 30 second test wave file.

This was a noisy but intelligible SSB channel for me, and FreeDV sounds OK over it. That’s pretty much what I am aiming at – for FreeDV to work about as well as SSB over HF multipath channels.

Next Steps

So now it’s time to try some real QSOs with these new modes and see if we have made any real progress. There is danger in getting too far ahead of oneself with simulations. A common mistake is getting a good result, declaring “victory”, then moving on when some bug is actually means you are fooling yourself. To support the new modes in FreeDV means quite a bit of coding which will keep me busy for the next week or so!

Once I am ready for testing the first step will be to establish the flatness of the tx response using this swept sine wave. Then ensure SSB works OK over a given channel, and start trying the new modes, in particular looking for examples of fading that break the modes.

Further down the track it would be nice to try different FEC and modulation schemes, lower Codec bit rates with strong FEC, attenuating or masking the annoying R2D2 sounds when there are bit errors, and (Peak/Average Power Ratio) PAPR improvements to increase our average transmitted power.

How You Can Help

I am interested in finding some more examples of channels that break the new modes. If you and a friend are set up for digital modes, then you can play and record wave files over a HF channel. By making off-air recordings of my modem test files, you can help me gather examples of real world HF channels, and help improve FreeDV.

Here is how you can help:

  1. If you are using FreeDV and find it’s not decoding, switch to SSB and confirm you can still communicate OK.
  2. Download and play these files (mod_test_1400.wav, mod_test_1600.wav, mod_test_2000.wav) over the channel, while your partner records the received audio. Make a separate recording for each file. The files are 30 seconds long, so I set the record time for 40 seconds. The FreeDV Tools-Record From Radio feature is a convenient way to record. It doesn’t matter if there is some noise at either end of the recording.
  3. Name the recorded files mod_test_1400_yourcallsign.wav, mod_test_1600_yourcallsign.wav, mod_test_2000_yourcallsign.wav.
  4. Email them to me.

Thanks!

FreeDV Robustness Part 1

Here is the next installment in my adventure of making FreeDV work as least as well as analog SSB over HF multipath fading channels. I’ve included the command lines I am using for those who want to play along with me.

Incremental Improvements

Over the past few weeks I have made some incremental improvements:

  1. Bill Cowley pointed out the Gray code mapping was wrong. That’s worth up to 25% of the Bit Error Rate (BER).
  2. Some SSB transmitters may have a non-flat frequency response. This can have a big effect on FreeDV performance. I encourage anyone using FreeDV to test your transmitters frequency response using this wave file and power meter. This file sweeps between 1000 Hz and 2000 Hz over 10 seconds. Just play it through the same rig interface at the same levels as you would run FreeDV. The power of the tone is about the same as the FreeDV modem signal. Monitor the variation in power over the sweep, for example if the power changes between 10W and 20W that’s a 3dB slope.
  3. Improvements to the sync state machine algorithm to avoid sync dropping out so when a fade temporarily wipes out the centre BPSK sync carrier.

Testing FreeDV over HF Channels

I need a repeatable way to test the system. The first step was to generate a 30 second file (1400 bit/s x 30 = 42000 bits) of modulated test bits (112 bit sequence that repeat every 80ms) using the command line tools:
~/codec2-dev/src$ ./fdmdv_get_test_bits - 42000 | ./fdmdv_mod - mod_test_v1.1.raw

This file was converted to a wave file using sox and sent to some friends for transmission over the air. This gives me samples of bit errors over real HF channels. I have been passing it through the Pathsim HF channel simulator (which runs well on Linux under Wine):

After processing by Pathsim, I can use the Octave demod simulation to demodulate the signal and generate error patterns:
octave:12> fdmdv_demod("../src/mod_test_v1.1_moderate_10dB.wav",1400*30,"moderate_10dB_inter.err")

Here is the waterfall and bit error patterns for the CCIR 10dB Moderate channel:

Using my handy collection of command line tools I can then apply this bit error pattern to a Codec 2 bit stream and listen to the results:
/codec2-dev/src$ ./insert_errors ve9qrp.c2 - ../octave/moderate_10dB.err 0 55 | ./c2dec 1400 - - | play -t raw -r 8000 -s -2 -

Note ve9qrp.c2 is a 1400 bit/s Codec 2 file, the “0 55” parameters specifies the range to apply bit errors to. This range lets me simulate the effect of errors on different parts of the Codec 2 frame. Not all bits are created equal. I happen to know that the excitation bits (voicing, pitch/energy VQ) are the most sensitive. They live in the first 20 bits of the 56 bit frame.

I am interested in using a (24,12) Golay code which can protect multiples of 12 bits. If we protect two blocks of 12 bits thats the first 24 bits protected. Lets assume this code can correct all bit errors. To simulate the effect of perfect FEC on the first 24 bits (with no protection on the last 32 bits) we can use:
/codec2-dev/src$ ./insert_errors ve9qrp.c2 - ../octave/moderate_10dB.err 24 55 | ./c2dec 1400 - - | play -t raw -r 8000 -s -2 -

This seems to work pretty well, as you can see from the samples below:

Sample
Modem signal
Modem signal (CCIR moderate 10dB)
Codec 2 with errors on all bits 0 to 55
Codec 2 with errors on bits 24 to 55 (simulated FEC)
Codec 2 with no errors
Analog SSB (CCIR moderate 10dB)

This suggests some FEC protecting the first 24 bits will give good performance on HF multipath channels, with intelligibility similar to analog SSB – but without the noise.

Increasing the bit rate

We currently use all of the 1400 bit/s for the Codec 2 data. We now need some more bits to carry FEC information.

One strategy is to make Codec 2 operate at a lower bit rate (say 1000 bit/s), then use the balance (say 400 bits/s) for FEC. This is tricky, as Codec 2 already operates at a pretty low bit rate. It might be possible if we make wider use prediction of the Codec parameters (i.e. sending differences), and/or vector quantisation (big tables to quantise a bunch of parameters at once). Prediction means the effect of an uncorrected bit error propagates into future frames. Vector Quantisation means a single bit error can change many parameters all at once.

So lowering the Codec 2 bit rate is likely to make the Codec more sensitive to bit errors. This might still be OK if we use blanket FEC protection of all bits. Some people prefer this approach, but I am not going to work on it for now. Instead I will stick with the 1400 bit/s speech codec and look at increasing the bit rate over the channel to provide bits for FEC.

Lets say we need to protect 24 bits/frame with a half rate code. That means we need a total of 56+24 = 80 bits/frame or 2000 bits/s. Some options are:

  • Go from 14 to 20 QPSK carriers, this will make the bandwidth 20x75Hz/carrier = 1500Hz, and reduce the power per carrier by 14/20 or 1.54dB. The wider bandwidth may cause problems with the skirts of the Tx audio filtering. Twenty carriers will also make Peak/Average Power Ratio (PAPR) problems worse, perhaps reducing our effective power output further by requiring further tx drive back off.
  • We could increase the symbol rate from 50 to 100 baud. This would reduce the energy/symbol by 3dB, but we would need only 10 carriers to get 2000 bit/s which means a power increase per carrier of 14/10 or 1.46dB so once again we have a difference of 3-1.46 = 1.54dB. The bandwidth of the signal would be 10*150Hz/carrier = 1500Hz. PAPR would be improved as we have only 10 carriers.
  • We could go to 10 carriers using 8PSK, this would reduce our energy/bit by about 3dB (according to this graph), but we would gain 1.54dB by moving from 14 carriers to 10, giving a raw performance 1.46dB less than the current 1400 bit/s waveform. The bandwidth would be a very tight 10x75Hz/carrier = 750Hz, narrower than the current 1400 bit/s, 1100Hz wide waveform. PAPR will be better. The narrow bandwidth has pros and cons – good for channel density and interference, but bad for frequency diversity – a wider bandwidth is generally better for handling frequency selective fading.

Why does Analog SSB work so well?

I have been thinking about the differences between analog SSB and digital speech. I figure we might be able to learn from the strengths of Analog SSB:

  • Frequency selective fading causes frequency selective filtering of the analog signal which is no big deal as speech is very redundant. With digital speech a freq selective fade may wipe out a bit that carries information for the entire spectrum, for example the voicing, pitch or energy bits.
  • There is no memory in analog speech. An “error” in the speech spectrum only affects the signal at the time it occurs. A bit error in digital speech may affect future bits for a few 100ms, if predictive coding of model parameters is used (e.g. the pitch/energy quantiser in Codec 2).
  • As the SNR drops the analog signal drops into the noise. The human ear is good at digging signals out of noise, although it can be fatiguing. Digital errors due to low SNR cause R2D2 sounds that the human ear finds harder to make sense of.
  • Analog speech automatically applies more power to the perceptually important low speech frequencies, as the audio signal entering the microphone from your mouth has higher low frequency energy. So as the SNR drops the perceptually important low frequency energy is the last to go, an ideal situation.
  • The FDM modem waveform is sensitive to clipping of the peaks. The analog signal processing of the SSB receiver and ear of the listener is not. So analog speech can be sent at a higher average power level.
  • Codec 2 produces intelligible speech up to a bit error rate of about 2%. Now 0.02 of 56 bit/s frame is about 1 bit error. So if even one carrier is suppressed by a fade we are in trouble with digital speech.
  • The FDMDV modem signal is narrower than analog SSB, so a fade of the same width wipes out proportionally more of the digital signal.
  • A fade during silence (say between syllables or words) or in a low energy (inter-formant) area of the speech spectrum has no effect on analog signals. A fade in silence frames or in any part of the modem signal spectrum has an effect on the digital speech.
  • For comparison, here is the waterfall (spectrogram) for the analog SSB sample:

    The strong vertical lines are due to clipping by the channel simulator as it’s AGC adjusts, this can also be heard as clipping in the analog sample. This is probably a set up error on my part. The comb like lines are the pitch harmonics of the speaker. Note the bandwidth is much wider than the FDMDV signal. Unlike the spectrogram of the FDMDV signal, it’s not obvious where the fades are. Most of the “area” of the spectrogram is low energy (blue) – silence in the time domain or low energy regions in the frequency domain.

Next Steps

The next step is to modify the Octave modem simulation so it support 8PSK as well as QPSK, and a run-time defined number of carriers. This can then be used to generate waveforms that can be played over the air or through the channel simulator. I’ll then build up some command line programs to send Codec frames that include various combinations of FEC.

HF Modem Bit Error Patterns

I am working on improving the performance of FreeDV on HF channels. As a first step I have been exploring the bit error patterns from the modem using some samples of a 1300 km HF radio path. These samples were kindly collected by Mark, VK5QI and Brenton, VK2MEV.

The FDMDV modem has 14 DQPSK data carriers. As I know the data that was transmitted, I can calculate the location of each bit error against time. Here are the bit error patterns for the 50W tx power sample, with the waterfall (spectrogram) of the same signal plotted below:

The red and blue lines indicate bit errors for the two bits modulated on each carrier. The number of each carrier (1 to 14) is on the LH axis. The x axis for both plots is time (500 bits/carrier and 10 seconds total).

At around 4-6 seconds on the waterfall we can see a fade in the top few carriers, with a corresponding burst of errors in the top few carriers of the bit error plot. Good.

However there are also two strange bit error effects. Firstly, the lowest few carriers have a permanently high bit error rate. The waterfall always shows a blue colour for these carriers – indicating a low power level. They are attenuated all of the time relative to the other carriers, as well as experiencing some fades at 2-3 and 8-9 seconds. Normally for a HF channel the level (and hence SNR and Bit Error Rate) should go up and down as the channel evolves. This suggests something is attenuating the carriers at around 500Hz, possibly some analog (high pass?) filtering in the SSB transmitter. It can’t be filtering in the receiver, as that would affect the level of both the signal and noise, and hence not affect the SNR or BER.

FreeDV has the ability to replay recorded samples from the SSB receiver. This gives an animated display of the spectrum and waterfall, which can show more information than a fixed image. Here is a screen shot of FreeDV replaying the 50W sample, also showing the lower tones being constantly attenuated:

In this case the waterfall is rotated (time on the vertical axis) compared to the waterfall plot above.

This sample has a centre frequency of 1200Hz. This has since been changed to 1500Hz, which I am hoping will fix the problem by moving the lower tones into a flatter region of the SSB radio passband. It does illustrate how important station set up can be for digital modes – we really want all the carriers to have the same TX power. We need a way to detect this sort of problem – otherwise we are introducing bit errors for no reason. Perhaps a “test frame” mode for FreeDV, so a friend can monitor the BER of each of your carriers, while you adjust your station.

The second strange effect can be observed in the bit error pattern for carrier 8. Bit errors are occurring at regular intervals, rather than the random distribution we would expect. Between symbols 300 and 400 (2 seconds) I count 9 bit errors. Now the FDM modulator waveform has a spiky nature, for example here is 10 seconds of the modulator waveform:

This because every now and again all of the carriers have the same phase, which sum to a big amplitude spike. The large spikes occur at a rate of 9 every 2 seconds. This suggests we are over driving the transmitter, causing distortion of the modem waveform and hence bit errors.

Lets look at a sample generated using 18W of transmit power (plotted below with corresponding waterfall). In this case there is less evidence of regularly spaced bit errors. This supports our theory that the TX was over driven in the 50W sample. However we can still see the effects of attenuation in the low frequency carriers (a high bit error rate). The bursts of errors sweeping through one carrier after another are also more obvious, corresponding to diagonal stripes on the spectrogram.

The next step is to gather some more off air samples using the 1500Hz centre frequency and see if the bit error pattern for the low frequency carriers improves. I am also coding up tests for interleaving and experimenting with unequal error protection schemes.

Coherent PSK Demodulation on HF

This post is rather technical, and assumes a knowledge of PSK demodulator design. I apologise if it is difficult to understand for the general reader. I have spent the last few weeks working on this part time so felt compelled to record the results somewhere. Thanks to Bill Cowley VK5DSP and Peter Martinez G3PLX for their email advice on this work.

I have a background in modems for satellite communications, which use non differential PSK and coherent demodulation. This has a 3dB advantage over DPSK, at the cost of additional complexity. The FDMDV modem used for FreeDV uses Differential Phase Shift Keying (DPSK). This is the usual choice for HF radio channels which have phase distortion due to multipath propagation. However 3dB is a big potential improvement, so I couldn’t help wondering if coherent demodulation would work for HF radio channels. So I wrote some Octave code to try it.

First I needed to develop a phase estimation algorithm that could be bolted onto the FDMDV demodulator, but using the same DPSK modulator and over the air specification. True coherent demodulation requires a unique word to resolve phase ambiguities. This isn’t possible with the current FDMDV modem specification as there are no spare bits. So I went for a pseudo coherent scheme where we coherently demodulate the PSK symbols, then pass them to a DPSK decoder. This has a performance hit compared to coherent PSK, but resolves the phase ambiguity without a unique word.

The function rx_est_phase() in fdmdv.m estimates the phase over a window of Nph symbols.

As a first step I plotted the scatter diagram of DPSK (top) versus pseudo coherent DPSK (bottom) for data from a real HF channel.

The psuedo coherent scatter plot looked a bit better to me so I decided to go a little further. I implemented a demodulator simulation that could measure bit error Rate (BER) for AWGN channels. During development I found that the phase estimator couldn’t be used to track frequency offsets larger than 0.5 Hz. I think this is because of the low 50 Hz symbol rate. So the existing DPSK demodulator was run in parallel to provide frequency offset tracking.

Here are the BER results for two Eb/No values (5 and 7 dB) for the two different algorithms.

Demod 7dB 5dB
Differential 0.0128 0.0487
Pseudo Coherent 0.0068 0.0252

The bit error rates are about half, which is a 1dB improvement. This is not very much, but I decided to “run it to ground” and test using some FDMDV modem data from real HF channels. Mark VK5QI kindly gathered these samples for me. Rather than Codec 2 data, a known test sequence is transmitted so BER can be determined at the receiver. The fdmdv_demod_coh.m script implements a demodulator that uses sample files as input.

Here are the results for 50W Tx power (VK2-VK5_20m_50W_TX.raw) using 1400*10 bits. The “Nph” parameter is the size of the window used to estimate the phase. More symbols means a smoother estimate but a slower response to phase variations.

Demod BER
Differential 0.0541
Pseudo Coherent Nph=5 0.0517
Pseudo Coherent Nph=7 0.0484
Pseudo Coherent Nph=9 0.0442
Pseudo Coherent Nph=13 0.0458

Not a very impressive improvement, at best a 20% improvement in BER. The results with 18W of Tx power (VK2-VK5_20m_18W_TX.raw) using 1400*10 bits are also marginal:

Demod BER
Differential 0.0991
Pseudo Coherent Nph=9 0.0926

On the HF channel we have multipath pushing FDM carriers down into the noise. My guess is that the HF channel can be modelled as either good, where the SNR is high, or bad, where a fade knocks out one of the FDM carriers entirely. A small improvement in demodulator performance only affects the transition between the two states.

So, the next step in improving FreeDV performance is to explore Forward Error Correction (FEC).

My First FreeDV Contact

A very pleasant Ham Radio day. My friends Joel and Mark (VK5QI) visited my home to build Peter Parkers (VK3YE) “Porta 40” DSB receiver (from the November 2012 issue of “Amateur Radio” magazine). Joel did the assembly work, with Mark and I helping test the receiver.

We started with the local oscillator, and checked it could be heard on a nearby SSB radio. We then built the RF Amp, mixer, and AF Amp. We tested the mixer with the use of a signal generator (an Arduino controlled DDS) and an oscilloscope and verified the mixer loss was just a few dB. An oscilliscope was used to verify each amplifier stage had gain. At the end of the day we connected the receiver to an outdoor antenna and were surprised to hear Ham radio signals on the 40m band! It felt very satisfying – I think we were all a bit surprised that it worked! I enjoyed working with Joel and Mark – we all brought different skills to the project and worked well as a team. I took care of catering – cooking a nice curry for lunch and keeping the coffee flowing.

First FreeDV Contact

For the last few days I have been trying to make some contacts on the HF bands using my FT817 5W radio and various antennas, including a long wire (with antenna tuner), a commercial end-fed trap diople, and a magnetic loop. The long wire and dipole are propped up at one end with a 7m “squid pole” type fibreglass fishing rod:

I could hear quite a few signals, but no one could hear me. I think the problem was low power and low antenna height. This WebSDR receiver located about 800km away has been very useful. It lets me visualise big chunks of the band for signals I could transmit CW and SSB to it to test various antennas, listening to the results through my laptop.

Mark brought along his 100W Icom HF SSB Radio to try a little more power. While connecting his radio to my antenna we noticed some activity on the 20m band that sounded like FreeDV. We connected his radio to my laptop and could visualise the BPSK sync part of the FreeDV signal, but it was too weak to decode. However we did learn that the band was open to VK2 (about 1500km away), so Mark IM-ed Brenton, VK2MEV, and we started a FreeDV contact, using about 10W of power at our end.

We experienced SNRs in the 5-10dB range with about 80% copy. Some speech was lost when the fading got really bad. The fading on the channel was changing all the time, fast to slow and back again. I think the other Hams running FreeDV just before us were Peter and friends, as described in Peter’s FreeDV blog post today.

Being a speech coding guy I was initially unhappy with the coded speech quality but after a while I seemed to adjust and it seemed just fine. The microphone EQ was useful in improving Brenton’s voice. FreeDV was easy to operate and the colourful GUIs make it interesting and fun.

The tx/rx switching was slower than Push To Talk (PTT) SSB, due to internal latencies and modem sync times, plus the VOX delay. It would be hard to do any real time break in, but didn’t affect our conversation. We switched to SSB to compare and were struck by the “hiss and crackle”, with fading on the analog audio also obvious.

Later in the day Mark and Brenton sent some test data frames over the same channel. I will use this test data to start optimising FreeDV for low SNR channels. Having known data means I can measure the bit error rate, and even extract patterns of bit errors that I can apply to off-line simulations of the system. I can then compare different demodulation and FEC schemes before returning with the best candidates for some real world testing.

FreeDV

For the last 2 months I have been working with Dave Witten KD0EAG, coding a GUI application called FreeDV. It combines Codec 2 and the FDMDV modem into single, user friendly application that runs on Linux and Windows. It enables anyone with a SSB radio start using digital voice.

It works really well. FreeDV uses just 1100 Hz of bandwidth, much less that the 2400 Hz required for an analog SSB signal. Compared to SSB it provides a “noise free” audio experience, and continues to work during fades and multipath at quite low SNRs. Mel Whitten has experimented with many Digital Voice systems over the years. This practical experience has led to the current design – a fast sync, no FEC, low latency system that gives a “SSB” type feel for operators.

Here is a video showing FreeDV in action, with analog SSB for comparison:

It’s been a long time since I did any GUI programming and I found it a nice change from the command line signal processing work that I usually do. The programing problems I had to solve didn’t involve maths or complex signal processing algorithms. However bringing FreeDV to life has it’s own special problems, for example spending hours messing with wxWidgets “sizers” to get a check box positioned just right! It was also much larger than the usual program I work on, so there was a certain complexity navigating large files and keeping several balls in the air at once.

I have also really enjoyed working with a nice team of guys, including Dave Witten, Mel Whitten and Bruce Perens. Also involved were a wonderful group of alpha testers and kind people helping us document, support, and improve FreeDV. One example is this fantastic FreeDV Getting Started video produced by Tony, K2MO.

I also feel a sense of importance in our work – FreeDV is the only open source digital voice system for Amateur Radio. It’s an opportunity to prevent Ham Radio (and digital voice over radio in general) being “locked down” to proprietary codecs.

Over the next few months we will gradually improve FreeDV. In particular I would like work on improvements to the low SNR performance. In the medium term I am interested in other applications for narrowband digital voice over radio, such as telephony in the developing world. Ham Radio is an ideal test bed for refining the algorithms and experimenting with integration of the various buildlng blocks.