SmartMic SM1000 Part 1

So the parts and PCB arrived from the SM1000 last week, and yesterday I started loading. I use a stereo microscope and hand solder each part. It’s actually fun, a nice change from software.

Here is the current (Rev B1) SM1000 schematic if you would like to follow along.

First I assembled the 5V switching supply, this worked OK except I had the tiny surface mount LED R33 around the wrong way. Then the 3V3 regulator, and this morning I spent a few hours soldering all the parts around the micro-controller (bypass caps, pull ups), and finally the STM32F4 uC.

I connected power but to my dismay couldn’t see any waveform when poking around the uC crystal. After a bit of head scratching I looked at the data sheet. The STM32F has many clock oscillator options, and it turns out that the default (factory) state is to have the external high speed crystal oscillator switched off.

So I connected a STM32f4 Discovery as an STLINK emulator pod, fired up the “st-util” GDB server program on my laptop and it didn’t work:
david@bear:~/stlink$ sudo ./st-util -f ~/codec2-dev/stm32/fft_test.elf
-f arg; /home/david/codec2-dev/stm32/fft_test.elf
2014-07-07T13:47:33 INFO src/stlink-usb.c: -- exit_dfu_mode
2014-07-07T13:47:33 INFO src/stlink-common.c: Loading device parameters....
2014-07-07T13:47:33 WARN src/stlink-common.c: unknown chip id! 0xe0042000
Chip ID is 00000000, Core ID is 00000000.

Hmm, that doesn’t look good. I started worrying that there was some other trick I needed to flash a bare-metal uC from the factory state. Then I checked the PCB. Turned out I has soldered R20 (a STLINK series termination resistor) in the wrong position. So I moved it and it still didn’t work. Then I looked again and found out I had moved the wrong resistor! So then I moved two resistors to fix my mistake and:
david@bear:~/stlink$ sudo ./st-util -f ~/codec2-dev/stm32/fft_test.elf
-f arg; /home/david/codec2-dev/stm32/fft_test.elf
2014-07-07T13:51:06 INFO src/stlink-usb.c: -- exit_dfu_mode
2014-07-07T13:51:06 INFO src/stlink-common.c: Loading device parameters....
2014-07-07T13:51:07 INFO src/stlink-common.c: Device connected is: F4 device, id 0x10016413
2014-07-07T13:51:07 INFO src/stlink-common.c: SRAM size: 0x30000 bytes (192 KiB), Flash: 0x100000 bytes (1024 KiB) in pages of 16384 bytes
Chip ID is 00000413, Core ID is 2ba01477.

Yayyyyyyyyy! I then flashed a test program (that does not use any I/O) and proved the uC was working just as well as the Discovery boards:
GDB connected.
kiss_fft 0.59 msecs
fft 0.56 msecs

Cool! That’s a FFT speed test program, that shows kiss_fft runs slightly faster than a FFT routine “optimised” for the SMT32F4. This was a test program I had laying around last year when I ported Codec 2 to the STM32F4.

OK, next step is to load one of the analog interfaces and see if I can output a modem signal.

Democratising HF Radio Part 1

I recently submitted a Shuttleworth Fellowship grant application. I had planned to use the funding to employ people and accelerate the roll out of the project described below. I just heard that my application was unsuccessful (they wanted something more experimental). Never mind, the ideas lives on!

I’m exploring some novel ideas for messaging over 100 km ranges in unconnected parts of the developing world. Radios are migrating from hardware to software, making the remaining hardware component very simple. Software can be free, so radio communication can be built at very low cost, and possibly by local people in the developing world. Even from e-waste.

I have a theory that this can address the huge problem of “distribution”. I’ve been involved in a few projects where well meaning geeks have tried to help people using technology. However we get wound up in our own technology. If you have a hammer, every problem is a nail. I think we have the technology – it’s physically getting it into peoples hands at the right cost and in a way that they can control and maintain it that is the problem. I also hit this problem in my small business career – it’s called “distribution”, it was really tough in that field as well.

Here is the video part of a Shuttleworth Fellowship grant application:

And here are the slides.

I’ll be moving this project forward in 2015. The world needs to get connected.

Reducing FDMDV Modem Memory

For the SM1000 (SmartMic) project I need to run the FDMDV HF modem on the STM32F4 micro-controller. However The STM32F4 only has 192k of internal RAM, and the modem in it’s original form uses over 400k. I wrote a unit test to break down the memory usage:
david@bear:~/tmp/codec2-dev/build_dir$ ./unittest/fdmdv_mem
struct FDMDV..........: 409192
prev_tx_symbols.......: 168
tx_filter_memory......: 1008
phase_tx..............: 168
freq..................: 168
pilot_lut.............: 5120
pilot_baseband1.......: 1840
pilot_baseband2.......: 1840
pilot_lpf1............: 5120
pilot_lpf2............: 5120
S1....................: 2048
S2....................: 2048
phase_rx..............: 168
rx_filter_memory......: 161280
rx_filter_mem_timing..: 3360
rx_baseband_mem_timing: 215040
phase_difference......: 168
prev_rx_symbols.......: 168
fft_buf...............: 4096
kiss_fft_cfg..........: 4

Looks like the memory usage is dominated by just two arrays. This modem has a rather high “over sampling rate”, M. The symbol (or baud) rate of each carrier is just 50 Hz. However the sample rate at the output is 8000 Hz, which gives M = 8000/50=160. We tend to do our processing at the symbol rate. This means that for every symbol we process, we need 160 samples. If we are running at 1600 bit/s there are 17 carriers (one is used for sync and carries no data). For some operations (like filtering), we need a record of the last 5 symbols. Each sample is 8 bytes as it’s a complex number (cos and sin sample) with two floats. So it’s easy to chew up say (160)(17)(5)(8)=108,800 bytes in a single array.

However the are many ways to implement a modem. By re-arranging the processing steps we can sometimes save memory and/or MIPs, or trade them off to get the combination we want for our target platform.

New Re-sampler

The first step was to change the re-sampling algorithm. The demodulator (figure below) has a timing estimation routine that works out the best place to sample a received symbol. This is important for modem performance, if you don’t sample at the right place you get noisy symbols and more errors. Often the position we wish to sample is half way between two existing samples. So we need a way to work out the value of a sample between two existing samples.

The original algorithm re-ran the 5 symbol filter at the optimal timing point. As we have oversampled by a large factor, we can choose any of the M=160 output samples per symbol. However this meant keeping a large memory buffer (rx_baseband_mem_timing) above. Instead, I implemented a linear re-sampler. This just fits a straight line between two existing filtered samples, then uses the year 8 equation of a straight line y=mx + c to find the new sample y at the optimum timing instant x.

The scatter diagram is a good way to evaluate the new, linear re-sampler. The points should be nice and tight. Here are the scatter diagrams for the original filter based re-sampler, followed by the linear re-sampler.

Any modem changes like this can have a big impact on Bit Error Rate (BER) performance, so we need to test very carefully. I used the Octave fdmdv_ut.m to measure the BER at the nominal operating point SNR of 4dB:
octave:3> fdmdv_ut
Bits/symbol.: 2
Num carriers: 14
Bit Rate....: 1400 bits/s
Eb/No (meas): 7.30 (8.20) dB
bits........: 2464
errors......: 16
BER.........: 0.0065
PAPR........: 13.02 dB
SNR...(meas): 3.99 (5.51) dB

The BER is about the same as the previous version so all good. Sixteen errors isn’t very many, so I also tested at a lower Eb/No (SNR) to get more than 100 errors.

Storing One Signal Instead of 17

The next array to tackle was rx_filter_memory. The demod takes the FDM modem signal centred on 1500 Hz, then downconverts it to 17 parallel baseband signals, one for each carrier. To make matters worse we need to store 5 symbols worth of each baseband signal. This uses a lot of storage. So I changed the processing steps to:

  1. Keep 5 symbols worth of demod input samples (the FDM signal centred on 1500 Hz). This is just one signal, rather than 17.
  2. For each carrier, downconvert all 5 symbols to baseband, filter, then throw away the baseband signal, freeing the memory.

This is wasteful of CPU, as we downconvert 5 symbols worth (5*160 samples), rather than just the 160 new samples. However it saves a lot of memory. The final memory breakdown uses 10% of the original:
david@bear:~/tmp/codec2-dev$ ./build_dir/unittest/fdmdv_mem
struct FDMDV..........: 41916
prev_tx_symbols.......: 168
tx_filter_memory......: 1008
phase_tx..............: 168
freq..................: 168
pilot_lut.............: 5120
pilot_baseband1.......: 1840
pilot_baseband2.......: 1840
pilot_lpf1............: 5120
pilot_lpf2............: 5120
S1....................: 2048
S2....................: 2048
phase_rx..............: 168
rx_fdm_mem............: 8960
rx_filter_mem_timing..: 3360
phase_difference......: 168
prev_rx_symbols.......: 168
fft_buf...............: 4096
kiss_fft_cfg..........: 4

Automated Testing

I use a set of automated tests to check the C and Octave versions are identical. First I run the C version, which processes 25 frames of modem data and logs all of the internal states as Octave-format vectors which are saved to a text file.
david@bear:~/tmp/codec2-dev/unittest$ pushd ../build_dir/ && make && popd && ../build_dir/unittest/tfdmdv
The command line is a bit tricky as the new Cmake build system builds out of the source tree, and my program needs to run from the unittest directory in the source tree. Next we run the Octave version, which should be identical. Some automated tests make sure they are. Actually as they are two different float implementations I make sure the total error between a C and Octave vector is within 1 part in 1000:
octave:4> tfdmdv
tx_bits..................: OK
tx_symbols...............: OK
tx_baseband..............: OK
tx_fdm...................: OK
pilot_lut................: OK
pilot_coeff..............: OK
pilot lpf1...............: OK
pilot lpf2...............: OK
S1.......................: OK
S2.......................: OK
foff_coarse..............: OK
foff_fine................: OK
foff.....................: OK
rx filt..................: OK
env......................: OK
rx_timing................: OK
rx_symbols...............: OK
rx bits..................: OK
sync bit.................: OK
sync.....................: OK
nin......................: OK
sig_est..................: OK
noise_est................: OK

passes: 23 fails: 0
The Octave version also plots the C and Octave states (vectors) against each other, so you can work out what went wrong. This one is the output of the rx filter, which was the site of my recent rewiring:

You may be interested in software development “process”, or delegating software work to teams of people with varying skills levels, or making a profit from software, or a business that requires software development to not destroy said business. I have used similar, bit exact, schemes for fixed point DSP development, which I whimsically describe in this allegorical tale.

These automated tests give me a lot of confidence. So many things can go wrong with a complex system. So as painful as it is, it’s worthwhile to have some quality “gates” wherever you can. Now when I move to the STM32F4 micro-controller, I can be reasonably sure there aren’t any algorithm or C porting bugs. Reasonably sure.

Lets Do the Time Warp Again

Check out this little snippet of code:
now downconvert using current freq offset to get Nfilter+nin
baseband samples.

Nfilter nin

This means winding phase(c) back from this point to ensure
phase continuity.

windback_phase = -freq_pol[c]*NFILTER;
windback_phase_rect.real = cos(windback_phase);
windback_phase_rect.imag = sin(windback_phase);
phase_rx[c] = cmult(phase_rx[c],windback_phase_rect);

There are a bunch of local oscillators that are defined by their current phase phase_rx[c], and frequency freq[c]. To calculate the next oscillator output sample we increment the current phase by the frequency. That’s how you make an oscillator in software. It’s normally all done in rectangular coordinates (real and imaginary parts, cos and sin, or in-phase and quadrature depending on where you went to school) as it’s easy on the CPU.

Now, we need to down-convert the signal from 5 symbols (960 samples) in the past. So we need to “wind” the oscillator phase backwards to where it was 960 samples ago. Think of the phase like a hand on a clock. We normally increment it a few minutes for every sample but now we want to wind it back several “days”.

Now that I can send oscillators backwards in time I’ll get to work on that warp drive. Or breaking the Shannon limit. Now that would be useful for these HF channels.

Is There a Better Way?

This work took me about an hour of creative thinking (the fun bit) and several days of implementation pain, off by one errors, fighting to understand filter memories (again), and tracking down differences between the Octave and C versions.

I can’t help wondering if there is an easier way, like snapping something together in GNU radio, or some better way of expressing digital filters in software. A more domain-specific language, or some set of functions that take care of filter memories being shifted so my head doesn’t hurt any more. If the code is easier to understand it would be more hackable too, and maintainable, and be a better tool for teaching.

I have tried various “toolkit” ideas over the years. They were all going to make DSP software development painless and fast. These sorts of tools are great fun to implement but the idea of a near-zero effort implementation utopia tends to break down somewhere in the messy details. So here I am still grinding away with Octave and C, like I have been since 1990.

Or maybe it’s just hard, and it isn’t going to get any easier? Like hacking the Linux kernel? I mean, I am trying to replace SSB …..

HF Modem Frequency Offset Estimation

One of my goals for 2014 is to make FreeDV work as well as SSB on HF radio channels. Recently I have been working on improvements to the frame sync algorithms used in the FDMDV modem. I would like to move from a “hard” sync decision to a “soft” one, such that the demod can track through fades. Often during a fade our signal is still there, so it’s better to wait around a bit than go off trying to find a new one. There is a trade off between remaining “in sync” and tracking through a fade and syncing up quickly when a new FreeDV signal appears.

Frame sync relies on the coarse frequency offset estimation algorithm, that works out the centre frequency of the modem signal is in the receivers passband. The nominal centre frequency is 1500 Hz. An offset of 100 Hz would mean the centre frequency is actually 1600 Hz. We need to know the offset within a few Hz for the demod to work properly. If we don’t get the frequency offset right, the demodulator outputs garbage, so no decoded speech. If the frequency offset estimation jumps about, the decoded speech will stop and start.

Frequency Offset Estimation in Action

The frequency offset estimation algorithm works by multiplying (mixing) the incoming signal with a noise free copy of the expected BPSK pilot signal. We then take the FFT of the resulting signal, and peak pick. On a good day, this will have a peak corresponding to the frequency offset. We then usually apply some post processing logic to correct the inevitable errors.

Here it is in action for a 0dB SNR AWGN channel with a -50Hz frequency offset. It’s a weak signal. First a spectrogram of the signal at the input of the demodulator, then a spectrogram of the output of the mixer (note -50Hz line), then a plot of four frequency offset estimates. The x axis in all plots is time, one frame is 20ms, 50 frames 1 second.

The four frequency offset lines “foff_xxx” above show firstly the “raw” output from peak picking the FFT, then the output from three different post processing algorithms. While all but the foff_thresh line look OK here, in practice they all have there pros and cons, none are completely reliable.

Here are the same plots on the nasty HF fading channel. You can see gaps in the pilot just under 1500Hz, which leads to the “dotted” -50Hz line on the mixer output spectrogram. The raw frequency offset estimate is all over the place, although it does get cleaned up by the post processors. Note the foff_state and foff_thresh plots take a long time to lock up, e.g. foff_thresh doesn’t jump to -50Hz until quite late in the simulation, and foff_state takes more than 1 second (50 frames).

Automated Tests

I’ve written a Unit Test (UT) in Octave called fdmdv_ut_freq_est.m that can perform automated tests of various channel conditions:
Test 3: 30 Seconds in HF multipath channel at 0dB-ish SNR
Channel EbNo SNR(calc) SNR(meas) SD(Hz) Hits Hits(%) Result
AWGN 3.00 0.78 1.18 22.00 200 100.00 PASS

Test 3: 30 Seconds in HF multipath channel at 0dB-ish SNR
Channel EbNo SNR(calc) SNR(meas) SD(Hz) Hits Hits(%) Result
HF 3.00 0.78 1.71 87.36 188 94.00 FAIL

The UT also generates the plots above to help me debug the frequency offset estimation algorithm.

Mesh Plots

I discovered that mesh plots were an interesting alternative to spectrograms for plotting signals in 3 dimensions such as time, frequency, and amplitude. On Octave, the mouse can be used to rotate the view of the mesh plot to view it from different angles. Here are some animated videos generated by Octave that illustrate the effect.

The first video is the modem spectrum, with a SNR of 10dB, in an AWGN channel. You can see the central “fin” which is the high energy BPSK pilot. Looks like a toaster! Note the slope from 0 to about 10 frames as the filter memories in the modem “fill up” from the all-zero state at start up.

The second video is the output of the frequency offset estimation mixer. Note the “blade” along the -50Hz line. This is the peak we are looking for.

The third video shows the mess the HF fading channel makes of the modem signal. Deep notches appear, and also some peaks, higher than the signal in the AWGN channel. I wonder if these peaks (short regions of high SNR) they can be used? As the mesh is rotated so it is flat, we get a form of the 2D-colourmap spectrogram.

This command line was used to generate the animations from the PNGs generated by Octave:
david@bear:~/tmp/codec2-dev/octave$ mencoder mf://*.png -mf w=640:h=480:fps=5:type=png -ovc lavc -lavcopts vcodec=mpeg4:mbd=2:trell -oac copy -o freq_est.mp4


I’m not quite happy with the frequency offset estimation algorithm. It still occasionally fails when the BPSK pilot is wiped out completely by a fade. Not quite what I want for the HF channel. Now that I’ve written it up here, I will take a break while I work on the SM1000 code, and come back to frequency offset estimation later.

I’d also like to try FreeDV with no frequency offset estimation. In a way, it’s just another algorithm that can go wrong in fading channels. Without it, the operator would need to tune the receiver to within say 10% of the symbols rate, e.g. 0.1(Rs) = 0.1(50) = 5Hz. But many SSB operators do that anyway. If a higher symbol rate was used, say Rs=200Hz, it’s +/- 20Hz. If we disable frequency offset estimation, we could lose the pilot, saving 1.2dB of SNR, 150Hz of bandwidth, and reducing Peak to Average Power Ratio (PAPR). It could also be a switch-able option, manually disabled by the operator for low SNR channels.

wxWidgets Checkbox Tooltips

I need to post this so that no one else experiences the same pain with wxWidgets (2.9.4). Tooltips weren’t working for me when I hovered over checkboxes. This has bothered me for about 2 years and Google doesn’t seem to throw up a solution.

We are using wxWidgets for FreeDV, as we needed cross platform support. That has actually worked out pretty well, with Linux, Win32, OSX and recently FreeBSD working just fine.

So, the Tooltip problem is something to do with the checkbox being inside a StaticBox. Here is what works and what doesn’t for me:

wxStaticBoxSizer* sbSizer_testFrames;
wxStaticBox *sb_testFrames = new wxStaticBox(this, wxID_ANY, _("Test Frames"));
sbSizer_testFrames = new wxStaticBoxSizer(sb_testFrames, wxVERTICAL);

m_ckboxTestFrame = new wxCheckBox(this, wxID_ANY, _("Enable"), wxDefaultPosition, wxDefaultSize, wxCHK_2STATE);
sb_testFrames->SetToolTip(_("Send frames of known bits instead of compressed voice"));
sbSizer_testFrames->Add(m_ckboxTestFrame, 0, wxALIGN_LEFT, 0);
wxStaticBoxSizer* sbSizer_testFrames;
sbSizer_testFrames = new wxStaticBoxSizer(new wxStaticBox(this, wxID_ANY, _("Test Frames")), wxVERTICAL);

m_ckboxTestFrame = new wxCheckBox(this, wxID_ANY, _("Enable"), wxDefaultPosition, wxDefaultSize, wxCHK_2STATE);
m_ckboxTestFrame->SetToolTip(_("Send frames of known bits instead of compressed voice"));
sbSizer_testFrames->Add(m_ckboxTestFrame, 0, wxALIGN_LEFT, 0);

Microphone Amplifiers

I’ve been prototyping microphone amplifiers for SmartMic. Although I’m not much on an analog guy I’m getting somewhere. Note the tiny SOT-23 op-amp soldered to a header!

We want to be able to handle electret and dynamic microphones, and have 0 to 40dB gain (trimmer adjustable). Here are the transistor and op-amp versions:

Note the DC coupling to the STM32F4 ADC, we want this to be about half scale on the ADC. I’m using the internal 12 bit ADC and DAC to handle the audio signals rather than an external PCM Codec. This seems to be working out pretty well so far. I sample at 16 kHz and don’t worry about anti-alias or reconstruction filtering – I figure the rig’s xtal filter and audio filtering will handle that. Seems to work.

With the transistor version I am concerned about repeatability across manufacture runs and temperature. For example the emitter resistor is AC bypassed so the spread of Beta’s will mean varying gain, or the DC offset might vary. The low Ve bothers me too. However by tweaking the various C values we could tailor the freq response which might be useful. On balance I think we might run with the op-amp version.

To test the amplifiers I speak into the microphone and sample a phrase using the ADC on my STM32F4 Discovery board running a unit test program that uploads the sample to my laptop via USB. This software converts the 12 bit unsigned samples to signed 16 bit samples. I can then plot the signal using Octave and check the DC offset and peak-peak levels:

Then I apply some sox and Codec 2 command line magic:
david@bear:~/tmp/codec2-dev/src$ sox -r 16000 -s -2 ~/stlink/stm_out.raw -r 8000 -t raw - lowpass 3300 highpass 100 | ./c2enc 1300 - - | ./c2dec 1300 - - | sox -t raw -r 8000 -s -2 - smartmic_micamp_1300.wav
Likewise with the logitech USB headset I used as a comparison, which was set as the default input sound device:
david@bear:~/tmp/codec2-dev/src$ rec -t raw -r 8000 -s -2 -c 1 logitech.raw
david@bear:~/tmp/codec2-dev/src$ ./c2enc 1300 logitech.raw - | ./c2dec 1300 - - | sox -t raw -r 8000 -s -2 - logitech_1300.wav vol 2

…..which lets me listen to the mic signals after encoding/decoding with Codec 2. It’s amazing what Unix style “stdin/stdout” tools strung together can do. Very quick prototyping.

I generally use my little laptop speaker to listen as (i) it makes Codec 2 sound better which strokes my ego and (2) it’s close to the sort of speaker Codec 2 will be used for in the real world. Here are some samples:

Transistor, electret
Opamp, electret
Opamp, Yaseu MH31 dynamic
$15 Logitech USB headset

The levels aren’t all exactly the same. To my ears the dynamic mic seems to have less low frequency response, not sure if that is good or bad. HF radio mics are generally tailored to maximise speech energy at those frequencies most important for intelligibility. This causes SSB modulation to allocate more transmit power to these regions of the speech spectrum, enhancing the received signal in the noisy, fading HF channel. Digital Voice doesn’t have the same relationship between the source audio spectrum and transmitted power, so the frequency response is less important from a transmitted “punch” point of view. This may mean traditional HF radio microphones and audio filtering is not needed and may even be harmful in a digital voice world.

We have found however, that the microphone and/or frequency response of the source audio can dramatically affect the Codec 2 speech quality. We are trying to gather more information on that and work out why.

Introducing the SM1000 Smart Mic

For the last few months Rick Barnich KA8BMA and I have been working on the SM1000, a embedded hardware product that allows you to run FreeDV without a PC. Just plug it into your SSB or FM radio, and you now have Digital Voice (DV). It’s based on a STM32F4 micro-controller, has a built in microphone, speaker amplifier, and transformer isolated interfaces to your radio. It’s just 80 x 100mm, and can be held in you hand and used like a regular PTT microphone, or sit near your radio in a small box form factor. The SM1000 will be in production later in 2014 and will retail for US$195.

Here is a block diagram of the SM1000 (click for larger version):

The whole design is open hardware (TAPR license). It will run an embedded version of FreeDV which is also 100% open source. You can forget about proprietary, expensive DV modes from other vendors that force you to use their hardware and make it illegal for you to experiment. Open source is the future of Digital Voice.

The schematic, PCB and other design files can be found in SmartMic SVN. The software is taking shape in the stm32 part of Codec 2 SVN.

The SM1000 has a RJ45 and 3.5mm sockets to interface to your radio, including a little patch panel to configure the RJ45 wiring. As it’s hand held and operates from 6-12V it could even run with low cost FM HTs, as an external mic.

We would welcome a review of the schematic and PCB design. Suggestions are welcome, but contributions much more so! Many hardware and software tasks are available for volunteers who are interested in contributing to the free and open future of Digital Voice. Rick Barnich KA8BMA is a recently retired engineer who is a veteran of over 100 PCB designs. He has done a fantastic job at schematic entry, sourcing parts, analog and digital design, and PCB layout – and he is having a great time in the process!

Rick and I are currently wrapping up the schematic and PCB design, and will be building up prototypes and various pieces of software over the next few months. Most of the analog design has been prototyped, Codec 2 is running on the micro-controller, but the modem still needs some work to reduce memory use. Lots of work required for the user interface, integration, and general micro-controller software.

Scatter Plots and FreeDV

I’ve been prototyping parts of SmartMic (more on that later), for example the interface between the micro-controller and the radio microphone input. I have been using FreeDV as a test tool to evaluate the quality of the signals passing through my prototype circuits. As I worked I recalled a conversation with Mel Whitten. He said that many people over-drive transmitters when using FreeDV, and even how he wasn’t sure what Scatter Plots were used for – after years of DV work!

So here is a video that explains some of the mystery behind Scatter Plots, and how they can be used to measure the quality of a FreeDV signal. For example as you change the Tx drive, AF gain, and the effect of noise and Tx pass-band filtering. As well as FreeDV it applies to any QPSK modem (they are used everywhere, from HF radio to Wifi to Sat-com).

Also from Mel on this topic:

“Gerry and I have tested the effects of overdrive both audio and RF showing the same results as you demonstrated. Its very easy to lose 10-15db and 20dB in the shoulders of the flat top signal with the “skirts” flying out spewing IMD into the adjacent channel. Nasty. I guess we need a spectrum mask for FreeDV like commerical DRM and DV broadcasters use. How may dB should the shoulders be down with FreeDV? 40+ dB is possible. All this is very easy to see with a Flex radio spectrum display. Spectrum re-growth from IMD is easily seen in the Flex waterfall also. I think very few hams understand all of this and as a result can be a very bad “neighbor” on the adjacent channel due to their over-driven rigs (especially amplifiers). There are times when EasyPal and its 2.4kHz of OFDM will “spread” 6-8kHz wide because of non linear amplification.”

I feel there is a lot of tuning we can do to match FreeDV to radios and optimise performance, perhaps by developing some simple tools so operators can test their FreeDV signal and make sure it is optimal. In the video above I think the last carrier is attenuated by the FT817, which has a -6dB point at 2600Hz. In low SNR channels, a weaker carrier will be wiped out first as the noise floor climbs, leading to a bit error rate of a least 1/16.

Natural and Gray Coding

After writing up the Variable Power Quantiser work I added another function to my fuzzy_gray.m Octave simulation to compare natural and Gray coded binary.

Here are some results for 3,4, and 5 bit quantisers over a range of errors:

Curiously, the natural binary results are a little better (about 1dB less Eb/No for the same SNR). Another surprise is that at low Eb/No (high BERs) the SNRs are about the same for each quantiser. For example around 9dB SNR at Eb/No = -2dB, for 5,4 and 3 bits.

Here is a plot of 2 to 7 bit natural binary quantisers over a wide Eb/No range. Up to about Eb/No of 4dB (a BER of 1%), the 3-7 bit quantisers all work about the same! At lower BER (higher Eb/No), the quantisation noise starts to dominate and the higher resolutions quantisers work better. Each extra bit adds about 6dB of improved SNR.

Channel errors dominate the SNR at BER greater than 1% (Eb/No=4dB). In some sense the extra quantiser bits are “wasted”. This may not be true in terms of subjective decoded speech quality. The occasional large error tends to drag the SNR measure down, as large errors dominate the noise power. Subjectively, this might be a click, followed by several seconds of relatively clean speech. So more (subjective) testing is required to determine if natural or Gray coding is best for Codec 2 parameters. The SNR results suggest there is not much advantage either way.

Here is a plot of the error from the natural and Gray coded quantisers at Eb/No=-2dB. Occasionally, the Gray coded error is very large (around 1.0), compared to the natural coded error which has a maximum of around 0.5.

This example of a 3 bit quantiser helps us understand why. The natural binary and Gray coding is listed below the quantiser values:

Quantised Value 0.0 0.125 0.25 0.375 0.5 0.625 0.75 0.875
Natural Binary Code 000 001 010 011 100 101 110 111
Gray Code 000 001 011 010 110 111 101 100

Although Gray codes are robust to some bit errors (for example 000 and 001), they also have some large jumps, for example the 000 and 100 codes are only 1 bit error apart but jump the entire quantiser range. Natural binary has an exponentially declining error step for each bit.

Variable Power Quantisation

A common task in speech coding is to take a real (floating point) number and quantise it to a fixed number of bits for sending over the channel. For Codec 2 a good example is the energy of the speech signal. This is sampled at a rate of 25Hz (once every 40ms) and quantised to 5 bits.

Here is an example of a 3 bit quantiser that can be used to quantise a real number in the range 0 to 1.0:

Quantised Value 0.0 0.125 0.25 0.375 0.5 0.625 0.75 0.875
Binary Code 000 001 010 011 100 101 110 111

The quantiser has 8 levels and a step size of 0.125 between levels. This introduces some quantisation “noise”, as the quantiser can’t represent all input values exactly. The quantisation noise reduces as the number of bits, and hence number of quantiser levels, increases. Every additional bit doubles the number of levels, so halves the step size between each level. This means the signal to noise ratio of the quantiser increases by 6dB per bit.

We use a modem to send the bits over the channel. Each bit is usually allocated the same transmit power. In poor channels, we get bit errors when the noise overcomes the signal and a 1 turns into a 0 (or a 0 into a 1). These bit errors effectively increases the noise in the decoded value, and therefore reduce the SNR. We now have errors from the quantisation process and bit errors during transmission over the channel.

However not all bits are created equal. If the most significant bit is flipped due to an error (say 000 to 100), the decoded value will be changed by 0.5. If there is an error in the least significant bit, the change will be just 0.125. So I decided to see what would happen if I allocated a different transmit power to each bit. I chose the 5 bits used in Codec 2 to transmit the speech energy. I wrote some Octave code to simulate passing these 5 bits through a simple BPSK modem at different Eb/No values (Eb/No is proportional to the the SNR of a radio channel, which is different to the SNR of the quantiser value).

I ran two simulations, first a baseline simulation where all bits are transmitted with the same power. The second simulation allocates more power to the more significant bits. Here are the amplitudes used for the BPSK symbol representing each bit. The power of each bit is the amplitude squared:

Bit 4 3 2 1 0
Baseline 1.0 1.0 1.0 1.0 1.0
Variable Power 1.61 1.20 0.80 0.40 0.40

Both simulations have the same total power for each 5 bit quantised value (e.g 1*1 + 1*1 + 1*1 + 1*1 + 1*1 = 5W). Here are some graphs from the simulation. The first graph shows the Bit Error Rate (BER) of the BPSK modem. We are interested in the region on the left, where the BER is higher than 10%.

The second graph shows the quantiser SNR performance for the baseline and variable power schemes. At high BER the variable power scheme is about 6dB better than the baseline.

The third figure shows the histograms of the quantiser errors for Eb/No = -2dB. The middle bar on both histograms is the quantisation noise, which is centred around zero. The baseline quantiser has lots of large errors (outliers) due to bit errors, however the variable power scheme has more smaller errors near the centre, where (hopefully) it has less impact on the decoded speech.

The final figure shows a time domain plot of the errors for the two schemes. The baseline quantiser has more large value errors, but a small amount of noise when there are no errors. The variable power scheme look a lot nicer, but you can see the amplitude of the smaller errors is higher than the baseline.

I used the errors from the simulation to corrupt the 5 bit Codec 2 energy parameter. Listen to the results for the baseline and variable power schemes. The baseline sample seems to “flutter” up and down as the energy bounces around due to bit errors. I can hear some “roughness” in the variable transmit power sample, but none of the flutter. However both are quite understandable, even though the bit error rates are 13.1% (baseline) and 18.7% (variable power)! Of course – this is just the BER of the energy parameters, in practice with all of the Codec bits subjected to that BER the speech quality would be significantly worse.

The simple modem simulation used here was BPSK modem over an AWGN channel. For FreeDV we use a DQPSK modem over a HF channel, which will give somewhat poorer results at the same channel Eb/No. However it’s the BER operating point that matters – we are aiming for intelligible speech over a channel between 10 and 20%, this is equivalent to a 1600 bit/s DQPSK modem on a “CCIR poor” HF channel at around 0dB average SNR.

Running Simulations
octave:6> fuzzy_gray
octave:7> compare_baseline_varpower_error_files

codec2-dev/src$ ./c2enc 1300 ../raw/ve9qrp.raw - | ./insert_errors - - ../octave/energy_errors_baseline.bin 56 | ./c2dec 1300 - - | play -t raw -r 8000 -s -2 -

codec2-dev/src$ ./c2enc 1300 ../raw/ve9qrp.raw - | ./insert_errors - - ../octave/energy_errors_varpower.bin 56 | ./c2dec 1300 - - | play -t raw -r 8000 -s -2 -

Note the 1300 bit/s mode actually used 52 bits per frame but c2enc/c2dec works with an integer number of bytes so for the purposes of simulating bit errors we round up to 7 bytes/frame (56 bits).

As I wrote this post I realised the experiments above used natural binary code, however Codec 2 uses Gray code. The next post looks into the difference in SNR performance between natural binary and Gray code.