FreeDV 700C

Over the past month the FreeDV 700C mode has been developed, integrated into the FreeDV GUI program version 1.2, and tested. Windows versions (64 and 32 bit) of this program can be downloaded from freedv.org. Thanks Richard Shaw for all your hard work on the release and installers.

FreeDV 700C uses the Codec 2 700C vocoder with the COHPSK modem. Some early results:

  • The US test team report 700C contacts over 2500km at SNRs down to -2dB, in conditions where SSB cannot be heard.
  • My own experience: the 700C speech quality is not quite as good as FreeDV 1600, but usable for conversation. That’s OK – it’s very early days for the 700C codec, and hey, it’s half the bit rate of 1600. I’m actually quite excited that 700C can be used conversationally at this early stage! I experienced a low SNR channel where FreeDV 700C didn’t work but SSB did, however 700C certainly works at much lower SNRs than 1600.
  • Some testers in Europe report 700C falling over at relatively high SNRs (e.g. 8dB). I also experienced this on a 1500km contact. Suspect this is a bug or corner case we can fix, especially in light of the US teams results.

Tony, K2MO, has put together this fine video demonstrating the various FreeDV modes over a simulated HF channel:

It’s early days for 700C, and there are mixed reports. However it’s looking promising. My next steps are to further explore the real world operation of FreeDV 700C, and work on improving the low SNR performance further.

Modems for HF Digital Voice Part 2

In the previous post I argued that pushing bits through a HF channel involves much wailing and gnashing of teeth. Now we shall apply numbers and graphs to the problem, which is – in a nutshell – Engineering.

QPSK Modem Simulation

I have worked up a GNU Octave modem simulation called hf_modem_curves.m. This operates at 1 sample/symbol, i.e. the sample rate is the symbol rate. So we takes some random bits, map them to QPSK symbols, add noise, then turn the noisy symbols back into bits and count errors:

The simulation ignores a few real world details like timing and phase synchronisation, so is a best case model. That’s OK for now. QPSK uses symbols that each carry 2 bits of information, here is the symbol set or “constellation”:

Four different points, each representing a different 2 bit combination. For example the bits ’00’ would be the cross at 45 degrees, ’10’ at 135 degrees etc. The plot above shows all possible symbols, but we just send one at a time. However it’s useful to plot all of the received symbols like this, as an indication of received signal quality. If the channel is playing nice, we receive something like this:

Each cross is now a fuzzy dot, as noise has been added by the channel. No bit errors yet – a bit error happens when we get enough noise to move received symbols into another quadrant. This sort of channel is called Additive White Gaussian Noise (AWGN). Line of site UHF radio is a good example of a real world AWGN channel – all you have to worry about is additive noise.

With a fading or multipath channel like HF we end up with something like:

In a fading channel the received symbol amplitudes bounce up and down as the channel fades in and out. Sometimes the symbols dip down into the noise and we get lots of bit errors. Sometimes the signal is reinforced, and the symbol amplitude gets bigger.

The simulation used for the multipath or HF channel uses a two path model, with additive noise as per the AWGN simulation:

Graphs and Modem Performance

Turns out there are some surprisingly good models to help us work out the expected Bit Error Rate (BER) for a modem. By “model” I mean people have worked out the maths to describe the Bit Error Rate (BER) for a QPSK Modem. This graph shows us how to work out the BER for QPSK (and BPSK):

So the red line shows us the BER given Eb/No (E-B on N-naught), which is a normalised form of Signal to Noise Ratio (SNR). Think about Eb/No as a modem running at 1 bit per second, with the noise power measured in 1 Hz of bandwidth. It’s a useful scale for comparing modems and modulation schemes.

Looking at the black lines, we can see that for an Eb/No or 4dB, we can expect a BER of 1E-2 or 0.01 or 1% of our bits will be received in error over an AWGN channel. This curve is for QPSK or BPSK, different curves would be used for other modems like FSK.

Given Eb/No you can work out the SNR if you know the bit rate and noise bandwidth:

    SNR = S/N = EbRb/NoB

or in dB:

    SNR(dB) = Eb/No(dB) + 10log10(Rb/B)

For example at Rb = 1600 bit/s and a noise bandwidth B = 3000 Hz:

    SNR(dB) = 4 + 10log10(1600/3000) = 1.27 dB

OK, so that was for ideal QPSK. Lets add a few more curves to our graph:

We have added the experimental results for our QPSK simulation (green), and for Differential QPSK (DQPSK – blue). Our QPSK modem simulation (green) is right on top of the theoretical QPSK curve (red) – this is good and shows our simulation is working really well.

DQPSK was discussed in Part 1. Phase differences are sent, which helps with phase errors in the channels but costs us extra bit errors. This is evident on the curves – at the 1E-2 BER line, DQPSK requires 7dB Eb/No, 3dB more (double the power) of QPSK.

Now lets look at modem performance for HF (multipath) channels, on this rather busy graph (click for larger version):

Wow, HF sucks. Looking at the theoretical HF QPSK performance (straight red line) to achieve a BER of 1E-2, we need 14dB of Eb/No. That’s 10dB worse than QPSK on the AWGN channel. With DQPSK, we need about 16dB.

For HF, a lot of extra power is required to make a small difference in BER.

Some of the kinks in the HF curves (e.g. green QPSK HF simulated just under red QPSK HF theory) are due to not enough simulation points – it’s not actually possible to do better than theory!

Estimated Performance of FreeDV Modes

Now we have the tools to estimate the performance of FreeDV modes. FreeDV 1600 uses Codec 2 at 1300 bit/s, plus a little FEC at 300 bit/s to give a total of 1600 bit/s. With the FEC, lets say we can get reasonable voice quality at 4% BER. FreeDV 1600 uses a DQPSK modem.

On an AWGN channel, that’s an Eb/No of 4.4dB for DQPSK, and a SNR of:

    SNR(dB) = 4.4 + 10log10(1600/3000) = 1.7 dB

On a multipath channel, that’s an Eb/No of 11dB for DQPSK, and a SNR of:

    SNR(dB) = 11 + 10log10(1600/3000) = 8.3 dB

As discussed in Part 1, FreeDV 700C uses diversity and coherent QPSK, and has a multipath (HF) performance curve plotted in cyan above, and close to ideal QPSK on AWGN channels. The payload data rate is 700 bit/s, however we have an overhead of two pilot symbols for every 4 data symbols. This means we effectively need a bit rate of Rb = 700*(4+2)/4 = 1050 bit/s to pump 700 bits/s through the channel. It doesn’t have any FEC (yet, anyway), so we need a BER of a little lower than FreeDV 1600, about 2%. Running the numbers:

On an AWGN channel, for 2% BER we need an Eb/No of 3dB for QPSK, and a SNR of:

    SNR(dB) = 3 + 10log10(1050/3000) = -1.5 dB

On a multipath channel, diversity (cyan line) helps a lot, that’s an Eb/No of 8dB, and a SNR of:

    SNR(dB) = 8 + 10log10(1050/3000) = 3.4 dB

The diversity model in the simulation uses two carriers. The amplitudes of each carrier after passing through the multipath model are plotted below:

Often when one carrier is faded, the other is not faded, so when we recombine them at the receiver we get an average that is closer to AWGN performance. However diversity is not perfect, occasionally both carriers are wiped out at the same time by a fade.

So we can see FreeDV 700C is about 4 dB in front of FreeDV 1600, which matches the best reports from early adopters. I’ve had reports of FreeDV 700C operating at as low as -2dB , which is presumably on channels that don’t have heavy fading and are more like AWGN. Also some reports of 700C falling over at high SNRs (around like 8dB)! However that is probably a bug, e.g. a sync issue or something else we can track down in time.

Real world channels can vary. The multipath model above doesn’t take into account fast or slow fading, it just calculates the average bit errors rate. In practice, slow fading is hard to handle in digital voice applications, as the whole channel might be wiped out for a few seconds.

Now that we have a reasonable 700 bit/s codec – we can also consider other schemes, such as a more powerful FEC code rather than diversity. Like diversity, FEC codes provide “coding gain”, moving our operating point to the left. Really good codes operate at 10% BER, right over on the Eb/No = 2dB region of the curve. No free lunch of course – such codes may require long latency (seconds) or be expensive to decode.

Next Steps

I’d like to “instrument” FreeDV 700C and work with the 700C early adopters to find out how well it’s working, why and how it falls over, and work through any obvious bugs. Then start experimenting with ways to make it operate at lower SNRs, such as more powerful FEC codes or even non-redundant techniques like Trellis decoding.

Now we have shown Codec 700C has sufficient quality for conversations over the air, I’m planning another iteration of the Codec 2 700C vocoder design to see if we can improve speech quality.

Links

Modems for HF Digital Voice Part 1.

More Eb/No to SNR worked examples.

Similar modem calculations were used to develop a 100 kbit/s telemetry system to send HD images from High Altitude Balloons.

Modems for HF Digital Voice Part 1

The newly released FreeDV 700C mode uses the Coherent PSK (COHPSK) modem which I developed in 2015. This post describes the challenges of building HF modems for DV, and how the COHPSK modem evolved from the FDMDV modem used for FreeDV 1600.

HF channels are tough. You need a lot of SNR to push bits through them. There are several problems to contend with:

When the transmit signal is reflected off the ionosphere, two or more copies arrive at the receiver antenna a few ms apart. These echoes confuse the demodulator, just like a room with bad echo can confuse a listener.

Here is a plot of a BPSK baseband signal (top). Lets say we receive two copies of this signal, from two paths. The first is identical to what we sent (top), but the second is delayed a few samples and half the amplitude (middle). When you add them together at the receiver input (bottom), it’s a mess:

The multiple paths combining effectively form a comb filter, notching out chunks of the modem signal. Loosing chunks of the modem spectrum is bad. Here is the magnitude and phase frequency response of a channel with the two paths used for the time domain example above:

Note that comb filtering also means the phase of the channel is all over the place. As we are using Phase Shift Keying (PSK) to carry our precious bits, strange phase shifts are more bad news.

All of these impairments are time varying, so the echoes/notches, and phase shifts drift as the ionosphere wiggles about. As well as the multipath, it must deal with noise and operate at SNRs of around 0dB, and frequency offsets between the transmitter and receiver of say +/- 100 Hz.

If commodity sound cards are used for the ADC and DAC, the modem must also handle large sample clock offsets of +/-1000 ppm. For example the transmitter DAC sample clock might be 7996 Hz and the receiver ADC 8004 Hz, instead of the nominal 8000 Hz.

As the application is Push to Talk (PTT) Digital Voice, the modem must sync up quickly, in the order of 100ms, even with all the challenges above thrown at it. Processing delay should be around 100ms too. We can’t wait seconds for it to train like a data modem, or put up with several seconds of delay in the receive speech due to processing.

Using standard SSB radio sets we are limited to around 2000 Hz of RF bandwidth. This bandwidth puts a limit on the bit rate we can get through the channel. The amplitude and phase distortion caused by typical SSB radio crystal filters is another challenge.

Designing a modem for HF Digital Voice is not easy!

FDMDV Modem

In 2012, the FDMDV modem was developed as our first attempt at a modem for HF digital voice. This is more or less a direct copy of the FDMDV waveform which was developed by Francesco Lanza, HB9TLK and Peter Martinez G3PLX. The modem software was written in GNU Octave and C, carefully tested and tuned, and most importantly – is open source software.

This modem uses many parallel carriers or tones. We are using Differential QPSK, so every symbol contains 2 bits encoded as one of 4 phases.

Lets say we want to send 1600 bits/s over the channel. We could do this with a single QPSK carrier at Rs = 800 symbols a second. Eight hundred symbols/s times two bit/symbol for QPSK is 1600 bit/s. The symbol period Ts = 1/Rs = 1/800 = 1.25ms. Alternatively, we could use 16 carriers running at 50 symbols/s (symbol period Ts = 20ms). If the multipath channel has echoes 1ms apart it will make a big mess of the single carrier system but the parallel tone system will do much better, as 1ms of delay spread won’t upset a 20ms symbol much:

We handle the time-varying phase of the channel using Differential PSK (DPSK). We actually send and receive phase differences. Now the phase of the channel changes over time, but can be considered roughly constant over the duration of a few symbols. So when we take a difference between two successive symbols the unknown phase of the channel is removed.

Here is an example of DPSK for the BPSK case. The first figure shows the BPSK signal top, and the corresponding DBPSK signal (bottom). When the BPSK signal changes, we get a +1 DBPSK value, when it is the same, we get a -1 DBPSK value.

The next figure shows the received DBPSK signal (top). The phase shift of the channel is a constant 180 degrees, so the signal has been inverted. In the bottom subplot the recovered BPSK signal after differential decoding is shown. Despite the 180 degree phase shift of the channel it’s the same as the original Tx BPSK signal in the first plot above.

This is a trivial example, in practice the phase shift of the channel will vary slowly over time, and won’t be a nice neat number like 180 degrees.

DPSK is a neat trick, but has an impact on the modem Bit Error Rate (BER) – if you get one symbol wrong, the next one tends to be corrupted as well. It’s a two for one deal on bit errors, which means crappier performance for a given SNR than regular (coherent) PSK.

To combat frequency selective fading we use a little Forward Error Correction (FEC) on the FreeDV 1600 waveform. So if one carrier gets notched out, we can use bits in the other carriers to recover the missing bits. Unfortunately we don’t have the bandwidth available to protect all bits, and the PTT delay requirement means we have to use a short FEC code. Short FEC codes don’t work as well as long ones.

COHPSK Modem

Over the next few years I spent some time thinking about different modem designs and trying a bunch of different ideas, most of which failed. Research and disappointment. You just have to learn from your mistakes, talk to smart people, and keep trying. Then, towards the end of 2014, a few ideas started to come together, and the COHPSK modem was running in real time in mid 2015.

The major innovations of the COHPSK modem are:

  1. The use of diversity to help combat frequency selective fading. The baseline modem has 7 carriers. A copy of these are made, and sent at a higher frequency to make 14 tones in total. Turns out the HF channel giveth and taketh away. When one tone is notched out another is enhanced (an anti-fade). So we send each carrier twice and add them back together at the demodulator, averaging out the effect of frequency selective fades:
  2. To use diversity we need enough bandwidth to fit a copy of the baseline modem carriers. This implies the need for a vocoder bit rate of much less than 1600 bit/s – hence several iterations at a 700 bits/s speech codec – a completely different skill set – and another 18 months of my life to develop Codec 2 700C.
  3. Coherent QPSK detection is used instead of differential detection, which halves the number of bit errors compared to differential detection. This requires us to estimate the phase of the channel on the fly. Two known symbols are sent followed by 4 data symbols. These known, or Pilot symbols, allow us to measure and correct for the current phase of each carrier. As the pilot symbols are sent regularly, we can quickly acquire – then track – the phase of the channel as it evolves.

Here is a figure that shows how the pilot and data symbols are distributed across one frame of the COHPSK modem. More information of the frame design is available in the cohpsk frame design spreadsheet, including performance calculations which I’ll explain in the next blog post in this series.

Coming Next

In the next post I’ll show how reading a few graphs and adding a few dBs together can help us estimate the performance of the FDMDV and COHPSK modems on HF channels.

Links

Modems for HF Digital Voice Part 2

cohpsk_plots.m Octave script used to generate plots for this post.

FDMDV Modem Page

FreeDV Robustness Part 1

FreeDV Robustness Part 2

FreeDV Robustness Part 3

CMA Equalisation of FSK

We’ve just released a new experimental mode for Digital Voice called FreeDV 800XA. This uses the Codec 700C mode, 100 bit/s for synchronisation, and a 4FSK modem, actually the same modem that has been so successful for images from High Altitude Balloons.

FSK has the advantage of being a constant amplitude waveform, so efficient class C amplifiers can be used. However as it currently stands, 800XA has no real protection for the multipath common on HF channels, for example symbols that have an echo delayed by a few ms.

So I decided to start looking at equalisers. Some Googling suggested the Constant Modulus Algorithm (CMA) Equaliser might be a suitable choice for FSK, and turned up some sample code on DSP stack exchange.

I had a bit of trouble getting the algorithm to work for bandpass FSK signals, so posted this question on CMA equalisation for FSK. I received some kind help, and eventually made the equaliser work on a simulated HF channel. Here is the Octave simulation cma.m

How it works

The equaliser attempts to correct for the channel using the received signal, which is corrupted by noise.

There is a “gotcha” in using a FIR filter to equalise a channel response. Consider a channel H(z) with a simple 3 sample impulse response h(n). Now we could equalise this with the exact inverse 1/H(z). Here is a plot of our example channel frequency response and the ideal equaliser which is exactly the inverse:

Now here is a plot of the impulse responses of the channel h(n), and equaliser h'(n):

The ideal equaliser response h'(n) is much longer than the 3 samples of the channel impulse response h(n). The CMA algorithm requires our equaliser to be a FIR filter. Counter-intuitively, we need to use an FIR equaliser with a number of taps significantly larger than the expected channel impulse response we are trying to equalise.

One explanation for this – the channel response can be considered to be a Finite Impulse response (FIR) filter H(z). The exact inverse 1/H(z), when expressed in the time domain, is an Infinite Impulse Response (IIR) filter, which have, you know, an infinitely long impulse response!

Simulation

The figures below show the CMA equaliser doing it’s thing in a multipath channel with AWGN noise. In Figure 1 the error is reduced over time, and the lower plot shows the combined channel-equaliser impulse response. If the equaliser was perfect the combined channel-equaliser response would be 1.

Figure 2 below shows the CMA going to work on a FSK signal. The top subplot is the transmitted FSK signal, you can see the two different frequencies in the waveform. The middle plot shows the received signal, after it has been messed up by the multipath channel. It’s clear that the tone amplitudes are different. Looking carefully at the point where the tones transition (e.g. around sample 25 and 65) there is intersymbol interference due to multipath echoes, messing up the start of each FSK symbol.

However in the bottom subplot the equaliser has worked it’s magic and the waveform is looking quite nice. The tone levels are nearly equal and much of the ISI removed. Yayyyyyy.

Figure 4 shows the magnitude frequency response at several stages in the simulation. The top subplot is the channel response. It’s a comb filter, typical of multipath channels. The middle subplot is the equaliser response. Ideally, this should be the exact inverse of the channel. It’s pretty close at the low end but seems to lose it’s way at very low and high frequencies. The lower plot is the combined response, which is close to 0dB at the low frequencies. Cool.

Figure 4 is the transmit spectrum of the modem signal (top), and the spectrum after the channel has mangled it (lower). Note one tone is now lower than the other. Also note that the modem signal only has energy in the low-mid range of the spectrum. This might explain why the equaliser does a good job in that region of the spectrum – it’s where we have energy to drive the adaption.

Problems for HF Digital Voice

Unfortunately the CMA equaliser only works well at high SNRs, and takes seconds to converge. I am interested in low SNR (around 0dB in a 3000 Hz noise bandwidth) and it’s Push To Talk (PTT) radio so we a need fast initial training, around 100ms. Then it must follow the time varying HF channel, continually retraining on the fly.

For further work I really should measure BER versus Eb/No for a variety of SNRs and convergence times, and measure what BER improvement we are buying with equalisation. BER is King, much easier that squinting at time domain waveforms.

If the CMA cost function was used with known information (like pilot symbols or the Unique Word we have in 800XA) it might be able to work faster. This would involve deconvolution on the fly, rather than using iterative or adaptive techniques.

Codec 2 700C

My endeavor to produce a digital voice mode that competes with SSB continues. For a big chunk of 2016 I took a break from this work as I was gainfully employed on a commercial HF modem project. However since December I have once again been working on a 700 bit/s codec. The goal is voice quality roughly the same as the current 1300 bit/s mode. This can then be mated with the coherent PSK modem, and possibly the 4FSK modem for trials over HF channels.

I have diverged somewhat from the prototype I discussed in the last post in this saga. Lots of twists and turns in R&D, and sometimes you just have to forge ahead in one direction leaving other branches unexplored.

Samples

Sample 1300 700C
hts1a Listen Listen
hts2a Listen Listen
forig Listen Listen
ve9qrp_10s Listen Listen
mmt1 Listen Listen
vk5qi Listen Listen
vk5qi 1% BER Listen Listen
cq_ref Listen Listen

Note the 700C samples are a little lower level, an artifact of the post filtering as discussed below. What I listen for is intelligibility, how easy is the same to understand compared to the reference 1300 bit/s samples? Is it muffled? I feel that 700C is roughly the same as 1300. Some samples a little better (cq_ref), some (ve9qrp_10s, mmt1) a little worse. The artifacts and frequency response are different. But close enough for now, and worth testing over air. And hey – it’s half the bit rate!

I threw in a vk5qi sample with 1% random errors, and it’s still usable. No squealing or ear damage, but perhaps more sensitive that 1300 to the same BER. Guess that’s expected, every bit means more at a lower bit rate.

Some of the samples like vk5qi and cq_ref are strongly low pass filtered, others like ve9qrp are “flat” spectrally, with the high frequencies at about the same level as the low frequencies. The spectral flatness doesn’t affect intelligibility much but can upset speech codecs. Might be worth trying some high pass (vk5qi, cq_ref) or low pass (ve9qrp_10s) filtering before encoding.

Design

Below is a block diagram of the signal processing. The resampling step is the key, it converts the time varying number of harmonic amplitudes to fixed number (K=20) of samples. They are sampled using the “mel” scale, which means we take more finely spaced samples at low frequencies, with coarser steps at high frequencies. This matches the log frequency response of the ear. I arrived at K=20 by experiment.

The amplitudes and even the Vector Quantiser (VQ) entries are in dB, which is very nice to work in and matches the ears logarithmic amplitude response. The VQ was trained on just 120 seconds of data from a training database that doesn’t include any of the samples above. More work required on the VQ design and training, but I’m encouraged that it works so well already.

Here is a 3D plot of amplitude in dB against time (300 frames) and the K=20 frequency vectors for hts1a. You can see the signal evolving over time, and the low levels at the high frequency end.

The post filter is another key step. It raises the spectral peaks (formants) an lowers the valleys (anti-formants), greatly improving the speech quality. When the peak/valley ratio is low, the speech takes on a muffled quality. This is an important area for further investigation. Gain normalisation after post filtering is why the 700C samples are lower in level than the 1300 samples. Need some more work here.

The two stage VQ uses 18 bits, energy 4 bits, and pitch 6 bits for a total of 28 bits every 40ms frame. Unvoiced frames are signalled by a zero value in the pitch quantiser removing the need for a voicing bit. It doesn’t use differential in time encoding to make it more robust to bit errors.

Days and days of very careful coding and checks at each development step. It’s so easy to make a mistake or declare victory early. I continually compared the output speech to a few Codec 2 1300 samples to make sure I was in the ball park. This reduced the subjective testing to a manageable load. I used automated testing to compare the reference Octave code to the C code, porting and testing one signal processing module at a time. Sometimes I would just printf rows of vectors from two versions and compare the two, old school but quite effective and spotting the step where the bug crept in.

Command line

The Octave simulation code can be driven by the scripts newamp1_batch.m and newamp1_fby.m, in combination with c2sim.

To try the C version of the new mode:

codec2-dev/build_linux/src$ ./c2enc 700C ../../raw/hts1a.raw - | ./c2dec 700C - -| play -t raw -r 8000 -s -2 -

Next Steps

Some thoughts on FEC. A (23,12) Golay code could protect the most significant bits of 1st VQ index, pitch, and energy. The VQ could be organised to tolerate errors in a few of its bits by sorting to make an error jump to a ‘close’ entry. The extra 11 parity bits would cost 1.5dB in SNR, but might let us operate at significantly lower in SNR on a HF channel.

Over the next few weeks we’ll hook up 700C to the FreeDV API, and get it running over the air. Release early and often – lets find out if 700C works in the real world and provides a gain in performance on HF channels over FreeDV 1600. If it looks promising I’d like to do another lap around the 700C algorithm, investigating some of the issues mentioned above.

SM2000 – Part 8 – Gippstech 2016 Presentation

Justin, VK7TW, has published a video of my SM2000 presentation at Gippstech, which was held in July 2016.

Brady O’Brien, KC9TPA, visited me in June. Together we brought the SM2000 up to the point where it is decoding FreeDV 2400A waveforms at 10.7MHz IF, which we demonstrate in this video. I’m currently busy with another project but will get back to the SM2000 (and other FreeDV projects) later this year.

Thanks Justin and Brady!

FreeDV and this video was also mentioned on this interesting Reddit post/debate from Gary KN4AQ on VHF/UHF Digital Voice – a peek into the future

Codec 2 Masking Model Part 5

In the last post in this series I was getting close to a fully quantised 700 bit/s codec. However as I pushed through I discovered a bug in the post-filter. I was accidentally cheating and using some of the encoder information in the decoder. When I corrected the bug the quality dropped significantly. I’ve hit these sorts of bugs before – the simulation code is complex and it’s easy to “declare victory” prematurely.

So I have abandoned the AbyS approach for now. Oh well, that’s “research and disappointment” for you. Plenty of new ideas though….

For the last few months I have been working on another solution that vector quantises a “fixed rate” version of the spectrum. The masking functions are still used to smooth the spectrum before sampling at the fixed rate. Much like we low pass filter time domain samples before sampling, the masking functions reduce the “bandwidth” and hence sample “rate” we need to represent the spectrum. Here is a block diagram of the current “700C” candidate codec:

The bit allocation is pitch (Wo) 6 bits, 1 bit for voicing, 16 bits for the amplitude VQ, 4 bits for energy and 1 bit spare. All updated every 40ms. The new work is in the “Decimate in Frequency” block, expanded here:

As the pitch of the speech varies, the number of harmonics used to represent the speech, L, varies. The goal is take a vector of L amplitude samples, vector quantise, and send them over a channel. To vector quantise them we need fixed length vectors. So a Discrete Fourier Transform (DFT) is used to resample the L amplitude samples to fixed vectors of length 20 (I have chosen k=10).

BTW a DFT is the generic form of a Fast Fourier Transform (FFT). A FFT is a computationally efficient (fast) way of computing a DFT.

The steps are similar to sampling a time domain signal. The bandwidth of the signal is limited by using the masking function to smooth the variations in the amplitude envelope. The use of masking functions means the smoothing matches the response of the ear, and no perceptually important information is lost.

I’ve recently been playing with OFDM modems, so I used a “cyclic suffix” to further smooth the DFT coefficients. DFTs like cyclic signals. If you have a DFT of an 8kHz signal, the sample at 3900Hz is the “close” to the sample at 0 Hz. If there is a step jump in amplitude – you get a lot of high frequency information in the DFT coefficients which is harder to quantise. So I throw away the last 500Hz of the speech signal (3500-4000 Hz), and replace it with a curve that ensures a smooth match between 3500 Hz and 0 Hz.

Yeah, I don’t know how I dream this stuff up either …… do I use the Force? Too much red wine or espresso? Experience? A life mispent on computers? Subconscious innovation? Plagiarism?

In the past I’ve tried to resample and VQ the spectrum of sinusoidal codecs a few times, without much success. Jean Marc also suggested something similar a few posts back. Anyhoo, getting somewhere this time around.

Here are some plots that show the algorithm in action for a frame of female speech:

Here are the amplitude samples (red crosses). The blue line has the cyclic suffix, note how it meets the first amplitude sample near 0Hz.

This figure shows the difference in the DFT coefficients with (blue) and without (green) the cyclic suffix:

Here is the cumulative energy of DFT coefficients, note that with the cyclic suffix (blue) low frequency energy dominates:

This figure shows a typical 2k=20 length vector that we vector quantise. Note it has zero mean – we extract the DC coefficient and separately quantise this as the frame energy.

Samples

Sample 1300 700C Candidate
hts1a Listen Listen
hts2a Listen Listen
forig Listen Listen
ve9qrp_10s Listen Listen
mmt1 Listen Listen
vkqi Listen Listen
cq_ref Listen Listen

Through a couple of years of on-air operation we have established that the 1300 bit/s codec (as used in FreeDV 1600 with 300 bit/s of FEC) has acceptable speech quality for HF. So the goal of this work is similar quality at 700 bit/s.

For some samples above (e.g. hts1a and mmt1a), 1300 is superior to the current 700C candidate. For others (e.g. hts2a and vk5qi) 700 sounds a little better. So I think I’m in the ball park.

There’s a bit of clipping at the start of cq_ref, and some level variations between the two modes on some samples. The 700C candidate has a few problems with unvoiced sounds, e.g. the intake of breath on ve9qrp_10, and the “ch” sound at the start of chicken in hts2a. Not sure why.

The cq_ref_1300 sample is a bit poor as the LPC technique used for spectral amplitudes falls over when the spectral dynamic range is high. In this sample the LF energy has much higher energy than the HF, i.e. a strong “Low Pass Filter” effect or spectral slope.

Next step is some refactoring – the Octave code is an untidy mess of 6 months of dead ends and false starts. A mirror of real world R&D I guess. Creating something new is not a tidy process. At least in my head. So many aspects of this algorithm that I could explore but I’d rather get this on the air and see if we really have something here. Would love to have some help with a port from Octave to C. Contact me if you’d like to work in this area.

SM2000 Part 7 – Prototype Ready for Manufacture

Rick, KA8BMA, has been working steadily on the CAD work for the SM2000 VHF Radio and the Rev A (prototype) PCB layout is now complete and ready for manufacture. Neil, VK5KA, an experienced RF Engineer, has been working with Rick on the PCB layout to ensure RF integrity. Thank you both for your fine work!

Here is the top layer of Rev A, which is 160mm x 160mm:

It’s a modular design, if you zoom in you can see the names of each module.

Edwin at Dragino is kindly assembling some prototypes for us, and I hope to start bringing up the board in early June.

Links

SM2000 Part 1 – Introducing the SM2000 project

SM2000 SVN – CAD Files for the project

FreeDV 2400A and 2400B Demos

Brady O’Brien, KC9TPA has put together a couple of videos demonstrating the new FreeDV VHF modes.

Here is a video demonstrating FreeDV 2400A, check out how well it performs next to analog FM:

The sample transmitted was generated using freedv_tx, audacity, and gnuradio (for FM modulation). The transmitting software was a gnuradio pipeline, to convert the 16 bit 48k short samples up to 4M 8bit hackrf samples. The HackRF was hooked straight up to j-pole antenna, about 25 feet in air. The power output was about 10mW.

Brady was 2.7 km away from the transmit site. On the receive end, a rtlsdr was connected to a 5/8 wave 2m antenna on his car. Software used on the receive end was gqrx, piped into freedv_rx over UDP, also recording a wav from which the DV and FM were later extracted.

Here is FreeDV 2400B, DV over a $50 commodity HT! This mode will run on any legacy FM analog radio, with roughly the same performance:

The transmitted sample was generated by freedv_tx and audacity. The TX rig was a yaesu FT-100, connected to a PC using a USB rigcat cable and isolated audio cable. The FT-100 was controlled by rigctl and hamlib, with 1W of transmit power. On the RX end, Brady was 3.8 km away. The UV-5R was configured with a SMA->BNC connector and MFJ magmount antenna. The UV-5R was interfaced to his laptop via a ‘kludge’ cable and USB audio interface.

Some errors can be heard in the decoded audio of this sample – we think the modem tones were clipping on the HT’s audio and introducing a few bit errors.

These new VHF modes are available in the FreeDV API and can be tested on the command line using freedv_tx and freedv_rx (example at the end of this post).

We would like to integrate FreeDV 2400B into the FreeDV GUI program and SM1000. It would be great to have some volunteers help with these tasks – please contact me if you can help!

FreeDV 2400A requires a SDR with a 5kHz RF bandwidth, and will be integrated into the SM2000 VHF radio.

Links

Modems for VHF Digital Voice
FreeDV 2400A
FreeDV 2016 Roadmap

FreeDV 2400A

Brady O’Brien, KC9TPA, has been working hard on two new FreeDV modes for VHF/UHF radio. To the existing Codec 2 1300 bit/s mode, he has added framing/sync logic and our high performance 4FSK modem. This mode is designed to be “readability 5” at -132dBm, which is 10dB beyond the point where analog FM and 1st generation DV systems stop working.

Brady tested the system by setting up a low power transmitter using a HackRF connected directly to an antenna (tx power about 20mW). A GNU Radio system was used to play FreeDV 2400A and analog FM signals at the same transmit power:

He then went for a drive and found a spot 2.5km away where the signal was weak, but still decodable.

Here is a FM sample and DV sample for comparison. At the same power even SSB would be a scratchy 6dB SNR copy (noise measured in a 3000Hz bandwidth).

Here is a spectogram of the two signals, FM/2400A/FM/2400A.

SDR radios are required to reach the performance goals for this mode. FreeDV 2400A is not designed to be run on legacy FM radios, even those with data ports. The RF bandwidth is 5kHz, too wide for SSB radios. This represents a complete departure from “FM” friendly VHF DV modes – DStar/C4FM/DMR which pass through an analog FM modem, and suffer performance degradation because of it. The mode has been designed without compromise in the modem and to explore new ground. It is also completely open source – especially the codec.

However we are also developing FreeDV 2400B – which is designed to run though any FM radio, even a $40 HT. Some test results on that soon.

FreeDV 2400A is available now in the FreeDV API and can be tested using the FreeDV command line utilities, for example:

./freedv_tx 2400A ../../raw/ve9qrp_10s.raw - | ./freedv_rx 2400A - - | play -t raw -r 8000 -s -2 -

It requires a 48kHz interface to the SDR.

Some information on the FreeDV 2400A mode:

Bit Rate 2400 bit/s
RF Bandwidth 5 kHz
Suggested Channel Spacing 6.25 kHz
Modulation 4FSK with non coherent demodulation
Symbol Rate 1200 symbols/s
Tone Spacing 1200 Hz
Frame Period 40ms
Bits/Frame 96
Unique Word 16 bits/frame
Codec 2 1300 52 bits/frame
Spare Bits 28 bits/frame

The spare bits are currently undefined but could be used for data, routing information, or FEC. It’s early days but this is an important first step – well done Brady!