Testing a FDMDV Modem

A key use for Codec 2 is digital voice over HF and VHF radio. A few months ago I figured we needed to get Codec 2 on the air. With a PC, codec and modem software, two sounds cards, and a Single Sideband (SSB) radio it is possible to send and receive Digital Voice (DV) signals over HF radio.

This requires a HF modem optimised for digital speech, in particular fast sync, no multi-second training sequences, the ability to recover quickly after a fade, and no automatic re-transmit of “bad” packets. FDMDV was a working system for HF Digital Voice from a few years ago, so seemed like a good starting point. It embodies a lot of experience from Digital Voice pioneers like Mel Whitten.

FDMDV stands for Frequency Division Multiplexed Digital Voice. A FDM modem is a basically a bunch of slow modems running in parallel. For example FDMDV has 14 carriers spaced 75 Hz apart, each running at 50 symbols/second. Due to multipath problems on HF this approach works better than one carrier running at 14×50 = 700 symbols/second. On each symbol is encoded two bits using differential QPSK, so the bit rate is 1400 bit/s.

A few months ago I started experimenting with GNU Octave simulations of parts of the FDMDV modem. One thing led to another and I ended up writing an open source version of the FDMDV modem, based on the FDMDV spec.

I am in the final stages of the C version of that modem, currently writing command line demo programs. I am not sure what the “best” HF DV system would look like (Codec/FEC/protocol/modem) but I feel the best way to find out is build something and iterate on it. Rather than concentrating on the Codec alone I wanted to get some real world HF DV experience to tune and evolve the system as a whole.

The cool thing about open source is it attracts the best in the field. I have been in regular contact with HF modem gurus like Peter Martinez G3PLX (PSK31), and Rick Muething KN6KB (WINMOR) who have been very helpful with suggestions and support as I re-implemented the FDMDV modem. Rick also has some great ideas for more advanced modulation schemes (trellis coded PSK) that would be nice to try later. I have also had some great help from Bill Cowley, who has 25 years of PSK modem experience.

Testing the Modem

After developing the modem algorithms for about two months using GNU Octave I was ready to test over a real HF channel. So a few days ago I sent a wave file of the modem signal to Mel and Tony (K2MO). They kindly played the tones over a 925 mile HF channel and sent me a recording of the received signal.

I ran the files through my FDMDV modem code. On the first pass the scatter diagram was a mess and the Bit Error Rate (BER) was about 10% – suspiciously high.

Then I noticed the timing offset was changing very quickly, as you can see in the plot below:

The demod estimates the best time to sample the received symbols. This is known as the “timing offset”. In the real world the sample clocks used at the transmitter and receiver tend to be a little different, for example 8000 and 8001Hz. In our case the sample clocks are in the sound device hardware used to play and record the modem signals. So we expect the timing offset to drift a little. I had been simulating just such problems during the modem development, for example testing clock differences of up to 2000 ppm (16Hz at an 8000 Hz sample rate).

Now the demod code keeps an eye on the drift in the timing estimate, and reshuffles buffers every now and again to keep them from overflowing. Hence the saw-tooth effect.

If we count how many “teeth per second” in the saw-tooth, we can estimate the difference in the transmit and receive sample clock. I estimated about 2.5, of 40 samples each. So in every second that’s 2.5×40 = 100 samples, or a 100Hz difference, or 12500ppm! It’s like the PC playing the signal was at 8000Hz and the sample rate of the PC receiving the signal was at 8100Hz.

I re-sampled the signal to correct the large sample clock offset using Sox:
sox -r 8100 -s -2 for_david.raw -s -2 for_david_8000hz.raw rate -h 8000

and the results were perfect – 0 bit errors except for when there was SSB interference across the signal! This was very exciting for me – the first verification that my modem actually worked over real HF channels.

Turns out some sound cards can’t accurately sample at 8000Hz. This was something I had been warned of by my HF modem brains trust. The solution is to use the 48000Hz sound card rate, which most soundcards seem to be better at.

FDM Modem in Action

Here are samples of the first 5 seconds of the transmit and receive modem signal. Now look at the spectrogram of the received signal:

Time is along the x axis, frequency along the y axis. The “hotter” the colour, the stronger the signal. Our FDM signal is the parallel red lines between 600 and 1700Hz. Above the modem signal is some analog SSB. You can hear this as the high frequency “Donald Duck” sound in the received signal. Now around 2.5 and 3.3 seconds there are strong bursts of SSB right on top of our signal, in the 0 to 1100 Hz range.

So how does our modem do? The “Bit errors for test frames” and “Test Frame Sync” plots below tells the story:

Look at the centre plot, it is a measure of bit errors for each test frame received by the demod. Between 2.5 and 3.5 seconds you can see several error bursts. However the demod recovered quickly after the SSB interference. The BPSK sync and test Frame sync plots are unbroken, indicating our demo didn’t “lose it” during the interfering burst. If this was Codec (digital voice) data we would hear some degraded speech, but the system would soldier on between interfering analog SSB bursts. Just what we want.

This sample also shows some of the accumulated wisdom that went into the FDMDV system design. It is a narrow signal (just 1100Hz), so less sensitive to interference to adjacent users on the busy HF bands compared to a system using a full SSB bandwidth of say 2400Hz. Narrow band means we can pack more energy into fewer carriers. The signal to noise ratio is relatively high, and the BER due to gaussian type channel noise (AWGN) practically zero. Rather bit errors come from adjacent users and mutipath fading effects (the latter not illustrated here).

Next steps are to integrate the Codec and Modem into an easy to use GUI program for Windows and Linux. This will help us obtain some real world experience which we can use to tune and further develop the entire system.

29 thoughts on “Testing a FDMDV Modem”

    1. Turns out it’s possible to resample between two rates without loss. This happens at a couple of points in the modem. For example we upconvert from the symbol rate (50 samples/s) to the sample rate (8000Hz) using an interpolating filter.

      On the demod, when we determine the best timing estimate, we resample the signal at that point in time using an oversampled receive filter. These filters effectively take into account many samples either side.

  1. Hi David, that looks pretty awesome to me. Will this be able to be run through a standard PSK/Echolink interface? If you let me know when you’re ready for some testing I would love to give you a hand!

    1. Thanks Darryl. I am not sure whats reqd to interface to PSK/Echolink but would be happy to give it a go. Thanks for the offer of help with testing!

    2. A standard audio/PTT interface for data modes should work fine with these kind of modem, but be aware it will take up an audio input and output.

      I use a Tigertronics SignaLink USB, which appears as a separate USB sound card, leaving my onboard sound-card free for microphone input and demodulated audio output.

      1. You might be able to do this with a single sound card if you build a more complicated harness than the EchoLink/PSK one. It’s possible to do the microphone/speaker audio on one of the stereo channels, and the radio baseband signal input/output on the other one. We haven’t tested if this has unacceptable crosstalk or any other problems yet.

  2. The big mismatch in sampling rates you are experiencing is a common problem with modern sound cards. It has caused serious problems for acoustic echo cancellation, demanding a rate adaptor be used to pull the sample rates into line, so you can actually do an NLMS/PNLMS/FAP/etc adaption to the echo.

    The problem usually stems from the chipset’s ability to sync to a SPDIF source. This means they run all input, including the ADC, from a PLL, and its timing is unrelated to the timing of the DAC. When there is no SPDIF signal, the PLL tends to drift to one end of its range, and you have a big sampling rate offset. In some cases it drifts around endlessly, meaning the sampling rate it pretty unstable. With some of these cards simply connecting the SPDIF output to the SPDIF input will pull the two sampling rates into line. I think with others you need to do some port selection to actually make the SPDIF input active. Then the ADC will sample at a rate determined by the crystal clock driving the DAC. This avoids the problem with AEC completely, and will make your sampling rate a lot closer to the expected value.

  3. Steve would the lack of a SPDIF source also be an issue on sound hardware without SPDIF, e.g. the sound chipset in my laptop?

  4. I don’t think there is any PC sound hardware without SPDIF these days. You may not have a connector, but that doesn’t mean the function isn’t in the silicon.

  5. I’m happy to test. I’m setup for digital modes. Just need to buy a cheap USB sound fob for the headset.

    VK3JED

  6. What sort of error correction coding are you using? At such a low bit rate, it may be practical to use low density parity codes or turbo codes to squeeze a few extra decibels of noise tolerance out of it.

    1. No FEC at present. Are LPDC and Turbo codes practical with small blocks e.g 56 bits/frame that we are using?

      1. I have a great expert in LDPC codes nearby and from talking to him I understood that LDPC is only good for big blocks of data – the bigger the better. So it’s a good choice for 1GbE with jumbo frames, but is not quite useful for speech codecs.

  7. I’m not an expert on the subject, though I have a fancy book on it. But looking it up, it seems that these codes do rely on large block sizes. Maybe a standard Hamming code would work best. If bandwidth is a major concern, you could just protect the most important bits of your code.

  8. Thanks everyone.
    I’ve been testing the FDMDV 1.3 system back to back on two computers , using two audio cards on each computer.

    In Audio mode(Plain Language) the system seem to work with almost perfect audio recovery, however in Data mode, there is a different picture emerging.

    I was wondering if any one else has ever tried to put two computer back to back. What type of interface did they use and what was the results
    VK4GRA(ham)

  9. Regarding the sound cards issue. From my experience of ~5 years ago, it’s usually much better to use the maximum sample rate a sound card could offer and then resample in software. The thing is that many sound cards employ very simple linear interpolation/decimation techniques and distorts signal quite a lot. So my recommendation is to detect the maximum sample rate supported by a sound card on start up and talk to it at this sample rate.

  10. I have has some sucess using FDMDV with IC718 & FT450 radios, and with 2 dell D630 laptops..using 1to1 audio transformers as Rig/PC isolators, I have found need to run the power 25% of ALC clipping for best performance.
    even then its bit choppy and bit tinny the recovered audio..mught be my poor quailty USB head set ?

  11. I played around with the FDMDV mod/demod program and I found out that it works really good for sending small computer files (64-128 kilobytes) acoustically from speaker to microphone. I used a CRC-16 to see if the file went through OK. I messed around with the number of characters and got at least 300 bytes per second at about 2 feet away. I tried running Reed Solomon coding but I didn’t know much about how to interleave the data and the computer chokes every few seconds as it corrects the bits ( I used 8 blocks of 223 bytes in a pseudo-random permutation). One thing is that it doesn’t work too well on compact cassette tape. Manchester coding works better for that. If I could figure out how to compile a Codec2 player on the Sony Playstation Portable I’d try using Manchester coding to make a digital cassette tape. A few years back I used the proprietary Digital Radio Mondiale 10 kHz and got 13 kbps music playing off a cassette tape in mono using one channel. A patent-free open source digital cassette tape (that uses an ordinary analog recorder) for music would be cool.

    1. There has been some progress on a turn key FDMDV2 application, but it’s not ready for release yet. Hopefully later this year.

  12. Well, congratulatons for the revival of Digital Voice! Looks very hard to find QSO partners, but I will try harder in this December! There is a small chance to see the digital voice embedded into a DSP chipset for a standalone “AOR”-like unit?

  13. David, do you have any data on how well the frequency lock survives various frequency-unstable input signals? I’m thinking doppler or LO drift which might be a problem for non-HF use.

  14. sorry David about not getting back to you earlier.
    I’ve been using FDMDV connected to a HF transceiver (FT 901d) using just 2 isolation transformers and a couple of resisters to adjust the audio levels in and out of the radio and my laptop.
    Using the straight audio function (Analogue ).
    on the FDMDV, I experienced considerable echoing of my transmitted voice. About three separate echos were heard by the remote station.
    After trying lots of combination in my hardware interface, I tracked down a solution. It lay in the function of the audio adjustment software.
    I found that going into the advanced option of Microphone settings and deselecting the function marked “AGC” solved the problem of uncontrolled echoing

    Graham VK4GRA

  15. The whole point of the pilot carrier (the one in the middle, 3 db higher than the data carriers (I am talking in the frequency domain)) is that the receiving end will lock onto that carrier and everything can run asynchronously at the receiving end, both transmit and receive clocks don’t need to be extremely close to the same frequency, the package being self synchronizing. There is not error correction intentionally as the mode is intended for near-real-time voice transmission and adding error correction would make it less-near-real-time i.e. longer delay for encode and decode. Human speech contains lost of redundancy and missed bits will be filled in by the brain. Anyway, anyone near Kauai who wants to experiment, give me a call plus eight zero eight 338 twenty twenty. Thus are my latent (non-near-real-time) comments on the subject Regards,

    George KH6/AH8H

Comments are closed.