I’m developing an open source data mode using a FSK modem and powerful LDPC codes. The initial use case is the Open IP over UHF/VHF project, but it’s available in the FreeDV API as a general purpose mode for sending data over radio channels.
It uses 2FSK or 4FSK, has a variety of LDPC codes available, works with bursts or streaming frames, and the sample rate and symbol rate can be set at init time.
The FSK modem has been around for some time, and is used for several applications such as balloon telemetry and FreeDV digital voice modes. Bill, VK5DSP, has recently done some fine work to tightly integrate the LDPC codes with the modem. The FSK_LDPC system has been tested over the air in Octave simulation form, been ported to C, and bundled up into the FreeDV API to make using it straight forward from the command line, C or Python.
We’re not using a “black box” chipset here – this is ground up development of the physical layer using open source, careful simulation, automated testing, and verification of our work on real RF signals. As it’s open source the modem is not buried in proprietary silicon so we can look inside, debug issues and integrate powerful FEC codes. Using a standard RTLSDR with a 6dB noise figure, FSK_LDPC is roughly 10dB ahead of the receiver in a sample chipset. That’s a factor of 10 in power efficiency or bit rate – your choice!
Performance
The performance is pretty close to what is theoretically possible for coded FSK [6]. This is about Eb/No=8dB (2FSK) and Eb/No=6dB (4FSK) for error free transmission of coded data. You can work out what that means for your application using:
MDS = Eb/No + 10*log10(Rb) + NF - 174 SNR = Eb/No + 10*log10(Rb/B)
So if you were using 4FSK at 100 bits/s, with a 6dB Noise figure, the Minimum Detectable Signal (MDS) would be:
MDS = 6 + 10*log10(100) + 6 - 174 = -142dBm
Given a 3kHz noise bandwidth, the SNR would be:
SNR = 6 + 10*log10(100/3000) = -8.8 dB
How it Works
Here is the FSK_LDPC frame design:
At the start of a burst we transmit a preamble to allow the modem to syncronise. Only one preamble is transmitted for each data burst, which can contain as many frames as you like. Each frame starts with a 32 bit Unique Word (UW), then the FEC codeword consisting of the data and parity bits. At the end of the data bits, we reserve 16 bits for a CRC.
This figure shows the processing steps for the receive side:
Unique Word Selection
The Unique Word (UW) is a known sequence of bits we use to obtain “frame sync”, or identify the start of the frame. We need this information so we can feed the received symbols into the LDPC decoder in the correct order.
To find the UW we slide it againt the incoming bit stream and count the number of errors at each position. If the number of errors is beneath a certain threshold – we declare a valid frame and try to decode it with the LDPC decoder.
Even with pure noise (no signal) a random sequence of bits will occasionally get a partial match (better than our threshold) with the UW. That means the occasional dud frame detection. However if we dial up the threshold too far, we might miss good frames that just happen to have a few too many errors in the UW.
So how do we select the length of the UW and threshold? Well for the last few decades I’ve been guessing. However despite being allergic to probability theory I have recently started using the Binominal Distribution to answer this question.
Lets say we have a 32 bit UW, lets plot the Binomial PDF and CDF:
The x-axis is the number of errors. On each graph I’ve plotted two cases:
- A 50% Bit Error Rate (BER). This is what we get when no valid signal is present, just random bits from the demodulator.
- A 10% bit error rate. This is the worst case where we need to get frame sync – a valid, but low SNR signal. The rate half LDPC codes fall over at about 10% BER.
The CDF tells us “what is the chance of this many or less errors”. We can use it to pick the UW length and thresholds.
In this example, say we select a “vaild UW threshold” of 6 bit errors out of 32. Imagine we are sliding the UW over random bits. Looking at the 50% BER CDF curve, we have a probablity of 2.6E-4 (0.026%) of getting 6 or less errors. Looking at the 10% curve, we have a probablity of 0.96 (96%) of detecting a valid frame – or in other words we will miss 100 – 96 = 4% of the valid frames that just happen to have 7 or more errors in the unique word.
So there is a trade off between false detection on random noise, and missing valid frames. A longer UW helps separate the two cases, but adds some overhead – as UW bits don’t carry any payload data. A lower threshold means you are less likely to trigger on noise, but more likely to miss a valid frame that has a few errors in the UW.
Continuing our example, lets say we try to match the UW on a stream of random bits from off air noise. Because we don’t know where the frame starts, we need to test every single bit position. So at a bit rate of 1000 bits/s we attempt a match 1000 times a second. The probability of a random match in 1000 bits (1 second) is 1000*2.6E-4 = 0.26, or about 1 chance in 4. So every 4 seconds, on average, we will get an accidental UW match on random data. That’s not great, as we don’t want to output garbage frames to higher layers of our system. So a CRC on the decoded data is performed as a final check to determine if the frame is indeed valid.
Putting it all together
We prototyped the system in GNU Octave first, then ported the individual components to stand alone C programs that we can string together using stdin/stdout pipes:
$ cd codec2/build_linux$ cd src/ $ ./ldpc_enc /dev/zero - --code H_256_512_4 --testframes 200 | ./framer - - 512 5186 | ./fsk_mod 4 8000 5 1000 100 - - | ./cohpsk_ch - - -10.5 --Fs 8000 | ./fsk_demod --mask 100 -s 4 8000 5 - - | ./deframer - - 512 5186 | ./ldpc_dec - /dev/null --code H_256_512_4 --testframes --snip-- Raw Tbits: 101888 Terr: 8767 BER: 0.086 Coded Tbits: 50944 Terr: 970 BER: 0.019 Tpkts: 199 Tper: 23 PER: 0.116
The example above runs 4FSK at 5 symbols/second (10 bits/s), at a sample rate of 8000 Hz. It uses a rate 0.5 LDPC code, so the throughput is 5 bit/s and it works down to -24dB SNR (at around 10% PER). This is what it sounds like on a SSB receiver:
Yeah I know. But it’s in there. Trust me.
The command line programs above are great for development, but unwieldy for real world use. So they’ve been combined into single FreeDV API functions. These functions take data bytes, convert them to samples you send through your radio, then at the receiver back to bytes again. Here’s a simple example of sending some text using the FreeDV raw data API test programs:
$ cd codec2/build_linux/src $ echo 'Hello World ' | ./freedv_data_raw_tx FSK_LDPC - - 2>/dev/null | ./freedv_data_raw_rx FSK_LDPC - - 2>/dev/null | hexdump -C 48 65 6c 6c 6f 20 57 6f 72 6c 64 20 20 20 20 20 |Hello World | 20 20 20 20 20 20 20 20 20 20 20 20 20 20 11 c6 | ..|
The “2>/dev/null” hides some of the verbose debug information, to make this example quieter. The 0x11c6 at the end is the 16 bit CRC. This particular example uses frames of 32 bytes, so I’ve padded the input data with spaces.
My current radio for real world testing is a Raspberry Pi Tx and RTLSDR Rx, but FSK_LDPC could be used over regular SSB radios (just pipe the audio into and out of your radio with a sound card), or other SDRs. FSK chips could be used as the Tx (although their receivers are often sub-optimal as we shall see). You could even try it on HF, and receive the signal remotely with a KiwiSDR.
I’ve used a HackRF as a Tx for low level testing. After a few days of tuning and tweaking it works as advertised – I’m getting within 1dB of theory when tested over the bench at rates between 500 and 20000 bits/s. In the table below Minimum Detectable Signal (MDS) is defined as 10% PER, measured over 100 packets. I send the packets arranged as 10 “bursts” of 10 packets each, with a gap between bursts. This gives the acquisition a bit of a work out (burst operation is typically tougher than streaming):
Info bit rate (bits/s) | Mode | NF (dB) | Expected MDS (dBm) | Measured MDS (dBm) | Si4464 MDS (dBm) |
---|---|---|---|---|---|
1000 | 4FSK | 6 | -132 | -131 | -123 |
10000 | 4FSK | 6 | -122 | -120 | -110 |
5000 | 2FSK | 6 | -123 | -123 | -113 |
The Si4464 is used as an example of a chipset implementation. The Rx sensitivity figures were extrapolated from the nearest bit rate on Table 3 of the Si4464 data sheet. It’s hard to compare exactly as the Si4664 doesn’t have FEC. In fact it’s not possible to fully utilise the performance of high performance FEC codes on chipsets as they generally don’t have soft decision outputs.
FSK_LDPC can scale to any bit rate you like. The ratio of the sample rate to symbol rate Fs/Rs = 8000/1000 (8kHz, 1000 bits/s) is the same as Fs/Rs = 800000/100000 (800kHz, 100k bits/s), so it’s the same thing to the modem. I’ve tried FSK_LDPC between 5 and 40k bit/s so far.
With a decent LNA in front of the RTLSDR, I measured MDS figures about 4dB lower at each bit rate. I used a rate 0.5 code for the tests to date, but other codes are available (thanks to Bill and the CML library!).
There are a few improvements I’d like to make. In some tests I’m not seeing the 2dB advantage 4FSK should be delivering. Syncronisation is trickier for 4FSK, as we have 4 tones, and the raw modem operating point is 2dB further down the Eb/No curve than 2FSK. I’d also like to add some GFSK style pulse shaping to make the Tx spectrum cleaner. I’m sure some testing over real world links will also show up a few issues.
It’s fun building, then testing, tuning and pushing through one bug after another to build your very own physical layer! It’s a special sort of magic when the real world results start to approach what the theory says is possible.
Reading Further
[1] Open IP over UHF/VHF Part 1 and Part 2 – my first use case for the FSK_LDPC protocol described in this post.
[2] README_FSK – recently updated documentation on the Codec 2 FSK modem, including lots of examples.
[3] README_data – new documentation on Codec 2 data modes, including the FSK_LDPC mode described in this post.
[4] 4FSK on 25 Microwatts – Bill and I sending 4FSK signals across Adelaide, using an early GNU Octave simulation version of the FSK_LDPC mode described in this post.
[5] Bill’s LowSNR blog.
[6] Coded Modulation Library Overview – CML is a wonderful library that we are using in Codec 2 for our LDPC work. Slide 56 tells us the theoretical mininum Eb/No for coded FSK (about 8dB for 2FSK and 6dB for 4FSK).
[7] 4FSK LLR Estimation Part 2 – GitHub PR used for development of the FSK_LDPC mode.
Hi David :),
Very impressive!! I am hoping to figure out some simple way to login to my 10 miles away factory more or less reliably via RF. The path is not line of sight :(. I am wondering if some frequency agile transceivers on each end could find and maintain a path.
I am guessing it might work :).
New e-mail. Old one was getting too full!
warm regards,
John – Concord NH
Great development, just was looking up your readme_fsk a couple of days before this was posted.
What penalty would you expect, if for simplicity/lazyness one would still use a chipset trx for the bitstream? As you mentioned soft decision won’t be available, but can’t really image how much of a difference that would make.
Also I think for now it would be interesting to be able to interface with existing chipset based counter parts over the air. Im just wondering if an arm mcu plus a minimal kind of SDR (tinySDR, maybe even slimmed down version of it) is still consuming more power than a readymade chipset like something from ti’s cc2xxx series or nrf52xxx etc. etc.
But yepp in the future free modems without any restrictions from silicone vendors should take over the world π
Hi Diego,
Well the chipset Tx would work OK, for the Rx side there are few sources of loss (i) the noise figure (might be ok) (ii) losses in the demod implementation (iii) the hard decisions would need to be mapped to the log likelihood ratios, and you’d need a guess at the SNR. So these would all need to be measured/simulated.
Good idea – I’m sure it’s possible to build a simple SDR and hook it up to a uC for the signal processing.
Thanks David for your explanation, very interesting topic.
Actually I recently took out my RC gear and looked into how they do the 2.4GHz stuff, and so thought I’d look up your modem concepts to compare.
As far as I can say they all use several different chipsets for 802.15.4, wireless mice and keyboards, etc. etc. and combine them with an ARM or AVR MCU, for hopping patterns and the RC stuff.
That’s where I came from with my idea going one size up in MCU would allow using your more flexible and more efficient modems, but then the disadvantage is one can’t use those dirt cheap radios coming on a nice pre-certified module that’s easy to handle. Next step would be finding a chipset which allows sneaking out I/Q or “audio band” IF…
Closest is AT86RF215IQ if there wouldn’t be that 128Mbit LVDS interface, which seems to always run at full speed even there’s a down sampler on board, either that or apretty horrible datasheet π
But well I’m again far from your basic research into application engineering, seems an engineer can’t ever escape engineering, just too interesting π
We could also build our own mixers/LO and make a simple IQ down converter to be sampled by a uC.
Having said that the chipsets are no doubt “good enough” for most applications, and very convenient. Takes a lot of hassles away if they work in your use case (e.g. if you have lost of signal). I just like playing close to the SNR limits.
The good enough mixed with the easy handling…
But interoperability is plain crap unless you can use a standard like ieee or so.
Happy to see your work taking shape and as its os its great to learn from, something a chipset will never do.
David,
Thanks for the continued great work!
I have a number of questions, comments.
1) I assume you both sync time and frequency during the preamble. Do you continue to sync either of these as each frame is received?
2) I take it your tones (at this time) have no shaping, i.e. switch from one tone to the next, instantly. Is that correct?
2) Are your tones orthogonal? (e.g. 4FSK using -2400, -800, 800, 2400 Hz at 1600 symbols/sec yielding 3200 bps has orthogonal tones, resulting in no ISI). Digital Voice systems like P25, and Yaesu System Fusion, do not use orthogonal tones, but do use raised cosine pre-filtering, which eliminates ISI.
3) For your simulations, does the SNR (Eb/No) correspond to the raw bit rate, or the information bit rate. (e.g. say its 200 bps d- 100 bps information/100 bps FEC. The energy of 200 bits are necessary to deliver just 100 bits of information. So the Eb number should not be the based on the raw bps, but the lower Information bit rate).
3) Regarding the slide 56, [6] reference, the 6 dB number for 4FSK: if I understand it correctly, it says by using appropriate error correct codes, itβs possible to achieve arbitrarily low error rates, at the plotted code rate vs SNR.
Again I wonder if the SNR (Eb/No) reflects the raw bps or the information bps. I read one of the papers referenced on slide 56, but I haven’t been able to determine the answer.
http://web.eecs.umich.edu/~stark/1985D.pdf
Finally, one of the important points in the referenced paper on slide 56 of [6], is the performance is good not just in AWGN, but also great performance in more realistic Rician and Rayleigh channels.
The paper also claims that by using the appropriate FEC, the resulting system will perform as well as an AWGN system (with no FEC) at only 1.35dB more signal.
From the paper:
Here we show that the loss due to fading can be reduced to approximately 1.35dB by the use of optimal codes.
Comment:
Regarding the performance relative to other chipsets.
The sensitivity of the Kenwood P25 transceiver has a sensitivity of 0.25 microvolt (-119 dBm) with a BER of 5% at 9600 bps.
https://comms.kenwood.com/common/pdf/download/TK-5720_5820_Specsheet.pdf
Thanks again for another great post.
Hi Rick,
Thanks for your kind words.
1) Yes coarse timing (frame sync), fine timing, and freq estimation continue throughout the burst.
2) Nope, hard FSK.
Yes orthogonal, tones are chosen to be at least the symbol rate apart. Yes the 4FSK that C4FM, DMR and friends use puzzles me a bit, I believe they pay a price of several dB for having tones close together compared to an ideal non-coherent FSK scheme. I figure you can’t get that bandwidth savings for free. We simulated those modems here and couldn’t do better than 5dB poorer than ideal non-coherent (orthogonal) FSK. Perhaps they know some more tricks – at the time I couldn’t find any studies on those modems. I also guess RF bandwidth and low adjacent channel interference is more important that power efficiency for those services, so they’re happy to take a hit of a few dB.
3) Eb per information bit. So for the rate 0.5 code used above the raw uncoded 4FSK modem was running at Eb/No = 3dB (nearly 10% raw BER). Good point, not sure about multipath channels, an extra hit of just 1.35dB would indeed be nice.
4) Yes it seems lot of DV and indeed analog FM systems quote a MDS at around -120dBm. Be interesting to actually MDS test one of those modems. It’s tricky to compare uncoded systems with coded systems – e.g. we wouldn’t get much data through with a 5% BER, and to get a decent PER you would need a pretty strong code, say rate 0.5, which means we’re down to -116dBm for 9600 information bits/s, and then you would need soft decision to make it work, and so on.
Cheers,
David
Look at slide 57, it shows min SNR for non-coherent FSK. If I understand the graph correctly, P25 would be about h= 0.3 corresponding to ~12 dB vs ~6dB For orthogonal (h=1).
So I believe this is consistent with your previous results. I believe this is for coded transmissions, not raw (see following slides).
Thanks for pointing that out Rick – I hadn’t noticed slide 57 before!! Great to have a theoretical framework around non-orthogonal FSK so we can better understand the trade offs. I’m pleased that it matches our simulations.
The MSK (h=0.5) results are interesting, as MSK can operate quite a bit lower that non-coherent FSK when coherently demodulated.
Looking at slide 56 again I can see our operating point Eb/Nos (8 and 6dB) are about 1dB off theory, not bad for the code length we have.
That CML library sure is a fine piece of work.
Hi Dave and Rick,
I don’t think there aren’t any special tricks in commercial C4FM. I think they’re just using ordinary FM discriminators, presumably adjusted to the (narrower) bandwidth. That means performance will be roughly comparable to analog FM of the same bandwidth. FEC can only help a little because FM output SNR drops so sharply below pre-demod SNR of 10 dB or so that the FEC can’t keep up. That’s the reason for the U-shaped curve in Stark’s paper showing that the optimal code rate above noncoherent modulation is roughly 0.4-0.5 depending on the fading model. This is something I learned in the early days of Qualcomm CDMA (the return channel code rate is 1/3) and applied it in my design of the AO-40/Funcube and ARISSat-1 telemetry schemes.
I’ve also been reading up and experimenting with FM threshold extension, which can buy you maybe 2 dB. Not great, but worthwhile NASA threw some money at this problem during Apollo because lunar rover TV was FM right at the threshold and “sparklies” were often visible.
A carefully designed PLL demodulator is the usual approach but I’ve also been playing with my own extension algorithm. It’s based on the fact that the “popcorn” noise you hear from FM near threshold occurs whenever noise takes the IF signal vector close to the origin, making the angle determination unreliable. This happens even on AWGN because of the nonlinearity of the demodulation function; as the amplitude decreases, small variations in the noise are magnified into larger and larger errors in the arctangent output. (The moral is to not limit FM before detection; the amplitude may be noise but it’s still useful.)
So if the instantaneous amplitude of the IF signal is less than about 40-55% (empirically determined) of the average amplitude, I blank that sample and interpolate the demodulated output. It’s not a panacea (FM threshold extension never is) but it does seem to make a qualitative improvement.
I haven’t tried it with digital data but it should help there too. I’d be especially interested in any quantitative comparisons with the more classic forms of FM threshold extension
Hi Phil – fine business on your threshold extension work. That’s a good point about throwing away amplitude information, I have encountered something similar – a performance hit when coarsely quantising the input samples to a FSK demod.
I’m a modem nerd, so non-optimal waveforms like C4FM, and AFSK over FM bother me. However most Hams don’t seem to mind, and indeed some of the latest work (M17, New Packet Radio) using similar FSK shifts to C4FM, chipset or analog FM demods, and hard decision FEC. Other factors come into play for Hams, like the availability of hardware and spectral efficiency, rather than power efficiency.
You and me both — I’m really bothered when I *know* something could be done better. Maybe some demos showing how to do it right would have an effect.
What do you think of pi/4 QPSK? Unlike C4FM it seems reasonably efficient and I think it’s the next “phase” of APCO digital mobile radio. The pi/4 shift is to avoid 180 deg transitions through the origin. I’m thinking I could implement it as 8-PSK with trellis coding to allow only 4 transitions.
Hi Phil, yes a few posts back http://www.rowetel.com/?p=7273 we demo-ed a few hundred bits/s over a city path at 25uW Tx power, that was good fun. I’d also like to move this fwd to something close to “plug and play” so others can experiment. We also pretty close to running FreeDV (or say M17) on a Pi+RTLSDR.
Would pi/4 (or offset QPSK) make it through a class C PA OK? I’m also pretty keen on coherently detected (G)MSK, as it’s as good a PSK in terms of SNR.