Binary Telemetry Protocol

Last week I tagged along on a Project Horus balloon launch with Mark, VK5QI. The purpose of this launch was to test a new balloon release and telemetry system that uses the closed source Lora chipset. We had an enjoyable day driving about the Adelaide Hills tracking the balloon, then DF-ing the payload on the ground.

The balloon also flew the RTTY based telemetry system. To receive the RTTY telemetry I used the fsk_horus.m FSK modem developed in October. This modem has near ideal performance in converting radio signals to binary digits. However its performance is limited by the RTTY protocol.

On the way home Mark suggested we fly another balloon in a few days, and we decided to try a new, binary protocol with the ideal modem. A furious two days of coding and integration ensued, but we managed to develop a Horus Layer 2 protocol in C, get it running on the payload, and integrate with the HabHub tracking system.

On Saturday 2 Jan 2016 we launched and it worked really well! Here is a plot of the balloons path:

Even with a very weak signal (we could just hear it on the SSB radios), the binary protocol was pulling packets with valid checksums out of the noise. Here is the telemetry from this sample of the received signal we recorded from the Mt Barker home of VK5FJ:
HORUS,1204,02:57:53,-34.)0819,539.59149 95 6 72,9,-3;,1416 CRC BAD
HORUS,12 5,02858:05,,34.90794,139.59418,9601,71,1,-13,1613 CRC BAD
HORUS,1210,02:59:05,-34 CRC BAD
HORUS,1211,02:59:17,-34.90725,139.61046,9697,74,9,-12,1408 CRC OK
HORUS,1212,02:59:29,-34.90722,139.61319,9714,76,9,-12,1418 fixed
HORUS,1213,02:59:41,-34.90720,139.61592,9729,74,9,-12,1418 fixed

1,1202,02:57:29,-34.908298,139.984512,9556,67,203,-13,156,adbf CRC BAD
1,1204,02:57:53,-34.908089,139.591492,9586,72,9,-13,156,adb3 CRC OK
1,1205,02:58:05,-34.907940,139.594177,9601,71,9,-13,154,999c CRC OK
1,1207,02:58:29,-34.907639,139.599594,9633,74,9,-13,155,9b28 CRC OK
1,1210,02:59:05,-34.907299,139.607758,9683,73,9,-12,154,27eb CRC OK
1,1211,02:59:17,-34.907249,139.610458,9697,74,9,-12,154,8fb9 CRC OK
1,1212,02:59:29,-34.907219,139.613190,9714,76,9,-12,155,a2e8 CRC OK
1,1213,02:59:41,-34.907200,139.615921,9729,74,9,-12,156,b378 CRC OK

The payload is transmitting RTTY and binary packets. The lines starting with “HORUS” come from the RTTY protocol. The lower lines starting with “1” from the new binary protocol. The binary protocol was delivering packets at Eb/No as low as 6dB (SNR in 3000Hz of -9dB) at 100 bit/s.

Here are some packets from the very end of the flight, from a sample provided by VK5EI in Adelaide:
HORUS,2513,07:19:41,-35.12791,140.72295,7992,50,9,-14,1393 CRC OK
HORUS,2514,07:19:53,-35.12800,140.72472,7838,49,9,-14,1386 CRC OK
HORUS,2515,07:20:05,-35.12794,140.72639,7680,43,9,-13,1395 CRC OK
HOR-(SMJJIRKH IKANHS H )12780,140VHIHCN@HHH0,38,9,-13,1400 CRC BAD
HORUS,2517,07 MOEMBA LJ@N HIIS K !72926,738C I SD PLM#! (1 CRC BAD

1,2513,07:19:41,-35.127911,140.722946,7992,50,9,-14,151,f565 CRC OK
1,2514,07:19:53,-35.127998,140.724716,7838,49,9,-14,151,c1ac CRC OK
1,2515,07:20:05,-35.127941,140.726395,7698,43,9,-13,150,0634 CRC BAD
1,2517,07:20:29,-0.000000,26334306.000000,3671,108,1,84,128,a66f CRC BAD

The payload was 300km to the East, and disappearing behind the Mt Lofty ranges as it descended beneath 7700m. Once again RTTY at the top, binary at the bottom. In this case RTTY managed to decode the last packet. The following plot helps explain why:

This is a plot of the output energy from the two FSK filters inside the demodulator. The gap between them is a measure of signal quality or SNR. The x-axis is the time in “bits”, there are 100 bit/s. At the start of this sample, the signal is very clean. Then at about bit 25000 it disappears abruptly into the noise, and by bit 26000 it is gone. One thousand bits is about the time it takes to send one RTTY and one binary packet. Once the signal is gone completely, neither protocol can do much with it.

Overall, a very satisfying result, especially on top of the “ideal” FSK modem development from October. I feel like we are pushing the art of open source telemetry forward. While useful and fun for balloon work, this work has far wider applications, such as IoT.

Protocol Design

Mark provided a packed binary structure for the payload data. I put some thought into the protocol design, carefully considering the use case. This is a lesson I learned from FreeDV – where “voice is not like data”.

I realised that with balloon telemetry data, losing a few packets is OK. When floating along at high altitude the last packet is often very similar to the next one. However it is really important to get some packets through. We don’t want the link to fall over entirely, but can tolerate a high packet error rate.

However when the payload is descending rapidly and close to the ground, reliably receiving packets every few seconds is important. It gives you a good chance of finding the payload on the ground.

The new binary protocol consists of a 16-bit unique word for finding the start of the packet, 176 bits (22 bytes) of binary payload data, which are protected by a (23,11) Golay block code to give a total packet size of 360 bits.

Given we have a few 100 bits of payload data I estimated a Bit Error Rate (BER) of 1E-3 would give us a fair chance of getting a packet through. Add a rate 1/2 code and we can handle a few % BER. Here is the unit test output:
$ gcc horus_l2.c -o horus_l2 -Wall -DHORUS_L2_UNITTEST $ ./horus_l2 test 0: BER: 0.00 ...........: 0
test 1: BER: 0.01 ...........: 0
test 2: BER: 0.05 ...........: 0
test 3: BER: 0.10 ...........: 10

OK, so it’s correcting (0 bit errors after decode) at random BER up to 5%. The Golay (23,11) code can correct 3 errors in a 23 bit codeword which (IIRC) means it falls over at about BER=0.08. This channel is arguably perfect AWGN – a balloon 30km in the air with a line of sight path to our receivers. So random (rather than burst) errors is a reasonable channel model.

Looking at the Eb/No versus BER curves for ideal 2FSK we get a BER of 0.05 at an Eb/No of 6.5dB. So thats where we would expect our binary protocol to fall over. Which is exactly what happened in our tests.

A BER of 1E-3 after FEC decoding is rather high. Its a region where FEC codes don’t work too well. I compared a rate 1/2 convolutional code at the same operating point. It had the same coding gain as the Golay block code (which is just 1.5dB). So might as well use the much simpler block code.

In fact, I am wondering if FEC helps us at all. We may be better off just sending the binary data at half the bit rate, and getting a 3dB increase in our energy/bit. Or send it twice, then combining the received symbols (diversity). More research required.

As I discovered with FreeDV, FEC is not a panacea. Simply slapping FEC onto your system without considering the requirements is naive.

The RTTY protocol has long packets of about 600 hundred bits and almost no protection from bit errors. So we could argue it requires a BER of 1E-3, which is an Eb/No of 10.5dB. This means our binary protocol has a “gain” of around 4dB. I haven’t confirmed this, but suspect most of this gain is from simply having a shorter packet.

However the real world improvement with the binary protocol was significant. There were many times during the cruise phase of the flight where it reliable returned packets when the RTTY protocol experienced problems. So perhaps there are other sources of bit errors that mean a little FEC helps a lot.

The Horus Binary protocol is implemented in a single C file horus_l2.c. Using #defines, the encoder/tx side can be compiled down to a very small module that will run on a tiny 8-bit uC. It can also be compiled to run unit tests, or as the decoder/rx side for the ground station.

Towards Open Source Telemetry

I’m interested in developing an open source telemetry system in 2016, and think we can outperform closed source systems such as Lora because … open source. In October we developed an “ideal” FSK Modem, now we have experience and good results with a protocol. Here is a work flow diagram for the project:

The fsk_horus.m modem needs to be ported to C, converted to fixed point, and then run on a modest uC which will give us a complete, open source telemetry system.

One important step is some simple, low cost, radio hardware. Not a chipset, but our very own open radio hardware. I have prototyped some of the radio already – and received 440MHz signals using a Si5351, NE602, and a few transistors (block diagram above)

Open Source – When Experts Collide

As our balloon was wafting about South Australia I was admiring the HabHub software. Some web developers really know their stuff and now I enjoy the benefits of that. I have no idea how to make nice web sites.

It dawned on me that what Mark and I are doing is applying our expertise to the physical layer of the system – modems and radio hardware. The web developers, smart as they are, would be amazed by our skills in that area. However we can link our code to theirs in a few minutes – no NDAs, no permission required. No road blocks to our innovation.

I keep seeing (and then demonstrating) large gains in modems – HF digital voice, VHF digital voice, and now telemetry. As I explore assumptions (“you can’t violate OSI model layers”, “you must have FEC”, “Chipset XXX is the best”, “you can’t build your own codec/modem/radio hardware”, “DSP must run on custom hardware”) I find many of them misleading or plain wrong.

In real terms – performance – the incumbent closed source systems have been crippled by the fact they are closed source. Then they tell us “you can’t play there”. Wrong.

Even the RF hardware is now “opening up” – I managed to get a prototype telemetry Rx working in a few hours on my bench. It’s not scary when you know how. Just open up the black box and peer inside. Refuse to accept the black box is all there is. Don’t stop until you hit the laws of physics.

Just like nuclear fusion – push together a few domain experts and a great deal of energy is released.

Further Work

Mark pointed out we need the new modem and protocol into a form usable by end users. We are currently piping a bunch of scripts together written in GNU Octave, C, and python. Fantastic for rapid prototyping but the end users need a cross platform GUI application like fldigi or FreeDV.

The current system sends RTTY/Binary packets one after the other. The RTTY packets take 70% of the time. With just the binary protocol we could get 3 times the packet rate which would improve the likelihood of getting valid packets.

An interleaver may also help, for times when there are burst errors. I have tested an initial version but it doesn’t separate the bits enough. More work needed.

It was unclear if some long strings of 1’s and 0’s were upsetting the fsk_horus.m frequency and timing offset estimators. More work needed to determine if this is a real problem. The interleaver would help, and we could always use a scrambler if it turns out to be a real problem.

Halving the bit rate to 50 baud would give us 3dB, and still an acceptable a packet update rate. Using 4FSK rather than 2FSK gives us another 3dB, so in total that’s an easy 6dB gain. 4FSK is possible to generate using more or less the current payload hardware (you might need two GPIOs bits driving a VCO). In a line of sight channel 6dB is double the range. In terms of transmit power that’s like having 4 times the transmit power. That may be “enough”; at 30km altitude the curvature of the Earth may obscure the signal first before you run out of link budget!

6 thoughts on “Binary Telemetry Protocol”

  1. “The web developers, smart as they are, would be amazed by our skills in that area.”

    In defence of the HabHub web developers, some of them are doing PhDs in communications and information theory, and have flown stuff with state of the art error correction codes, and so on. It’s more just a question of where you focus your energy to make things better (and easy to implement) for the community as a whole, where having an advanced degree [as a pre-req for good telemetry] isn’t an option. The web stuff is pretty good now, so reliable two-way comms for everyone would be an excellent area for those with skills and, crucially, time/inclination to advance and then package up nicely so it can be easily adopted outside the nerdier core.

    I’m completely with you on open source, and how the only limits you should worry about are the ones imposed by physical laws. True for all engineering endeavors.

    1. I’m even more impressed if they can do great Web development and also hold PhDs in communications!

  2. If you back off the Golay coding, the CRC could also be used for error correction, though the CRC-16-CCITT provides only Hamming Distance 4 (to 32751 bits)
    If you’re constrained by the radio hardware providing the CRC (ie SiLabs Si1000 ISM like Tridge’s SiK), you may be able to use it with CRC-32 IEEE 802.3 (inbuilt in the Si1000) which will give you Hamming distance 6 to 268 bits.
    If you can choose the CRC, KoopMan gives CRC-24K/7 with HD 7 to 231 bits, and CRC-24/4 with HD 6 to 822 bits. Using a 4-bit array lookup CRC algorithm would cost only 3*16 = 48 bytes for the array.
    Depending on the transmitter micro, the SiK SDCC approach may work.

    The website links to Koopman’s Checksum & CRC blog.
    Known zero bits in the timestamp fields could be used for correction.

    1. Thank you – some great ideas. In the previous post in this series I describe a soft decision bit-flipping approach that’s sounds similar to your suggestion. The “fixed” text in the ASCII protocol example above is the algorithm in action.

      However I hadn’t thought about it in terms of Hamming distance, or that other CRCs could give better results. So nice to see this technique has been “engineered”. I wonder how this technique compares to other FEC codes, e.g. in terms or rate and coding gain? Reminds me of LPDC parity checks.


      If you’re constrained by the radio hardware

      …. that is exactly whey we are building an open source telemetry radio – no more constraints!

  3. Hi David,

    p.s.- FWIW your website has stopped allowing commenting from Dillo! So I have to now use a PIG browser :(.

    FANTASTIC. Most impressive.

    I wonder if it might make sense to broadcast sequentially different types of packets from a live balloon to see which works best under real field conditions?

    And I bet a cheap telephone could be taught to decode whatever.

    Which brings up an interesting question. A cheap telephone in addition to a user processor must have a pretty good RF processor along with receive and transmit electronics and some not to shoddy peripherals. Further the RF stuff ‘must’ be visible from the main processor and is probably also patchable from there.

    Wouldn’t it be neat to hack a cheap telephone into a ham rig. I bet there are folks around this blog that could supply enough information to get things started?

    Dual boot – telephone or Linux RF box :).

    Lots of fun :).


  4. The 24-bit CRC was assuming an 8-bitter like an 8051(?) as the transmitter processor.
    If using a more capable processor, then CRC-32/8 gives HD 8 to 992 bits if I read the table correctly.

Comments are closed.