WaveNet and Codec 2

Yesterday my friend and fellow open source speech coder Jean-Marc Valin (of Speex and Opus fame) emailed me with some exciting news. W. Bastiaan Kleijn and friends have published a paper called “Wavenet based low rate speech coding“. Basically they take bit stream of Codec 2 running at 2400 bit/s, and replace the Codec 2 decoder with the WaveNet deep learning generative model.

What is amazing is the quality – it sounds as good an an 8000 bit/s wideband speech codec! They have generated wideband audio from the narrowband Codec model parameters. Here are the samples – compare “Parametrics WaveNet” to Codec 2!

This is a game changer for low bit rate speech coding.

I’m also happy that Codec 2 has been useful for academic research (Yay open source), and that the MOS scores in the paper show it’s close to MELP at 2400 bit/s. Last year we discovered Codec 2 is better than MELP at 600 bit/s. Not bad for an open source codec written (more or less) by one person.

Now I need to do some reading on Deep Learning!

Reading Further

Wavenet based low rate speech coding
Wavenet Speech Samples
AMBE+2 and MELPe 600 Compared to Codec 2

Lithium Cell Amp Hour Tester and Electric Sailing

I recently electrocuted my little sail boat. I built the battery pack using some second hand Lithium cells donated by my EV. However after 8 years of abuse from my kids and I those cells are of varying quality. So I set about developing an Amp-Hour tester to determine the capacity of the cells.

The system has a relay that switches a low value power resistor (OK some coat hanger wire) across the 3.2V cell terminals, loading it up at about 27A, roughly the cruise current for my e-boat. It’s about 0.12 ohms once it heats up. This gets too hot to touch but not red hot, it’s only 86W being dissipated along about 1m of wire. When I built my EV I used the coat hanger wire load trick to test 3kW loads, that was a bit more exciting!

The empty beer can in the background makes a useful insulated stand off. Might need to make more of those.

When I first installed Lithium cells in my EV I developed a charge controller for my EV. I borrowed a small part of that circuit; a two transistor flip flop and a Battery Management System (BMS) module:

Across the cell under test is a CM090 BMS module from EV Power. That’s the good looking red PCB in the photos, onto which I have tacked the circuit above. These modules have a switch than opens when the cell voltage drops beneath 2.5V.

Taking the base of either transistor to ground switches on the other transistor. In logic terms, it’s a “not set” and “not reset” operation. When power is applied, the BMS module switch is closed. The 10uF capacitor is discharged, so provides a momentary short to ground, turning Q1 off, and Q2 on. Current flows through the automotive relay, switching on the load to the battery.

After a few hours the cell discharges beneath 2.5V, the BMS switch opens and Q2 is switched off. The collector voltage on Q2 rises, switching on Q1. Due to the latching operation of the flip flip – it stays in this state. This is important, as when the relay opens, the cell will be unloaded and it’s voltage will rise again and the BMS module switch will close. In the initial design without a flip flop, this caused the relay to buzz as the cell voltage oscillated about 2.5V as the relay opened and closed! I need the test to stop and stay stopped – it will be operating unattended so I don’t want to damage the cell by completely discharging it.

The LED was inserted to ensure the base voltage on Q1 was low enough to switch Q1 off when Q2 was on (Vce of Q2 is not zero), and has the neat side effect of lighting the LED when the test is complete!

In operation, I point a cell phone taking time lapse video of the LED and some multi-meters, and start the test:

I wander back after 3 hours and jog-shuttle the time lapse video to determine the time when the LED came on:

The time lapse feature on this phone runs in 1/10 of real time. For example Cell #9 discharged in 12:12 on the time lapse video. So we convert that time to seconds, multiply by 10 to get “seconds of real time”, then divide by 3600 to get the run time in hours. Multiplying by the discharge current of 27(ish) Amps we get the cell capacity:

  12:12 time lapse, 27*(12*60+12)*10/3600 = 55AH

So this cells a bit low, and won’t be finding it’s way onto my boat!

Another alternative is a logging multimeter, one could even measure and integrate the discharge current over time. or I could have just bought or borrowed a proper discharge tester, but where’s the fun in that?

Results

It was fun to develop, a few Saturday afternoons of sitting in the driveway soldering, occasional burns from 86W of hot wire, and a little head scratching while I figured out how to take the design from an expensive buzzer to a working circuit. Nice to do some soldering after months of software based DSP. I’m also happy that I could develop a transistor circuit from first principles.

I’ve now tested 12 cells (I have 40 to work through), and measured capacities of 50 to 75AH (they are rated at 100AH new). Some cells have odd behavior under load; dipping beneath 3V right at the start of the test rather than holding 3.2V for a few hours – indicating high internal resistance.

My beloved sail e-boat is already doing better. Last weekend, using the best cells I had tested at that point, I e-motored all day on varying power levels.

One neat trick, explained to me by Matt, is motor-sailing. Using a little bit of outboard power, the boat overcomes hydrodynamic friction (it gets moving in the water) and the sail is moved out of stall (like an airplane wing moving to just above stall speed). This means to boat moves a lot faster than under motor or sail alone in light winds. For example the motor was registering just 80W, but we were doing 3 knots in light winds. This same trick can be done with a stink-motor and dinosaur juice, but the e-motor is completely silent, we forgot it was on for hours at a time!

Reading Further

Electric Car BMS Controller
New Lithium Battery Pack for my EV
Engage the Silent Drive
EV Bugs

Testing HAB Telemetry Protocols

On Saturday Mark and I had a pleasant day bench testing High Altitude Balloon (HAB) Telemetry protocols and demodulators.

Project Horus HAB flights use a low power transmitter to send regular updates of the balloons position and status. To date, this has been sent using RTTY, and demodulated using Fldigi, or a special version modified for HAB work called dl-Fldigi.

Lora is becoming common in HAB circles, however I am confident we can do better using a custom protocol and well engineered, and most importantly – open source – modems. While very well designed and conveniently packaged, Lora is not magic – modem performance is defined by physics.

A few year ago, Mark and I developed and flight tested a binary protocol (Horus Binary) for HAB flights. We have dusted this off, and I’ve written a C callable API (horus_api.c) to make Horus RTTY and Binary easy to use. The plan is to release a cross platform GUI application that supports Horus Binary, so anyone with a SSB receiver can join in the fun of tracking Horus flights using Horus Binary.

A good HAB telemetry protocol works at low SNRs, and has fast updates to allow accurate positioning of the payload during the final decent. A way of measuring the performance is Packet Error Rate (PER) – how many telemetry packets get through at a given Signal to Noise Ratio (SNR).

So we generated some synthetic Horus RTTY and Binary packets at calibrated SNRs using GNU Octave simulation code (fsk_horus.m), then played the wave files through several modems.

Here are the results (click for a larger version):

The X-axis is in Eb/No, which is proportional to SNR:

  SNR = EBNodB + 10log10(Rb/BW)

where Rb is the bit rate and BW is the noise bandwidth you want to measure SNR in. Eb/No is handy as it normalises for the effect of bit rate and noise bandwidth, making modem comparison easier.

Protocol dl-Fldigi
RTTY
Fldigi
RTTY
Horus
RTTY
Horus
Binary
Eb/No
(50% PER)
13.0 12.0 11.5 4.5
Rb 100 100 100 200
SNR (3000Hz) -1.7 -2.7 -3.2 -7.2
Packet
Duration
6 6 6 1.6
Wave File Listen Listen Listen Listen

Discussion

The older dl-Fldigi is a few dB behind the more modern Fldigi. Our Horus RTTY and especially Binary protocols are doing very well. At the same bit rate (Eb/No curve), Horus Binary is 9dB ahead of dl-Fldigi, which is a very useful gain; at least double the Line of Site (LOS) range, and equivalent to having nearly 10x the transmit power. The Binary packets are fast as well, allowing for rapid position updates in the final descent.

Trade offs are possible, for example if we slowed Horus Binary to 50 bits/s, it’s packet duration would be 6.4s (about the same as RTTY) however 50% PER would occur at a SNR of -13dB, a 15dB improvement over dl-Fldigi.

Reading Further

Project Horus
Binary Telemetry Protocol
All Your Modem are Belong To Us
SNR and Eb/No Worked Example

Measuring SDR Noise Figure in Real Time

I’m building a sensitive receiver for FreeDV 2400A signals. As a first step I tried a HackRF with an external Low Noise Amplifier (LNA), and attempted to measure the Noise Figure (NF) using the system Mark and I developed two years ago.

However I was getting results that didn’t make sense and were not repeatable. So over the course of a few early morning sessions I came up with a real time NF measurement system, and wrinkled several bugs out of it. I also purchased a few Airspy SDRs, and managed to measure NF on them as well as the HackRF.

It’s a GNU Octave script called nf_from_stdio.m that accepts a sample stream from stdio. It assumes the signal contains a sine wave test tone from a calibrated signal generator, and noise from the receiver under test. By sampling the test tone it can establish the gain of the receiver, and by sampling the noise spectrum an estimate of the noise power.

The script can be driven from command line utilities like hackrf_transfer or airspy_rx or via software receivers like gqrx that can send SSB-demodaulted samples over UDP. Instructions are at the top of the script.

Equipment

I’m working from a home workbench, with rudimentary RF skills, a strong signal processing background and determination. I do have a good second hand signal generator (Marconi 2031), that cost AUD$1000 at a Hamfest, and a Rigol 815 Spec An (generously donated by Mel K0PFX, and Jim, N0OB) to support my FreeDV work. Both very useful and highly recommended. I cross-checked the sig-gen calibrated output using an oscilloscope and external attenuator (within 0.5dB). The Rigol is less accurate in amplitude (1.5dB on its specs), but useful for relative measurements, e.g. comparing cable attenuation.

For the NF test method I have used a calibrated signal source is required. I performed my tests at 435MHz using a -100dBm carrier generated from the Marconi 2031 sig-gen.

Usage and Results

The script accepts real samples from a SSB demod, or complex samples from an IQ source. Tune your receiver so that the sinusoidal test tone is in the 2000 to 4000 Hz range as displayed on Fig 2 of the script. In general for minimum NF turn all SDR gains up to maximum. Check Fig 1 to ensure the signal is not clipping, reduce the baseband gain if necessary.

Noise is measured between 5000 and 10000 Hz, so ensure the receiver passband is flat in that region. When using gqrx, I drag the filter bandwidth out to 12000 Hz.

The noise estimates are less stable than the tone power estimate, leading to some sample/sample variation in the NF estimate. I take the median of the last five estimates.

I tried supplying samples to nf_from_stdio using two methods:

  1. Using gqrx in UDP mode to supply samples over UDP. This allows easy tuning and the ability to adjust the SDR gains in real time, but requires a few steps to set up
  2. Using a “single” command line approach that consists of a chain of processing steps concatenated together. Once your signal is tuned you can start the NF measurements with a single step.

Instructions on how to use both methods are at the top of nf_from_stdio.m

Here are some results using both gqrx and command line methods, with and without an external (20dB gain/1dB NF) LNA. They were consistent across two laptops.

SDR Gqrx LNA Cmd Line LNA Cmd Line no LNA
AirSpy Mini 2.0 2.2 7.9
AirSpy R2 1.7 1.7 7.0
HackRF One 2.6 3.4 11.1

The results with LNA are what we would expect for system noise figures with a good LNA at the front end.

The “no LNA” Airspy NF results are curious – the Airspy specs state a NF of just 3.5dB. So we contacted Airspy via Twitter and email to see how they measured their stated NF. We haven’t received a response to date. I posted to the Airspy mailing list and one gentleman (Dave – WØLEV) kindly replied and has measured noise figures of 4dB using calibrated noise sources and attenuators.

Looking into the data sheets for the Airspy, it appears the R820T tuner at the front end of the Airspy has a NF of 3.5dB. However a system NF will always be worse than the first device, as other devices (e.g. the ADC) also inject noise.

Other possibilities for my figures are measurement error, ambient noise sources at my site, frequency dependent NF, or variations in individual R820T samples.

In our past work we have used Bit Error Rate (BER) results as an independent method of confirming system noise figure. We found a close match between theoretical and measured BER when testing with and without a LNA. I’ll be repeating similar low level BER tests with FreeDV 2400A soon.

Real Time Noise Figure

It’s really nice to read the system noise figure in real time. For example you can start it running, then experiment with grounding, tightening connectors, or moving the SDR away from the laptop, or connect/disconnect a LNA in real time and watch the results. Really helps catch little issues in these difficult to perform tests. After all – we are measuring thermal noise, a very weak signal.

Some of the NF problems I could find and remove with a real time measurement:

  • The Airspy mini is nearly 1dB worse on the front left USB port than the rear left USB port on my X220 Thinkpad!
  • The Airspy mini really likes USB extension cables with ferrite clamps – without the ferrite I found the LNA was ineffective in reducing the NF – being swamped by conducted laptop noise I guess.
  • Loose connectors can make the noise figure a few dB worse. Wiggle and tighten them all.
  • Position of SDR/LNA near the radio and other bench equipment.
  • My magic touch can decrease noise figure! Grounding effect I guess?

Development Bugs

I had to work through several problems before I started getting sensible numbers. This was quite discouraging for a while as the numbers were jumping all over the place. However its fair to say measuring NF is a tough problem. From what I can Google its an uncommon measurement for people in home workshops.

These bugs are worth mentioning as traps for anyone else attempting home NF measurements:

  1. Cable loss: I found a 1.5dB loss is some cable I was using between the sig gen and the SDR under test. I Measured the loss by comparing a few cables connected between my sig gen and spec an. While the 815 is not accurate in terms of absolute calibration (rated at 1.5dB), it can still be used for comparative measurements. The cable loss can be added to the calculations or just choose a low loss cable.
  2. Filter shape: I had initially placed the test tone under 1000Hz. However I noticed that the gqrx signal had a few dB of high pass filtering in this region (Fig 2 below). Not an issue for regular USB demodulation, but a few dB really matters for NF! So I moved the test tone to the 2-4kHz region where the gqrx output was nice and flat.
  3. A noisy USB port, especially without a clamp, on the Airspy Mini (photo below). Found by trying different SDRs and USB ports, and finally a clamp. Oh Boy, never expected that one. I was connecting the LNA and the NF was stuck at 4dB – swamped by noise from the USB Port I guess.
  4. Compression: Worth checking the SDR output is not clipped or in compression. I adjusted the sig gen output up and down 3dB, and checked the power estimate from the script changed by 3dB. Also worth monitoring Fig 1 from the script, make sure it’s not hitting the limits. The HackRF needed it’s baseband gain reduced, but the Airspys were OK.
  5. I used latest Airspy tools built from source (rather than Ubuntu 17 package) to get stdout piping working properly and not have other status information from printfs injected into the sample stream!

Credits

Thanks Mark, for the use of your RF hardware, and I’d also like to mention the awesome CSDR tools and fantastic gqrx software – both very handy for SDR work.

Engage the Silent Drive

I’ve been busy electrocuting my boat – here are our first impressions of the Torqueedo Cruise 2.0T on the water.

About 2 years ago I decided to try sailing, so I bought a second hand Hartley TS16; a popular small “trailer sailor” here in Australia. Since then I have been getting out once every week, having some very pleasant days with friends and family, and even at times by myself. Sailing really takes you away from everything else in the world. It keeps you busy as you are always pulling a rope or adjusting this and that, and is physically very active as you are clambering all over the boat. Mentally there is a lot to learn, and I started as a complete nautical noob.

Sailing is so quiet and peaceful, you get propelled by the wind using aerodynamics and it feels like like magic. However this is marred by the noise of outboard motors, which are typically used at the start and end of the day to get the boat to the point where it can sail. They are also useful to get you out of trouble in high seas/wind, or when the wind dies. I often use the motor to “un hit” Australia when I accidentally lodge myself on a sand bar (I have a lot of accidents like that).

The boat came with an ancient 2 stroke which belched smoke and noise. After about 12 months this motor suffered a terminal melt down (impeller failure and over heated) so it was replaced with a modern 5HP Honda 4-stroke, which is much quieter and very fuel efficient.

My long term goal was to “electrocute” the boat and replace the infernal combustion outboard engine with an electric motor and battery pack. I recently bit the bullet and obtained a Torqeedo Cruise 2kW outboard from Eco Boats Australia.

My friend Matt and I tested the motor today and are really thrilled. Matt is an experienced Electrical Engineer and sailor so was an ideal companion for the first run of the Torqueedo.

Torqueedo Cruise 2.0 First Impressions

It’s silent – incredibly so. Just a slight whine conducted from the motor/gearbox pod beneath the water. The sound of water flowing around the boat is louder!

The acceleration is impressive, better than the 4-stroke. Make sure you sit down. That huge, low RPM prop and loads of torque. We settled on 1000W, experimenting with other power levels.

The throttle control is excellent, you can dial up any speed you want. This made parking (mooring) very easy compared to the 4-stroke which is more of a “single speed” motor (idles at 3 knots, 4-5 knots top speed) and is unwieldy for parking.

It’s fit for purpose. This is not a low power “trolling” motor, it is every bit as powerful as the modern Honda 5HP 4-stroke. We did a A/B test and obtained the same top speed (5 knots) in the same conditions (wind/tide/stretch of water). We used it with 15 knot winds and 1m seas and it was the real deal – pushing the boat exactly where we wanted to go with authority. This is not a compromise solution. The Torqueedo shows internal combustion who’s house it is.

We had some fun sneaking up on kayaks at low power, getting to within a few metres before they heard us. Other boaties saw us gliding past with the sails down and couldn’t work out how we were moving!

A hidden feature is Azipod steering – it steers through more than 270 degrees. You can reverse without reverse gear, and we did “donuts” spinning on the keel!

Some minor issues: Unlike the Honda the the Torqueedo doesn’t tilt complete out of the water when sailing, leaving some residual drag from the motor/propeller pod. It also has to be removed from the boat for trailering, due to insufficient road clearance.

Walk Through

Here are the two motors with the boat out of the water:

It’s quite a bit longer than the Honda, mainly due to the enormous prop. The centres of the two props are actually only 7cm apart in height above ground. I had some concerns about ground clearance, both when trailering and also in the water. I have enough problems hitting Australia and like the way my boat can float in just 30cm of water. I discussed this with my very helpful Torqueedo dealer, Chris. He said tests with short and long version suggested this wasn’t a problem and in fact the “long” version provided better directional control. More water on top of the prop is a good thing. They recommend 50mm minimum, I have about 100mm.

To get started I made up a 24V battery pack using a plastic tub and 8 x 3.2V 100AH Lithium cells, left over from my recent EV battery upgrade. The cells are in varying conditions; I doubt any of them have 100AH capacity after 8 years of being hammered in my EV. On the day we ran for nearly 2 hours before one of the weaker cells dipped beneath 2.5V. I’ll sort through my stock of second hand cells some time to optimise the pack.

The pack plus motor weighs 41kg, the 5HP Honda plus 5l petrol 32kg. At low power (600W, 3.5 knots), this 2.5kWHr pack will give us a range of 14 nm or 28km. Plenty – on a huge days sailing we cover 40km, of which just 5km would be on motor.

All that power on board is handy too, for example the load of a fridge would be trivial compared to the motor, and a 100W HF radio no problem. So now I can quaff ice-cold sparkling shiraz or a nice beer, while having an actual conversation and not choking on exhaust fumes!

Here’s Matt taking us for a test drive, not much to the Torqueedo above the water:

For a bit of fun we ran both motors (maybe 10HP equivalent) and hit 7 knots, almost getting the Hartley up on the plane. Does this make it a Hybrid boat?

Conclusions

We are in love. This is the future of boating. For sale – one 5HP Honda 4-stroke.

How Inlets Generate Thrust on Supersonic aircraft

Some time ago I read Skunk Works, a very good “engineering” read.

In the section on the SR-71, the author Ben Rich made a statement that has puzzled me ever since, something like: “Most of the engines thrust is developed by the intake”. I didn’t get it – surely an intake is a source of drag rather than thrust? I have since read the same statement about the Concorde and it’s inlets.

Lately I’ve been watching a lot of AgentJayZ Gas Turbine videos. This guy services gas turbines for a living and is kind enough to present a lot of intricate detail and answer questions from people. I find his presentation style and personality really engaging, and get a buzz out of his enthusiasm, love for his work, and willingness to share all sorts of geeky, intricate details.

So inspired by AgentJayZ I did some furious Googling and finally worked out why supersonic planes develop thrust from their inlets. I don’t feel it’s well explained elsewhere so here is my attempt:

  1. Gas turbine jet engines only work if the air is moving into the compressor at subsonic speeds. So the job of the inlet is to slow the air down from say Mach 2 to Mach 0.5.
  2. When you slow down a stream of air, the pressure increases. Like when you feel the wind pushing on your face on a bike. Imagine (don’t try) the pressure on your arm hanging out of a car window at 100 km/hr. Now imagine the pressure at 3000 km/hr. Lots. Around a 40 times increase for the inlets used in supersonic aircraft.
  3. So now we have this big box (the inlet chamber) full of high pressure air. Like a balloon this pressure is pushing equally on all sides of the box. Net thrust is zero.
  4. If we untie the balloon neck, the air can escape, and the balloon shoots off in the opposite direction.
  5. Back to the inlet on the supersonic aircraft. It has a big vacuum cleaner at the back – the compressor inlet of the gas turbine. It is sucking air out of the inlet as fast as it can. So – the air can get out, just like the balloon, and the inlet and the aircraft attached to it is thrust in the opposite direction. That’s how an inlet generates thrust.
  6. While there is also thrust from the gas turbine and it’s afterburner, turns out that pressure release in the inlet contributes the majority of the thrust. I don’t know why it’s the majority. Guess I need to do some more reading and get my gas equations on.

Another important point – the aircraft really does experience that extra thrust from the inlet – e.g. it’s transmitted to the aircraft by the engine mounts on the inlet, and the mounts must be designed with those loads in mind. This helps me understand the definition of “thrust from the inlet”.

Steve Ports an OFDM modem from Octave to C

Earlier this year I asked for some help. Steve Sampson K5OKC stepped up, and has done some fine work in porting the OFDM modem from Octave to C. I was so happy with his work I asked him to write a guest post on my blog on his experience and here it is!

On a personal level working with Steve was a great experience for me. I always enjoy and appreciate other people working on FreeDV with me, however it is quite rare to have people help out with programming. As you will see, Steve enjoyed the process and learned a great deal in the process.

The Problem with Porting

But first some background on the process involved. In signal processing it is common to develop algorithms in a convenient domain-specific scripting language such as GNU Octave. These languages can do a lot with one line of code and have powerul visualisation tools.

Usually, the algorithm then needs to be ported to a language suitable for real time implementation. For most of my career that has been C. For high speed operation on FPGAs it might be VHDL. It is also common to port algorithms from floating point to fixed point so they can run on low cost hardware.

We don’t develop algorithms directly in the target real-time language as signal processing is hard. Bugs are difficult to find and correct. They may be 10x or 100x times harder (in terms of person-hours) to find in C or VHDL than say GNU Octave.

So a common task in my industry is porting an algorithm from one language to another. Generally the process involves taking a working simulation and injecting a bunch of hard to find bugs into the real time implementation. It’s an excellent way for engineering companies to go bankrupt and upset customers. I have seen and indeed participated in this process (screwing up real time implementations) many times.

The other problem is algorithm development is hard, and not many people can do it. They are hard to find, cost a lot of money to employ, and can be very nerdy (like me). So if you can find a way to get people with C, but not high level DSP skills, to work on these ports – then it’s a huge win from a resourcing perspective. The person doing the C port learns a lot, and managers are happy as there is some predictability in the engineering process and schedule.

The process I have developed allows people with C coding (but not DSP) skills to port complex signal processing algorithms from one language to another. In this case its from GNU Octave to floating point C. The figures below shows how it all fits together.

Here is a sample output plot, in this case a buffer of received samples in the demodulator. This signal is plotted in green, and the difference between C and Octave in red. The red line is all zeros, as it should be.

This particular test generates 12 plots. Running is easy:

$ cd codec2-dev/octave
$ ../build_linux/unittest/tofdm
$ octave
>> tofdm
W........................: OK
tx_bits..................: OK
tx.......................: OK
rx.......................: OK
rxbuf in.................: OK
rxbuf....................: OK
rx_sym...................: FAIL (0.002037)
phase_est_pilot..........: FAIL (0.001318)
rx_amp...................: OK
timing_est...............: OK
sample_point.............: OK
foff_est_hz..............: OK
rx_bits..................: OK

This shows a fail case – two vectors just failed so some further inspection required.

Key points are:

  1. We make sure the C and Octave versions are identical. Near enough is not good enough. For floating point I set a tolerance like 1 part in 1000. For fixed point ports it can be bit exact – zero difference.
  2. We dump a lot of internal states, not just the inputs and outputs. This helps point us at exactly where the problem is.
  3. There is an automatic checklist to give us pass/fail reports of each stage.
  4. This process is not particularly original. It’s not rocket science, but getting people (especially managers) to support and follow such a process is. This part – the human factor – is really hard to get right.
  5. The same process can be used between any two versions of an algorithm. Fixed and float point, fixed point C and VHDL, or a reference implementation and another one that has memory or CPU optimisations. The same basic idea: take a reference version and use software to compare it.
  6. It makes porting fun and strangely satisfying. You get constant forward progress and no hard to find bugs. Things work when they hit real time. After months of tough, brain hurting, algorithm development, I find myself looking forward to the productivity the porting phase.

In this case Steve was the man doing the C port. Here is his story…..

Initial Code Construction

I’m a big fan of the Integrated Debugging Environment (IDE). I’ve used various versions over the years, but mostly only use Netbeans IDE. This is my current favorite, as it works well with C and Java.

When I take on a new programming project I just create a new IDE project and paste in whatever I want to translate, and start filling-in the Java or C code. In the OFDM modem case, it was the Octave source code ofdm_lib.m.

Obviously this code won’t do anything or compile, but it allows me to write C functions for each of the Octave code blocks. Sooner or later, all the Octave code is gone, and only C code remains.

I have very little experience with Octave, but I did use some Matlab in college. It was a new system just being introduced when I was near graduation. I spent a little time trying to make the program as dynamic as the Octave code. But it became mired in memory allocation.

Once David approved the decision for me to go with fixed configuration values (Symbol rate, Sample rate, etc), I was able to quickly create the header files. We could adjust these header files as we went along.

One thing about Octave, is you don’t have to specify the array sizes. So for the C port, one of my tasks was to figure out the array sizes for all the data structures. In some cases I just typed the array name in Octave, and it printed out its value, and then presto I now knew the size. Inspector Clouseau wins again!

The include files were pretty much patterned the same as FDMDV and COHPSK modems.

Code Starting Point

When it comes to modems, the easiest thing to create first is the modulator. It proved true in this case as well. I did have some trouble early on, because of a bug I created in my testing code. My spectrum looked different than Davids. Once this bug was ironed out the spectrums looked similar. David recommended I create a test program, like he had done for other modems.

The output may look similar, but who knows really? I’m certainly not going to go line by line through comma-separated values, and anyway Octave floating point values aren’t the same as C values past some number of decimal points.

This testing program was a little over my head, and since David has written many of these before, he decided to just crank it out and save me the learning curve.

We made a few data structure changes to the C program, but generally it was straight forward. Basically we had the outputs of the C and Octave modulators, and the difference is shown by their different colors. Luckily we finally got no differences.

OFDM Design

As I was writing the modulator, I also had to try and understand this particular OFDM design. I deduced that it was basically eighteen (18) carriers that were grouped into eight (8) rows. The first row was the complex “pilot” symbols (BPSK), and the remaining 7 rows were the 112 complex “data” symbols (QPSK).

But there was a little magic going on, in that the pilots were 18 columns, but the data was only using 16. So in the 7 rows of data, the first and last columns were set to a fixed complex “zero.”

This produces the 16 x 7 or 112 complex data symbols. Each QPSK symbol is two-bits, so each OFDM frame represents 224 bits of data. It wasn’t until I began working on the receiver code that all of this started to make sense.

With this information, I was able to drive the modulator with the correct number of bits, and collect the output and convert it to PCM for testing with Audacity.

DFT Versus FFT

This OFDM modem uses a DFT and IDFT. This greatly simplifies things. All I have to do is a multiply and summation. With only 18 carriers, this is easily fast enough for the task. We just zip through the 18 carriers, and return the frequency or time domain. Obviously this code can be optimized for firmware later on.

The final part of the modulator, is the need for a guard period called the Cyclic Prefix (CP). So by making a copy of the last 16 of the 144 complex time-domain samples, and putting them at the head, we produce 160 complex samples for each row, giving us 160 x 8 rows, or 1280 complex samples every OFDM frame. We send this to the transmitter.

There will probably need to be some filtering, and a function of adjusting gain in the API.

OFDM Modulator

That left the Demodulator which looked much more complex. It took me quite a long time just to get the Octave into some semblance of C. One problem was that Octave arrays start at 1 and C starts at 0. In my initial translation, I just ignored this. I told myself we would find the right numbers when we started pushing data through it.

I won’t kid anyone, I had no idea what was going on, but it didn’t matter. Slowly, after the basic code was doing something, I began to figure out the function of various parts. Again though, we have no idea if the C code is producing the same data as the Octave code. We needed some testing functions, and these were added to tofdm.m and tofdm.c. David wrote this part of the code, and I massaged the C modem code until one day the data were the same. This was pretty exciting to see it passing tests.

One thing I found, was that you can reach an underflow with single precision. Whenever I was really stumped, I would change the single precision to a double, and then see where the problem was. I was trying to stay completely within single precision floating point, because this modem is going to be embedded firmware someday.

Testing Process

There was no way that I could have reached a successful conclusion without the testing code. As a matter of fact, a lot of programming errors were found. You would be surprised at how much damage a miss placed parenthesis can do to a math equation! I’ve had enough math to know how to do the basic operations involved in DSP. I’m sure that as this code is ported to firmware, it can be simplified, optimized, and unrolled a bit for added speed. At this point, we just want valid waveforms.

C99 and Complex Math

Working with David was pretty easy, even though we are almost 16 time-zones apart. We don’t need an answer right now, and we aren’t working on a deadline. Sometimes I would send an email, and then four hours later I would find the problem myself, and the morning was still hours away in his time zone. So he sometimes got some strange emails from me that didn’t require an answer.

David was hands-off on this project, and doesn’t seem to be a control freak, so he just let me go at it, and then teamed-up when we had to merge things in giving us comparable output. Sometimes a simple answer was all I needed to blow through an Octave brain teaser.

I’ve been working in C99 for the past year. For those who haven’t kept up (1999 was a long time ago), but still, we tend to program C in the same way. In working with complex numbers though, the C library has been greatly expanded. For example, to multiply two complex numbers, you type” “A * B”. That’s it. No need to worry about a simulated complex number using a structure. You need a complex exponent, you type “cexp(I * W)” where “I” is the sqrt(-1). But all of this is hidden away inside the compiler.

For me, this became useful when translating Octave to C. Most of the complex functions have the same name. The only thing I had to do, was create a matrix multiply, and a summation function for the DFT. The rest was straight forward. Still a lot of work, but it was enjoyable work.

Where we might have problems interfacing to legacy code, there are functions in the library to extract the real and imaginary parts. We can easily interface to the old structure method. You can see examples of this in the testing code.

Looking back, I don’t think I would do anything different. Translating code is tedious no matter how you go. In this case Octave is 10 times easier than translating Fortran to C, or C to Java.

The best course is where you can start seeing some output early on. This keeps you motivated. I was a happy camper when I could look and listen to the modem using Audacity. Once you see progress, you can’t give up, and want to press on.

Steve/k5okc

Reading Further

The Bit Exact Fairy Tale is a story of fixed point porting. Writing this helped me vent a lot of steam at the time – I’d just left a company that was really good at messing up these sorts of projects.

Modems for HF Digital Voice Part 1 and Part 2.

The cohpsk_frame_design spreadsheet includes some design calculations on the OFDM modem and a map of where the data and pilot symbols go in time and frequency.

Reducing FDMDV Modem Memory is an example of using automated testing to port an earlier HF modem to the SM1000. In this case the goal was to reduce memory consumption without breaking anything.

Fixed Point Scaling – Low Pass Filter example – is consistently one of the most popular posts on this blog. It’s a worked example of a fixed point port of a low pass filter.

New Lithium Battery Pack for my EV

Eight years ago I installed a pack of 36 Lithium cells in my EV. After about 50,000km and several near-death battery pack experiences (over discharge) the range decreased beneath a useful level so I have just purchased a new pack.

Same sort of cells, CALB 100AH, 3.2V per cell (80km range). The pack was about AUD$6,000 delivered and took an afternoon to install. I’ve adjusted my Zivan NG3 to cut out at an average of 3.6 v/cell (129.6V), and still have the BMS system that will drop out the charger if any one cell exceeds 4.1V.

The original pack was rated at 10 years (3000 cycles) and given the abuse we subjected it to I’m quite pleased it lasted 8 years. I don’t have a fail-safe battery management system like a modern factory EV so we occasionally drove the car when dead flat. While I could normally pick this problem quickly from the instrumentation my teenage children tended to just blissfully drive on. Oh well, this is an experimental hobby, and mistakes will be made. The Wright brothers broke a few wings……

I just took the car with it’s new battery pack for a 25km test drive and all seems well. The battery voltage is about 118V at rest, and 114V when cruising at 60 km/hr. It’s not dropping beneath 110V during acceleration, much better than the old pack which would sag beneath 100V. I guess the internal resistance of the new cells is much lower.

I plan to keep driving my little home-brew EV until I can by a commercial EV with a > 200km range here in Australia for about $30k, which I estimate will happen around 2020.

It’s nice to have my little EV back on the road.

Codec 2 Wideband

I’m spending a month or so improving the speech quality of a couple of Codec 2 modes. I have two aims:

  1. Make the 700 bit/s codec sound better, to improve speech quality on low SNR HF channels (beneath 0dB).
  2. Develop a higher quality mode in the 2000 to 3000 bit/s range, that can be used on HF channels with modest SNRs (around 10dB)

I ran some numbers on the new OFDM modem and LDPC codes, and turns out we can get 3000 bit/s of codec data through a 2000 Hz channel at down to 7dB SNR.

Now 3000 bit/s is broadband for me – I’ve spent years being very frugal with my bits while I play in low SNR HF land. However it’s still a bit low for Opus which kicks in at 6000 bit/s. I can’t squeeze 6000 bit/s through a 2000 Hz RF channel without higher order QAM constellations which means SNRs approaching 20dB.

So – what can I do with 3000 bit/s and Codec 2? I decided to try wideband(-ish) audio – the sort of audio bandwidth you get from Skype or AM broadcast radio. So I spent a few weeks modifying Codec 2 to work at 16 kHz sample rate, and Jean Marc gave me a few tips on using DCTs to code the bits.

It’s early days but here are a few samples:

Description Sample
1 Original Speech Listen
2 Codec 2 Model, orignal amplitudes and phases Listen
3 Synthetic phase, one bit voicing, original amplitudes Listen
4 Synthetic phase, one bit voicing, amplitudes at 1800 bit/s Listen
5 Simulated analog SSB, 300-2600Hz BPF, 10dB SNR Listen

Couple of interesting points:

  • Sample (2) is as good as Codec 2 can do, its the unquantised model parameters (harmonic phases and amplitudes). It’s all down hill from here as we quantise or toss away parameters.
  • In (3) I’m using a one bit voicing model, this is very vocoder and shouldn’t work this well. MBE/MELP all say you need mixed excitation. Exploring that conundrum would be a good Masters degree topic.
  • In (3) I can hear the pitch estimator making a few mistakes, e.g. around “sheet” on the female.
  • The extra 4kHz of audio bandwidth doesn’t take many more bits to encode, as the ear has a log frequency response. It’s maybe 20% more bits than 4kHz audio.
  • You can hear some words like “well” are muddy and indistinct in the 1800 bit/s sample (4). This usually means the formants (spectral) peaks are not well defined, so we might be tossing away a little too much information.
  • The clipping on the SSB sample (5) around the words “depth” and “hours” is an artifact of the PathSim AGC. But dat noise. It gets really fatiguing after a while.

Wideband audio is a big paradigm shift for Push To Talk (PTT) radio. You can’t do this with analog radio: 2000 Hz of RF bandwidth, 8000 Hz of audio bandwidth. I’m not aware of any wideband PTT radio systems – they all work at best 4000 Hz audio bandwidth. DVSI has a wideband codec, but at a much higher bit rate (8000 bits/s).

Current wideband codecs shoot for artifact-free speech (and indeed general audio signals like music). Codec 2 wideband will still have noticeable artifacts, and probably won’t like music. Big question is will end users prefer this over SSB, or say analog FM – at the same SNR? What will 8kHz audio sound like on your HT?

We shall see. I need to spend some time cleaning up the algorithms, chasing down a few bugs, and getting it all into C, but I plan to be testing over the air later this year.

Let me know if you want to help.

Play Along

Unquantised Codec 2 with 16 kHz sample rate:

$ ./c2sim ~/Desktop/c2_hd/speech_orig_16k.wav --Fs 16000 -o - | play -t raw -r 16000 -s -2 -

With “Phase 0” synthetic phase and 1 bit voicing:

$ ./c2sim ~/Desktop/c2_hd/speech_orig_16k.wav --Fs 16000 --phase0 --postfilter -o - | play -t raw -r 16000 -s -2 -

Links

FreeDV 2017 Road Map – this work is part of the “Codec 2 Quality” work package.

Codec 2 page – has an explanation of the way Codec 2 models speech with harmonic amplitudes and phases.