Earlier this year I asked for some help. Steve Sampson K5OKC stepped up, and has done some fine work in porting the OFDM modem from Octave to C. I was so happy with his work I asked him to write a guest post on my blog on his experience and here it is!
On a personal level working with Steve was a great experience for me. I always enjoy and appreciate other people working on FreeDV with me, however it is quite rare to have people help out with programming. As you will see, Steve enjoyed the process and learned a great deal in the process.
The Problem with Porting
But first some background on the process involved. In signal processing it is common to develop algorithms in a convenient domain-specific scripting language such as GNU Octave. These languages can do a lot with one line of code and have powerul visualisation tools.
Usually, the algorithm then needs to be ported to a language suitable for real time implementation. For most of my career that has been C. For high speed operation on FPGAs it might be VHDL. It is also common to port algorithms from floating point to fixed point so they can run on low cost hardware.
We don’t develop algorithms directly in the target real-time language as signal processing is hard. Bugs are difficult to find and correct. They may be 10x or 100x times harder (in terms of person-hours) to find in C or VHDL than say GNU Octave.
So a common task in my industry is porting an algorithm from one language to another. Generally the process involves taking a working simulation and injecting a bunch of hard to find bugs into the real time implementation. It’s an excellent way for engineering companies to go bankrupt and upset customers. I have seen and indeed participated in this process (screwing up real time implementations) many times.
The other problem is algorithm development is hard, and not many people can do it. They are hard to find, cost a lot of money to employ, and can be very nerdy (like me). So if you can find a way to get people with C, but not high level DSP skills, to work on these ports – then it’s a huge win from a resourcing perspective. The person doing the C port learns a lot, and managers are happy as there is some predictability in the engineering process and schedule.
The process I have developed allows people with C coding (but not DSP) skills to port complex signal processing algorithms from one language to another. In this case its from GNU Octave to floating point C. The figures below shows how it all fits together.
Here is a sample output plot, in this case a buffer of received samples in the demodulator. This signal is plotted in green, and the difference between C and Octave in red. The red line is all zeros, as it should be.
This particular test generates 12 plots. Running is easy:
$ cd codec2-dev/octave $ ../build_linux/unittest/tofdm $ octave >> tofdm W........................: OK tx_bits..................: OK tx.......................: OK rx.......................: OK rxbuf in.................: OK rxbuf....................: OK rx_sym...................: FAIL (0.002037) phase_est_pilot..........: FAIL (0.001318) rx_amp...................: OK timing_est...............: OK sample_point.............: OK foff_est_hz..............: OK rx_bits..................: OK
This shows a fail case – two vectors just failed so some further inspection required.
Key points are:
- We make sure the C and Octave versions are identical. Near enough is not good enough. For floating point I set a tolerance like 1 part in 1000. For fixed point ports it can be bit exact – zero difference.
- We dump a lot of internal states, not just the inputs and outputs. This helps point us at exactly where the problem is.
- There is an automatic checklist to give us pass/fail reports of each stage.
- This process is not particularly original. It’s not rocket science, but getting people (especially managers) to support and follow such a process is. This part – the human factor – is really hard to get right.
- The same process can be used between any two versions of an algorithm. Fixed and float point, fixed point C and VHDL, or a reference implementation and another one that has memory or CPU optimisations. The same basic idea: take a reference version and use software to compare it.
- It makes porting fun and strangely satisfying. You get constant forward progress and no hard to find bugs. Things work when they hit real time. After months of tough, brain hurting, algorithm development, I find myself looking forward to the productivity the porting phase.
In this case Steve was the man doing the C port. Here is his story…..
Initial Code Construction
I’m a big fan of the Integrated Debugging Environment (IDE). I’ve used various versions over the years, but mostly only use Netbeans IDE. This is my current favorite, as it works well with C and Java.
When I take on a new programming project I just create a new IDE project and paste in whatever I want to translate, and start filling-in the Java or C code. In the OFDM modem case, it was the Octave source code ofdm_lib.m.
Obviously this code won’t do anything or compile, but it allows me to write C functions for each of the Octave code blocks. Sooner or later, all the Octave code is gone, and only C code remains.
I have very little experience with Octave, but I did use some Matlab in college. It was a new system just being introduced when I was near graduation. I spent a little time trying to make the program as dynamic as the Octave code. But it became mired in memory allocation.
Once David approved the decision for me to go with fixed configuration values (Symbol rate, Sample rate, etc), I was able to quickly create the header files. We could adjust these header files as we went along.
One thing about Octave, is you don’t have to specify the array sizes. So for the C port, one of my tasks was to figure out the array sizes for all the data structures. In some cases I just typed the array name in Octave, and it printed out its value, and then presto I now knew the size. Inspector Clouseau wins again!
The include files were pretty much patterned the same as FDMDV and COHPSK modems.
Code Starting Point
When it comes to modems, the easiest thing to create first is the modulator. It proved true in this case as well. I did have some trouble early on, because of a bug I created in my testing code. My spectrum looked different than Davids. Once this bug was ironed out the spectrums looked similar. David recommended I create a test program, like he had done for other modems.
The output may look similar, but who knows really? I’m certainly not going to go line by line through comma-separated values, and anyway Octave floating point values aren’t the same as C values past some number of decimal points.
This testing program was a little over my head, and since David has written many of these before, he decided to just crank it out and save me the learning curve.
We made a few data structure changes to the C program, but generally it was straight forward. Basically we had the outputs of the C and Octave modulators, and the difference is shown by their different colors. Luckily we finally got no differences.
As I was writing the modulator, I also had to try and understand this particular OFDM design. I deduced that it was basically eighteen (18) carriers that were grouped into eight (8) rows. The first row was the complex “pilot” symbols (BPSK), and the remaining 7 rows were the 112 complex “data” symbols (QPSK).
But there was a little magic going on, in that the pilots were 18 columns, but the data was only using 16. So in the 7 rows of data, the first and last columns were set to a fixed complex “zero.”
This produces the 16 x 7 or 112 complex data symbols. Each QPSK symbol is two-bits, so each OFDM frame represents 224 bits of data. It wasn’t until I began working on the receiver code that all of this started to make sense.
With this information, I was able to drive the modulator with the correct number of bits, and collect the output and convert it to PCM for testing with Audacity.
DFT Versus FFT
This OFDM modem uses a DFT and IDFT. This greatly simplifies things. All I have to do is a multiply and summation. With only 18 carriers, this is easily fast enough for the task. We just zip through the 18 carriers, and return the frequency or time domain. Obviously this code can be optimized for firmware later on.
The final part of the modulator, is the need for a guard period called the Cyclic Prefix (CP). So by making a copy of the last 16 of the 144 complex time-domain samples, and putting them at the head, we produce 160 complex samples for each row, giving us 160 x 8 rows, or 1280 complex samples every OFDM frame. We send this to the transmitter.
There will probably need to be some filtering, and a function of adjusting gain in the API.
That left the Demodulator which looked much more complex. It took me quite a long time just to get the Octave into some semblance of C. One problem was that Octave arrays start at 1 and C starts at 0. In my initial translation, I just ignored this. I told myself we would find the right numbers when we started pushing data through it.
I won’t kid anyone, I had no idea what was going on, but it didn’t matter. Slowly, after the basic code was doing something, I began to figure out the function of various parts. Again though, we have no idea if the C code is producing the same data as the Octave code. We needed some testing functions, and these were added to tofdm.m and tofdm.c. David wrote this part of the code, and I massaged the C modem code until one day the data were the same. This was pretty exciting to see it passing tests.
One thing I found, was that you can reach an underflow with single precision. Whenever I was really stumped, I would change the single precision to a double, and then see where the problem was. I was trying to stay completely within single precision floating point, because this modem is going to be embedded firmware someday.
There was no way that I could have reached a successful conclusion without the testing code. As a matter of fact, a lot of programming errors were found. You would be surprised at how much damage a miss placed parenthesis can do to a math equation! I’ve had enough math to know how to do the basic operations involved in DSP. I’m sure that as this code is ported to firmware, it can be simplified, optimized, and unrolled a bit for added speed. At this point, we just want valid waveforms.
C99 and Complex Math
Working with David was pretty easy, even though we are almost 16 time-zones apart. We don’t need an answer right now, and we aren’t working on a deadline. Sometimes I would send an email, and then four hours later I would find the problem myself, and the morning was still hours away in his time zone. So he sometimes got some strange emails from me that didn’t require an answer.
David was hands-off on this project, and doesn’t seem to be a control freak, so he just let me go at it, and then teamed-up when we had to merge things in giving us comparable output. Sometimes a simple answer was all I needed to blow through an Octave brain teaser.
I’ve been working in C99 for the past year. For those who haven’t kept up (1999 was a long time ago), but still, we tend to program C in the same way. In working with complex numbers though, the C library has been greatly expanded. For example, to multiply two complex numbers, you type” “A * B”. That’s it. No need to worry about a simulated complex number using a structure. You need a complex exponent, you type “cexp(I * W)” where “I” is the sqrt(-1). But all of this is hidden away inside the compiler.
For me, this became useful when translating Octave to C. Most of the complex functions have the same name. The only thing I had to do, was create a matrix multiply, and a summation function for the DFT. The rest was straight forward. Still a lot of work, but it was enjoyable work.
Where we might have problems interfacing to legacy code, there are functions in the library to extract the real and imaginary parts. We can easily interface to the old structure method. You can see examples of this in the testing code.
Looking back, I don’t think I would do anything different. Translating code is tedious no matter how you go. In this case Octave is 10 times easier than translating Fortran to C, or C to Java.
The best course is where you can start seeing some output early on. This keeps you motivated. I was a happy camper when I could look and listen to the modem using Audacity. Once you see progress, you can’t give up, and want to press on.
The Bit Exact Fairy Tale is a story of fixed point porting. Writing this helped me vent a lot of steam at the time – I’d just left a company that was really good at messing up these sorts of projects.
The cohpsk_frame_design spreadsheet includes some design calculations on the OFDM modem and a map of where the data and pilot symbols go in time and frequency.
Fixed Point Scaling – Low Pass Filter example – is consistently one of the most popular posts on this blog. It’s a worked example of a fixed point port of a low pass filter.