Open Source Echo Canceller Part 5 – Ready for Beta Testing

The continued trials and tribulations of echo canceller development! Since the last post the echo canceller (named Oslec) has been tested at several alpha sites. Oslec is performing well and is now ready for Beta testing. See the Oslec home page for the current status of the echo canceller.

The core of Oslec is the echo.c file.

Why X100P Cards Have Echo Problems

After the good results I obtained in the previous post I asked a few friends to test Oslec on their Asterisk systems. These guys were using X100P cards for the FXO interface. Oslec performed poorly so I asked them to collect some samples of the echo signal so I could analyse them off line. As part of the development strategy Oslec has built in echo sampling code.

Here is a plot of the X100P signals. Open it in another window or tab. Note that the green receive signal has a slight DC offset, it doesn’t quite sit on the zero line. Note also that the echo canceller output (blue) is quite large, i.e. the cancellation is poor.

There is also a series of small spikes on the green and blue signals – this is 60Hz hum that the X100P has mistakenly delivered to us. We don’t normally hear this hum as the phones we use tend to filter it out.

The combination of DC offset and 60 Hz hum confuses the echo canceller algorithm and make it converge slowly. This means poor performance and lots of echo.

The trick was to remove the hum and DC with a simple high pass filter. Here is a plot of the X100P signals with the high pass filter. To see the effect of the filter use your browsers forward and backward buttons to flick between the two images (with and without filtering).

See how the hum and DC offset have gone away after filtering? The green and blue lines are both now on the 0 line. But best of all the echo level (blue) has been greatly reduced, as without the hum and DC offset the echo canceller can do a much better job. Pretty cool, huh? DSP in action :-)

The observant will also note a DC offset on the (red) transmit signal. I am not sure why this is present, it’s in the signal from the SIP phone in this particular test. Perhaps a DC offset in the SIP phone electronics.

Since the filter was added Oslec has been in constant use on a X100P home IP-PBX system in Ottawa with great results.

This is an exciting result – it means low cost ($10) FXO hardware can have high quality echo cancellation.

Handling Background Noise

Inside echo cancellers there is an animal called a “Non Linear Processor” or NLP. After the adaptive filter part of the echo canceller has done it’s best this gizmo is used to remove any remaining echo.

Initially Oslec had a very simple mute algorithm – if the residual echo level was low enough the output would be muted. You can see this muting in action in the X100P plot above. Around sample 1000, the blue line “flat lines” as the NLP cuts off any residual echo.

The problem is that muting is crude – it removes echo but also throws out the background noise. Without background noise the calls sounds unnatural.

I experimented with a couple of ideas here. The first was to insert “comfort noise” instead of muting. The level of the noise is set to match that of the background noise level. However the noise I used sounded unnatural compared to the actual background noise (which in my office is computer fans).

The best NLP solution I have found so far is simply to clip the residual echo signal to the level of the background noise:
/* This sounds much better than CNG */
if (ec->clean_nlp > ec->Lbgn)
ec->clean_nlp = ec->Lbgn;
if (ec->clean_nlp < -ec->Lbgn)
ec->clean_nlp = -ec->Lbgn;

This works surprisingly well, even very weird background noise like a lawnmower working in my backyard came through fine and I still couldn’t hear any echo. BTW I found the idea of clipping on the data sheet of a commercial “hardware” echo cancellation chip.

I struggled with the code to estimate the background noise level for some time before settling on a really simple algorithm. I just average the level using a slow (1 second time const) filter if the current level is less than a (experimentally derived) constant:
if (ec->Lclean < 40) {
ec->Lbgn_acc = abs(ec->clean) - ec->Lbgn;
ec->Lbgn = (ec->Lbgn_acc (1<<11)) >> 12;
}

The 40 was measured experimentally, and is roughly the lowest level of any near end speech. The idea is that we don’t want to include any near end speech (which is generally higher in level than background noise) in our estimate of the background noise level. This estimator has worked well to date however I would welcome feedback from beta testers on Oslec background noise performance.

Fun with Soft Phones

One gentleman (Pavel) who tested Oslec complained of echo break through with pop sounds like “P” in Peter. He was using a kiax soft phone connected to an FXO port via Asterisk. The problem was due to a interesting combination of high quality audio (the microphone and sound blaster) and the telephone network.

Pavel had initially thought his microphone was faulty is some way, however it turns out it was too good! The microphone/sound blaster combination lets low frequency signals (e.g. down to 20Hz) through to the FXO Port. However the FXO port electronics (in particular the hybrid) is designed for telephone bandwidth signals (300-3300Hz). The results was a temporary failure of the hybrid, which let some echo slip through.

Figure 1 is a close up of the “pop” waveform. The red line (tx) is the microphone signal. Note how around sample 300 it goes down off the screen then comes back up around sample 400. This signal is very low frequency compared to normal speech (it changes slowly compared to other parts of the red tx signal).

In Figure 2 the lower brown line shows how well the FXO port hybrid is working. It normally varies between 6-15dB. However near the pop it first goes up to 60 then down to 0 dB. The low frequency pop signal is messing up the hybrid, making it non-linear. This means the echo canceller cannot cope, in fact it resets the coefficients. Echo cancellers depend on the hybrid working in a linear mode all the time.

So the solution is to high pass filter the tx signal the microphone sends to your FXO port. By removing the low frequency pop energy your the hybrid will remain linear, and the echo canceller will work with soft phones like kiax.

Are 128ms Tails Really Needed?

There is a school of thought that says a “128ms tail” is required for “serious” echo cancellation. I am not sure where that requirement comes from. I have collected many echo samples from all over the world and 9/10 would have worked with a 16ms tail. The 10th was a long distance T1 line, and even that fits comfortably in a 32ms tail.

If any PSTN FXO phone line has a 128ms echo then an analog phone connected to that line would be unusable. My understanding (and experience) is that Telcos insert network echo cancellers after a certain delay, for example on long distance calls. That’s why you can make regular long distance calls without echo.

Perhaps 128ms tails are required for operating inside of Telco networks or when Telcos handle international trunks. Anyway I suspect that the 128ms requirement is just another piece of FUD that surrounds echo cancellation, like “hardware echo cancellation is superior to software” or “you must have a DSP chip”. Perhaps “128ms tail” has become confused with “a working echo canceller algorithm”.

So my alpha testers and I typically use Oslec with a 16 or 32ms tail and it works just fine. BTW I would love to see a sample that refutes this argument. If anyone can send me a sample that shows a tail greater than 32ms for a PSTN FXO line I will happily publish it here and recant my heresy against the church of the 128ms tail!

Open Development Works!

In Part One I spoke about the idea of people collecting and sending samples of bad echo. This has really worked well. Several times during testing something went wrong at an alpha site, and collecting a sample really helped me work out why and improve Oslec. Thanks especially to Mark, Pawel, and Pavel plus many others for sending in samples.

I have also had a lot of useful comments from my DSP “brains trust” – Steve, Jean-Marc and Ramakrishnan.

This was truly an open development effort. Echo cancellers are tricky voodoo, requiring lots or practical tricks on top of the standard DSP algorithms. I have tried off and on for 15 years to come up with a viable echo canceller. The Zaptel echo cancellers have been works-in-progress for 6 years. Other companies have invested 10’s of man years (e.g. teams of 20 people for several years).

So I think this is a great example of where open development techniques have been used to achieve excellent results in a short time.

How to Test an Echo Canceller

Testing an echo canceller can be tricky. This section explains how to test an echo canceller, along with some of the traps.

Set up a SIP phone to FXS call. This way there is plenty of delay in the circuit and you will get a nice echo. An analog to analog call (say FXS to FXS) might be TDM bridged and not have any delay.

Once the call is running mute the analog phone. Most phones have a button for this – if they don’t then unplug the handset from the phone base. To test an echo canceller is is important to have no near end speech present. If the FXS phone is in a different room it’s OK to leave it un muted – just make sure it can’t pick up anything you are saying into the SIP Phone.

Speak loudly into the SIP phone and listen for any echos. It is fun to try this with the Oslec control panel:

For example try the “Disable” and “Enable” buttons and listen to the echo with and without the echo canceller.

Another good test is double talk. Get a person in a different room to use the FXS phone while you use the SIP phone. Get them to speak constantly and try to talk on top of them. Double talk is a good test as it can confuse echo cancellers.

If you are testing a SIP to FXO call, make sure the phone at the other end of the connection (say a cell or desktop phone) is out of audio range of the SIP phone, for example in another room. Alternatively, mute the telephone.

Conclusion

A good quality line echo canceller has been a missing part of the open source telephony scene for a long time. Through the efforts of a team of DSP engineers and several alpha testers we have developed a good candidate for a free (as in speech) line echo canceller. Please try Oslec yourself and tell us what you think.

Reading Further

Oslec Home Page
Part 1 – Introduction
Part 2 – How Echo Cancellers Work
Part 3 – Two Prototypes
Part 4 – First Phone Calls
ringtone download free 100ringtone series tv 24add to mobile ringtones phoneringtone mp3 24for g 3 free ringtonealltel ringtones comfree ringtone 100 musicringtones show 24 the Mappokemon hentai clips movieporn movie zone starmovies rapidsharemovies round assin disney sex moviesshemales moviesstrapon movies sexdildo pussy movie tight Mapaapne banaya ringtone aashiqaddiction janes ringtoneringtone 8830 forumringtones 1997 edge16 ringtone tonsadulf ringtone proofmiles 500 ringtonea650 ringtone install Mapembedded videos pornporn ember downloademergent porngeorge pornstar emilysimons porn emilyporn in emmaemma star star pornpink emma starr porn Map

26 thoughts on “Open Source Echo Canceller Part 5 – Ready for Beta Testing”

  1. Its good to see an update on this work.

    In developing my FAX modems I have gathered samples with the wackiest DC offsets imaginable. You would typically expect there to be no DC continuity in a PSTN telephony path. Thus, you would typically expect a DC offset to delay to zero fairly quickly. Don’t bank on it. I have FAX modem traces where with each change of tx modem the DC hops, and stays solidly at the new level until the next change of mode makes it hop again. I have never really figured out where this comes from. I have seen DC so big that the ulaw/alaw codec design’s assumption of small signals being in the middle is so wrong there is serious distortion.

    As for long echo cancellers….

    A modem’s echo canceller needs to be long. They use tones to switch off the in-network cancellers, and cancel end-to-end. They must, or they just can’t clean up the channel well enough. They generally work in a sparse manner, even for fairly short echos. This is because the distant echo can have a frequency offset, due to FDM equipment in the path. FDM kit is probably fairly rare these days, but was still common in the 90s when most modem designs were developed. They use a Hilbert transform, a tracking phase rotator, and a complex EC. I’ve never seen a voice canceller doing this.

    An in-network voice echo canceller may need to be long, as the path may span the global, and connect with high latency terminals, like cellphones. However, most of that path will be 4-wire, so sparse techniques are used.

    If you are cancelling local and distant echos, and you really have a significant distant echo, its probably there because its a fairly local call with no in-network canceller in the path. Therefore, if you don’t disable the in-network canceller (which you don’t for voice) you shouldn’t see long echos. The snag here is that tandem cancellers may get into nasty hunting modes, where they just can’t lock down the channel estimate correctly.

    The centre clipping idea for NLP sounds interesting. From your description it sounds like you are getting most of the benefit of a true signal estimation and resynthesis, with almost no complexity. Excellent. Now you have described the idea, I kinda think I heard this before somewhere and didn’t put 2+2 together, to properly understand what they were getting at.

  2. David,

    Glad to see the progress. I have gone thru the exact problem you described, i.e. the DC offset issue, while I did the ecan on a MIPS 4kc based processor. It makes ecan go haywire!

    I also like the clipping based solution you have come up with. The ones I had worked with had CNG using pink noise.

    Great going. Congratulations on the work.

  3. Thanks for the kind words Guys.

    Steve, thanks for your explanation of why long tails are required in certain circumstances. A sparse 128ms algorithm is something I would like to look at should people using Oslec come up with any FXO PSTN lines that can’t be handled by a 32ms tail. At present my main interest in FXO PSTN lines and voice rather than fax or data lines, so I am hoping 32ms tails will be adequate.

    Thanks again,

    David

  4. Thank you for your great -and extremely informative-blog. I really think what your doing is amazing.

    I know this is a bit off topic, but I was wondering if there is anyplace where I can get some examples and demos of echo cancellation using BSS, and really how good are they in comparison to the classic adaptive filtering models.

    Thanks,

    Ameirah

  5. Ameirah,

    Its not clear to me that BSS techniques would have any real benefit for line echo cancellation. Their usual application has been in multi-channel acoustic echo cancellation, such as conference systems. Do you have anything specific in mind?

  6. Yes, thats what I meant. I have looked for examples on some algorithms, and I found EcoBliSS (Schobben and Sommen). My resources are very limited and I can’t seem to find any other examples. Which products actually use the MC-AEC using BSS, if any? I am sorry if I am making no sense, I am still a student trying to do a paper on the topic, and it is a bit out of my field.

    Thanks again,

    Ameirah

  7. David,
    I’m glad to discover this work. I’m currently doing
    line echo cancellation via dedicated dsp hardware and
    dsp assembly language software. However, I have a new
    application where I need to do x86 host based echo
    cancellation using a high level language like C.

    After reviewing the info available from your website,
    I would like to offer the following comments/suggestions
    for possible improvements for x86 windows platform users
    like myself:

    1. Give the visual studio VC++ 6.0 source project for
    the libspandsp.dll. All I should have to do is open the
    project.dsp and compile it. Maybe it’s there already and
    I missed it.

    2. There should be a block diagram of your echo canceller
    implementation similar to the Messerschmitt app note.

    3. There should some performance info showing mips
    (or cpu usage) per echo channel under clearly stated
    “typical” conditions for all platforms supported.
    Also, what is the memory footprint required per channel?

    4. Is your implementation thread safe for multi-threaded
    x86 apps?

    5. There should be a table showing some basic G.168
    requirements and the oslec performance, again similar
    to the Messerschmitt app note.

    Well, there you have it. I really didn’t mean to write an
    epistle – but I hope you will view this as some positive
    feedback. It would be greatly appreciated if you could
    respond to some or all of my points in the very near future
    as I’m immersed in this work now.

    Thanks in advance for your consideration.

    Regards,

    Bill Salibrici

  8. David,
    I’m sorry I forgot a couple of points in my prior post
    today as follows:

    A. There should be a Setup.exe for a simple install for
    x86 windows users like myself. The InstallShield package
    could be useful for this.

    B. A simple example.c to illustrate basic echo cancel usage
    for a single channel would be very helpful for a newcomer
    to what you are doing.

    Thanks for your patience.

    Regards,

    Bill Salibrici

  9. Hi Bill,

    Thanks for your quality comments and suggestions. Some great ideas there. I can comment/answer some of your list now, others I will think about for the future.

    Re (1) and (A) these days I am a 100% Linux guy so not particularly interested in maintaining a Windows version. However if anyone else wants to address Windows support of Oslec I am happy to link to it, or publish the results on my site. Just out of curiousity – what sort of telephony work is going on in Windows these days that would need echo cancellation?

    Re (3) typical CPU usage is 20 MIPs/ch for a 32ms tail on an x86 with MMX enabled, and about 20 MIPs/ch on a Blackfin for a 16ms tail including all the zaptel overheads. So on a 3 GHz x86 you get 3E9/20E6 channels, although I haven’t pushed it that far myself.

    Re (4) yes it is thread safe, several people are using it for multiple T1/E1 spans.

    (5) is a good idea, it actually outputs a table similar to this as part of the test scripts.

    Re (B) good idea, the closest to this right now is the spandsp/tests/echo_test.c program, in say –file mode. Or perhaps the user/speedtest.c

    I agree – much of what you ask for should be clearly stated some where in a specs section on the oslec web page.

    Thanks,

    David

  10. David,
    I have discovered a memory leak for the echo canceller.
    echo_can_create does two invocations of fir16_create.
    However, echo_can_free calls fir16_free only once.

    I added the following line to echo_can_free and it seemed
    to cure the problem:

    void echo_can_free(echo_can_state_t *ec)
    {
    int i;

    fir16_free(&ec->fir_state);
    fir16_free(&ec->fir_state_bg); // Add this line [wjs 19july07]
    for (i = 0; i fir_taps16[i]);
    free(ec->snapshot);
    free(ec);
    }

    Hopefully you have a place where updates are collected and
    you will notify us about future releases that contain bug
    fixes and /or feature enhancements.

    Thanks again for your good work in a difficult area.

    –Bill

  11. Hi David,

    I am frequent visitor of your blog, hopping to see changes in your echo.c file. I am thankful to you, you have done amazing work and is of great help to me.

    I intended to implement Echo cancellar on FPGA. Your blog gave me quick start on that, i am pretty much done translating all of the algorithm you have done in C in to FPGA, now i am trying to implement same algorithm for 128ms tail but problem i faced is, i am not able to replicate the arithmetic and logical operations you are performing for 256 taps for 1024 taps. I wanted to ask you what is base for your adding certain values and shifting values for 256 taps in your C code. i tried to figure out some kind of relation between taps and AL operations you have done, but i could not. if i could understand that, i will be able to implement same thing for 1024 taps.

    It would be great, if can help me out on this….

    Thank you,
    Mahesh

    mahesh@cem-solutions.net

  12. Hi Mahesh,

    Thanks for you kind words, and I am glad that it is helpful to you :-)

    The FPGA implementation sounds interesting. To maximise your chances of success I would suggest you make sure that your FPGA code is “bit exact” with the C similaution using several test vectors.

    128ms presents some problems. For stable adaption the convergence rate tends to be slower for longer tails. I found that with >32ms tails Oslec would not converge fast enough to be useful.

    The solution many people use is a “sparse” tail, where they just adapt short (say 128 tap, 16ms) tails in the region of the tail where the echo is estimated to be. The estimate the position of the echo using an open loop method such as cross correlation. Sometimes they use 2 or more 16 ms sections in the 128ms span to cope with more than one echo source.

    Sorry, I do not quite understand your question “I wanted to ask you what is base for your adding certain values and shifting values for 256 taps in your C code”? Could you explain it another way please?

    Thanks,

    David

  13. Hi David,

    Thank you for your effort to help me, i appreciate it.

    You were right, “bit exact” was a problem i faced while trying to convert C program to Verilog. I had to understand every line of your code to efficiently transfer into verilog and i faced couple problem while doing that.

    I am planning to give a shot to “sparse” tail method, if you have some reading material which could help me out on this, would you please forward it my email id. (mahesh@cem-solutions.net)

    what i meant is, in your code the line

    ec->Lrx = (ec->Lrxacc + (1> 5;

    why exactly add 16 and divide it by 32 ( i was not able to draw a relation between numbers and with taps)

    ec->Lbgn = (ec->Lbgn_acc + (1> 12;

    “” same here ???

    Thank you,
    Mahesh

  14. Hi Mahesh,

    Good work on taking the time to make your Verilog bit exact – it is much better to spend the time now being careful than get stuck in painful debugging later on! FYI I have written more about bit exact DSP techniques in another post. This post also pokes a bit of fun at typical company management :-)

    Re the sparse tail method I have seen some application notes from Texas Instruments for Acoustic Echo Cancellers that give detailed explanations. These app notes are aimed at TI DSPs and Acoustic rather than line echo cancellation however the principes are the same. I suggest using Oslec in simulation mode to test the sparse techniques, just introduce an extra pure delay component into the echo path to give the delay estimation a work out. Aim to pass the same G168 tests.

    OK the code you refer to effectively adds 0.5 of a LSB before truncation, which gives more accurate rounding. It is like saying:

    ec->Lrx = floor(ec->Lrxacc/32 + 0.5);
    ec->Lrx = floor(ec->Lrxacc/32 + (32/32)*0.5);
    ec->Lrx = floor((ec->Lrxacc+16)/32);

    If we rounded without adding the 0.5LSB you would get some bias which upsets some algorithms, like IIR filters (e.g. they may not track down to zero with 0 input) and NLMS filters where the 0.5LSB is a significant quantity compared to the correction being applied at each step.

    Cheers,

    David

  15. “This works surprisingly well, even very weird background noise like a lawnmower working in my backyard came through fine and I still couldn’t hear any echo”

    No need to be suprised. If you can’t hear any echo, that only shown that echo canceller in other end working well.

    Your echo canceller is supposed to cancell echo for far-end’s party and vice versa, far-end’s echo canceller need to cancell echo for you (near-end). Ask your party at far-end for echo? Presence of echo at far-end is sign that your echo canceller does not work well.

    It seems to me like misunderstanding how echo canceller is supposed to work!

  16. Hello,
    I would like to ask for you about the algorithm of soustraction of the telepnonique echoe.
    Thanks.
    Idon’t wan’t to pulish my mail.
    CAT UYEN LE

  17. Hi David
    I don’t understand the blackfin optimization of lms_adapt_bg (as I’m trying to do such a thing on a C54x). In fact, the phist pointer seems to be out of bounds of the history table. It starts at curr_pos, but increments “taps” times, so I think it overflows. It does work only if phist is incremented modulo taps. Or is there missing a test with j within the loop ? (j is declared and initialized but not used at all).
    Thanks for all your work !
    Guillaume

  18. Hi Guillaume,

    In fir.h we allocate 2*taps for fir->history then with each input make two copies of the input sample:

    fir->history[fir->curr_pos] = sample;
    fir->history[fir->curr_pos + fir->taps] = sample;

    A circular buffer would be more elegant (less memory).

    – David

  19. Hi David,

    I must say that your blog/website was one of the things that inspired us to start myopen, an open-hardware myoelectric processor. (still in development! based on blackfin processor!)

    If you need/want/curious about a assembly version of LMS, I’ve written one for blackfin to cancel 50/60Hz noise (or anything else, e.g. echo) in the EMG signals:
    http://code.google.com/p/myopen/source/browse/trunk/firmware/lms.asm
    It’s well tested, works reliably, and is as fast as my ability permits.

    :-)
    Tim

  20. Good to see you have inspired to do some open hardware work Tim. Sure, I’ll take a look at your LMS code.

    – David

  21. Hi David,

    I have the following install.

    Analogue/VOIP extentions Asterisk mISDN (hfcpci) PSTN

    I have been struggling with echo on some calls since the beginning. Note that these calls were always to the same people on the PSTN side (about 10%), and any VOIP or Analogue extention inside. Also the same people on the PSTN always echoed.

    Changing to OSLEC (from MG2) on mISDN did not work. Even with 32ms tail. I then tried SoftEcho with 128ms and immediately echo was gone.

    I believe that I am a test case that proves that 128ms tails are neccesary.
    If you are interested in wav files or any more info, please let me know.

    Johan

  22. Hi Johan,

    Thanks for your test report.

    The mISDN driver has a bug that introduces random and variable delay in the echo path. An echo canceller with a 128ms tail can cope with this bug, Oslec cannot.

    My recommendation is to fix the bug in mISDN, I am not inclined to modify Oslec for 128ms tails to cope with a bug in one driver. Unfortunately it’s been a few years, and no one has progressed this issue with mISDN. So it still doesn’t have a free echo canceller.

    Some notes on the problem here.

    If you think about it, an ISDN call between two ISDN phones must have very low delay, else echo cancellation would be required on every ISDN telephone. If the ISDN path really had 128ms echo, you wouldn’t able to use analog (FXS) type phones anywhere in the network.

    Cheers,

    David

  23. Peter,

    Just to satisfy my own curiocity….

    Does this mean that if I add Peter Schlaile’s mISDN_delay patch, OSLEC will work?

    Or will it not because of the fact that the delay is “random and variable” ?

    Thanks,

    Johan

Comments are closed.