For my latest Codec 2 brainstorms I need to generate a phase spectra from a magnitude spectra. I’m using ceptral/minimum phase techniques. Despite plenty of theory and even code on the Internet it took me a while to get something working. So I thought I’d post an worked example here. I must admit the theory still makes my eyes glaze over. However a working demo is a great start to understanding the theory if you’re even nerdier than me.
Codec 2 just transmits the magnitude of the speech spectrum to the decoder. The phases are estimated at the encoder but take too many bits to encode, and aren’t that important for communications quality speech. So we toss them away and reconstruct them at the decoder using some sort of rule based approach. I’m messing about with a new way of modeling the speech spectrum so needed a new way to generate the phase spectra at the decoder.
Here is the mag_to_phase.m function, which is a slightly modified version of this Octave code that I found in my meanderings on the InterWebs. I think there is also a Matlab/Octave function called mps.m which does a similar job.
I decided it to test it using a 10th order LPC synthesis filter. These filters are known to have a minimum-phase phase spectra. So if the algorithm is working it will generate exactly the same phase spectra.
So we start with 40ms of speech:
Then we find the phase spectra (bottom) given the magnitude spectrum (top):
On the bottom the green line is the measured phase spectrum of the filter, and the blue line is what the mag_to_phase.m function came up with. They are identical, I’ve just offset them by 0.5 rads on the plot. So it works Yayyyy – we can find a minimum phase spectra from just the magnitude spectra of a filter.
This is the impulse response, which the algorithm spits out as an intermediate product. One interpretation of minimum phase (so I’m told) is that the energy is all collected near the start of the pulse:
As the DFT is cyclical the bit on the right is actually concatenated with the bit on the left to make one continuous pulse centered on time = 0. All a bit “Dr Who” I know but this is DSP after all! With a bit of imagination you can see it looks like one period of the original input speech in the first plot above.