Pillai,S U, Shim, T.I., Batalama, S N, Kazakos, D, Daum, F. "Spectral Estimation and Modeling The electrical Engineering Handbook Ed. Richard C. dorf Boca Raton CRC Press llc. 2000
Pillai, S.U., Shim, T.I., Batalama, S.N., Kazakos, D., Daum, F. “Spectral Estimation and Modeling” The Electrical Engineering Handbook Ed. Richard C. Dorf Boca Raton: CRC Press LLC, 2000
16 Spectral estimation S Unnikrishna pillai Polytechnic University and modeling Theodore I Shim lan. batalama 16.1 Spectral Analysis niversity of New York Historical Perspective. Modern Spectral Analysis Dimitri Kazakos Bayesian Estimation. Mean-Square Estimation. Minimax University of Southwestern Estimation. Maximum Likelihood Estimation Other Parameter Fred Daum 16.3 Kalman Filtering Kalman Filter Equations. Kalman Filter Examples. Extended Kalman Filter· Nonlinear Filters· Practical issues 16.1 Spectral Analysis S. Unnikrishna pillai and theodore l shim Historical Perspective Modern spectral analysis dates back at least to Sir Isaac Newton [Newton, 1671, whose prism experiments with sunlight led him to discover that each color represented a particular wavelength of that the sunlight contained all wavelengths. Newton used the word spectrum, a variant of the Latin word specter, to describe the band of visible light colors. vibrating string can be expressed as an infinite sum containing weighted sine and cosine terms. Later, the Felly 9 In the early eighteenth century, Bernoulli discovered that the solution to the wave equation describing engineer Joseph Fourier in his Analytical Theory of Heat[ Fourier, 1822] extended Bernoullis wave equation results to arbitrary periodic functions that might contain a finite number of jump discontinuities. Thus, for oftea 70, if f(0=f(t+ To)for all t, then f(o) represents a periodic signal with period To and in the case al signals, it has the Fourier series representation f(t)=Ao+2>(Ak cos koot+ Bk sin koot) where @o= 2T/To and A f(t)cos ko Cot dt, BKT f(t) sin koot dt with Ao representing the dc term(k= 0). Moreover, the infinite sum on the right-hand side of the above expression converges to [f(to)+f(to)J/2. The total power P of the periodic function satisfies the relation c 2000 by CRC Press LLC
© 2000 by CRC Press LLC 16 Spectral Estimation and Modeling 16.1 Spectral Analysis Historical Perspective • Modern Spectral Analysis 16.2 Parameter Estimation Bayesian Estimation • Mean-Square Estimation • Minimax Estimation • Maximum Likelihood Estimation • Other Parameter Estimation Schemes 16.3 Kalman Filtering Kalman Filter Equations • Kalman Filter Examples • Extended Kalman Filter • Nonlinear Filters • Practical Issues 16.1 Spectral Analysis S. Unnikrishna Pillai and Theodore I. Shim Historical Perspective Modern spectral analysis dates back at least to Sir Isaac Newton [Newton, 1671], whose prism experiments with sunlight led him to discover that each color represented a particular wavelength of light and that the sunlight contained all wavelengths. Newton used the word spectrum, a variant of the Latin word specter, to describe the band of visible light colors. In the early eighteenth century, Bernoulli discovered that the solution to the wave equation describing a vibrating string can be expressed as an infinite sum containing weighted sine and cosine terms. Later, the French engineer Joseph Fourier in his Analytical Theory of Heat [Fourier, 1822] extended Bernoulli’s wave equation results to arbitrary periodic functions that might contain a finite number of jump discontinuities. Thus, for some T0 > 0, if f(t) = f(t + T0) for all t, then f(t) represents a periodic signal with period T0 and in the case of real signals, it has the Fourier series representation where ω0 = 2π/T0, and with A0 representing the dc term (k = 0). Moreover, the infinite sum on the right-hand side of the above expression converges to [f(t–0) + f(t+0)]/2. The total power P of the periodic function satisfies the relation ft A A k t B k t k k k ( ) ( cos sin ) =+ + = ∞ 0 00 ∑ 1 2 ω ω A T f t k t dt B T f t k t dt k k T T = = ∫ ∫ 1 1 0 0 0 0 0 0 0 0 ( ) cos , ( ) sin ω ω S. Unnikrishna Pillai Polytechnic University Theodore I. Shim Polytechnic University Stella N. Batalama State University of New York at Buffalo Dimitri Kazakos University of Southwestern Louisiana Fred Daum Raytheon Company
p=fPd=A2+2∑(A+B) implying that the total power is distributed only among the dc term, the fundamental frequency 2/ and its harmonics koo, k> 1, with 2(Af+Bk) representing the power contained at the harmonic koo. For every periodic signal with finite power, since A,+0, Bx0, eventually the overharmonics become of decreasing The British physicist Schuster [Schuster, 1898] used this observation to suggest that the partial power Pk 2(Ak Bk) at frequency koos k=0-o0, be displayed as the spectrum. Schuster termed this method the periodogram, and information over a multiple of periods was used to compute the Fourier coefficients and/or to smooth the periodogram, since depending on the starting time, the periodogram may contain irregular and spurious peaks. a notable exception to periodogram was the linear regression analysis introduced by the British statistician Yule [Yule, 1927] to obtain a more accurate description of the periodicities in noisy data. Because he sampled periodic process x(k)=cos koo T containing a single harmonic component satisfies the recursive relation where a=2 cos @. T represents the harmonic component, its noisy version x(k)+ n(k)satisfies the recursion x(k)=ax(k-1)-x(k-2)+n(k) Yule interpreted this time series model as a recursive harmonic process driven by a ner generalized the above form to determine the periodicity in the sequence of sunspot numbers. Yule furt recursion to x(k)=ax(k-1)+bx(k-2)+n(k) where a and b are arbitrary, to describe a truly autoregressive process and since for the right choice of a, b the least-square solution to the above autoregressive equation is a damped sinusoid, this generalization forms the basis for the modern day parametric methods Modern Spectral analysis Norbert Wiener's classic work on Generalized Harmonic Analysis [ Wiener, 1930] gave random processes a firm statistical foundation, and with the notion of ensemble average several key concepts were then introduced. The formalization of modern day probability theory by Kolmogorov and his school also played an indispensable part in this development. Thus, if xt) represents a continuous-time stochastic (random)process, then for every fixed t, it behaves like a random variable with some probability density function f(x, t). The ensemble average or expected value of the process is given by (t)=E[x(t)]=xf (x,t)dx and the statistical correlation between two time instants t and s of the random process is described through its autocorrelation function Rx(t1,t2)=E[x(t1)x*(t2)]= x1x2f1x2(x,x2,1,t2 (t2,t1) c 2000 by CRC Press LLC
© 2000 by CRC Press LLC implying that the total power is distributed only among the dc term, the fundamental frequency w0 = 2p/T0 and its harmonics kw0, k ³ 1, with 2(A2 k + B2 k ) representing the power contained at the harmonic kw0. For every periodic signal with finite power, since Ak Æ 0, Bk Æ 0, eventually the overharmonics become of decreasing importance. The British physicist Schuster [Schuster, 1898] used this observation to suggest that the partial power Pk = 2(A2 k + B2 k ) at frequency kw0, k = 0 Æ •, be displayed as the spectrum. Schuster termed this method the periodogram, and information over a multiple of periods was used to compute the Fourier coefficients and/or to smooth the periodogram, since depending on the starting time, the periodogram may contain irregular and spurious peaks. A notable exception to periodogram was the linear regression analysis introduced by the British statistician Yule [Yule, 1927] to obtain a more accurate description of the periodicities in noisy data. Because the sampled periodic process x(k) = cos kw0T containing a single harmonic component satisfies the recursive relation x(k) = ax(k – 1) – x(k – 2) where a = 2 cos w0T represents the harmonic component, its noisy version x(k) + n(k) satisfies the recursion x(k) = ax(k – 1) – x(k – 2) + n(k) Yule interpreted this time series model as a recursive harmonic process driven by a noise process and used this form to determine the periodicity in the sequence of sunspot numbers. Yule further generalized the above recursion to x(k) = ax(k – 1) + bx(k – 2) + n(k) where a and b are arbitrary, to describe a truly autoregressive process and since for the right choice of a, b the least-square solution to the above autoregressive equation is a damped sinusoid, this generalization forms the basis for the modern day parametric methods. Modern Spectral Analysis Norbert Wiener’s classic work on Generalized Harmonic Analysis [Wiener, 1930] gave random processes a firm statistical foundation, and with the notion of ensemble average several key concepts were then introduced. The formalization of modern day probability theory by Kolmogorov and his school also played an indispensable part in this development. Thus, if x(t) represents a continuous-time stochastic (random) process, then for every fixed t, it behaves like a random variable with some probability density function fx(x,t). The ensemble average or expected value of the process is given by and the statistical correlation between two time instants t1 and t2 of the random process is described through its autocorrelation function P T f t dt A Ak k B k T = = + + = • Ú Â 1 2 0 2 0 2 2 2 1 0 0 * ( )* ( ) mx x (t) = = E[x(t)] x f (x, t) dx -• • Ú R t t E x t x t x x f x x t t dx dx R t t xx x x xx ( , ) [ ( ) * ( )] * ( , , , ) * ( , ) – – 1 2 1 2 1 2 1 2 1 2 1 2 2 1 1 2 = = = • • -• • Ú Ú
where fx(x, x2,t, t2)represents the joint probability density function of the random variable x=xt,) and x2=xt2)and*denotes the complex conjugate transpose in general. Processes with autocorrelation functions that depend only upon the difference of the time intervals t, and t, are known as wide sense stationary processes Thus, if x(t) is wide sense stationary, then ELx(t+ t)x(t]=Ru(t)=ri(t) To obtain the distribution of power versus frequency in the case of a stochastic process, one can make use the Fourier transform based on a finite segment of the data. Letting 0)= x(t)e / u dt represent the power contained in a typical realization over the interval (-T D), its ensemble average value as T-o represents the true power contained at frequency o. Thus, for wide sense stationary processes So)=imEP2(o)=im「「R2(t1-1)lmth2 (16.1) lim| Rxx(t)1 Rx(eodt≥0 27 Moreover, the inverse relation gives R2(T)=1∫ S(o)e/o da (16.2) and hence R2(0)=Ex=p=1∫so)do Thus S(o)represents the power spectral density and from Eqs. (16. 1)and(16.2), the power spectral density and the autocorrelation function form a Fourier transform pair, the well-known Wiener-Khinchin theorem If xkD) represents a discrete-time wide sense stationary stochastic process, then Elx(on+k)])x'(nT)) and the power spectral density is given by or in terms of the normalized variable e=ot. c 2000 by CRC Press LLC
© 2000 by CRC Press LLC where fx1x2(x1 , x2 , t1 , t2 ) represents the joint probability density function of the random variable x1 = x(t1) and x2 = x(t2) and * denotes the complex conjugate transpose in general. Processes with autocorrelation functions that depend only upon the difference of the time intervals t1 and t2 are known as wide sense stationary processes. Thus, if x(t) is wide sense stationary, then E[x(t + t)x*(t)] = Rxx(t) = R* xx (–t) To obtain the distribution of power versus frequency in the case of a stochastic process, one can make use of the Fourier transform based on a finite segment of the data. Letting represent the power contained in a typical realization over the interval (–T, T), its ensemble average value as T Æ • represents the true power contained at frequency w. Thus, for wide sense stationary processes (16.1) Moreover, the inverse relation gives (16.2) and hence Thus S(w) represents the power spectral density and from Eqs. (16.1) and (16.2), the power spectral density and the autocorrelation function form a Fourier transform pair, the well-known Wiener–Khinchin theorem. If x(kT) represents a discrete-time wide sense stationary stochastic process, then and the power spectral density is given by or in terms of the normalized variable q = wT, P T x t e dt T j t T T ( ) w ( ) w = - -Ú 1 2 2 S E P R t t e dt dt R T e d R e d T T T xx T T T T jtt T xx T T j xx j ( ) lim [ ( )] lim ( – ) lim ( ) – ( ) – – – ( – ) – – w w t t t t t w wt wt = = = Ê Ë Á ˆ ¯ ˜ = ³ Æ • Æ • Æ • - -• • Ú Ú Ú Ú 1 2 1 2 2 2 1 2 1 2 0 * * R S e d xx j ( )t ( ) p w w wt = -• • Ú 1 2 R E x t P S d xx ( ) [ ] ( ) ( ) – 0 1 2 2 = = = • • Ú * * p w w r E x n k T x nT r k k = + ( ) = - { ( ) * ( )} * S r ek j k T k (w) w = - =-• • Â
S(6)= =S(6+2πk)≥0 (16.3) and the inverse relation gives the autocorrelations to be S(e)e/de = r-k Thus, the power spectral density of a discrete-time process is periodic. Such a process can be obtained by sampling a continuous-time process at t=kT, K=0-o0, and if the original continuous-time process is band limited with a two-sided bandwidth equal to 20,=2T/T, then the set of discrete samples so obtained is equivalent to the original process in a mean-square sense As Schur [Schur, 1917] has observed, for discrete-time stationary processes the nonnegativity of the power pectrum is equivalent to the nonnegative definiteness of the Hermitian Toeplitz matrices, i.e r rk S(6)≥0台 T≥0,k=0 To If xnT)is the output of a discrete-time linear time-invariant causal system driven by w(nt), then we have the w(m)-H()=∑MkT)1→xn)=∑MkTm(m-m)165 In the case of a stationary input, the output is also stationary, and its power spectral density is given by S6)=|Hef)2S() (16.6) then Sm(0)=0and the power spectral density of the input process. If the input is a white noise process, S6)=oH(e)2 Clearly if H(z)is rational, so is S,(e). Conversely, given a power spectral density
© 2000 by CRC Press LLC (16.3) and the inverse relation gives the autocorrelations to be Thus, the power spectral density of a discrete-time process is periodic. Such a process can be obtained by sampling a continuous-time process at t = kT, *k* = 0 Æ •, and if the original continuous-time process is bandlimited with a two-sided bandwidth equal to 2wb = 2p/T, then the set of discrete samples so obtained is equivalent to the original process in a mean-square sense. As Schur [Schur, 1917] has observed, for discrete-time stationary processes the nonnegativity of the power spectrum is equivalent to the nonnegative definiteness of the Hermitian Toeplitz matrices, i.e., (16.4) If x(nT) is the output of a discrete-time linear time-invariant causal system driven by w(nT), then we have the following representation: (16.5) In the case of a stationary input, the output is also stationary, and its power spectral density is given by Sx(q) = *H(ejq)* 2Sw(q) (16.6) where Sw(q) represents the power spectral density of the input process. If the input is a white noise process, then Sw(q) = s2 and Sx(q) = s2 *H(ejq)* 2 Clearly if H(z) is rational, so is Sx(q). Conversely, given a power spectral density (16.7) that satisfies the integrability condition (16.8) S r e S k k j k k ( ) q (q p ) q = = + ³ - =-• • Â 2 0 r S e d r k j k = = -k -Ú 1 2p q q q p p ( ) * S r r r r r r r r r k k k k k k k ( ) ... * ... * ... * , – – q ³ ¤ = Ê Ë Á Á Á Á Á ˆ ¯ ˜ ˜ ˜ ˜ ˜ 0 = ³ 0 = 0 Æ • 0 1 1 0 1 1 0 T T M M O M w nT H z h kT z x nT h kT w n k T k k k ( ) Æ = ( ) ( ) Æ ( ) = ( ) (( - ) ) = • = • Â Â 0 0 S r e x k j k k (q) q = ³ =-• • Â 0 S d x (q q ) p p < • -Ú