WiTrace:Centimeter-Level Passive Gesture Tracking Using OFDM signals Lei Wang,Ke Sun,Haipeng Dai,Member,IEEE,Wei Wang,Member,IEEE, Kang Huang,Alex X.Liu,Senior Member,IEEE,Xiaoyu Wang,and Qing Gu Member,IEEE Abstract-Gesture tracking is a basic Human-Computer Interaction mechanism to control devices such as loT and VR/AR devices However,prior OFDM signal based systems focus on gesture recognition and provide results with insufficient accuracy,and thus cannot be applied for high-precision gesture tracking.In this paper,we propose a CSI based device-free gesture tracking system, called WiTrace,which leverages the CSI values extracted from OFDM signals to enable accurate gesture tracking.For 1D tracking, WiTrace derives the phase of the signals reflected by the hand from the composite signals,and measures the phase changes to obtain the movement distance.For 2D tracking,WiTrace proposes the first CSI based scheme to accurately estimate the initial position,and adopts the Kalman Filter based on continuous Wiener process acceleration model to further filter out tracking noise.Our results show that WiTrace achieves an average accuracy of 6.23 cm for initial position estimation,and achieves cm-level accuracy,with average tracking errors of 1.46 cm and 2.09 cm for 1D tracking and 2D tracking,respectively. Index Terms-CSl,Gesture Tracking 1 INTRODUCTION ESTURE tracking is a basic Human-Computer Interaction human body as one single object with decimeter-level resolution. mechanism to control not only electronic Intemet of Things Our recent work,QGesture,uses phase information extracted (IoT)devices but also VR/AR devices.In smart homes,gestures from WiFi signals to track human hands.However,due to the are recognized to change the channel of TVs or adjust the phase noises,QGesture has a limited accuracy of 5.5 cm at a temperature of air conditioners.In VR/AR applications,users use distance of 2 m and needs to know the initial position before gestures to interact with devices,such as typewriting in the air performing 2D tracking.The dominant technologies above are 2].Gesture tracking can also be used for writing-in-the-air and shown in Table 1.Other Radio Frequency(RF)and acoustic signal gesture-based games.Recently,OFDM based WiFi signals are based tracking schemes use localization technologies to track widely used for passive sensing for gesture movement as [3]- gestures.WiTrack [14].[15]proposes to use specially designed [5]due to its particular advantages.In comparison with vision Frequency-Modulated Continuous-Wave (FMCW)radar with a based methods [6],[7].WiFi based approaches are not limited by high bandwidth of 1.79 GHz to track human movement behind lighting condition and room layout as WiFi signals are able to the wall with a resolution of 11 cm to 20 cm,which needs penetrate through walls.Meanwhile,users don't bother to wear special hardware.Similar to RF signal,although acoustic tracking devices [8].which is convenient and saves the extra cost of schemes [16]-[19]have high accuracy,these systems cannot serve wearable devices.Prior WiFi based gesture recognition systems as remote control interface for home applications due to limited extract features from reflected signals for different gestures [4]. working range. [9]and use machine learning methods to recognize gestures.Nev- In this paper,we propose WiTrace,a WiFi OFDM based ertheless,these methods provide results with insufficient accuracy device-free cm-level gesture tracking system.Our key idea is to and cannot be applied to high-precision gesture tracking.Existing use the Channel State Information (CSD)values of WiFi to track WiFi based tracking schemes include WiDraw [10].Widar [11], the hand with centimeter-level accuracy in 2D space.We utilize the Widar 2.0 [12],and QGesture [13].WiDraw uses Angle-Of- fact that the phase changes of CSI values reflected by the hand are Arrival (AOA)measurements to achieve a tracking accuracy of proportional to the propagation path length changes of the hand. 5 cm,which allows the user to draw in the air in densely deployed Since the wavelength of 2.4 GHz WiFi signals is around 12.5 cm, areas with more than 25 WiFi transmitters surrounding the user. hand movement with a few centimeters will significantly affect the Widar and Widar 2.0 are human tracking schemes which treat the CSI values.WiTrace uses Universal Software Radio Peripheral (USRP)to transmit and receive the Commercial-Off-the-Shelf .L.Wang.K.Sun,H.Dai.W.Wang,K.Kang.X.Wang.O.Gu are (COTS)802.11g signals with a carrier frequency of 2.4 GHz and with the State Key Laboratory for Novel Software Technology.Nanjing a bandwidth of 20 MHz.For ID tracking,WiTrace extracts the University,Nanjing 210023.China (e-mail:wangl@smail.nju.edu.cn: kesun@smail.nju.edu.cn:haipengdai@nju.edu.cn;ww@nju.edu.cn;hk- phase of the signals reflected by the hand from the composite wany520@gmail.com:mg 1633074@smail.nju.edu.cn:guq@nju.edu.cn) signals,and measures the phase changes to obtain the movement Alex X.Liu is with the State Key Laboratory for Novel Software Tech- distance.Furthermore,WiTrace uses one transmitter and two nology,Nanjing University,Nanjing 210023,China,and also with the receivers to enable the 2D tracking of hand.We propose the first Department of Computer Science and Engineering.Michigan State Uni- versity.East Lansing.MI 48824 USA (e-mail:alexliu @cse.msu.edu). CSI based scheme to accurately estimate the initial position,which A preliminary version of this work appeared in the proceeding of IEEE SECON has huge impact on the overall system performance.Furthermore, 2018 (11.Manuscript received April 19.2005:revised August 26.2015. we adopt the Kalman Filter(KF)based method to filter out noise Corresponding author:Wei Wang. of tracking
WiTrace: Centimeter-Level Passive Gesture Tracking Using OFDM signals Lei Wang, Ke Sun, Haipeng Dai, Member, IEEE, Wei Wang, Member, IEEE, Kang Huang, Alex X. Liu, Senior Member, IEEE, Xiaoyu Wang, and Qing Gu Member, IEEE Abstract—Gesture tracking is a basic Human-Computer Interaction mechanism to control devices such as IoT and VR/AR devices. However, prior OFDM signal based systems focus on gesture recognition and provide results with insufficient accuracy, and thus cannot be applied for high-precision gesture tracking. In this paper, we propose a CSI based device-free gesture tracking system, called WiTrace, which leverages the CSI values extracted from OFDM signals to enable accurate gesture tracking. For 1D tracking, WiTrace derives the phase of the signals reflected by the hand from the composite signals, and measures the phase changes to obtain the movement distance. For 2D tracking, WiTrace proposes the first CSI based scheme to accurately estimate the initial position, and adopts the Kalman Filter based on continuous Wiener process acceleration model to further filter out tracking noise. Our results show that WiTrace achieves an average accuracy of 6.23 cm for initial position estimation, and achieves cm-level accuracy, with average tracking errors of 1.46 cm and 2.09 cm for 1D tracking and 2D tracking, respectively. Index Terms—CSI, Gesture Tracking ✦ 1 INTRODUCTION G ESTURE tracking is a basic Human-Computer Interaction mechanism to control not only electronic Internet of Things (IoT) devices but also VR/AR devices. In smart homes, gestures are recognized to change the channel of TVs or adjust the temperature of air conditioners. In VR/AR applications, users use gestures to interact with devices, such as typewriting in the air [2]. Gesture tracking can also be used for writing-in-the-air and gesture-based games. Recently, OFDM based WiFi signals are widely used for passive sensing for gesture movement as [3]– [5] due to its particular advantages. In comparison with vision based methods [6], [7], WiFi based approaches are not limited by lighting condition and room layout as WiFi signals are able to penetrate through walls. Meanwhile, users don’t bother to wear devices [8], which is convenient and saves the extra cost of wearable devices. Prior WiFi based gesture recognition systems extract features from reflected signals for different gestures [4], [9] and use machine learning methods to recognize gestures. Nevertheless, these methods provide results with insufficient accuracy and cannot be applied to high-precision gesture tracking. Existing WiFi based tracking schemes include WiDraw [10], Widar [11], Widar 2.0 [12], and QGesture [13]. WiDraw uses Angle-OfArrival (AOA) measurements to achieve a tracking accuracy of 5 cm, which allows the user to draw in the air in densely deployed areas with more than 25 WiFi transmitters surrounding the user. Widar and Widar 2.0 are human tracking schemes which treat the • L. Wang, K. Sun, H. Dai, W. Wang, K. Kang, X. Wang, Q. Gu are with the State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China (e-mail: wangl@smail.nju.edu.cn; kesun@smail.nju.edu.cn; haipengdai@nju.edu.cn; ww@nju.edu.cn; hkwany520@gmail.com; mg1633074@smail.nju.edu.cn; guq@nju.edu.cn). • Alex X. Liu is with the State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China, and also with the Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824 USA (e-mail: alexliu@cse.msu.edu). A preliminary version of this work appeared in the proceeding of IEEE SECON 2018 [1]. Manuscript received April 19, 2005; revised August 26, 2015. Corresponding author: Wei Wang. human body as one single object with decimeter-level resolution. Our recent work, QGesture, uses phase information extracted from WiFi signals to track human hands. However, due to the phase noises, QGesture has a limited accuracy of 5.5 cm at a distance of 2 m and needs to know the initial position before performing 2D tracking. The dominant technologies above are shown in Table 1. Other Radio Frequency (RF) and acoustic signal based tracking schemes use localization technologies to track gestures. WiTrack [14], [15] proposes to use specially designed Frequency-Modulated Continuous-Wave (FMCW) radar with a high bandwidth of 1.79 GHz to track human movement behind the wall with a resolution of 11 cm to 20 cm, which needs special hardware. Similar to RF signal, although acoustic tracking schemes [16]–[19] have high accuracy, these systems cannot serve as remote control interface for home applications due to limited working range. In this paper, we propose WiTrace, a WiFi OFDM based device-free cm-level gesture tracking system. Our key idea is to use the Channel State Information (CSI) values of WiFi to track the hand with centimeter-level accuracy in 2D space. We utilize the fact that the phase changes of CSI values reflected by the hand are proportional to the propagation path length changes of the hand. Since the wavelength of 2.4 GHz WiFi signals is around 12.5 cm, hand movement with a few centimeters will significantly affect the CSI values. WiTrace uses Universal Software Radio Peripheral (USRP) to transmit and receive the Commercial-Off-the-Shelf (COTS) 802.11g signals with a carrier frequency of 2.4 GHz and a bandwidth of 20 MHz. For 1D tracking, WiTrace extracts the phase of the signals reflected by the hand from the composite signals, and measures the phase changes to obtain the movement distance. Furthermore, WiTrace uses one transmitter and two receivers to enable the 2D tracking of hand. We propose the first CSI based scheme to accurately estimate the initial position, which has huge impact on the overall system performance. Furthermore, we adopt the Kalman Filter (KF) based method to filter out noise of tracking
TRANSACTIONS ON MOBILE COMPUTING,VOL.17,NO.10,OCTOBER 2018 2 Table 1 Push Comparison of different WiFi-based systems NLOS A System Object Granularity Range TX&RX LOSB WiDraw 101 Hand 5 cm 0.6m 27 Transmitter Receiver QGesture [13] Hand 5.5cm 2m 3 wdar[11】 Human body 0.83.2m 3 NLOS CY 25 cm Widar 2.0 [12] Human body 75 cm 8m Wall Wikey [3] Gesture Recognition 4m 25 WiFinger [4] Gesture Recognition 14m Wigesture [9] Gesture Recognition 22m 22 WiTrace Gesture 2.09cm 23.5m Figure 1.Illustration of multiple paths WiTrace addresses three critical challenges.The first challenge 2 CSI PHASE MODEL is to achieve cm-level hand tracking accuracy for large range In this section,we describe the theoretical model of Channel State based on WiFi signals.Prior WiFi based tracking scheme uses Information (CSI)regarding dynamic gesture movement.Specific- AOA to track hand with large number of transmitters in the range ally,CSI estimates the channel properties of a communication of 2 feet [10].In contrast,we leverage the fact that the phase link,which is described by channel frequency response (CFR) changes of dynamic component of CSI are proportional to the path for k-th subcarrier frequency fk [23].As a result,CSI of the length changes caused by the object movement.By measuring k-th subcarrier at time t is the superimposition response of all and analyzing the phase changes.WiTrace achieves an average transmission paths [24]: distance error of 1.46 cm when pushing hand for 30 cm in the range of 1.2 m using omnidirectional antennas. The second challenge is to separate the phase changes caused (2πfkd(t)/c+中e ej(f,,(1) by the moving hands from CSI values caused by other environ- ments.The Signal-to-Noise Ratio (SNR),which represents the ratio of the reflecting power of target objects and other static where K is the total number of paths,is the attenuation coefficient of the k-th subcarrier,di(t)is the length of path i,c is objects,attenuates at long distance.As a result,the phase changes caused by the moving hands can be easily contaminated by other the speed of the wireless signal,and o;is the initial phase caused by time delay of the imperfect hardware.Additionally,traditional ambient interference,which means it is challenging to extract the phase changes from mixture signals.To address this challenge, CSI measurements typically have a phase shift of(f,t),which we apply a heuristic algorithm,i.e.Extracting Static Component is caused by residual frequency offset due to non-synchronized- clocks between transceiver pair.In order to rule out the phase (ESC)which lies in its robustness to the ambient interference. errors,we use an external clock [25]to connect the transmitter For In-phase (i.e.I)or Quadrature (i.e.Q)components of CSI. we first find the nearby local maxima and minima using empirical and the receiver in our system. As shown in Figure 1,all of the paths can be divided into threshold.To wipe out those noisy extreme points,we set temporal static paths,e.g.,the wall and LoS path,and dynamic paths e.g., threshold that is determined by the maximal Doppler frequency. The third challenge is to estimate the initial position of hand the hand.For static path i,the length of path di can be considered as fixed during a short period.As a result,Eq.(1)can be rewritten in 2D space.Although we can precisely measure the distance as' changes of hand movements,it is difficult to locate the absolute position of the hand directly without the initial hand location. Existing indoor localization based on WiFi signals [20]-[22]can k(t)=afei(2rd()/+), (2) be used for initial location estimation.However,these systems iEPa only get the coarse location at decimeter level of the human body,which is insufficient for gesture tracking.To address this whereis the sum of CSI for the static paths that are constant challenge,we first estimate the coarse initial hand position based for a short duration,P is the set for the dynamic paths,and on the CSI phase difference of variant subcarriers caused by Ak=c/fr is the wavelength for frequency f. hand movement.This coarse initial position estimation can narrow Suppose we can derive the phase change of path i,i.e..A. down the candidate region for the following fine estimation step where the phase i is i=2mdi(t)/k+i.Thus,the length so that the computation complexity of the fine estimation can be change of dynamic path i is given by: significantly reduced.Then,we utilize the fact that the estimated trajectory would be different for different initial positions.We use the result of two preamble gestures as the fingerprints of different △d,= △p入 (3) 2π initial positions and combine two directions to refine the initial where Ai is the phase change of path i. position estimation.Our approach achieves an average accuracy Finally,our goal is to measure the phase changes of the of 6.23 cm for initial position estimation. dynamic path caused by hand movement,and thereby determine We implemented WiTrace using USRP transceivers.Our ex- the length change of dynamic path to track hand in the air. perimental results show that our approach achieves estimated accuracy of the initial hand position 6.23 cm on average,and tracks the hand movement with mean accuracy of 1.46 cm for ID 3 CSI PHASE BASED DISTANCE MEASUREMENT tracking and 2.09 cm for 2D tracking,respectively.The result also In this section.we propose a method to measure hand movement. shows that WiTrace reaches overall mean direction error of 7.32 Our measurement method contains four steps,as shown in Fig- degrees across five different directions in 2D space case. ure 2.First,we apply the Hampel filter to remove the noise of
TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 10, OCTOBER 2018 2 Table 1 Comparison of different WiFi-based systems System Object Granularity Range TX&RX WiDraw [10] Hand 5 cm 0.6 m 27 QGesture [13] Hand 5.5 cm 2 m 3 Widar [11] Human body 25 cm 0.8 ∼ 3.2 m 3 Widar 2.0 [12] Human body 75 cm 8 m 2 Wikey [3] Gesture Recognition 4 m 5 WiFinger [4] Gesture Recognition 1 ∼ 4m 5 Wigesture [9] Gesture Recognition ≥ 2m ≥ 2 WiTrace Gesture 2.09 cm ≥ 3.5 m 3 WiTrace addresses three critical challenges. The first challenge is to achieve cm-level hand tracking accuracy for large range based on WiFi signals. Prior WiFi based tracking scheme uses AOA to track hand with large number of transmitters in the range of 2 feet [10]. In contrast, we leverage the fact that the phase changes of dynamic component of CSI are proportional to the path length changes caused by the object movement. By measuring and analyzing the phase changes, WiTrace achieves an average distance error of 1.46 cm when pushing hand for 30 cm in the range of 1.2 m using omnidirectional antennas. The second challenge is to separate the phase changes caused by the moving hands from CSI values caused by other environments. The Signal-to-Noise Ratio (SNR), which represents the ratio of the reflecting power of target objects and other static objects, attenuates at long distance. As a result, the phase changes caused by the moving hands can be easily contaminated by other ambient interference, which means it is challenging to extract the phase changes from mixture signals. To address this challenge, we apply a heuristic algorithm, i.e. Extracting Static Component (ESC) which lies in its robustness to the ambient interference. For In-phase (i.e. I ) or Quadrature (i.e. Q) components of CSI, we first find the nearby local maxima and minima using empirical threshold. To wipe out those noisy extreme points, we set temporal threshold that is determined by the maximal Doppler frequency. The third challenge is to estimate the initial position of hand in 2D space. Although we can precisely measure the distance changes of hand movements, it is difficult to locate the absolute position of the hand directly without the initial hand location. Existing indoor localization based on WiFi signals [20]–[22] can be used for initial location estimation. However, these systems only get the coarse location at decimeter level of the human body, which is insufficient for gesture tracking. To address this challenge, we first estimate the coarse initial hand position based on the CSI phase difference of variant subcarriers caused by hand movement. This coarse initial position estimation can narrow down the candidate region for the following fine estimation step so that the computation complexity of the fine estimation can be significantly reduced. Then, we utilize the fact that the estimated trajectory would be different for different initial positions. We use the result of two preamble gestures as the fingerprints of different initial positions and combine two directions to refine the initial position estimation. Our approach achieves an average accuracy of 6.23 cm for initial position estimation. We implemented WiTrace using USRP transceivers. Our experimental results show that our approach achieves estimated accuracy of the initial hand position 6.23 cm on average, and tracks the hand movement with mean accuracy of 1.46 cm for 1D tracking and 2.09 cm for 2D tracking, respectively. The result also shows that WiTrace reaches overall mean direction error of 7.32 degrees across five different directions in 2D space case. Transmitter Receiver Wall Push NLOS C LOS B NLOS A Figure 1. Illustration of multiple paths 2 CSI PHASE MODEL In this section, we describe the theoretical model of Channel State Information (CSI) regarding dynamic gesture movement. Specifically, CSI estimates the channel properties of a communication link, which is described by channel frequency response (CFR) for k-th subcarrier frequency fk [23]. As a result, CSI of the k-th subcarrier at time t is the superimposition response of all transmission paths [24]: −→H(t) k = X K i=1 α k i,te j(2πfkdi(t)/c+φi) ! e jψ(fk,t) , (1) where K is the total number of paths, α k i,t is the attenuation coefficient of the k-th subcarrier, di(t) is the length of path i, c is the speed of the wireless signal, and φi is the initial phase caused by time delay of the imperfect hardware. Additionally, traditional CSI measurements typically have a phase shift of ψ(fk, t), which is caused by residual frequency offset due to non-synchronizedclocks between transceiver pair. In order to rule out the phase errors, we use an external clock [25] to connect the transmitter and the receiver in our system. As shown in Figure 1, all of the paths can be divided into static paths, e.g., the wall and LoS path, and dynamic paths e.g., the hand. For static path i, the length of path di can be considered as fixed during a short period. As a result, Eq. (1) can be rewritten as: −→Hk (t) = −→Hk st + X i∈Pd α k i,te j(2πdi(t)/λk+φi) , (2) where −→H fk st is the sum of CSI for the static paths that are constant for a short duration, Pd is the set for the dynamic paths, and λk = c/fk is the wavelength for frequency fk. Suppose we can derive the phase change of path i, i.e., ∆ϕi , where the phase ϕi is ϕi = 2πdi(t)/λk + φi . Thus, the length change of dynamic path i is given by: ∆di = ∆ϕiλk 2π (3) where ∆ϕi is the phase change of path i. Finally, our goal is to measure the phase changes of the dynamic path caused by hand movement, and thereby determine the length change of dynamic path to track hand in the air. 3 CSI PHASE BASED DISTANCE MEASUREMENT In this section, we propose a method to measure hand movement. Our measurement method contains four steps, as shown in Figure 2. First, we apply the Hampel filter to remove the noise of
TRANSACTIONS ON MOBILE COMPUTING,VOL.17,NO.10,OCTOBER 2018 3 of CSI experiences large fluctuations because of phase change.We Receive CSI Denoise CSI asurements apply a sliding window to compute the variance of the amplitude continuously.We choose to use the amplitude instead of the phase mainly due to the less computational cost of the amplitude as the unwrapping process for phase calculation may incur additional Measure computation for rectifying the discontinuity in the phase.In Figure Remove Detect the moving distance 3(d).Std represents the standard deviation of each short period, and I component means the In-phase component of CSI values. Both Std and I component are normalized to [-1,1]for a clear illustration.As shown in Figure 3(d),the variance in static period Figure 2.Processing flow for 1D tracking is much smaller than the variance in dynamic period.So the movement period can be easily detected by using an empirical threshold.However,there may still exist some abnormal variances CSI signal.After removing the noise,we can verify the CSI phase due to multipath effect for one frequency.These multipaths are model by illustrating the CSI signal in two dimensions.Second, mainly caused by movement of other body parts.As a result,we we use the variance of CSI amplitude to detect the start of the combine the results of all the subcarriers by using mathematical movement.Third,we propose a heuristic algorithm to remove the expectation to mitigate the effect of multipath.Then,we use a static vector from the CSI signal.At last,we transform the phase predefined empirical threshold to detect the beginning and end of change of CSI signal to the movement distance. the movement. 3.1 CSI Signal Preprocessing 3.4 Static Vector Elimination As shown in Figure 3(a),raw CSI signals(red curves)have large In reality,it is challenging to remove static vector from the CSI jitters as they contain various types of noise [26].On one hand, measurement.On one hand,the static vector that is mainly caused there are many outliers in CSI signal,which are mostly caused by static reflectors,e.g.,Path B and Path C as shown in Figure 1,is by wireless interference.On the other hand,system hardware much stronger than the dynamic vector caused by hand,e.g.Path may generate high frequency noise [27].As a result,we need A.On the other hand,static vector may change slowly with the to preprocess the I/Q components of the CSI signal to remove moving of hand due to blocking of other reflectors and the slow these noises.First,we apply the Hampel filter [28]to filter out the movement of other body parts (e.g.the arm).Additionally,even outliers that have are significantly different to others.The green though dynamic vector of hand dominates the variation of CSI, curve in Figure 3(a)shows the results after applying the Hampel SNR will degrade with distance between hand and receiving end. filter.Second,we utilize the moving average low-pass filter to There are some existing algorithms that separate static vector further remove high frequency noise.The black curve in Figure from dynamic vector of hand.Dual-Differential Background Re- 3(a)shows the result of the signal after filtering.We then get the moval (DDBR)and Phase Counting and Reconstruction(PCR)are complex valued CSI measurements using the I/Q components as used in 60 GHz Radar systems,such as mTrack [29],to remove the real/imaginary parts. the static vector.However,both methods are not suitable for CSI signal due to the high noise-level of CSI and the lack of periodicity 3.2 CSI Phase Model Verification that is required by PCR.LLAP [16]based on ultrasound applies a heuristic algorithm called Local Extreme Value Detection (LEVD) In Figure 3(b).we use a real world CSI measurement to illustrate how CSI phase changes in our CSI phase model.During a short based on Empirical Mode Decomposition (EMD)algorithm [30] to estimate the static vector.It isolates the static vector by time period from 1.25 to 1.7 seconds,a user pushes his hand for 28 cm towards the receiving end.As the two-way path length detecting whether the gap between alternate local maximum and change is△d=28×2 cm and the wavelengthλis12.5cm,we minimum points is larger than an empirical threshold Thr.Here Thr is set as three times of the standard deviation of the baseband find that CSI rotates clockwise for△d/入=28×2/12.5≈4.5 signal in a static environment.However,for CSI signal,the static cycles.We further observe that the static vector which corresponds vector is always contaminated by surrounding noise,thus,it is to the static path is not constant during this period.This is mainly difficult to reliably detect the local maximum and minimum points due to other slow changes around ambient environment during the hand movement.Meanwhile,the CSI amplitude is not stable by threshold Thr.For example,most maximum and minimum points in Figure 3(e)are failed to be detected by LEVD.As during the period.This is owing to the increasing strength of reflected signal.As a result,we need to remove the static vector a result,we propose the Extracting Static Component (ESC) method as shown in Algorithm I to estimate static vector.On and extract the dynamic vector,which corresponds to the dynamic path,to get the phase change of dynamic vector as shown in Figure one hand,instead of using threshold that is three times of the standard deviation,we use empirical threshold called Thrm, 3(c).Note that the centers of the traces of the dynamic vectors are moved to the origin so that we can directly measure the phase of which is far below the previous threshold to avoid neglecting small local extreme points.However,this operation may include them. some noisy points leading to incorrect static components.To remove noisy extreme points in the environment,we set a temporal 3.3 Movement Detection threshold related to the frequency shift of signals caused by gesture Before measuring the movement,our system needs to detect the movement,Ta,to 1/(2fdm)-u,where fd is the largest start of the movement.When user keeps static,the amplitude of Doppler frequency shift for each short period and u is a small CSI is stable except for small fluctuation caused by ambient noise. positive constant.We apply Short Time Fourier Transform(STFT) Meanwhile,once the user begins to move his hand,the amplitude method on CSI measurements to derive the instantaneous Doppler
TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 10, OCTOBER 2018 3 Detect the movement No Yes Remove static vector Measure moving distance Receive CSI measurements Denoise CSI measurements Figure 2. Processing flow for 1D tracking CSI signal. After removing the noise, we can verify the CSI phase model by illustrating the CSI signal in two dimensions. Second, we use the variance of CSI amplitude to detect the start of the movement. Third, we propose a heuristic algorithm to remove the static vector from the CSI signal. At last, we transform the phase change of CSI signal to the movement distance. 3.1 CSI Signal Preprocessing As shown in Figure 3(a), raw CSI signals (red curves) have large jitters as they contain various types of noise [26]. On one hand, there are many outliers in CSI signal, which are mostly caused by wireless interference. On the other hand, system hardware may generate high frequency noise [27]. As a result, we need to preprocess the I/Q components of the CSI signal to remove these noises. First, we apply the Hampel filter [28] to filter out the outliers that have are significantly different to others. The green curve in Figure 3(a) shows the results after applying the Hampel filter. Second, we utilize the moving average low-pass filter to further remove high frequency noise. The black curve in Figure 3(a) shows the result of the signal after filtering. We then get the complex valued CSI measurements using the I/Q components as the real/imaginary parts. 3.2 CSI Phase Model Verification In Figure 3(b), we use a real world CSI measurement to illustrate how CSI phase changes in our CSI phase model. During a short time period from 1.25 to 1.7 seconds, a user pushes his hand for 28 cm towards the receiving end. As the two-way path length change is ∆d = 28 × 2 cm and the wavelength λ is 12.5 cm, we find that CSI rotates clockwise for ∆d/λ = 28 × 2/12.5 ≈ 4.5 cycles. We further observe that the static vector which corresponds to the static path is not constant during this period. This is mainly due to other slow changes around ambient environment during the hand movement. Meanwhile, the CSI amplitude is not stable during the period. This is owing to the increasing strength of reflected signal. As a result, we need to remove the static vector and extract the dynamic vector, which corresponds to the dynamic path, to get the phase change of dynamic vector as shown in Figure 3(c). Note that the centers of the traces of the dynamic vectors are moved to the origin so that we can directly measure the phase of them. 3.3 Movement Detection Before measuring the movement, our system needs to detect the start of the movement. When user keeps static, the amplitude of CSI is stable except for small fluctuation caused by ambient noise. Meanwhile, once the user begins to move his hand, the amplitude of CSI experiences large fluctuations because of phase change. We apply a sliding window to compute the variance of the amplitude continuously. We choose to use the amplitude instead of the phase mainly due to the less computational cost of the amplitude as the unwrapping process for phase calculation may incur additional computation for rectifying the discontinuity in the phase. In Figure 3(d), Std represents the standard deviation of each short period, and I component means the In-phase component of CSI values. Both Std and I component are normalized to [−1, 1] for a clear illustration. As shown in Figure 3(d), the variance in static period is much smaller than the variance in dynamic period. So the movement period can be easily detected by using an empirical threshold. However, there may still exist some abnormal variances due to multipath effect for one frequency. These multipaths are mainly caused by movement of other body parts. As a result, we combine the results of all the subcarriers by using mathematical expectation to mitigate the effect of multipath. Then, we use a predefined empirical threshold to detect the beginning and end of the movement. 3.4 Static Vector Elimination In reality, it is challenging to remove static vector from the CSI measurement. On one hand, the static vector that is mainly caused by static reflectors, e.g., Path B and Path C as shown in Figure 1, is much stronger than the dynamic vector caused by hand, e.g. Path A. On the other hand, static vector may change slowly with the moving of hand due to blocking of other reflectors and the slow movement of other body parts (e.g. the arm). Additionally, even though dynamic vector of hand dominates the variation of CSI, SNR will degrade with distance between hand and receiving end. There are some existing algorithms that separate static vector from dynamic vector of hand. Dual-Differential Background Removal (DDBR) and Phase Counting and Reconstruction (PCR) are used in 60 GHz Radar systems, such as mTrack [29], to remove the static vector. However, both methods are not suitable for CSI signal due to the high noise-level of CSI and the lack of periodicity that is required by PCR. LLAP [16] based on ultrasound applies a heuristic algorithm called Local Extreme Value Detection (LEVD) based on Empirical Mode Decomposition (EMD) algorithm [30] to estimate the static vector. It isolates the static vector by detecting whether the gap between alternate local maximum and minimum points is larger than an empirical threshold T hr. Here T hr is set as three times of the standard deviation of the baseband signal in a static environment. However, for CSI signal, the static vector is always contaminated by surrounding noise, thus, it is difficult to reliably detect the local maximum and minimum points by threshold T hr. For example, most maximum and minimum points in Figure 3(e) are failed to be detected by LEVD. As a result, we propose the Extracting Static Component (ESC) method as shown in Algorithm 1 to estimate static vector. On one hand, instead of using threshold that is three times of the standard deviation, we use empirical threshold called T hrm, which is far below the previous threshold to avoid neglecting small local extreme points. However, this operation may include some noisy points leading to incorrect static components. To remove noisy extreme points in the environment, we set a temporal threshold related to the frequency shift of signals caused by gesture movement, Td, to 1/(2fdmax ) − µ, where fdmax is the largest Doppler frequency shift for each short period and µ is a small positive constant. We apply Short Time Fourier Transform (STFT) method on CSI measurements to derive the instantaneous Doppler
TRANSACTIONS ON MOBILE COMPUTING,VOL.17,NO.10,OCTOBER 2018 0.04 -Raw I component -Raw CSI Step1(Hampel filter) 兰0.9 Step2(Average moving filte 0.06 -0.08 0.7 0.12 06 -0.141 1.2 14 1.6 18 0.740.760.780.80.820.84 Time(Second) I component (a)CSI preprocessing (b)I/Q trace of filtered CSI 0.05 15 -125 -Dynamic vetor -Std I componen -1.3 0 8 LEVD ESC extreme point LEVD extreme point .1.4g 0 0.05 05 1.5 0 0.102 0.30.40.50.60.7 0.8 I component Time (second) Time(second) (c)I/Q trace of dynamic vector (d)Variance of I/Q component (e)Static vector estimation of LEVD and ESC Figure 3.1D tracking measurement Algorithm 1:Extracting Static Component Algorithm according to Eq.(3).As Figure I shows,since transmitter/receiver Input:CSI signal of real or imaginary part X(t)=X(t)or and hand are set on the same line,the real movement distance X(t).CSI deviation of real or imaginary part d,=d of hand is half of the path length,e.g.,Ad= when ord for previous static period. a user pushes his hand from moment ti to ti.Although we Output:Estimated static vector S(t) have mitigated the effect of static multipath by removing the 1 Initialize extrema of real and imaginary part:E(ti)=E(ti) static vector,there still remains some dynamic multipath effect or ER(t),where t;is the timestamps of extrema; when hands move.We utilize the fact that different subcarriers 2 for each time t do have different frequencies.Meanwhile,when there is no multipath if detect the movement then /*Find extreme point of X(t)*/ effect,the measured distance changes should be the same for all ifX(t)is a local extrema and X(t)-E(ti)2 ds subcarriers,while the phase changes of different subcarriers are then different.As a result,we combine results of different subcarriers Take STFT of X(t)to get the maximum doppler by using linear regression [161,which finds the best value of Ad frequency fmar: that fits all phase changes obtained from different frequencies,to ift-t≥l/(2fmar)-μthen further mitigate the multipath effect. i←i+1: 9 t1←t; 10 E(t)←X(e: 4 2D TRACKING 么 Xs(t)←(E(t)+E(t-1)/2: In this section,we present our 2D tracking algorithm based on 12 /*Update the static vector*/ the distance measurements in Section 2.We first discuss the 3(e)←X()+jX8(e): CSI measurement noise sources that make our initial position estimation results significantly deviate from ground truth and take 14 return了(e); methods to reduce the noises.Then,we apply phase difference over different subcarriers to coarsely estimate the initial hand frequency shift fdm The duration between any two adjacent position and propose a novel algorithm to refine the resolution extreme points should be more than Td.which is slightly larger subsequently.At last,we present a KF based algorithm to further than half of the shortest period.As shown in Figure 3(e),ESC refine the trajectory. improves the accuracy to detect the movement of hand and avoids the small noise induced by ambient environment within the first 0.15 seconds. 4.1 CSI Measurement Evaluation and Denoising There are mainly three kinds of noises in CSI measurements that may give rise to large phase errors. 3.5 Distance Measurement Carrier Frequency Offset(CFO):CFO occurs when there After movement detection and static vector elimination,the phase exists carrier frequency mismatch between the transmitter and of dynamic vector changes linearly with path length change receiver oscillators.Although the CFO is compensated by the
TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 10, OCTOBER 2018 4 1 1.2 1.4 1.6 1.8 Time(Second) 0.6 0.7 0.8 0.9 1 I component Raw I component Step1(Hampel filter) Step2(Average moving filter) (a) CSI preprocessing 0.74 0.76 0.78 0.8 0.82 0.84 I component -0.14 -0.12 -0.1 -0.08 -0.06 -0.04 Q component Raw CSI (b) I/Q trace of filtered CSI -0.05 0 0.05 I component -0.05 0 0.05 Q component Dynamic vetor (c) I/Q trace of dynamic vector 0 0.5 1 1.5 2 Time (second) -1.5 -1 -0.5 0 0.5 1 1.5 Normalized Std I component Threshold Movement Period (d) Variance of I/Q component 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Time (second) -1.45 -1.4 -1.35 -1.3 -1.25 I/Q components Raw signal ESC LEVD ESC extreme point LEVD extreme point (e) Static vector estimation of LEVD and ESC Figure 3. 1D tracking measurement Algorithm 1: Extracting Static Component Algorithm Input: CSI signal of real or imaginary part X(t) = X I (t) or X Q(t). CSI deviation of real or imaginary part ds = d I s or d Q s for previous static period. Output: Estimated static vector −→S (t) 1 Initialize extrema of real and imaginary part: E(ti) = E I (ti) or E Q(ti), where ti is the timestamps of extrema; 2 for each time t do 3 if detect the movement then 4 /*Find extreme point of X(t) */ 5 if X(t) is a local extrema and |X(t) − E(ti)| ≥ ds then 6 Take STFT of X(t) to get the maximum doppler frequency fmax; 7 if t − ti ≥ 1/(2fmax) − µ then 8 i ← i + 1; 9 ti ← t; 10 E(ti) ← X(t); 11 Xs(t) ← (E(ti) + E(ti−1))/2; 12 /*Update the static vector*/ 13 −→S (t) ← X I s (t) + jXQ s (t); 14 return −→S (t) ; frequency shift fdmax . The duration between any two adjacent extreme points should be more than Td, which is slightly larger than half of the shortest period. As shown in Figure 3(e), ESC improves the accuracy to detect the movement of hand and avoids the small noise induced by ambient environment within the first 0.15 seconds. 3.5 Distance Measurement After movement detection and static vector elimination, the phase of dynamic vector changes linearly with path length change according to Eq. (3). As Figure 1 shows, since transmitter/receiver and hand are set on the same line, the real movement distance of hand is half of the path length, e.g., ∆d = dti−dtj 2 when a user pushes his hand from moment ti to tj . Although we have mitigated the effect of static multipath by removing the static vector, there still remains some dynamic multipath effect when hands move. We utilize the fact that different subcarriers have different frequencies. Meanwhile, when there is no multipath effect, the measured distance changes should be the same for all subcarriers, while the phase changes of different subcarriers are different. As a result, we combine results of different subcarriers by using linear regression [16], which finds the best value of ∆d that fits all phase changes obtained from different frequencies, to further mitigate the multipath effect. 4 2D TRACKING In this section, we present our 2D tracking algorithm based on the distance measurements in Section 2. We first discuss the CSI measurement noise sources that make our initial position estimation results significantly deviate from ground truth and take methods to reduce the noises. Then, we apply phase difference over different subcarriers to coarsely estimate the initial hand position and propose a novel algorithm to refine the resolution subsequently. At last, we present a KF based algorithm to further refine the trajectory. 4.1 CSI Measurement Evaluation and Denoising There are mainly three kinds of noises in CSI measurements that may give rise to large phase errors. Carrier Frequency Offset (CFO): CFO occurs when there exists carrier frequency mismatch between the transmitter and receiver oscillators. Although the CFO is compensated by the
TRANSACTIONS ON MOBILE COMPUTING,VOL.17,NO.10,OCTOBER 2018 451 -SFO retained 09 09 15 2 25 20 30 40 50 30 50 05 Time (Second) Subcarrier Subcarriers Standard Deviation(Radian) Figure 4.CSI phase changes of Figure 5.Raw CSI phase Figure 6.Denoised CSI phase of Figure 7.CDF of phase variance all subcarriers changes of various moment various moment for all subcarriers W/O SFO external clock in our system,the compensation is not perfect due strate the effect of the filter,we collect 150 CSI samples with a to the hardware imperfection.The residual CFO will lead to the duration of 96 ms.Then,we compute the standard deviation of time-varying CSI values offset as He(f,t)=(f,t). CSI phase over the 96 ms segment for each subcarrier.Figure where Af is the corresponding CFO.Fortunately,as shown in 10 shows that the average phase deviation significantly reduced Figure 4,the frequency shift over a silent duration of 1.5 seconds from 0.032 radians before filtering to 0.025 radians after filtering. is only 0.044 radians for all subcarriers,which is small,relative to The denoising process is necessary for the coarse estimation in the Doppler frequency caused by hand movements.Note that there Section 4.2,since the performance of coarse estimation would are no any dynamic paths over this silent duration.For example, degrade severely without denoising the Doppler frequency is at least 16 Hz when the velocity of a pushing hand is 1 m/s.This indicates that the impact of CFO can be ignored since it is small comparing to the impact of hand 4.2 Coarse Initial Position Estimation movements. After removing CSI noises,we use the phase difference between Sampling Frequency Offset(SFO)and Packet Detection subcarriers to coarsely estimate the initial position of hand.The Delay (PDD):In addition to CFO,two other noise sources (i.e. coarse initial position estimation can narrow down the candid- SFO and PDD)[13].[27]have the similar effect on the CSI ate region,which greatly reduces the complexity of the later measurements.As shown in Figure 5,the slopes of linear phase refinement algorithm.As shown in Eg.(2),CSI of one pair of shift are almost the same on the different frames during the transceiver is the superimposition of all paths.Suppose that there same measurement.However,the slope changes randomly across is only one dynamic path (i.e.,hand movement)in our system. different measurements when the equipments restarts due to the Thus,the CSI vector of all paths on k-th subcarrier at time t can randomness in the PLL locking process of the clock.Thus,we be described as: compensate the SFO/PDD offset whenever the system restarts.We use linear regression to remove the SFO/PDD offset [13].Figure 6 (t)=+aa(t)ej(2xdna()/A+ha) (4) shows the CSI phase before and after correction.We observe that where Hk,af(t).dnd(t).and ohd are the constant static the CSI phase is more consistent over different subcarriers after vector,the attenuation coefficient,path length and initial phase SFO/PDD correction.The largest phase difference for any two subcarriers is smaller than 0.03 radians,and there is only a small for hand reflection,respectively.We can use(t) residual difference of less than 0.01 radians after phase correction. (t)/)to represent the dynamic vector.As shown in Figure 11,the dynamic vector rotates due to the phase In case that the system restarts,CSI phase may have inconsistent initial phases.To demonstrate this,we collect 150 CSI samples shift caused by hand movements.Furthermore,the phase differ- ence of two subcarriers due to hand movement at time t can be across different measurements to compute the standard deviation described as: of CSI phase over all subcarriers.Figure 7 shows the standard deviation of phase averaged over different subcarriers reduced Apki,ka (t)=arg((t))-arg(()) (5) from 0.61 radians before removing the CFO to 0.01 radians after removing SFO. 2dhd(t)/XK2 -2ndhd(t)/Xk CSI variance:Although we have applied low-pass filter to where we use k,2 to represent two different subcarriers and remove high frequency noises at the denoising step,there is still arg((t))represents the the phase of dynamic vector cor- CSI variance even in silent period.For example,the standard responding to hand movement on subcarrier k.Once we know deviation of CSI phase for time period 01.5 s as shown in Ap(t),the path length can be calculated as: Figure 4 is 0.013 radians.This variance will degrade the initial hand position estimation performance significantly.To remove △Pk1k2(t)入k1入k2 (6) CSI noises efficiently,we use three cascading low-pass filters to dhd(t)= 2r(Ak1-入k2) constitute a new cascading filter as h'(t)=h(t)*h(t)*h(t), where h(t)is the impulse response for one low-pass filter and We use the tangent vector corresponding to dynamic vector represents convolution.Figure 8 shows the frequency response of (t).denoted as(t).to estimate pk The phase the cascading filter comparing with the previous average moving change of the dynamic vector is equivalent to phase change in filter.In Figure 9,we observe that high frequency noises are its tangent vector: attenuated significantly by the filter so that the SNR is improved, where SNR here is denoted by the ratio of normal hand movements △pk,k2()=【arg(ig(t)-元/2-[arg(i()-π/2A to the high frequency noises power on I/Q component.To demon- -arg(i好()-arg(() (7)
TRANSACTIONS ON MOBILE COMPUTING, VOL. 17, NO. 10, OCTOBER 2018 5 0 0.5 1 1.5 2 2.5 3 3.5 Time (Second) -4 -3 -2 -1 0 1 2 3 4 Phase (Radian) Subcarrier1 Subcarrier15 Subcarrier25 Subcarrier35 Subcarrier45 Subcarrier55 Figure 4. CSI phase changes of all subcarriers 0 10 20 30 40 50 60 Subcarriers 1 1.5 2 2.5 3 3.5 4 4.5 Phase (Radian) 0.1 s 0.3 s 0.5 s 0.7 s 0.9 s Figure 5. Raw CSI phase changes of various moment 0 10 20 30 40 50 60 Subcarriers 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 0.055 Phase (Radian) t=0.1 s t=0.3 s t=0.5 s t=0.7 s t=0.9 s Figure 6. Denoised CSI phase of various moment 0 0.5 1 1.5 Standard Deviation(Radian) 0 0.2 0.4 0.6 0.8 1 CDF SFO retained SFO removed Figure 7. CDF of phase variance for all subcarriers W/O SFO external clock in our system, the compensation is not perfect due to the hardware imperfection. The residual CFO will lead to the time-varying CSI values offset as −→Hc(f, t) = e j2π∆f t−→H(f, t), where ∆f is the corresponding CFO. Fortunately, as shown in Figure 4, the frequency shift over a silent duration of 1.5 seconds is only 0.044 radians for all subcarriers, which is small, relative to the Doppler frequency caused by hand movements. Note that there are no any dynamic paths over this silent duration. For example, the Doppler frequency is at least 16 Hz when the velocity of a pushing hand is 1 m/s. This indicates that the impact of CFO can be ignored since it is small comparing to the impact of hand movements. Sampling Frequency Offset (SFO) and Packet Detection Delay (PDD): In addition to CFO, two other noise sources (i.e. SFO and PDD) [13], [27] have the similar effect on the CSI measurements. As shown in Figure 5, the slopes of linear phase shift φ are almost the same on the different frames during the same measurement. However, the slope changes randomly across different measurements when the equipments restarts due to the randomness in the PLL locking process of the clock. Thus, we compensate the SFO/PDD offset whenever the system restarts. We use linear regression to remove the SFO/PDD offset [13]. Figure 6 shows the CSI phase before and after correction. We observe that the CSI phase is more consistent over different subcarriers after SFO/PDD correction. The largest phase difference for any two subcarriers is smaller than 0.03 radians, and there is only a small residual difference of less than 0.01 radians after phase correction. In case that the system restarts, CSI phase may have inconsistent initial phases. To demonstrate this, we collect 150 CSI samples across different measurements to compute the standard deviation of CSI phase over all subcarriers. Figure 7 shows the standard deviation of phase averaged over different subcarriers reduced from 0.61 radians before removing the CFO to 0.01 radians after removing SFO. CSI variance: Although we have applied low-pass filter to remove high frequency noises at the denoising step, there is still CSI variance even in silent period. For example, the standard deviation of CSI phase for time period 0 ∼ 1.5 s as shown in Figure 4 is 0.013 radians. This variance will degrade the initial hand position estimation performance significantly. To remove CSI noises efficiently, we use three cascading low-pass filters to constitute a new cascading filter as h 0 (t) = h(t) ∗ h(t) ∗ h(t), where h(t) is the impulse response for one low-pass filter and ∗ represents convolution. Figure 8 shows the frequency response of the cascading filter comparing with the previous average moving filter. In Figure 9, we observe that high frequency noises are attenuated significantly by the filter so that the SNR is improved, where SNR here is denoted by the ratio of normal hand movements to the high frequency noises power on I/Q component. To demonstrate the effect of the filter, we collect 150 CSI samples with a duration of 96 ms. Then, we compute the standard deviation of CSI phase over the 96 ms segment for each subcarrier. Figure 10 shows that the average phase deviation significantly reduced from 0.032 radians before filtering to 0.025 radians after filtering. The denoising process is necessary for the coarse estimation in Section 4.2, since the performance of coarse estimation would degrade severely without denoising. 4.2 Coarse Initial Position Estimation After removing CSI noises, we use the phase difference between subcarriers to coarsely estimate the initial position of hand. The coarse initial position estimation can narrow down the candidate region, which greatly reduces the complexity of the later refinement algorithm. As shown in Eq. (2) , CSI of one pair of transceiver is the superimposition of all paths. Suppose that there is only one dynamic path (i.e., hand movement) in our system. Thus, the CSI vector of all paths on k-th subcarrier at time t can be described as: −→Hk (t) = −→Hk st + α k hd(t)e j(2πdhd(t)/λk+φhd) (4) where Hk st, α k hd(t), dhd(t), and φhd are the constant static vector, the attenuation coefficient, path length and initial phase for hand reflection, respectively. We can use −→Hk hd(t) = α k hd(t)e j(2πdhd(t)/λk+φhd) to represent the dynamic vector. As shown in Figure 11, the dynamic vector rotates due to the phase shift caused by hand movements. Furthermore, the phase difference of two subcarriers due to hand movement at time t can be described as: ∆pk1,k2 (t) = arg( −→H k2 hd(t)) − arg( −→H k1 hd(t)) (5) = 2πdhd(t)/λk2 − 2πdhd(t)/λk1 where we use k1, k2 to represent two different subcarriers and arg( −→Hk hd(t)) represents the the phase of dynamic vector corresponding to hand movement on subcarrier k. Once we know ∆pk1,k2 (t), the path length can be calculated as: dhd(t) = ∆pk1,k2 (t)λk1 λk2 2π(λk1 − λk2 ) . (6) We use the tangent vector corresponding to dynamic vector −→Hk hd(t), denoted as −→Hk tg(t), to estimate ∆pk1,k2 . The phase change of the dynamic vector is equivalent to phase change in its tangent vector: ∆pk1,k2 (t) = [arg( −→H k2 tg (t)) − π/2] − [arg( −→H k1 tg (t)) − π/2] = arg( −→H k2 tg (t)) − arg( −→H k1 tg (t)). (7)