Understanding and Modeling of WiFi Signal Based Human Activity Recognition Wei Wangt Alex X.Liutt Muhammad Shahzad:Kang Ling' Sanglu Lut tState Key Laboratory for Novel Software Technology,Nanjing University,China +Dept.of Computer Science and Engineering,Michigan State University,USA ww@nju.edu.cn,(alexliu,shahzadm)@cse.msu.edu,lingkang@smail.nju.edu.cn,sanglu@nju.edu.cn Abstract limited operation range of just tens of centimeters [2].Wearable Some pioneer WiFi signal based human activity recognition sys- sensors based approaches are inconvenient sometimes because of tems have been proposed.Their key limitation lies in the lack of the sensors that users have to wear.Recently.WiFi signal based hu- a model that can quantitatively correlate CSI dynamics and human man activity recognition systems,such as WiSee [17].E-eyes [27]. activities.In this paper.we propose CARM,a CSI based human and WiHear[26],have been proposed based on the observation that Activity Recognition and Monitoring system.CARM has two the- different human activities introduce different multi-path distortions oretical underpinnings:a CSI-speed model,which quantifies the in WiFi signals.WiSee uses USRP to capture the OFDM signals correlation between CSI value dynamics and human movement and measures the Doppler shift in signals reflected by human bod- speeds,and a CSI-activity model,which quantifies the correlation ies to recognize nine gestures.E-eyes uses Channel State Inform- between the movement speeds of different human body parts and ation(CSI)histograms as fingerprints for recognizing daily human a specific human activity.By these two models,we quantitatively activities such as brushing teeth.WiHear uses specialized direc- build the correlation between CSI value dynamics and a specific tional antennas to obtains CSI variations caused by lip movement human activity.CARM uses this correlation as the profiling mech- for recognizing spoken words.Their key advantages over camera anism and recognizes a given activity by matching it to the best-fit and sensor based approaches are that they do not require lighting, profile.We implemented CARM using commercial WiFi devices provide better coverage as they can operate through walls,preserve and evaluated it in several different environments.Our results show user privacy,and do not require users to carry any devices as they that CARM achieves an average accuracy of greater than 96%. rely on the WiFi signals reflected by humans Categories and Subject Descriptors 1.2 Limitations of Prior Art The key limitation of these pioneer WiFi based human activity C2.1 [Network Architecture and Design]:Wireless communica- recognition systems is the lack of a model that can quantitatively tion correlate CSI dynamics and human activities.As such,these sys- General Terms tems mostly rely on the statistical characteristics of WiFi signals such as Doppler movement directions and distributions of signal Experimentation,Measurement strength,to distinguish different human activities.The lack of such a model limits the further development of WiFi based human activ- Keywords ity recognition technologies.Without such a model,it is difficult Channel State Information(CSD);WiFi;Activity Recognition; to understand the correlation between WiFi signal dynamics and human activities.Furthermore,without such a model,it is diffi- 1.INTRODUCTION cult to optimize the performance of such systems due to the lack of adjustable parameters,and we have to resort to trial-and-error for 1.1 Motivation performance optimization. Human activity recognition is the core technology that enables a 1.3 Proposed Approach wide variety of applications such as health care,smart homes,fit- ness tracking.and building surveillance.Traditional approaches In this paper,we propose CARM,a CSI based human Activity use cameras [6],radars [2],or wearable sensors [7,33].How- Recognition and Monitoring system.CARM consists of two Com- ever,camera based approaches have the fundamental limitations mercial Off-The-Shelf(COTS)WiFi devices as shown in Figure 1,one for continuously sending signals,which can be a router. of requiring line of sight with enough lighting and breaching hu- and one for continuously receiving signals,which can be a laptop. man privacy potentially.Low cost 60 GHz radar solutions have When a human activity is performed in the range of these two devices,on the WiFi signal receiver end,CARM recognizes the Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed human activity based on how the CSI value changes.CARM has for profit or commercial advantage and that copies bear this notice and the full cita- two theoretical underpinnings that we propose in this paper:a tion on the first page.Copyrights for components of this work owned by others than CSI-speed model and a CSl-activity model.Our CSI-speed model ACM must be honored.Abstracting with credit is permitted.To copy otherwise,or re- quantifies the correlation between CSI value dynamics and human publish,to post on servers or to redistribute to lists,requires prior specific permission movement speeds.Our CSI-activity model quantifies the correla- and/or a fee.Request permissions from Permissions@acm.org. MobiCom'/5,September 7-11,2015,Paris.France. tion between the movement speeds of different human body parts ©2015ACM.ISBN978-1-4503-3619-2/1509.S15.00 and a specific human activity.By these two models,we quantitat- D0 http:ldx.doi.org/10.11452789168.2790093. ively build the correlation between CSI value dynamics and a spe-
Understanding and Modeling of WiFi Signal Based Human Activity Recognition Wei Wang† Alex X. Liu†‡ Muhammad Shahzad‡ Kang Ling† Sanglu Lu† †State Key Laboratory for Novel Software Technology, Nanjing University, China ‡Dept. of Computer Science and Engineering, Michigan State University, USA ww@nju.edu.cn, {alexliu,shahzadm}@cse.msu.edu, lingkang@smail.nju.edu.cn, sanglu@nju.edu.cn Abstract Some pioneer WiFi signal based human activity recognition systems have been proposed. Their key limitation lies in the lack of a model that can quantitatively correlate CSI dynamics and human activities. In this paper, we propose CARM, a CSI based human Activity Recognition and Monitoring system. CARM has two theoretical underpinnings: a CSI-speed model, which quantifies the correlation between CSI value dynamics and human movement speeds, and a CSI-activity model, which quantifies the correlation between the movement speeds of different human body parts and a specific human activity. By these two models, we quantitatively build the correlation between CSI value dynamics and a specific human activity. CARM uses this correlation as the profiling mechanism and recognizes a given activity by matching it to the best-fit profile. We implemented CARM using commercial WiFi devices and evaluated it in several different environments. Our results show that CARM achieves an average accuracy of greater than 96%. Categories and Subject Descriptors C2.1 [Network Architecture and Design]: Wireless communication General Terms Experimentation,Measurement Keywords Channel State Information (CSI);WiFi; Activity Recognition; 1. INTRODUCTION 1.1 Motivation Human activity recognition is the core technology that enables a wide variety of applications such as health care, smart homes, fitness tracking, and building surveillance. Traditional approaches use cameras [6], radars [2], or wearable sensors [7, 33]. However, camera based approaches have the fundamental limitations of requiring line of sight with enough lighting and breaching human privacy potentially. Low cost 60 GHz radar solutions have Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org. MobiCom’15, September 7–11, 2015, Paris, France. c 2015 ACM. ISBN 978-1-4503-3619-2/15/09 ...$15.00. DOI: http://dx.doi.org/10.1145/2789168.2790093. limited operation range of just tens of centimeters [2]. Wearable sensors based approaches are inconvenient sometimes because of the sensors that users have to wear. Recently, WiFi signal based human activity recognition systems, such as WiSee [17], E-eyes [27], and WiHear [26], have been proposed based on the observation that different human activities introduce different multi-path distortions in WiFi signals. WiSee uses USRP to capture the OFDM signals and measures the Doppler shift in signals reflected by human bodies to recognize nine gestures. E-eyes uses Channel State Information (CSI) histograms as fingerprints for recognizing daily human activities such as brushing teeth. WiHear uses specialized directional antennas to obtains CSI variations caused by lip movement for recognizing spoken words. Their key advantages over camera and sensor based approaches are that they do not require lighting, provide better coverage as they can operate through walls, preserve user privacy, and do not require users to carry any devices as they rely on the WiFi signals reflected by humans. 1.2 Limitations of Prior Art The key limitation of these pioneer WiFi based human activity recognition systems is the lack of a model that can quantitatively correlate CSI dynamics and human activities. As such, these systems mostly rely on the statistical characteristics of WiFi signals, such as Doppler movement directions and distributions of signal strength, to distinguish different human activities. The lack of such a model limits the further development of WiFi based human activity recognition technologies. Without such a model, it is difficult to understand the correlation between WiFi signal dynamics and human activities. Furthermore, without such a model, it is diffi- cult to optimize the performance of such systems due to the lack of adjustable parameters, and we have to resort to trial-and-error for performance optimization. 1.3 Proposed Approach In this paper, we propose CARM, a CSI based human Activity Recognition and Monitoring system. CARM consists of two Commercial Off-The-Shelf (COTS) WiFi devices as shown in Figure 1, one for continuously sending signals, which can be a router, and one for continuously receiving signals, which can be a laptop. When a human activity is performed in the range of these two devices, on the WiFi signal receiver end, CARM recognizes the human activity based on how the CSI value changes. CARM has two theoretical underpinnings that we propose in this paper: a CSI-speed model and a CSI-activity model. Our CSI-speed model quantifies the correlation between CSI value dynamics and human movement speeds. Our CSI-activity model quantifies the correlation between the movement speeds of different human body parts and a specific human activity. By these two models, we quantitatively build the correlation between CSI value dynamics and a spe-
cific human activity.CARM uses this quantitative correlation as the The third challenge is that CSI values are too noisy to be dir- profiling mechanism and recognizes a given activity by matching it ectly used for human activity recognition.Even in a static en- to the best-fit profile. vironment without any human activity,CSI values fluctuate be- cause WiFi devices are susceptible to surrounding electromagnetic noises.Moreover,the internal state changes in WiFi devices,e.g.. transmission rate adaptation and transmission power adaptation of- Wireless route ten introduce impulse and burst noises in CSI values.General pur- pose denoising methods,such as low-pass filters or median filters, do not perform well in removing these impulse and bursty noises for two reasons:First,the sampling rates that these methods require are much higher than the frequency of the WiFi signal.Second,the noise density in CSI values is too high for traditional filters,which Figure 1:CARM System only work well for low density noise.In this paper,we propose a Our CSI-speed model and CSI-activity model advance the state- principal component analysis(PCA)based CSI denoising scheme. of-the-art on WiFi signal based human activity recognition from This scheme is based on our observation that the signal fluctuations two fronts.First,they provide us the theoretical basis to under- caused by body movements in all subcarriers of the CSI values are stand,even quantitatively.the relationship between CSI value dy correlated. namics and human movement speeds,and the relationship between The fourth challenge is to capture body movements in the pres- the movement speeds of different human body parts and human ence of carrier frequency offset(CFO).CFO is the dynamically activities.Regarding the relationship between CSI value dynamics changing difference in carrier frequencies between a pair of WiFi and human movement speeds,for example,our model shows that devices,which occurs due to the minor physical differences in high-speed body part movement generates high-frequency changes hardware and other factors such as temperature changes [8].CFO in CSI values.Regarding the relationship between the movement causes the phase values of the received signal to change,making it speeds of different human body parts and human activities,tak- hard to distinguish whether the phase value changed is due to CFO ing the activity of falling down as an example,our model shows or due to human movement.To address this challenge.we use the that it can be characterized as a sudden increase in body movement CSI signal power to infer the body movement.We show that CSI speed in less than one second.Second,these two models provide us signal power is not affected by CFO,but retains information about the tunable parameters to optimize the performance of WiFi signal the movement speeds of the body. based human activity recognition.For example,according to our The fifth challenge is to automatically detect the start and end models,the CSI sampling rate should be chosen as 800 samples per of a human activity.To address this challenge,we use the eigen- second because the typical human movement speed corresponds to vectors obtained from PCA.The key idea is that in the absence of CSI components of lower than 300 Hz. any activity,the time-series of CSI values contain random noise and consequently,the signal eigenvector varies randomly.During a hu- 1.4 Technical Challenges and Our Solutions man activity,the signals in subcarriers become correlated and the The first technical challenge is to estimate human movement signal eigenvector becomes smooth.We capture the smoothness of speeds from CSI values based on our CSI-speed model.This the eigenvector by calculating its high-frequency energy and com is challenging because the CSI measurements at the receiver are pare it to a dynamically adapting threshold to detect start and end. mixed WiFi signals arrived from multiple paths,which changes as 1.5 Key Technical Novelty and Results human moves.Furthermore,different human body parts move at different speeds for a given activity and the WiFi signals reflected The key technical novelty of this paper is two fold.First,we pro- by different body parts are also mixed at the receiver.Our key ob- pose the CSI-speed model and the CSI-activity model to quantify servation is that these signals are linearly combined so that their the correlation between CSI value dynamics and a specific human frequencies are preserved when they are mixed together.There- activity.Second,we propose a set of signal processing techniques. fore,we use Discrete Wavelet Transform (DWT)to separate the such as PCA based denoising and DWT based feature extraction, for human activity recognition based on the CSI-speed model and frequency components that represent different movement speeds. The advantage of DWT is that it provides a proper tradeoff between the CSI-activity model.The key technical depth of this paper lies time and frequency resolution and enables the measurement of both in the signal processing aspect such as the theoretical analysis of the correlation between CSI values of subcarriers and the relation- fast and slow activities. The second challenge is to build the CSI-activity model that is ro- ship between multi-path speeds and CFR power.We implemented bust for different humans.For the same activity,to a certain degree, CARM on commercial WiFi devices and evaluated it in multiple environments.Our results show that CARM achieves an average different people perform it differently and even the same person performs it differently at different times.To address this challenge, activity recognition accuracy of 96%.For a new environment and we propose a Hidden Markov Model(HMM)based human activity a new person that the system has never been trained on,CARM can recognition approach.We use the patterns of movement speeds for still achieve a recognition accuracy for more than 80% different activities to build their corresponding HMM based mod- els.The features that we extract to infer the speed patterns are 2.RELATED WORK only affected by movement speeds of the body and are relatively Existing work on device-free human activity recognition and agonistic to environmental changes.This enables us to recognize localization can be divided into four categories:Received Signal activities even when the environment changes.We choose HMM Strength Indicator(RSSI)based,specialized hardware based.radar because of its inherent capability to recognize the same activities based,and CSI based. that are done at different speeds.To recognize a sample of an un- RSSI Based:RSSI based human activity recognition systems known activity,we evaluate the unknown samples against HMMs leverage the signal strength changes caused by human activities of all activities and find the model that gives the highest likelihood. [3,22,23].This approach can only do coarse grained human activity
cific human activity. CARM uses this quantitative correlation as the profiling mechanism and recognizes a given activity by matching it to the best-fit profile. Wireless router Laptop Wireless signal reflection Figure 1: CARM System Our CSI-speed model and CSI-activity model advance the stateof-the-art on WiFi signal based human activity recognition from two fronts. First, they provide us the theoretical basis to understand, even quantitatively, the relationship between CSI value dynamics and human movement speeds, and the relationship between the movement speeds of different human body parts and human activities. Regarding the relationship between CSI value dynamics and human movement speeds, for example, our model shows that high-speed body part movement generates high-frequency changes in CSI values. Regarding the relationship between the movement speeds of different human body parts and human activities, taking the activity of falling down as an example, our model shows that it can be characterized as a sudden increase in body movement speed in less than one second. Second, these two models provide us the tunable parameters to optimize the performance of WiFi signal based human activity recognition. For example, according to our models, the CSI sampling rate should be chosen as 800 samples per second because the typical human movement speed corresponds to CSI components of lower than 300 Hz. 1.4 Technical Challenges and Our Solutions The first technical challenge is to estimate human movement speeds from CSI values based on our CSI-speed model. This is challenging because the CSI measurements at the receiver are mixed WiFi signals arrived from multiple paths, which changes as human moves. Furthermore, different human body parts move at different speeds for a given activity and the WiFi signals reflected by different body parts are also mixed at the receiver. Our key observation is that these signals are linearly combined so that their frequencies are preserved when they are mixed together. Therefore, we use Discrete Wavelet Transform (DWT) to separate the frequency components that represent different movement speeds. The advantage of DWT is that it provides a proper tradeoff between time and frequency resolution and enables the measurement of both fast and slow activities. The second challenge is to build the CSI-activity model that is robust for different humans. For the same activity, to a certain degree, different people perform it differently and even the same person performs it differently at different times. To address this challenge, we propose a Hidden Markov Model (HMM) based human activity recognition approach. We use the patterns of movement speeds for different activities to build their corresponding HMM based models. The features that we extract to infer the speed patterns are only affected by movement speeds of the body and are relatively agonistic to environmental changes. This enables us to recognize activities even when the environment changes. We choose HMM because of its inherent capability to recognize the same activities that are done at different speeds. To recognize a sample of an unknown activity, we evaluate the unknown samples against HMMs of all activities and find the model that gives the highest likelihood. The third challenge is that CSI values are too noisy to be directly used for human activity recognition. Even in a static environment without any human activity, CSI values fluctuate because WiFi devices are susceptible to surrounding electromagnetic noises. Moreover, the internal state changes in WiFi devices, e.g., transmission rate adaptation and transmission power adaptation often introduce impulse and burst noises in CSI values. General purpose denoising methods, such as low-pass filters or median filters, do not perform well in removing these impulse and bursty noises for two reasons: First, the sampling rates that these methods require are much higher than the frequency of the WiFi signal. Second, the noise density in CSI values is too high for traditional filters, which only work well for low density noise. In this paper, we propose a principal component analysis (PCA) based CSI denoising scheme. This scheme is based on our observation that the signal fluctuations caused by body movements in all subcarriers of the CSI values are correlated. The fourth challenge is to capture body movements in the presence of carrier frequency offset (CFO). CFO is the dynamically changing difference in carrier frequencies between a pair of WiFi devices, which occurs due to the minor physical differences in hardware and other factors such as temperature changes [8]. CFO causes the phase values of the received signal to change, making it hard to distinguish whether the phase value changed is due to CFO or due to human movement. To address this challenge, we use the CSI signal power to infer the body movement. We show that CSI signal power is not affected by CFO, but retains information about the movement speeds of the body. The fifth challenge is to automatically detect the start and end of a human activity. To address this challenge, we use the eigenvectors obtained from PCA. The key idea is that in the absence of any activity, the time-series of CSI values contain random noise and consequently, the signal eigenvector varies randomly. During a human activity, the signals in subcarriers become correlated and the signal eigenvector becomes smooth. We capture the smoothness of the eigenvector by calculating its high-frequency energy and compare it to a dynamically adapting threshold to detect start and end. 1.5 Key Technical Novelty and Results The key technical novelty of this paper is two fold. First, we propose the CSI-speed model and the CSI-activity model to quantify the correlation between CSI value dynamics and a specific human activity. Second, we propose a set of signal processing techniques, such as PCA based denoising and DWT based feature extraction, for human activity recognition based on the CSI-speed model and the CSI-activity model. The key technical depth of this paper lies in the signal processing aspect such as the theoretical analysis of the correlation between CSI values of subcarriers and the relationship between multi-path speeds and CFR power. We implemented CARM on commercial WiFi devices and evaluated it in multiple environments. Our results show that CARM achieves an average activity recognition accuracy of 96%. For a new environment and a new person that the system has never been trained on, CARM can still achieve a recognition accuracy for more than 80%. 2. RELATED WORK Existing work on device-free human activity recognition and localization can be divided into four categories: Received Signal Strength Indicator (RSSI) based, specialized hardware based, radar based, and CSI based. RSSI Based: RSSI based human activity recognition systems leverage the signal strength changes caused by human activities [3,22,23]. This approach can only do coarse grained human activity
recognition with low accuracy because the RSSI values provided quency f measured at time t.CSI measurements basically con- by the commercial devices have very low resolution [23].For RSSI tains these CFR values.Let Nr=and NR=represent the num- based gesture recognition,the accuracy is 56%over 7 different ges- ber of transmitting and receiving antennas,respectively.As CSI is tures [21].Sigg et al.use software radio to improve the granularity measured on 30 selected OFDM subcarriers for a received 802.11 of RSSI values and consequently improve the accuracy of activity frame.each CSI measurement contains 30 matrices with dimen- recognition to 72%for 4 activities [22].In comparison,CARM sions NTx NRz.Each entry in any matrix is a CFR value between uses CSI values and achieves an accuracy of 96%. an antenna pair at a certain OFDM subcarrier frequency at a par- Specialized Hardware Based:Fine-grained radio signal meas- ticular time.Onwards,we call the time-series of CFR values for urements can be collected by software defined radio or specially a given antenna pair and OFDM subcarrier as CS/stream.Thus, designed hardware [11,12.14,17].WiSee uses USRP to capture there are 30 x NTz x NRz CSI streams in a time-series of CSI the WiFi OFDM signals and measures the Doppler shift in signals values. reflected by human bodies to recognize a set of nine different ges tures with an accuracy of 95%[17].AllSee uses a specially de de signed analog circuit to extract the amplitude of the received sig- nals and uses their envelopes to recognize gestures within a short Q distance of 2.5 feet [14].Wision uses multi-path reflections to build Combined CFR an image for nearby objects [11].In comparison,CARM requires (t LoS path H(ft) no specialized hardware and at the same time achieves high activity recognition accuracy at longer distances. Reflected by Static component Radar Based:Device-free human activity recognition has also wall H,(fo) been studied using radar technology [4,5.16,25].Using the mi- Reflected by cro Doppler information,radars can measure the movement speeds body of different parts of human body [25].WiTrack uses specially Dynamic Component designed Frequency Modulated Carrier Wave(FMCW)signals to Receiver Hdf.t) track human movements behind the wall with a resolution of ap- proximately 20cm [4.5].Compared to the specially designed radar (a)Visual representation (b)Phasor representation signals such as FMCW or Ultra-wideband (UWB)signals,WiFi Figure 2:Multi-paths caused by human movements signals have much narrower bandwidth.For example,802.11a/b/g usually use a bandwidth of 20 MHz,while FMCW uses bandwidth of up to 1.79 GHz [4].Compared to prior work in radar techno- logy,CARM designed a new set of signal processing methods that 3.2 Phase Changes for Paths are suitable for the OFDM signal used in WiFi. Surrounding objects reflect wireless signals due to which a trans- CSI Based:CSI values are available in many commercial mitted signal arrives at the receiver through multiple paths.If a devices such as Intel 5300 [9]and Atheros 9390 network interface radio signal arrives at the receiver through N different paths,then cards(NICs)[19].Recently CSI has been used for human activity H(f,t)is given by the following equation [24]: recognition [10,26,27,30.35]as well as indoor localization [19,32]. Han et al.proposed to use CSI to detect a single human activity of HU,t)=e-2m△t∑akU,t)e-2xfr0 (1) falling [10].Zhou et al.proposed to use CSI to detect the presence where a(f,t)is the complex valued representation of attenuation of a person in an environment [35].Xi et al.proposed to use CSI to count the number of people in a crowd [30].WiHear uses spe- and initial phase offset of thepath,e()is the phase cialized directional antennas to obtain CSI variations caused by lip shift on thepath that has a propagation delay of().and eis phase shift caused by the carrier frequency difference movement for recognizing spoken words [26].E-eyes recognizes a Af between the sender and the receiver. set of nine daily human activities using CSI.Note that WiHear and The changes in the length of a path lead to the changes in the E-eyes use CSI in quite different ways than CARM.WiHear does phase of the WiFi signal on the corresponding path.Consider the not effectively denoise CSI values;thus,it has to use directional scenario in Figure 2(a),where the WiFi signal is reflected by the antennas to reduce the noise in CSI values to achieve acceptable ac- curacy.In comparison,we denoise CSI values and use commercial human body through thekh path.When the human body moves by a small distance between time 0 and time t,the length of theth WiFi devices with built-in omnidirectional antennas.E-eyes uses path changes from d(0)to d(t).Since wireless signals travel at CSI histograms as fingerprints for recognizing human daily activ the speed of light,denoted as c.the delay of theh path,denoted ities,such as brushing teeth,taking showers,and washing dishes, as TA (t)can be written as T(t)=di (t)/c.Let f and A repres- which are relatively location dependent.In comparison,CARM ents the carrier frequency and the wavelength,where A =c/f uses CSI values based on our CSI-speed and CSI-activity models. Thus,the phase shift e()on this path can be written as 3.UNDERSTANDING WIFI MULTI-PATH e(),which means that when the path length changes by one wavelength,the receiver experiences a phase shift of 2 in the received subcarrier. 3.1 Overview of CSI WiFi NICs continuously monitor variations in the wireless chan- 3.3 Practical Limitations nel using CSI,which characterizes the frequency response of the Theoretically,it is possible to precisely measure the phase of the wireless channel [1].Let X(f,t)and Y(f,t)be the frequency do- path in systems where sender and receiver are perfectly synchron- main representations of transmitted and received signals,respect- ized,e.g.,as in RFID systems [31].But,unfortunately,commercial ively,with carrier frequency f.The two signals are related by WiFi devices have non-negligible carrier frequency offsets(CFO) the expression Y(f,t)=H(f,t)xX(f,t),where H(f,t)is the due to hardware imperfections and environmental variations [8]. complex valued channel frequency response(CFR)for carrier fre- IEEE 802.11n standard allows the carrier frequency of a device to
recognition with low accuracy because the RSSI values provided by the commercial devices have very low resolution [23]. For RSSI based gesture recognition, the accuracy is 56% over 7 different gestures [21]. Sigg et al. use software radio to improve the granularity of RSSI values and consequently improve the accuracy of activity recognition to 72% for 4 activities [22]. In comparison, CARM uses CSI values and achieves an accuracy of 96%. Specialized Hardware Based: Fine-grained radio signal measurements can be collected by software defined radio or specially designed hardware [11, 12, 14, 17]. WiSee uses USRP to capture the WiFi OFDM signals and measures the Doppler shift in signals reflected by human bodies to recognize a set of nine different gestures with an accuracy of 95% [17]. AllSee uses a specially designed analog circuit to extract the amplitude of the received signals and uses their envelopes to recognize gestures within a short distance of 2.5 feet [14]. Wision uses multi-path reflections to build an image for nearby objects [11]. In comparison, CARM requires no specialized hardware and at the same time achieves high activity recognition accuracy at longer distances. Radar Based: Device-free human activity recognition has also been studied using radar technology [4, 5, 16, 25]. Using the micro Doppler information, radars can measure the movement speeds of different parts of human body [25]. WiTrack uses specially designed Frequency Modulated Carrier Wave (FMCW) signals to track human movements behind the wall with a resolution of approximately 20cm [4, 5]. Compared to the specially designed radar signals such as FMCW or Ultra-wideband (UWB) signals, WiFi signals have much narrower bandwidth. For example, 802.11a/b/g usually use a bandwidth of 20 MHz, while FMCW uses bandwidth of up to 1.79 GHz [4]. Compared to prior work in radar technology, CARM designed a new set of signal processing methods that are suitable for the OFDM signal used in WiFi. CSI Based: CSI values are available in many commercial devices such as Intel 5300 [9] and Atheros 9390 network interface cards (NICs) [19]. Recently CSI has been used for human activity recognition [10,26,27,30,35] as well as indoor localization [19,32]. Han et al. proposed to use CSI to detect a single human activity of falling [10]. Zhou et al. proposed to use CSI to detect the presence of a person in an environment [35]. Xi et al. proposed to use CSI to count the number of people in a crowd [30]. WiHear uses specialized directional antennas to obtain CSI variations caused by lip movement for recognizing spoken words [26]. E-eyes recognizes a set of nine daily human activities using CSI. Note that WiHear and E-eyes use CSI in quite different ways than CARM. WiHear does not effectively denoise CSI values; thus, it has to use directional antennas to reduce the noise in CSI values to achieve acceptable accuracy. In comparison, we denoise CSI values and use commercial WiFi devices with built-in omnidirectional antennas. E-eyes uses CSI histograms as fingerprints for recognizing human daily activities, such as brushing teeth, taking showers, and washing dishes, which are relatively location dependent. In comparison, CARM uses CSI values based on our CSI-speed and CSI-activity models. 3. UNDERSTANDING WIFI MULTI-PATH 3.1 Overview of CSI WiFi NICs continuously monitor variations in the wireless channel using CSI, which characterizes the frequency response of the wireless channel [1]. Let X(f, t) and Y (f, t) be the frequency domain representations of transmitted and received signals, respectively, with carrier frequency f. The two signals are related by the expression Y (f, t) = H(f, t) × X(f, t), where H(f, t) is the complex valued channel frequency response (CFR) for carrier frequency f measured at time t. CSI measurements basically contains these CFR values. Let NT x and NRx represent the number of transmitting and receiving antennas, respectively. As CSI is measured on 30 selected OFDM subcarriers for a received 802.11 frame, each CSI measurement contains 30 matrices with dimensions NT x×NRx. Each entry in any matrix is a CFR value between an antenna pair at a certain OFDM subcarrier frequency at a particular time. Onwards, we call the time-series of CFR values for a given antenna pair and OFDM subcarrier as CSI stream. Thus, there are 30 × NT x × NRx CSI streams in a time-series of CSI values. Sender Receiver dk(t) Wall Reflected by body Reflected by wall LoS path dk(0) (a) Visual representation I Q Combined CFR H(f,t) Static component Hs(f,t) Dynamic Component Hd(f,t) (b) Phasor representation Figure 2: Multi-paths caused by human movements 3.2 Phase Changes for Paths Surrounding objects reflect wireless signals due to which a transmitted signal arrives at the receiver through multiple paths. If a radio signal arrives at the receiver through N different paths, then H(f, t) is given by the following equation [24]: H(f, t) = e −j2π∆ftXN k=1 ak(f, t)e −j2πfτk(t) (1) where ak(f, t) is the complex valued representation of attenuation and initial phase offset of the k th path , e −j2πfτk(t) is the phase shift on the k th path that has a propagation delay of τk(t), and e −j2π∆ft is phase shift caused by the carrier frequency difference ∆f between the sender and the receiver. The changes in the length of a path lead to the changes in the phase of the WiFi signal on the corresponding path. Consider the scenario in Figure 2(a), where the WiFi signal is reflected by the human body through the k th path. When the human body moves by a small distance between time 0 and time t, the length of the k th path changes from dk(0) to dk(t). Since wireless signals travel at the speed of light, denoted as c, the delay of the k th path, denoted as τk(t) can be written as τk(t) = dk(t)/c. Let f and λ represents the carrier frequency and the wavelength, where λ = c/f. Thus, the phase shift e −j2πfτk(t) on this path can be written as e −j2πdk(t)/λ, which means that when the path length changes by one wavelength, the receiver experiences a phase shift of 2π in the received subcarrier. 3.3 Practical Limitations Theoretically, it is possible to precisely measure the phase of the path in systems where sender and receiver are perfectly synchronized, e.g., as in RFID systems [31]. But, unfortunately, commercial WiFi devices have non-negligible carrier frequency offsets (CFO) due to hardware imperfections and environmental variations [8]. IEEE 802.11n standard allows the carrier frequency of a device to
drift by up to 100 kHz from the central frequency of the channel for H.(f).is the sum of CFRs for static paths.Thus,the total CFR is 5 GHz band [1].Such frequency drift leads to rapid phase changes given by the following equation. in CSI values.Commercial WiFi NICs take one set of CSI meas- urements per frame.With a transmission rate of 4,000 frames per H(f,t)=e(H()+>ax(f,t)e (2) second,which is around the maximum number of frames that the kEPd commercial device can continuously transmit due to the frame ag- The total CFR has time-varying power because in complex gregation mechanism in 802.11n [1].the phase shift caused by the plane,the static component H(f)is a constant vector while the term e in Equation(1)cloud be as large as 50 between dynamic component Ha(f,t)is superposition of vectors with time consecutive CSI values. varying phases and amplitudes,as shown in Figure 2(b).When Our measurements on commercial devices show that phases of the phase of the dynamic component changes,the magnitude of the CFR are too noisy to be used for activity recognition due to CFO. combined CFR changes accordingly. Figure 3 shows the CSI phase differences for consecutive frames Now,consider how CFR power changes with an object mov- sent through a WiFi link between two commercial devices.Due ing around.Let an object move at a constant speed such that the to the randomness of the packet sending process,the interval At length of the kth path changes at a constant speed v for a short between two consecutive frames is randomly distributed in the time period,e.g.,100 milliseconds.Let d(t)be the length of the range of 300~550 microseconds (us).This gives us a chance to kth path at time t.Thus,d(t)=dk(0)+vxt.The instantaneous measure the fine grained phase differences for different At.Each CFR power at time t can be derived as follows (detailed derivations dot in Figure 3 gives the phase difference for a pair of frame separ- are omitted due to space constraints). ated by the given At,thus we can obtain the relationship between At and the phase shift.As shown by Figure 3,the phase differ- 1Hf,trP=∑2A.a,圳os(2+2d0+) 入 ence2r△f△t changes by8π(four vertical strips)when△tin- kEPd crease from 350 us to 400 us.Thus,the CFO can be calculated as 2lak(f,t)ai(f,t)l cos /2aos-m+2红(d0)-do+a Af==80 kHz.There are two causes that lead to k,lEPd the imprecision of CFR phase.First,from the width of the vertical k strips in Figure 3,we observe that CFR phase has measurement er- +Iak(f,t2+1H()2 (3) ror as large as 0.5.In most cases,the phase changes caused by kEPd human reflection are much smaller than 0.5m.Thus,phase changes where24@+中kand2红o-4o》+are constant caused by movements are often buried in phase nosies.Second, values representing initial phase offsets. our measurements on commercial devices show that the central fre- Equation (3)provides a key insight:the total CFR power is quency often drifts by tens of Hz per second,making it hard to the sum of a constant offset and a set of sinusoids,where the fre- predict CFR phase and separate the phase change caused by clock quencies of the sinusoids are functions of the speeds of path length drifts from the small phase shifts caused by body movements.Fur- changes.By measuring the frequencies of these sinusoids and mul- thermore,the phase sanitization method introduced in [20]could tiplying them with the carrier wavelength,we can obtain the speeds not work for our case because the phase sanitization process also of path length change.In this way,we can build a CSI-speed model removes the phase shifts caused by body movements. which relates the variations in CSI power to the movement speeds. 3.5 Model Verification We use a simple moving object to verify our CSI-speed model in Equation (3).We move a steel plate with diameter of 30 cm along the perpendicular bisector of the sender/receiver,similar to the scenario shown in Figure 2(a).Flat steel objects can serve as mirrors for radio waves [34].Thus,there is only one path dominat- ing the signal reflected by the steel plate and Equation(3)reduces to one sinusoid wave plus a constant offset.The frequency of the si- 350 400 4t(us)450 500 nusoid changes according to the instantaneous moving speed.This Figure 3:Phase differences for consecutive frames can be verified by Figure 4(a),which shows the CSI waveform caused by steel plate movements.When there is only one domin- ating sinusoid wave,the movement distance can be calculated by measuring the phase change of the signal,which is the integral of the signal frequency over time. 3.4 CSI-Speed Model We use Hilbert Transform to calculate the phase change of the While it is hard to directly measure the phase of a path,it is waveform as follows.We first remove the DC component that ac- possible to infer the phase of a path using the CFR power i.e.. counts for the static paths.We then use Hilbert Transform to derive H(f,t)2.The principle behind our method is that when the the analytic signal from the waveform.The unwrapped instantan- lengths of multi-paths change,the CFR power varies according to eous phase of the analytic signal keeps track of the phase change the path length change. of the waveform.We can then multiply the phase change with the To understand the relationship between CFR power and the wavelength to get the path length change.Since the reflected sig- length change of a path,we first express CFR as a sum of dy- nal goes through a round-trip from the reflector,the path length namic CFR and static CFR and then calculate the power.Dy- change is approximately two times of the movement distance of namic CFR,represented by Ha(f,t),is the sum of CFRs for paths the reflector in this case [29]. whose lengths change with the human movement,and is given by The Hilbert Transform based distance measurement has average Ha(f)=t)ed()where Pa is the set of accuracy of 2.86 cm,as showing in Figure 4(b)and 4(c).In the dynamic paths whose lengths change.Static CFR,represented by experiments,we move the steel plate for a random distance in the
drift by up to 100 kHz from the central frequency of the channel for 5 GHz band [1]. Such frequency drift leads to rapid phase changes in CSI values. Commercial WiFi NICs take one set of CSI measurements per frame. With a transmission rate of 4,000 frames per second, which is around the maximum number of frames that the commercial device can continuously transmit due to the frame aggregation mechanism in 802.11n [1], the phase shift caused by the term e −j2π∆ft in Equation (1) cloud be as large as 50π between consecutive CSI values. Our measurements on commercial devices show that phases of CFR are too noisy to be used for activity recognition due to CFO. Figure 3 shows the CSI phase differences for consecutive frames sent through a WiFi link between two commercial devices. Due to the randomness of the packet sending process, the interval ∆t between two consecutive frames is randomly distributed in the range of 300∼550 microseconds (µs). This gives us a chance to measure the fine grained phase differences for different ∆t. Each dot in Figure 3 gives the phase difference for a pair of frame separated by the given ∆t, thus we can obtain the relationship between ∆t and the phase shift. As shown by Figure 3, the phase difference 2π∆f∆t changes by 8π (four vertical strips) when ∆t increase from 350 µs to 400 µs. Thus, the CFO can be calculated as ∆f = 8π 2π(400−350)µs = 80 kHz. There are two causes that lead to the imprecision of CFR phase. First, from the width of the vertical strips in Figure 3, we observe that CFR phase has measurement error as large as 0.5π. In most cases, the phase changes caused by human reflection are much smaller than 0.5π. Thus, phase changes caused by movements are often buried in phase nosies. Second, our measurements on commercial devices show that the central frequency often drifts by tens of Hz per second, making it hard to predict CFR phase and separate the phase change caused by clock drifts from the small phase shifts caused by body movements. Furthermore, the phase sanitization method introduced in [20] could not work for our case because the phase sanitization process also removes the phase shifts caused by body movements. 300 350 400 450 500 550 −3 −2 −1 0 1 2 3 ∆t (µs) Phase difference (rad) Figure 3: Phase differences for consecutive frames 3.4 CSI-Speed Model While it is hard to directly measure the phase of a path, it is possible to infer the phase of a path using the CFR power i.e., |H(f, t)| 2 . The principle behind our method is that when the lengths of multi-paths change, the CFR power varies according to the path length change. To understand the relationship between CFR power and the length change of a path, we first express CFR as a sum of dynamic CFR and static CFR and then calculate the power. Dynamic CFR, represented by Hd(f, t), is the sum of CFRs for paths whose lengths change with the human movement, and is given by Hd(f, t) = P k∈Pd ak(f, t)e −j2πdk(t)/λ, where Pd is the set of dynamic paths whose lengths change. Static CFR, represented by Hs(f), is the sum of CFRs for static paths. Thus, the total CFR is given by the following equation. H(f, t) = e −j2π∆ft Hs(f) + X k∈Pd ak(f, t)e −j 2πdk(t) λ (2) The total CFR has time-varying power because in complex plane, the static component Hs(f) is a constant vector while the dynamic component Hd(f, t) is superposition of vectors with time varying phases and amplitudes, as shown in Figure 2(b). When the phase of the dynamic component changes, the magnitude of the combined CFR changes accordingly. Now, consider how CFR power changes with an object moving around. Let an object move at a constant speed such that the length of the k th path changes at a constant speed vk for a short time period, e.g., 100 milliseconds. Let dk(t) be the length of the k th path at time t. Thus, dk(t) = dk(0) + vkt. The instantaneous CFR power at time t can be derived as follows (detailed derivations are omitted due to space constraints). |H(f, t)| 2 = X k∈Pd 2|Hs(f)ak(f, t)| cos 2πvkt λ + 2πdk(0) λ + φsk + X k,l∈Pd k6=l 2|ak(f, t)al(f, t)| cos 2π(vk − vl)t λ + 2π (dk(0) − dl(0)) λ + φkl + X k∈Pd |ak(f, t)| 2 + |Hs(f)| 2 (3) where 2πdk(0) λ + φsk and 2π(dk(0)−dl(0)) λ + φkl are constant values representing initial phase offsets. Equation (3) provides a key insight: the total CFR power is the sum of a constant offset and a set of sinusoids, where the frequencies of the sinusoids are functions of the speeds of path length changes. By measuring the frequencies of these sinusoids and multiplying them with the carrier wavelength, we can obtain the speeds of path length change. In this way, we can build a CSI-speed model which relates the variations in CSI power to the movement speeds. 3.5 Model Verification We use a simple moving object to verify our CSI-speed model in Equation (3). We move a steel plate with diameter of 30 cm along the perpendicular bisector of the sender/receiver, similar to the scenario shown in Figure 2(a). Flat steel objects can serve as mirrors for radio waves [34]. Thus, there is only one path dominating the signal reflected by the steel plate and Equation (3) reduces to one sinusoid wave plus a constant offset. The frequency of the sinusoid changes according to the instantaneous moving speed. This can be verified by Figure 4(a), which shows the CSI waveform caused by steel plate movements. When there is only one dominating sinusoid wave, the movement distance can be calculated by measuring the phase change of the signal, which is the integral of the signal frequency over time. We use Hilbert Transform to calculate the phase change of the waveform as follows. We first remove the DC component that accounts for the static paths. We then use Hilbert Transform to derive the analytic signal from the waveform. The unwrapped instantaneous phase of the analytic signal keeps track of the phase change of the waveform. We can then multiply the phase change with the wavelength to get the path length change. Since the reflected signal goes through a round-trip from the reflector, the path length change is approximately two times of the movement distance of the reflector in this case [29]. The Hilbert Transform based distance measurement has average accuracy of 2.86 cm, as showing in Figure 4(b) and 4(c) . In the experiments, we move the steel plate for a random distance in the
35 Time (seconds) 65 Moving is(meter) 0.6 14 easmen2产 (a)CSI waveform for a movement with 0.8m (b)Measurements of path length change (c)CDF of measurement error pathlength change Figure 4:Experiments with steel plates moving along a straight line. range of 0~1.6 m which incurs 0~3.2 m path length change.The with the same speed may introduce different path length change ground truth path length change is measured by a laser rangefinder. speeds when movement directions are different.Furthermore,dif- which provides distance measurement accuracy of 0.1 cm.Under ferent people may perform the same activity with different speeds carrier frequency of 5.825 GHz,which has wavelength of 5.15cm. and the multi-path conditions may change under different environ- our path length measurement has maximal error of 5.87 cm and ments. mean error of 2.86 cm.The major error sources are errors in de- Our experiments show that different human activities actually in- ciding the phase of the starting and ending cycle.Therefore,the cur path length change speed with significant difference,so that the measurement error does not increase with the movement distance minor measurement differences caused by movement direction and and is uniformly distributed in the range of 0~6 cm,see Figure the different ways to perform the same activity can be safely ig- 4(c). nored.To study the robustness of the movement speeds,we collect more than 780 activity samples for three activities,walking,run- 4. MODELING OF HUMAN ACTIVITIES ning and sitting down,performed by 25 volunteers with different ages and genders.The activities are performed at different loca- 4.1 Human Activity Characteristics tions with different directions,e.g.,we ask the volunteer to walk Modeling CFR power change caused by human activity is chal- around a large table so that four different walking directions are lenging.Unlike the simple object used in section 3.5,human bod captured.Figure 6 shows the estimated torso speed distribution for ies have complex shapes and different body parts can move at dif- the three different activities.Note that we estimate the torso speed ferent speeds.Moreover,the reflections from body parts may go by dividing the speed of path length change by two.This usually through different paths in complex indoor environments.From gives a smaller estimation than the actual speed because depend Equation (3),we see that the CFR power is a linear combination ing on the movement direction,moving by 1 cm usually cause less of all the reflected paths and the speeds of path length change than 2 cm path length change [29].Even with different movement are preserved in the combination process.Therefore,we can use directions,we observe that the three activities have different speeds Time-Frequency analysis tools,such as Short-Time Fourier Trans- in Figure 6.Such speed difference can be used for activity classi- form (STFT)or Discrete Wavelet Transform (DWT)to separate fication.As an example,we can achieve a classification accuracy these components in the frequency domain.Human activity can of 88%for all three activities,when we divide the samples to three be modeled by profiling the energy of each frequency component types with estimated speed of 0~0.61 m/s,0.61~1.0 m/s and above 1.0 m/s.By looking at various different activities,we found that derived from Time-Frequency analysis tools.As an example,Fig- ure 5 illustrates the waveform and the corresponding spectrogram most human activities contains speed components ranging from for three human activities:walking.falling and sitting down.The 0~2.5 m/s and the frequency components for a given activity are spectrogram shows how the energy of each frequency component stable across different scenarios,including apartments,offices,and evolves with time,where high-energy components are colored in large open area,see our evaluations in Section 8.Therefore,the red.In the spectrogram for the walking activity,there is a high- strength of the frequency components can serve as a robust feature energy band around 35~40Hz frequency,as shown in Figure 5(d). for human activities. With a wavelength of 5.15 cm,these frequency components repres- 025 ent 0.91.0 m/s movement speed after considering the round-trip path length change.This coincides the normal movement speed of human torso while walking [25].Figure 5(e)shows the spec- 0.15 sitting dow trogram of falling,which has an energy increase in the frequency range of 40~80 Hz between 1~1.5 seconds.This indicates a fast speed-up from below 0.5 m/s speed to 2 m/s,during a short time period of 0.5 seconds,which is a clear sign of falling.The activ- 0.4 08 ity of sitting down shown in Figure 5(f)is different from falling. Estimated speed(m/s) as the speed for sitting down is much slower.Using the energy Figure 6:Histogram of speeds for different activities profile of different frequencies,we can build CSI-activity model. which quantifies the correlation between the movement speeds of 4.3 different human body parts and a specific human activity. CSI-Activity Model We propose to use Hidden Markov Model (HMM)to build CSI- 4.2 Robustness of Activity Speeds activity models that consist of mutiple movement states.As an ex- We next study whether the speed based CSI-activity model ample,we observe that the action of falling comprises several states are robust across different scenarios.It is well known that the from Figure 5(e).The person first moves slowly,with most CSI en- path length change is determined by both the position of the ergy on the low frequency (slow movement)components.Then. sender/receiver and the movement directions [29].Movements there is a fast transition to very high speed movement where sub-
2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 50 100 150 200 Time (seconds) CSI power (a) CSI waveform for a movement with 0.8m pathlength change 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 0 0.5 1 1.5 2 2.5 3 Moving distance (meters) Path length change (meters) Ground truth Measurement results (b) Measurements of path length change 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0 0.2 0.4 0.6 0.8 1 Measurement error (meters) CDF (c) CDF of measurement error Figure 4: Experiments with steel plates moving along a straight line. range of 0∼1.6 m which incurs 0∼3.2 m path length change. The ground truth path length change is measured by a laser rangefinder, which provides distance measurement accuracy of 0.1 cm. Under carrier frequency of 5.825 GHz, which has wavelength of 5.15cm, our path length measurement has maximal error of 5.87 cm and mean error of 2.86 cm. The major error sources are errors in deciding the phase of the starting and ending cycle. Therefore, the measurement error does not increase with the movement distance and is uniformly distributed in the range of 0∼6 cm, see Figure 4(c). 4. MODELING OF HUMAN ACTIVITIES 4.1 Human Activity Characteristics Modeling CFR power change caused by human activity is challenging. Unlike the simple object used in section 3.5, human bodies have complex shapes and different body parts can move at different speeds. Moreover, the reflections from body parts may go through different paths in complex indoor environments. From Equation (3), we see that the CFR power is a linear combination of all the reflected paths and the speeds of path length change are preserved in the combination process. Therefore, we can use Time-Frequency analysis tools, such as Short-Time Fourier Transform (STFT) or Discrete Wavelet Transform (DWT) to separate these components in the frequency domain. Human activity can be modeled by profiling the energy of each frequency component derived from Time-Frequency analysis tools. As an example, Figure 5 illustrates the waveform and the corresponding spectrogram for three human activities: walking, falling and sitting down. The spectrogram shows how the energy of each frequency component evolves with time, where high-energy components are colored in red. In the spectrogram for the walking activity, there is a highenergy band around 35∼40Hz frequency, as shown in Figure 5(d). With a wavelength of 5.15 cm, these frequency components represent 0.9∼1.0 m/s movement speed after considering the round-trip path length change. This coincides the normal movement speed of human torso while walking [25]. Figure 5(e) shows the spectrogram of falling, which has an energy increase in the frequency range of 40∼80 Hz between 1∼1.5 seconds. This indicates a fast speed-up from below 0.5 m/s speed to 2 m/s, during a short time period of 0.5 seconds, which is a clear sign of falling. The activity of sitting down shown in Figure 5(f) is different from falling, as the speed for sitting down is much slower. Using the energy profile of different frequencies, we can build CSI-activity model, which quantifies the correlation between the movement speeds of different human body parts and a specific human activity. 4.2 Robustness of Activity Speeds We next study whether the speed based CSI-activity model are robust across different scenarios. It is well known that the path length change is determined by both the position of the sender/receiver and the movement directions [29]. Movements with the same speed may introduce different path length change speeds when movement directions are different. Furthermore, different people may perform the same activity with different speeds and the multi-path conditions may change under different environments. Our experiments show that different human activities actually incur path length change speed with significant difference, so that the minor measurement differences caused by movement direction and the different ways to perform the same activity can be safely ignored. To study the robustness of the movement speeds, we collect more than 780 activity samples for three activities, walking, running and sitting down, performed by 25 volunteers with different ages and genders. The activities are performed at different locations with different directions, e.g., we ask the volunteer to walk around a large table so that four different walking directions are captured. Figure 6 shows the estimated torso speed distribution for the three different activities. Note that we estimate the torso speed by dividing the speed of path length change by two. This usually gives a smaller estimation than the actual speed because depending on the movement direction, moving by 1 cm usually cause less than 2 cm path length change [29]. Even with different movement directions, we observe that the three activities have different speeds in Figure 6. Such speed difference can be used for activity classi- fication. As an example, we can achieve a classification accuracy of 88% for all three activities, when we divide the samples to three types with estimated speed of 0∼0.61 m/s, 0.61∼1.0 m/s and above 1.0 m/s. By looking at various different activities, we found that most human activities contains speed components ranging from 0∼2.5 m/s and the frequency components for a given activity are stable across different scenarios, including apartments, offices, and large open area, see our evaluations in Section 8. Therefore, the strength of the frequency components can serve as a robust feature for human activities. 0.2 0.4 0.6 0.8 1 1.2 1.4 0 0.05 0.1 0.15 0.2 0.25 Estimated speed (m/s) Probability running walking sitting down Figure 6: Histogram of speeds for different activities 4.3 CSI-Activity Model We propose to use Hidden Markov Model (HMM) to build CSIactivity models that consist of mutiple movement states. As an example, we observe that the action of falling comprises several states from Figure 5(e). The person first moves slowly, with most CSI energy on the low frequency (slow movement) components. Then, there is a fast transition to very high speed movement where sub-