RespTracker:Multi-user Room-scale Respiration Tracking with Commercial Acoustic Devices Haoran Wan,Shuyu Shi,Wenyu Cao,Wei Wang,Guihai Chen State Key Laboratory for Novel Software Technology,Nanjing University [wanhr,wenyucao}@smail.nju.edu.cn,[ssy,ww,gchen}@nju.edu.cn Abstract-Continuous domestic respiration monitoring provides vital information for diagnosing assorted diseases.In this paper,we introduce RESPTRACKER,the first continuous, multiple-person respiration tracking system in domestic settings using acoustic-based COTS devices.RESPTRACKER uses a two-stage algorithm to separate and recombine respiration signals from multiple paths in a short period so that it can track the respiration rate of multiple moving subjects.Our experimental results show that our two-stage algorithm can distinguish the respiration of at least four subjects at a distance of three meters. I.INTRODUCTION Background and Motivation:Respiration is one of the vi- tal signs that contain valuable information to diagnose assorted diseases,e.g.,pulmonary disease [1],heart failure [2],anxi- Figure 1.General application scenario of RESPTRACKER. ety [3].and sleep disorders [4].Clinical instruments,such as capnography or plethysmography,provide reliable respiration measurements.However,they need professional operators and for a Wi-Fi bandwidth of 40 MHz.Existing works either cannot be deployed in the domestic scenario to perform long- rely on the differences in the respiration rate [11]or use term monitoring,which is vital to early diagnoses of chronic specialized high bandwidth frequency modulated continuous diseases,such as obstructive sleep apnea syndrome (OSAS) wave (FMCW)radar and Independent Component Analysis and chronic obstructive pulmonary disease (COPD).As a (ICA)[9]to separate multiple users.These solutions impose result,the development of domestic continuous respiratory extra assumptions on respiration patterns or need specialized monitoring systems has attracted increasing research interest devices that increase the domestic deployment cost.Acoustic- in recent years. based systems turn the speaker-microphone pair integrated There are domestic respiratory monitoring systems based with COTS devices,such as mobile phones and smart speak- on cameras [5]or using special devices,including belt integ- ers,into an active sonar to perform the respiration monitoring rated with capacitive sensors 6]or smart cushion with air task.The advantage of acoustic-based systems is the higher pressure sensors [7].However,user studies have shown that range resolution [10],e.g.,a typical bandwidth of 4kHz people are reluctant to deploy these devices due to privacy leads to a range resolution of 8.5 cm for ultrasound signals. concerns [5],[8]or the high cost and long-term physical However,due to the fast attenuation of sound signals,most contact requirements [6],[7].A more promising solution is acoustic-based systems have a limited range of 0.7~1.1 m enabling device-free respiratory monitoring with ubiquitously [10].[13],[14].Therefore,their applications are limited to available wireless signals emitted by commercial off-the-shelf sleep monitoring instead of room-scale domestic deployment (COTS)devices in domestic settings [9]-[11]. for continuous respiratory monitoring and tracking. Limitations of Prior Art:Existing device-free respiratory Proposed Approach:In this paper,we introduce monitoring systems leverage two types of signals emitted RESPTRACKER,the first continuous,multiple-person respira- by COTS devices:radio frequency (RF)signals and ultra- tion tracking system in domestic settings using acoustic-based sound signals.One popular solution for RF-based systems is COTS devices.As shown in Figure 1,the respiration signal collecting Wi-Fi channel state information(CSD)for further of different users may arrive at the receiver through multiple respiration measurements [12].However,due to the narrow paths.RESPTRACKER proposes a multipath separation and bandwidth of Wi-Fi signals,the range resolution of CSI is too combination framework for robust respiration signal tracking. low to separate two nearby respiration signals.For example, First,RESPTRACKER utilizes inaudible sound signal mod- the aliasing range between two non-resolvable paths is 7.5 m ulated by the Zadoff-Chu (ZC)sequence to separate sound re- flections from different users.Compared to traditional FMCW- Shuyu Shi is the corresponding author. based systems,the key advantage of our separation scheme
RespTracker: Multi-user Room-scale Respiration Tracking with Commercial Acoustic Devices Haoran Wan, Shuyu Shi, Wenyu Cao, Wei Wang, Guihai Chen State Key Laboratory for Novel Software Technology, Nanjing University {wanhr, wenyucao}@smail.nju.edu.cn, {ssy, ww, gchen}@nju.edu.cn Abstract—Continuous domestic respiration monitoring provides vital information for diagnosing assorted diseases. In this paper, we introduce RESPTRACKER, the first continuous, multiple-person respiration tracking system in domestic settings using acoustic-based COTS devices. RESPTRACKER uses a two-stage algorithm to separate and recombine respiration signals from multiple paths in a short period so that it can track the respiration rate of multiple moving subjects. Our experimental results show that our two-stage algorithm can distinguish the respiration of at least four subjects at a distance of three meters. I. INTRODUCTION Background and Motivation: Respiration is one of the vital signs that contain valuable information to diagnose assorted diseases, e.g., pulmonary disease [1], heart failure [2], anxiety [3], and sleep disorders [4]. Clinical instruments, such as capnography or plethysmography, provide reliable respiration measurements. However, they need professional operators and cannot be deployed in the domestic scenario to perform longterm monitoring, which is vital to early diagnoses of chronic diseases, such as obstructive sleep apnea syndrome (OSAS) and chronic obstructive pulmonary disease (COPD). As a result, the development of domestic continuous respiratory monitoring systems has attracted increasing research interest in recent years. There are domestic respiratory monitoring systems based on cameras [5] or using special devices, including belt integrated with capacitive sensors [6] or smart cushion with air pressure sensors [7]. However, user studies have shown that people are reluctant to deploy these devices due to privacy concerns [5], [8] or the high cost and long-term physical contact requirements [6], [7]. A more promising solution is enabling device-free respiratory monitoring with ubiquitously available wireless signals emitted by commercial off-the-shelf (COTS) devices in domestic settings [9]–[11]. Limitations of Prior Art: Existing device-free respiratory monitoring systems leverage two types of signals emitted by COTS devices: radio frequency (RF) signals and ultrasound signals. One popular solution for RF-based systems is collecting Wi-Fi channel state information (CSI) for further respiration measurements [12]. However, due to the narrow bandwidth of Wi-Fi signals, the range resolution of CSI is too low to separate two nearby respiration signals. For example, the aliasing range between two non-resolvable paths is 7.5 m Shuyu Shi is the corresponding author. Figure 1. General application scenario of RESPTRACKER. for a Wi-Fi bandwidth of 40 MHz. Existing works either rely on the differences in the respiration rate [11] or use specialized high bandwidth frequency modulated continuous wave (FMCW) radar and Independent Component Analysis (ICA) [9] to separate multiple users. These solutions impose extra assumptions on respiration patterns or need specialized devices that increase the domestic deployment cost. Acousticbased systems turn the speaker-microphone pair integrated with COTS devices, such as mobile phones and smart speakers, into an active sonar to perform the respiration monitoring task. The advantage of acoustic-based systems is the higher range resolution [10], e.g., a typical bandwidth of 4kHz leads to a range resolution of 8.5 cm for ultrasound signals. However, due to the fast attenuation of sound signals, most acoustic-based systems have a limited range of 0.7∼1.1 m [10], [13], [14]. Therefore, their applications are limited to sleep monitoring instead of room-scale domestic deployment for continuous respiratory monitoring and tracking. Proposed Approach: In this paper, we introduce RESPTRACKER, the first continuous, multiple-person respiration tracking system in domestic settings using acoustic-based COTS devices. As shown in Figure 1, the respiration signal of different users may arrive at the receiver through multiple paths. RESPTRACKER proposes a multipath separation and combination framework for robust respiration signal tracking. First, RESPTRACKER utilizes inaudible sound signal modulated by the Zadoff-Chu (ZC) sequence to separate sound re- flections from different users. Compared to traditional FMCWbased systems, the key advantage of our separation scheme
is that we can precisely measure both the amplitude and the 4.0 L05 phase of individual reflection paths.Then,RESPTRACKER Direct turns the indoor multipath effect into our friends by recom- Multipath bining the multipath signals belonging to the same user.Our Reflections signal combination algorithm performs a multi-dimensional 70,0 140,0210.0280.0350.0420,0490.0560.0630.0700.0 Distance (cm) search and analysis among different distances,multiple re- ceiving microphones,and different time-frames,based on the (a)CIR amplitude of a single frame amplitude and phase measurement of the ZC signal.In this €2800 Multipath way,we can reliably cluster reflection paths to different users g2100 Reflections even if they have similar respiration rates.With our two- 140.0 stage scheme,RESPTRACKER can detect reliable single person 700 respiration signal at a distance of 3 meters and track the 060 6.0 120 180 24.0 movement of each user within 20 seconds after movements. Time (s) And,we can also separate multiple subjects'respiration and (b)Time variations of CIR amplitudes. Amplitude at 50cm Amp ude at80c track each of them in domestic settings. Technical Challenges and Solutions:The first challenge is to reliably separate multiple breath signals.Existing work for multi-user breath detection [9]leverages the ICA algorithm to extract different subjects'respiration.As multiple reflections 120 18D 240 Time (s) of wireless signal are mixed at the receiver due to the limited range resolution,they need a reliable decomposition algorithm (c)Filtered CIR amplitude at different distances. d Waveform Ground Truth Waveform to separate them.To address this challenge,we use the ZC sequence to distinguish different sound reflection paths with a high resolution of less than 10cm.In addition,we can measure the features of individual paths in terms of the channel impulse response (CIR).In this way,each path contains less 60 12.0 18.0 24.0 Time (s) interference of other subjects so that the difficulty of signal (d)Reconstructed respiration signal. decomposition is greatly reduced. The second challenge is to expand the monitor range to the Figure 2.CIR waveform of a single subject room-scale.Since the ultrasonic signal attenuates quickly in indoor environments,the measurement of a single path could be noisy and inaccurate.Traditional delay-and-sum algorithm determine whether there are users'movements and then track for beamforming blindly combines signals from the same the distance change of each reflection path.Therefore,we can distance and angle where the weak respiration signal may be quickly use the historical data to regain synchronization within destroyed by the out-of-phase combination.To resolve this twenty seconds after the movement. issue,we use a multi-dimensional signal combination scheme Summary of Experimental results:In the single user to select and recombine the respiration signals from the same scenario,our system can robustly estimate the respiration rate user.We first leverage multiple microphones that are common with an error under 0.6 Beats per Minute (BPM)for different on COTS devices,such as Amazon Echo and Google Home, environments,such as in the hallway,offices.and conference to collect multiple copies of the sound reflections.Based on rooms.RESPTRACKER can also achieve an error of less than 1 the multipath phenomenon,we collect sound reflections on BPM within a distance of three meters and maintain an error of paths at different distances that arrive at the same microphone. less than 0.8 BPM while the user is moving.In the multi-user By clustering these multi-dimensional reflection signals.we scenario,RESPTRACKER can separate the respiration signal of can determine whether a given path on a given microphone more than four users in the same room and achieve an error contains the respiration signal and which user the respiration of less than 1 BPM for each user. signal belongs to.In this way,we are able to combine a II.SYSTEM OVERVIEW large number of weak paths from the same user,thereby reconstructing the respiration signal reliably and achieving RESPTRACKER aims at multiple-person room-scale respir- long-distance monitoring. ation tracking.Therefore,the system is supposed to detect and The third challenge is to track the respiration signal while separate the weak reflection signals at a long range reliably. the subject is moving.As users may not keep static in their daily routine,our monitoring system should be able to keep A.Design Motivations tracking while users change their position or orientation. To understand the design challenges for long-range respir- To achieve respiration tracking under dynamic position and ation signal detection and separation,we provide a typical orientation,we divide the signal into short observation slots respiration signal illustration in Figure 2.Figure 2(a)shows with a duration of twenty seconds.Within each slot,we first the amplitude of multipath signals at different distances,where
is that we can precisely measure both the amplitude and the phase of individual reflection paths. Then, RESPTRACKER turns the indoor multipath effect into our friends by recombining the multipath signals belonging to the same user. Our signal combination algorithm performs a multi-dimensional search and analysis among different distances, multiple receiving microphones, and different time-frames, based on the amplitude and phase measurement of the ZC signal. In this way, we can reliably cluster reflection paths to different users even if they have similar respiration rates. With our twostage scheme, RESPTRACKER can detect reliable single person respiration signal at a distance of 3 meters and track the movement of each user within 20 seconds after movements. And, we can also separate multiple subjects’ respiration and track each of them in domestic settings. Technical Challenges and Solutions: The first challenge is to reliably separate multiple breath signals. Existing work for multi-user breath detection [9] leverages the ICA algorithm to extract different subjects’ respiration. As multiple reflections of wireless signal are mixed at the receiver due to the limited range resolution, they need a reliable decomposition algorithm to separate them. To address this challenge, we use the ZC sequence to distinguish different sound reflection paths with a high resolution of less than 10cm. In addition, we can measure the features of individual paths in terms of the channel impulse response (CIR). In this way, each path contains less interference of other subjects so that the difficulty of signal decomposition is greatly reduced. The second challenge is to expand the monitor range to the room-scale. Since the ultrasonic signal attenuates quickly in indoor environments, the measurement of a single path could be noisy and inaccurate. Traditional delay-and-sum algorithm for beamforming blindly combines signals from the same distance and angle where the weak respiration signal may be destroyed by the out-of-phase combination. To resolve this issue, we use a multi-dimensional signal combination scheme to select and recombine the respiration signals from the same user. We first leverage multiple microphones that are common on COTS devices, such as Amazon Echo and Google Home, to collect multiple copies of the sound reflections. Based on the multipath phenomenon, we collect sound reflections on paths at different distances that arrive at the same microphone. By clustering these multi-dimensional reflection signals, we can determine whether a given path on a given microphone contains the respiration signal and which user the respiration signal belongs to. In this way, we are able to combine a large number of weak paths from the same user, thereby reconstructing the respiration signal reliably and achieving long-distance monitoring. The third challenge is to track the respiration signal while the subject is moving. As users may not keep static in their daily routine, our monitoring system should be able to keep tracking while users change their position or orientation. To achieve respiration tracking under dynamic position and orientation, we divide the signal into short observation slots with a duration of twenty seconds. Within each slot, we first (a) CIR amplitude of a single frame. (b) Time variations of CIR amplitudes. 0.0 6.0 12.0 18.0 24.0 30.0 Time (s) 0.0 0.5 1.0 Normalized Amplitude Amplitude at 50cm Amplitude at 80cm (c) Filtered CIR amplitude at different distances. 0.0 6.0 12.0 18.0 24.0 30.0 Time (s) -0.2 0.0 0.2 Normalized Amplitude Reconstructed Waveform Ground Truth Waveform (d) Reconstructed respiration signal. Figure 2. CIR waveform of a single subject determine whether there are users’ movements and then track the distance change of each reflection path. Therefore, we can quickly use the historical data to regain synchronization within twenty seconds after the movement. Summary of Experimental results: In the single user scenario, our system can robustly estimate the respiration rate with an error under 0.6 Beats per Minute (BPM) for different environments, such as in the hallway, offices, and conference rooms. RESPTRACKER can also achieve an error of less than 1 BPM within a distance of three meters and maintain an error of less than 0.8 BPM while the user is moving. In the multi-user scenario, RESPTRACKER can separate the respiration signal of more than four users in the same room and achieve an error of less than 1 BPM for each user. II. SYSTEM OVERVIEW RESPTRACKER aims at multiple-person room-scale respiration tracking. Therefore, the system is supposed to detect and separate the weak reflection signals at a long range reliably. A. Design Motivations To understand the design challenges for long-range respiration signal detection and separation, we provide a typical respiration signal illustration in Figure 2. Figure 2(a) shows the amplitude of multipath signals at different distances, where
Signal cross-correlation between the received and the transmitted sig- Separation Speaker nal to derive the CIR.We detect each path in random sampled ZC Modulation frames and calculate the respiration SNR in the frequency domain to select paths that are candidates of respiration related ZC reflections that will be used in the second stage. Demodulation The second stage is path combination.To expand the Microphone Array sensing range,we first perform cross-correlation between the Path Two-Round Breath detected paths and their surrounding samples to calculate delay Selection Combinations Estimation and conduct delay-and-sum in the local paths.We then use a Principal Component Analysis(PCA)algorithm to optimally Path Path Clustering Combination combine the time-domain waveform of the detected paths. Figure 3.System Overview of RESPTRACKER Based on the combined respiration signal,we perform the room-scale tracking by calculating the waveform of each ob- servation slot independently.Finally,we use the reconstructed each peak corresponds to one signal path.From Figure 2(a), we have two observations.First,due to the high resolution of breath signal to perform breath rate estimation for each user. the sound signal,the width of each peak is less than 10 cm so III.SIGNAL SEPARATION that theoretically we can separate two users even if they are We use ZC sequences that have ideal auto-correlation just 10 cm apart.Second,the sound signal attenuates quickly property to separate paths of different users and at different and it is hard to reliably detect peaks at a distance of 4 meters. distances Figure 2(b)further illustrates the time variations of the paths,where we removed the static components by subtracting A.ZC Modulation the paths that are not changing within a period of half a The transmitting signal used in RESPTRACKER is the ZC minute,e.g.,the LOS path and reflections of walls.We observe sequence modulated by a sinusoid carrier [15].The ZC se- from Figure 2(b)that the respiration of a user causes regular quence with a length of Nzc is given by: fluctuations in the corresponding path.More interestingly,a single user may incur correlated changes in multiple paths, scln]=eju ,n=0,,Nzc-1, (1) as the signal may be reflected by the wall before reaching the chest of the user and may reflect from different parts where the u and g are the parameters of the sequence.We set of the chest.While these reflections are weak,they provide q to 0,u to 1,and Nse to 199 representing a 2 kHz bandwidth important respiration information of the same user.This is in the modulated signal.Once we get the baseband signal,we because it is well known that the signal quality of a single path use frequency domain interpolation to expand the sequence largely depends on the posture and angle of the user [11].The to a length of L,which is the frame length of our OFDM fluctuations of a single path may be undetectable for certain symbol and is set to 4800 samples in our scheme.We then user orientations,which lead to interruptions in continuous modulate the signal with a carrier sinusoid at a frequency of monitoring.Therefore,it is vital to combine the information fe by moving the baseband sequence to the higher frequency of different paths to perform reliable continuous monitoring. part.Before performing Inverse Fast Fourier transform(IFFT) Figure 2(c)shows the waveform of the respiration signal of for OFDM modulation,we set the negative frequency part the same user at reflection paths at different distances.While to the conjugate counterpart of the signal on the positive the patterns of these signals are similar,they have different frequency.Algorithm 1 shows the detailed process,where fs phases and signal details.Therefore,directly adding these is the sampling frequency.After we generate one frame of the paths may not be an effective way to enhance the signal. time-domain real signal zcrn],we transmit it repeatedly so Based on the above observations,we find that that the transmitted signals are cyclical OFDM symbols. RESPTRACKER needs to address two important challenges. First,how to efficiently separate and identify the multipaths Algorithm 1:Transmitting signal generation of different users?Second,how to reliably combine and Result:The modulated sequence zcrn]with a length reconstruct the breath signals from different paths belonging of L and a carrier frequency of fe. to a single user? 1 Generate zcn]from Eq.1 with a length of Nze. 2 Perform FFT on zc[n]to get ZC[n]. B.System Design 3 Perform FFT shift on ZC[n]to get ZCa[n] To address the above challenges,RESPTRACKER proposes 4 Generate a all zero sequence ZC[n]with a length of L. a two-stage design as shown in Figure 3. 5ZC'-Ng-山:'+N1←ZCm Ja The first stage is signal separation.We use COTS speakers to transmit ZC modulated sound signals.The reflected signals 6ZC-华-2:L-华+←ZCm 7 Perform IFFT on ZC to the time-domain zcr[n]. are received by a microphone array that collects multiple copies of the reflection signal.We perform frequency domain
Figure 3. System Overview of RESPTRACKER each peak corresponds to one signal path. From Figure 2(a), we have two observations. First, due to the high resolution of the sound signal, the width of each peak is less than 10 cm so that theoretically we can separate two users even if they are just 10 cm apart. Second, the sound signal attenuates quickly and it is hard to reliably detect peaks at a distance of 4 meters. Figure 2(b) further illustrates the time variations of the paths, where we removed the static components by subtracting the paths that are not changing within a period of half a minute, e.g., the LOS path and reflections of walls. We observe from Figure 2(b) that the respiration of a user causes regular fluctuations in the corresponding path. More interestingly, a single user may incur correlated changes in multiple paths, as the signal may be reflected by the wall before reaching the chest of the user and may reflect from different parts of the chest. While these reflections are weak, they provide important respiration information of the same user. This is because it is well known that the signal quality of a single path largely depends on the posture and angle of the user [11]. The fluctuations of a single path may be undetectable for certain user orientations, which lead to interruptions in continuous monitoring. Therefore, it is vital to combine the information of different paths to perform reliable continuous monitoring. Figure 2(c) shows the waveform of the respiration signal of the same user at reflection paths at different distances. While the patterns of these signals are similar, they have different phases and signal details. Therefore, directly adding these paths may not be an effective way to enhance the signal. Based on the above observations, we find that RESPTRACKER needs to address two important challenges. First, how to efficiently separate and identify the multipaths of different users? Second, how to reliably combine and reconstruct the breath signals from different paths belonging to a single user? B. System Design To address the above challenges, RESPTRACKER proposes a two-stage design as shown in Figure 3. The first stage is signal separation. We use COTS speakers to transmit ZC modulated sound signals. The reflected signals are received by a microphone array that collects multiple copies of the reflection signal. We perform frequency domain cross-correlation between the received and the transmitted signal to derive the CIR. We detect each path in random sampled frames and calculate the respiration SNR in the frequency domain to select paths that are candidates of respiration related reflections that will be used in the second stage. The second stage is path combination. To expand the sensing range, we first perform cross-correlation between the detected paths and their surrounding samples to calculate delay and conduct delay-and-sum in the local paths. We then use a Principal Component Analysis (PCA) algorithm to optimally combine the time-domain waveform of the detected paths. Based on the combined respiration signal, we perform the room-scale tracking by calculating the waveform of each observation slot independently. Finally, we use the reconstructed breath signal to perform breath rate estimation for each user. III. SIGNAL SEPARATION We use ZC sequences that have ideal auto-correlation property to separate paths of different users and at different distances. A. ZC Modulation The transmitting signal used in RESPTRACKER is the ZC sequence modulated by a sinusoid carrier [15]. The ZC sequence with a length of Nzc is given by: zc[n] = e −j πu(n+1+2q) Nzc , n = 0, ..., Nzc − 1, (1) where the u and q are the parameters of the sequence. We set q to 0, u to 1, and Nzc to 199 representing a 2 kHz bandwidth in the modulated signal. Once we get the baseband signal, we use frequency domain interpolation to expand the sequence to a length of L, which is the frame length of our OFDM symbol and is set to 4800 samples in our scheme. We then modulate the signal with a carrier sinusoid at a frequency of fc by moving the baseband sequence to the higher frequency part. Before performing Inverse Fast Fourier transform (IFFT) for OFDM modulation, we set the negative frequency part to the conjugate counterpart of the signal on the positive frequency. Algorithm 1 shows the detailed process, where fs is the sampling frequency. After we generate one frame of the time-domain real signal zcT [n], we transmit it repeatedly so that the transmitted signals are cyclical OFDM symbols. Algorithm 1: Transmitting signal generation Result: The modulated sequence zcT [n] with a length of L and a carrier frequency of fc. 1 Generate zc[n] from Eq.1 with a length of Nzc. 2 Perform FFT on zc[n] to get ZC[n]. 3 Perform FFT shift on ZC[n] to get ZCs[n]. 4 Generate a all zero sequence ZCd[n] with a length of L. 5 ZCd[ fcL fs − (Nzc−1) 2 : fcL fs + (Nzc−1) 2 ] ⇐ ZCs[n]. 6 ZCd[L − fcL fs − Nzc−1 2 : L − fcL fs + Nzc−1 2 ] ⇐ ZC∗ [n]. 7 Perform IFFT on ZCd to the time-domain zcT [n]
B.ZC Demodulation C.Path Selection After the signal is transmitted from the speaker,the micro- Before we reconstruct the respiration signals from multiple phone array at receiver side records the signals that comes paths,we need to first select correct paths to that contains from both the LOS path and the reflections of subjects and breath related signal patterns.As modeled in Eg.(2),we can denote breath related reflections as: the environment.On one pair of speaker/microphone,we can extract one set of CIR per OFDM frame by performing cR=Ae-(ern- dbody+d(t) (3) cross-correlation between the received signal and the known c×fa transmitted signal [15].Instead of using the time domain Where dody is the path length of user's body reflection,d(t)is down-conversion and correlation as in [15],we leverage the the chest movement during the exhaling and inhaling.which is frequency domain multiplication to perform the frequency- a periodic signal,and p is the phase shift cause by the software domain correlation which will greatly reduce the computa- delay and reflection phase inversion.Under this model,the tional complexity of correlation. corresponding CIR is: The received signal is modeled as: cirr因=Ae-(a+p)simC dbody+d(t) (4) P c x fs Aie-jo(t) n- Ti (2) As the OFDM signal is band-limited with a rectangular frequency gate function,the corresponding time-domain CIR Where zcRln]is received signal,P is the number of paths, is a convolution of the sinc function with the impulsive A;is attenuation coefficient of path i,o;is the phase shift response.For a breath movement with a period ofthe caused by the propagation/reflection of path i and Ti is the corresponding CIR peak will move back-and-forth with an time of flight (ToF)of path i.We first segment the received amplitude of dr around dbody.As the LOS and reflection from signal into frames with the same length of L.We then static environment or static body parts remain almost the same perform FFT on each frame and extract OFDM passband along with time,we can separate the static paths and the breath frequency components ZCRn]corresponding to the trans- related paths by their periodicity.After the system starts for mitted ZC.[n].We multiply ZCeln]by ZC In]to perform monitoring,we first determine the location of the LOS path by cross-correlation in the frequency domain.According to the voting for the maximum peak location of the first L,,frames ideal auto-correlation property of ZC sequence [16],the auto- which is set to 20 in experiments and is corresponding to correlation of ZCs [n]x ZC:[n]is all 1 in the frequency do- 2 seconds.The LOS localization is an one-time calibration main.Therefore,the cross-correlation gives an ideal CIR under because the distance between speaker and microphone is fixed the bandwidth limitation.We use zero-padding to expand the during the monitoring. frequency domain baseband length to L then perform an IFFT Static Signal Removal and Random Sampling:After loc- to get an interpolated time-domain CIR.The peaks in the alized the LOS path,we remove both the LOS path and static resulting CIR denote different delayed versions of transmitted reflection.As the LOS and static reflections corresponding to signal from different paths,as shown in Figure 2(a).Algorithm peaks with quasi-static amplitude and phase,we can remove 2 shows the detailed demodulation process. them by subtracting the average complex-valued CIR of each observation slot from each CIR frame.In this way,the remain- ing non-zero peaks corresponds to dynamical paths.We then Algorithm 2:Received signal demodulation randomly sample R frames in the observation slot to detect the Result:The interpolated time-domain cir[n] dynamical paths.The random sampling scheme is robust for 1 Perform FFT on zcRn]to get ZCRn. respiration detection as the paths corresponding to respiration 2 CIRbaseband[n]ZCR[n]x ZCs[n]. may periodically disappear due to chest movements. 3 Generate an all-zero sequence CIRIn] We use two extra constraints to remove the interference of 4 CIR[0:Ns-1]+CIRbaseband[0:s-] noisy paths.First,we remove peaks that have an amplitude s CIRL-s中:←CIRbasebandl:Nzd smaller than a threshold B of the maximum dynamical path 2 6 Perform IFFT on CIR[n]to the time-domain cir[n]. This effectively removes the fluctuation caused by the side- lobes of the sinc function and we set the threshold B=0.2. Second,we remove paths that are within To sample points to On each pair of speaker/microphone,we obtain one meas- avoid repetition. urement of cir[n]for an OFDM frame,which has a duration of Breath SNR Calculation:After detecting the dynamical 0.1 second.We assemble the measurement of CIR in multiple paths,we use the breath SNR to determine whether the path OFDM frames within an Observation Slot to form a 2D CIR contains respiration signal or other interfering movements.The map as shown in Figure 2(b).The time-domain resolution of breath SNR is based on the observation that the respiration 0.1s in the CIR map gives a sampling rate of 10Hz,which is signal will have a strong frequency component within the adequate for monitoring respiration signals that have typical breath frequency range of 0.1~0.5 Hz as indicated in Eq.(4). frequency of 0.1~0.5 Hz. Therefore,for a specific dynamical path,we first perform an
B. ZC Demodulation After the signal is transmitted from the speaker, the microphone array at receiver side records the signals that comes from both the LOS path and the reflections of subjects and the environment. On one pair of speaker/microphone, we can extract one set of CIR per OFDM frame by performing cross-correlation between the received signal and the known transmitted signal [15]. Instead of using the time domain down-conversion and correlation as in [15], we leverage the frequency domain multiplication to perform the frequencydomain correlation which will greatly reduce the computational complexity of correlation. The received signal is modeled as: zcR[n] = X P i=1 Aie −jφi(t) zcT n − τi fs , (2) Where zcR[n] is received signal, P is the number of paths, Ai is attenuation coefficient of path i, φi is the phase shift caused by the propagation/reflection of path i and τi is the time of flight (ToF) of path i. We first segment the received signal into frames with the same length of L. We then perform FFT on each frame and extract OFDM passband frequency components ZCR[n] corresponding to the transmitted ZCs[n]. We multiply ZCR[n] by ZC∗ s [n] to perform cross-correlation in the frequency domain. According to the ideal auto-correlation property of ZC sequence [16], the autocorrelation of ZCs[n] × ZC∗ s [n] is all 1 in the frequency domain. Therefore, the cross-correlation gives an ideal CIR under the bandwidth limitation. We use zero-padding to expand the frequency domain baseband length to L then perform an IFFT to get an interpolated time-domain CIR. The peaks in the resulting CIR denote different delayed versions of transmitted signal from different paths, as shown in Figure 2(a). Algorithm 2 shows the detailed demodulation process. Algorithm 2: Received signal demodulation Result: The interpolated time-domain cir[n]. 1 Perform FFT on zcR[n] to get ZCR[n]. 2 CIRbaseband[n] ⇐ ZCR[n] × ZCs[n]. 3 Generate an all-zero sequence CIR[n]. 4 CIR[0 : Nzc−1 2 ] ⇐ CIRbaseband[0 : Nzc−1 2 ] 5 CIR[L − Nzc+1 2 : L] ⇐ CIRbaseband[ Nzc+1 2 : Nzc] 6 Perform IFFT on CIR[n] to the time-domain cir[n]. On each pair of speaker/microphone, we obtain one measurement of cir[n] for an OFDM frame, which has a duration of 0.1 second. We assemble the measurement of CIR in multiple OFDM frames within an Observation Slot to form a 2D CIR map as shown in Figure 2(b). The time-domain resolution of 0.1s in the CIR map gives a sampling rate of 10Hz, which is adequate for monitoring respiration signals that have typical frequency of 0.1∼0.5 Hz. C. Path Selection Before we reconstruct the respiration signals from multiple paths, we need to first select correct paths to that contains breath related signal patterns. As modeled in Eq. (2), we can denote breath related reflections as: zcRb [t] = Ae−j( 2πfd(t) c +p) zcT n − dbody + d(t) c × fs (3) Where dbody is the path length of user’s body reflection, d(t) is the chest movement during the exhaling and inhaling, which is a periodic signal, and p is the phase shift cause by the software delay and reflection phase inversion. Under this model, the corresponding CIR is: cirRb [t] = Ae−j( 2πfd(t) c +p) sinc n − dbody + d(t) c × fs (4) As the OFDM signal is band-limited with a rectangular frequency gate function, the corresponding time-domain CIR is a convolution of the sinc function with the impulsive response. For a breath movement with a period of 1 fb , the corresponding CIR peak will move back-and-forth with an amplitude of dr around dbody. As the LOS and reflection from static environment or static body parts remain almost the same along with time, we can separate the static paths and the breath related paths by their periodicity. After the system starts for monitoring, we first determine the location of the LOS path by voting for the maximum peak location of the first Lv frames which is set to 20 in experiments and is corresponding to 2 seconds. The LOS localization is an one-time calibration because the distance between speaker and microphone is fixed during the monitoring. Static Signal Removal and Random Sampling: After localized the LOS path, we remove both the LOS path and static reflection. As the LOS and static reflections corresponding to peaks with quasi-static amplitude and phase, we can remove them by subtracting the average complex-valued CIR of each observation slot from each CIR frame. In this way, the remaining non-zero peaks corresponds to dynamical paths. We then randomly sample R frames in the observation slot to detect the dynamical paths. The random sampling scheme is robust for respiration detection as the paths corresponding to respiration may periodically disappear due to chest movements. We use two extra constraints to remove the interference of noisy paths. First, we remove peaks that have an amplitude smaller than a threshold β of the maximum dynamical path. This effectively removes the fluctuation caused by the sidelobes of the sinc function and we set the threshold β = 0.2. Second, we remove paths that are within Tb sample points to avoid repetition. Breath SNR Calculation: After detecting the dynamical paths, we use the breath SNR to determine whether the path contains respiration signal or other interfering movements. The breath SNR is based on the observation that the respiration signal will have a strong frequency component within the breath frequency range of 0.1∼0.5 Hz as indicated in Eq. (4). Therefore, for a specific dynamical path, we first perform an
FFT along the time-axis to get the spectrum of the path.We then measure the maximum energy in the FFT bins within the E220 210.0 breath frequency range of 0.1~0.5Hz as Emaz.The breath SNR is defined as 70. which is a weighted sum of the uniqueness of the peak Breath pattern from subject 2 060 60 12.0 180 24.0 within the breath frequency range and the strength of the peak Time (s) comparing to other movements.In this way,we can detect Figure 4.CIR map with two users in the environment. the candidate paths that corresponds to breath movements for further path combinations in the next section. 2(d)compares the reconstructed respiration waveform and the IV.PATH COMBINATION ground truth waveform captured by the respiration belt. In this section,we reconstruct the respiration signal through B.Path Clustering for Multiple Users two-round path combinations on the candidate paths detected In real-world scenarios,there might be more than one user in the previous section.We also illustrate how to separate the in the room.Therefore,we need to distinguish the paths respiration signal of multiple users and how to track users if belonging to each user before performing the combination. they move during the monitoring. Separation of Different Users:As our ZC sequence has a range resolution of around 10 cm,we can separate users A.Two-Round Combinations by their different distances to the receiver.Figure 4 shows Traditional delay-and-sum combinations for beamforming the CIR map when there are two users at distance of 1 does not work well for respiration signal reconstruction as meter and 1.5 meters.We can clearly observe two traces shown by our experimental results in Section V.Therefore,related to the respiration signal from these two users at we propose a two-round combinations scheme to enhance the corresponding distances.We treat the user separation problem respiration signal. as an unsupervised classification problem and use the K-means Local Path Combination:According to our signal model, algorithm to perform clustering of paths.As different users the CIR samples surrounding each peak share the same pattern may have similar breath rates and phases,we use the distance of the path at the peak so that we can combine them to enhance as the feature of the clustering algorithm.After the clustering, the common features caused by breathing.Specifically,we the paths of the same user are more likely to be placed in the calculate the cross-correlation between the candidate paths and same class since the effective multipath reflections are mostly Nocat path samples around them to get the weight parameter. around the direct reflection.We then perform the two-round We then delay the surrounding paths and use a weighted-sum combinations algorithm to reconstruct the respiration signal of to add them to the candidate path to reduce the noise of the each user. single sample at the candidate peaks. In the multiple users scenario,each user may have different Path Combination from Different Distances:In this breath SNR.So,we reduce the SNR threshold to cover more subsection,we first consider path combination for a single paths to include more paths for multipath clustering. user where all candidate paths are from the same respiration movements.After the local combination,we gather the can- C.Tracking didate paths from different distances and microphones together Users may move during the respiration monitoring period. to form a matrix X with a size of n x Tp,where n is the Therefore,we need to relocate the users and regain synchron- total number of candidate paths and Tp is the number of ization after each movement.To achieve this.we divide the frames in the observation slot.We only use the amplitude of continuous monitoring period into shorter observation slots the candidate paths to avoid the phase noises in paths.We and perform user tracking within each slot. then remove the static part of each row through the LEVD To balance between the accuracy of movement detection algorithm [17]and apply a moving average filter with a length and delay of respiration rate estimation,we choose to set the of nine samples to smooth the waveform. observation slot length to 20 seconds,which lasts 200 OFDM Although these data are all from single user and share the frames.Within each observation slot,we perform movement same breath pattern,they have different phases and signal detection on the path index change and combination result. details,see Figure 2(c),caused by the propagation delay and When a movement occurs,the peaks found for breath will environment reflections.A straightforward method is to use move largely and the periodic pattern of the result will be the breath SNR as an indicator and exhaustively search for devastated.This is because we sample the frames randomly all possible phase delay parameter to maximize the SNR of within the observation slot,and the possibility of the small generated signal,which is time-consuming.Instead of using portions of movements being sampled is quite small and this method,we use the PCA algorithm to extract the principal the selected paths'major component are still breath related. components which are strongly correlated to the respiration However,when there are movements across the whole slot. signal.The first principal component of the signal matrix gives we should entirely discard the given slot.In this case,the re- a low-noise reconstruction of the respiration signal.Figure constructed waveform has an abrupt shape with no periodicity
FFT along the time-axis to get the spectrum of the path. We then measure the maximum energy in the FFT bins within the breath frequency range of 0.1∼0.5Hz as Emax. The breath SNR is defined as w1 Emax ( P f∈[0.1,0.5] Ef )−Emax +w2 P Emax f∈[0.5,5] Ef , which is a weighted sum of the uniqueness of the peak within the breath frequency range and the strength of the peak comparing to other movements. In this way, we can detect the candidate paths that corresponds to breath movements for further path combinations in the next section. IV. PATH COMBINATION In this section, we reconstruct the respiration signal through two-round path combinations on the candidate paths detected in the previous section. We also illustrate how to separate the respiration signal of multiple users and how to track users if they move during the monitoring. A. Two-Round Combinations Traditional delay-and-sum combinations for beamforming does not work well for respiration signal reconstruction as shown by our experimental results in Section V. Therefore, we propose a two-round combinations scheme to enhance the respiration signal. Local Path Combination: According to our signal model, the CIR samples surrounding each peak share the same pattern of the path at the peak so that we can combine them to enhance the common features caused by breathing. Specifically, we calculate the cross-correlation between the candidate paths and Nlocal path samples around them to get the weight parameter. We then delay the surrounding paths and use a weighted-sum to add them to the candidate path to reduce the noise of the single sample at the candidate peaks. Path Combination from Different Distances: In this subsection, we first consider path combination for a single user where all candidate paths are from the same respiration movements. After the local combination, we gather the candidate paths from different distances and microphones together to form a matrix X with a size of n × Tp, where n is the total number of candidate paths and Tp is the number of frames in the observation slot. We only use the amplitude of the candidate paths to avoid the phase noises in paths. We then remove the static part of each row through the LEVD algorithm [17] and apply a moving average filter with a length of nine samples to smooth the waveform. Although these data are all from single user and share the same breath pattern, they have different phases and signal details, see Figure 2(c), caused by the propagation delay and environment reflections. A straightforward method is to use the breath SNR as an indicator and exhaustively search for all possible phase delay parameter to maximize the SNR of generated signal, which is time-consuming. Instead of using this method, we use the PCA algorithm to extract the principal components which are strongly correlated to the respiration signal. The first principal component of the signal matrix gives a low-noise reconstruction of the respiration signal. Figure Figure 4. CIR map with two users in the environment. 2(d) compares the reconstructed respiration waveform and the ground truth waveform captured by the respiration belt. B. Path Clustering for Multiple Users In real-world scenarios, there might be more than one user in the room. Therefore, we need to distinguish the paths belonging to each user before performing the combination. Separation of Different Users: As our ZC sequence has a range resolution of around 10 cm, we can separate users by their different distances to the receiver. Figure 4 shows the CIR map when there are two users at distance of 1 meter and 1.5 meters. We can clearly observe two traces related to the respiration signal from these two users at corresponding distances. We treat the user separation problem as an unsupervised classification problem and use the K-means algorithm to perform clustering of paths. As different users may have similar breath rates and phases, we use the distance as the feature of the clustering algorithm. After the clustering, the paths of the same user are more likely to be placed in the same class since the effective multipath reflections are mostly around the direct reflection. We then perform the two-round combinations algorithm to reconstruct the respiration signal of each user. In the multiple users scenario, each user may have different breath SNR. So, we reduce the SNR threshold to cover more paths to include more paths for multipath clustering. C. Tracking Users may move during the respiration monitoring period. Therefore, we need to relocate the users and regain synchronization after each movement. To achieve this, we divide the continuous monitoring period into shorter observation slots and perform user tracking within each slot. To balance between the accuracy of movement detection and delay of respiration rate estimation, we choose to set the observation slot length to 20 seconds, which lasts 200 OFDM frames. Within each observation slot, we perform movement detection on the path index change and combination result. When a movement occurs, the peaks found for breath will move largely and the periodic pattern of the result will be devastated. This is because we sample the frames randomly within the observation slot, and the possibility of the small portions of movements being sampled is quite small and the selected paths’ major component are still breath related. However, when there are movements across the whole slot, we should entirely discard the given slot. In this case, the reconstructed waveform has an abrupt shape with no periodicity