1826·Wang et al. 0 0 ON 04 01 0 OFF/ 0.2 CW EPC 02 的 0.3 0.4 0 05 15 2 2.5 0 0.1020.30.4 Time(s) In-phase (a)USRP signal components (b)Constellation of tag movement ¥0.62 USRP-100Hz 一USRP-10OHz 0.10.110.120.130.14 100 150 Time (s) Frequency (Hz) 0.62L 一USRP.300z 061 USRP.300Hz 05 0.11 0.12 0.13 0.14 250 300 350 Time (s) Frequency (Hz) (c)Amplitude of USRP signal (d)Frequency analysis of raw signal Fig.3.Principle analysis of vibration sensing from IQ plain. Observation 1:The USRP reader with higher sampling rate is more suitable for eavesdropping than the COTS RFID reader. For the COTS RFID reader,we can only detect the 100Hz sound from both the frequency domain and time domain,i.e.,the orange wave in Figure 2(b)and the orange peak in Figure 2(c).According to the Shannon's law [31].over 600Hz sampling rate is required to capture the 300Hz sound.Even if the compressive reading [40,41] can solve the mechanical vibration,it cannot sense the human voice,which has complicated frequency bands. Therefore,we do not consider the compressive sensing and use the traditional FFT to measure the frequency bands.For the USRP reader,even if the reader signal is much stronger than the tag signal,leading to the huge signal noise,we can still observe the weak tag signals of 100Hz and 300Hz in the time domain and frequency domain,i.e.,the 100Hz red wave and 300Hz jitters in Figure 2(b),and the corresponding blue peaks in Figure 2(c). Thus,when we focus on the human voice with complicated frequency bands,the USRP platform is more suitable to capture the human voice than the COTS RFID readers. 3.2 Tag Movement V.S.Tag Vibration Since the tag vibration can be regarded as a small tag movement,we next investigate how the physical-layer signal changes with the tag movement by pushing the tag close to the antenna. Observation 2:The tag movement leads to the wavy change in the time domain,and the rotation of signal vector in the IO plane. As shown in Figure 3(a),when we push the tag close to the antennas from 1.5m to 1.3m,the signal amplitude is changing as the cosine function.As shown in Figure 3(b),when we push the tag close to the antenna,the signal rotates in the IQ plane,and the rotation center is not at the origin.It means that the received signal does not change with the tag-antenna distance linearly.Moreover,two main circles are formed in this figure.In the enlarged signal in the time domain of Figure 3(a),we can clearly see the QUERY and ACK commands from the Proc.ACM Interact.Mob.Wearable Ubiquitous Technol,Vol.5,No.4,Article 182.Publication date:December 2021
182:6 • Wang et al. 0 0.5 1 1.5 2 2.5 Time (s) 0 0.1 0.2 0.3 0.4 0.5 Amplitude QUERY ACK RN16 EPC CW (a) USRP signal components ON OFF/ CW (b) Constellation of tag movement 0.1 0.11 0.12 0.13 0.14 Time (s) 0.615 0.62 Amplitude USRP-100Hz 0.1 0.11 0.12 0.13 0.14 Time (s) 0.615 0.62 Amplitude USRP-300Hz (c) Amplitude of USRP signal 50 100 150 Frequency (Hz) 0 1 2 FFT USRP-100Hz 250 300 350 Frequency (Hz) 0 0.5 1 FFT USRP-300Hz (d) Frequency analysis of raw signal Fig. 3. Principle analysis of vibration sensing from IQ plain. Observation 1: The USRP reader with higher sampling rate is more suitable for eavesdropping than the COTS RFID reader. For the COTS RFID reader, we can only detect the 100Hz sound from both the frequency domain and time domain, i.e., the orange wave in Figure 2(b) and the orange peak in Figure 2(c). According to the Shannon’s law [31], over 600Hz sampling rate is required to capture the 300Hz sound. Even if the compressive reading [40, 41] can solve the mechanical vibration, it cannot sense the human voice, which has complicated frequency bands. Therefore, we do not consider the compressive sensing and use the traditional FFT to measure the frequency bands. For the USRP reader, even if the reader signal is much stronger than the tag signal, leading to the huge signal noise, we can still observe the weak tag signals of 100Hz and 300Hz in the time domain and frequency domain, i.e., the 100Hz red wave and 300Hz jitters in Figure 2(b), and the corresponding blue peaks in Figure 2(c). Thus, when we focus on the human voice with complicated frequency bands, the USRP platform is more suitable to capture the human voice than the COTS RFID readers. 3.2 Tag Movement V.S. Tag Vibration Since the tag vibration can be regarded as a small tag movement, we next investigate how the physical-layer signal changes with the tag movement by pushing the tag close to the antenna. Observation 2: The tag movement leads to the wavy change in the time domain, and the rotation of signal vector in the IQ plane. As shown in Figure 3(a), when we push the tag close to the antennas from 1.5m to 1.3m, the signal amplitude is changing as the cosine function. As shown in Figure 3(b), when we push the tag close to the antenna, the signal rotates in the IQ plane, and the rotation center is not at the origin. It means that the received signal does not change with the tag-antenna distance linearly. Moreover, two main circles are formed in this $gure. In the enlarged signal in the time domain of Figure 3(a), we can clearly see the QUERY and ACK commands from the Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 5, No. 4, Article 182. Publication date: December 2021
Thru-the-wall Eavesdropping on Loudspeakers via RFID by Capturing Sub-mm Level Vibration.182:7 Environment Signal changes due to movement W signa displacement Environment 0 In-pbase (a)Signal components in RFID (b)Signal in the IO plane Fig.4.Transmission model in RFID system. reader,as well as the RN16 response and EPC response from the tag.Comparing Figure 3(b)with Figure 3(a),two circles in Figure 3(b)are caused by the changes of CW signals and tag backscattered signals,which correspond to the OFF and ON states of tag modulation [13].Note that when we push the tag about 20cm,which is about 1.23x of the half wave length of CW signals,the signal rotates about 1.23x circles.Since the tag vibration is a small tag movement,the tag vibration leads to the small wavy change in the time domain,and small rotation of signal vector in the IQ plane,which are used to build the model in Section 4. 3.3 Tag Vibration V.S.Diaphragm Vibration Since both the tag and the diaphragm may vibrate due to the sound pressure,we conduct experiments to study the different influences.Particularly,we remove the tag in front of the loudspeaker as shown in Figure 2(a)to capture the diaphragm vibration from the CW signal. Observation 3:The tag vibration captured by backscattered signals is much larger than the loudspeaker diaphragm vibration captured by CW signals. Comparing Figure 2(b)with Figure 3(c),when we remove the tag from the loudspeaker,the periodic patterns without tags are distinctly reduced.Particularly,for the 100Hz sound,we can still observe the weak periodic pattern in Figure 3(c),but the amplitude is much weaker than Figure 2(b).For the 300Hz sound,no periodic pattern can be found in Figure 3(c)and Figure 3(d).The reason is that the metallic tag can backscatter more RF-signals than the papery diaphragm.Thus,the attached tag can amplify the interference of the loudspeaker through backscattering. 4 SYSTEM DESIGN In this section,we introduce the principle of Tag-Bug by extracting the vibration of tag based on the signal model. In particular,we propose to extract the sound from either the vibration effect or the reflection effect of the tag. According to the sound extraction model,we design a new tag response mechanism,which can randomize the tag responses and improve the sound quality. 4.1 Transmitting Model Uplink.In RFID systems,the transmitting antenna TX sends the Cw signal to activate the tag as shown in Figure 4(a).Due to the interference of multi-path effect,the signal reflected from the environment also arrives at the tag together with the Cw signal: Stag STx(hd+hE). (1) Here,Stag indicates the signal received by the tag,STx is the CW signal sent by the TX antenna,hd is the signal attenuation due to the transmitting distance and he is the signal attenuation due to the multi-path effect of the environment.Particularly,in an ideal channel model [13],ha can be calculated as hd =ei,where d is the distance between the TX antenna and the tag,j is the imaginary number.0d is the phase calculated from distance Proc.ACM Interact.Mob.Wearable Ubiquitous Technol.,Vol.5,No.4,Article 182.Publication date:December 2021
Thru-the-wall Eavesdropping on Loudspeakers via RFID by Capturing Sub-mm Level Vibration • 182:7 TX RX Tag displacement Environment CW signal Multi-path Backscattered signal Leakage signal Environment (a) Signal components in RFID O !! !" !# !$ Signal changes due to movement In-phase Quadrature (b) Signal in the IQ plane Fig. 4. Transmission model in RFID system. reader, as well as the RN16 response and EPC response from the tag. Comparing Figure 3(b) with Figure 3(a), two circles in Figure 3(b) are caused by the changes of CW signals and tag backscattered signals, which correspond to the OFF and ON states of tag modulation [13]. Note that when we push the tag about 202<, which is about 1.23⇥ of the half wave length of CW signals, the signal rotates about 1.23⇥ circles. Since the tag vibration is a small tag movement, the tag vibration leads to the small wavy change in the time domain, and small rotation of signal vector in the IQ plane, which are used to build the model in Section 4. 3.3 Tag Vibration V.S. Diaphragm Vibration Since both the tag and the diaphragm may vibrate due to the sound pressure, we conduct experiments to study the di!erent in"uences. Particularly, we remove the tag in front of the loudspeaker as shown in Figure 2(a) to capture the diaphragm vibration from the CW signal. Observation 3: The tag vibration captured by backscattered signals is much larger than the loudspeaker diaphragm vibration captured by CW signals. Comparing Figure 2(b) with Figure 3(c), when we remove the tag from the loudspeaker, the periodic patterns without tags are distinctly reduced. Particularly, for the 100Hz sound, we can still observe the weak periodic pattern in Figure 3(c), but the amplitude is much weaker than Figure 2(b). For the 300Hz sound, no periodic pattern can be found in Figure 3(c) and Figure 3(d). The reason is that the metallic tag can backscatter more RF-signals than the papery diaphragm. Thus, the attached tag can amplify the interference of the loudspeaker through backscattering. 4 SYSTEM DESIGN In this section, we introduce the principle of Tag-Bug by extracting the vibration of tag based on the signal model. In particular, we propose to extract the sound from either the vibration e!ect or the re"ection e!ect of the tag. According to the sound extraction model, we design a new tag response mechanism, which can randomize the tag responses and improve the sound quality. 4.1 Transmi!ing Model Uplink. In RFID systems, the transmitting antenna TX sends the CW signal to activate the tag as shown in Figure 4(a). Due to the interference of multi-path e!ect, the signal re"ected from the environment also arrives at the tag together with the CW signal: (C06 = ()- (⌘3 + ⌘⇢). (1) Here, (C06 indicates the signal received by the tag, ()- is the CW signal sent by the TX antenna, ⌘3 is the signal attenuation due to the transmitting distance and ⌘⇢ is the signal attenuation due to the multi-path e!ect of the environment. Particularly, in an ideal channel model [13], ⌘3 can be calculated as ⌘3 = 1 3 4j\3 , where 3 is the distance between the TX antenna and the tag, j is the imaginary number. \3 is the phase calculated from distance Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 5, No. 4, Article 182. Publication date: December 2021
1828·Wang et al. d and wave length入,as: d 8a=2r mod2π. (2) he is related to distance d and the transmitting environment in principle Downlink.After the tag receives the signal,the tag backscatters the signal with FMo or Miller modulations, which encodes the binary bits with ON and OFF states [13].For the OFF state,the tag backscatters all the CW signal, which has a small amplitude.Therefore,the signal received by the reader is the combination of the backscattered signal from tag Stag(hd+he)and the leakage signal from reader SrxhL: SRX.0=STXhL Stag(ha +hE)=STX(hL hahd +hE.d), (3) where hd is the signal attenuation due to the downlink transmitting distance,he indicates the environment influence in the backscattered channel.For simplicity,we use hed to represent the overall signal attenuation due to the environment,which is also related to the distance d. For the ON state,the tag backscatters a large amplitude signal by changing the state of tag antenna.Thus,the received signal is: SRx.1 STxhL Stag(ha hg)h1 STx(hL hihahd +he.d). (4 where hi is the modulation gain of the tag,and h is the overall signal attenuation due to the environment for the ON state.In RFID systems,the tag changes the antenna capacitance to modulate the Cw signal during the backscattering,so that h is usually regarded as the signal enhancement.Particularly,because the multi-path effect from the environment is relative small,we thus omit the influence of hi and regard hd approximates to hed.As a result,the signal received by the reader can be divided into three parts:the leakage signal SL,the multi-path signal Se and the backscattered signal So or S1,where SL STxhL, SE STxhE.d, (5) So Srxhaha,S1=STxhahah1. When the TX antenna and RX antenna are placed close to each other and the tag is relatively far from the two antennas,we regardd'd.Thus,both So and S are proportional to hh==e20,indicating that the phase change is 2d.Such phase change is compatible with the results in Figure 3(b),where 20cm movement leads to 2.45x radians phase change. IO plane analysis.Figure 4(b)presents the signal model in the IQ plane.The transmitting distance dd'changes with the tag movement,leading to the change of both the multi-path signal Se and the backscattered signal So\Si Thus,the phases of Se and So\S get changed,resulting in the rotation of the corresponding signals.The phase change of So\S is caused by the signal attenuation h2,whose phase change is 2m24.Therefore,both So and S rotate with the transmitting distance d,which leads to two arcs in the IQ plane.Since Se is usually static,we omit it for simplicity.Such results exactly explain the signal change in Figure 3(b) 4.2 Sound Extraction from Vibration Effect Theoretically,the vibration effect of the tag due to the sound can lead to the variation of the transmitting distance as d=do+f(t,do).Here,do is the average tag-antenna distance,and f(t,d)is the distance variation related to time t and vibration amplitude do.For the mono-tone sound with the frequency o,f(t,do)=d,cos(2mot),which can be extended to any complicated sound with multiple tones.For simplicity,we introduce the algorithm with mono tone sound.In an ideal model,such tag vibration can be directly captured by the received signals So and S1. However,since the leakage signal SL is much stronger than the backscattered signal So and S1,the small changes of So and Si will not remarkably affect the received signal SRx.o and SRx.1.Figure 5(a)plots the vibration-based Proc.ACM Interact.Mob.Wearable Ubiquitous Technol,Vol.5,No.4,Article 182.Publication date:December 2021
182:8 • Wang et al. 3 and wave length _, as: \3 = 2c 3 _ mod 2c. (2) ⌘⇢ is related to distance 3 and the transmitting environment in principle. Downlink. After the tag receives the signal, the tag backscatters the signal with FM0 or Miller modulations, which encodes the binary bits with ON and OFF states [13]. For the OFF state, the tag backscatters all the CW signal, which has a small amplitude. Therefore, the signal received by the reader is the combination of the backscattered signal from tag (C06 (⌘30 + ⌘⇢0) and the leakage signal from reader ()-⌘!: ('-,0 = ()-⌘! + (C06 (⌘30 + ⌘⇢0) = ()- (⌘! + ⌘3⌘30 + ⌘⇢,3 ), (3) where ⌘30 is the signal attenuation due to the downlink transmitting distance, ⌘⇢0 indicates the environment in"uence in the backscattered channel. For simplicity, we use ⌘⇢,3 to represent the overall signal attenuation due to the environment, which is also related to the distance 3. For the ON state, the tag backscatters a large amplitude signal by changing the state of tag antenna. Thus, the received signal is: ('-,1 = ()-⌘! + (C06 (⌘30 + ⌘⇢0)⌘1 = ()- (⌘! + ⌘1⌘3⌘30 + ⌘0 ⇢,3 ), (4) where ⌘1 is the modulation gain of the tag, and ⌘0 ⇢,3 is the overall signal attenuation due to the environment for the ON state. In RFID systems, the tag changes the antenna capacitance to modulate the CW signal during the backscattering, so that ⌘1 is usually regarded as the signal enhancement. Particularly, because the multi-path e!ect from the environment is relative small, we thus omit the in"uence of ⌘1 and regard ⌘0 ⇢,3 approximates to ⌘⇢,3 . As a result, the signal received by the reader can be divided into three parts: the leakage signal (!, the multi-path signal (⇢ and the backscattered signal (0 or (1, where 8>>>< >>> : (! = ()-⌘!, (⇢ = ()-⌘⇢,3, (0 = ()-⌘3⌘30, (1 = ()-⌘3⌘30⌘1. (5) When the TX antenna and RX antenna are placed close to each other and the tag is relatively far from the two antennas, we regard 30 ⇡ 3. Thus, both (0 and (1 are proportional to ⌘3⌘30 = ⌘2 3 = 1 32 4j2\3 , indicating that the phase change is 2c 23 _ . Such phase change is compatible with the results in Figure 3(b), where 20cm movement leads to 2.45c radians phase change. IQ plane analysis. Figure 4(b) presents the signal model in the IQ plane. The transmitting distance 3\30 changes with the tag movement, leading to the change of both the multi-path signal (⇢ and the backscattered signal (0\(1. Thus, the phases of (⇢ and (0\(1 get changed, resulting in the rotation of the corresponding signals. The phase change of (0\(1 is caused by the signal attenuation ⌘2 3 , whose phase change is 2c 23 _ . Therefore, both (0 and (1 rotate with the transmitting distance 3, which leads to two arcs in the IQ plane. Since (⇢ is usually static, we omit it for simplicity. Such results exactly explain the signal change in Figure 3(b). 4.2 Sound Extraction from Vibration E"ect Theoretically, the vibration e!ect of the tag due to the sound can lead to the variation of the transmitting distance as 3 = 30 + 5 (C, 3E ). Here, 30 is the average tag-antenna distance, and 5 (C, 3E ) is the distance variation related to time C and vibration amplitude 3E . For the mono-tone sound with the frequency q, 5 (C, 3E ) = 3E cos(2cqC), which can be extended to any complicated sound with multiple tones. For simplicity, we introduce the algorithm with mono tone sound. In an ideal model, such tag vibration can be directly captured by the received signals (0 and (1. However, since the leakage signal (! is much stronger than the backscattered signal (0 and (1, the small changes of (0 and (1 will not remarkably a!ect the received signal ('-,0 and ('-,1. Figure 5(a) plots the vibration-based Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 5, No. 4, Article 182. Publication date: December 2021
Thru-the-wall Eavesdropping on Loudspeakers via RFID by Capturing Sub-mm Level Vibration.182:9 Centralized Large signal noise phase change 0535 05 S 0525 SRX.o Long time interval 0.1005 0.1006 0.1007 0 In-phase Time(s) (a)Raw signal V.S.Centralized signal (b)Signal cancellation from adjacent samples Phase change of 05 aw恤a图 Amplified MSD 0 05 一djacent samples SRX.O SRX.O SRX.1 0 Mean of S SL Phase change of o 300Hz -Mean of Sx Static MSD 0s 200 400 600 800 1000 In-phase Frequency (Hz) (c)Amplified MSD V.S.Static MSD (d)Vibration extraction results of different cancellation methods Fig.5.Vibration extraction mechanisms. signal change by omitting SE.Both SRx.o and SRx.I slightly rotate,and the raw phase change is much small due to the strong leakage signal.Moreover,sub-mm level vibration of the tag due to the sound can be easily drowned by the ambient noise.Thus,we need to amplify the vibration effect by removing the strong interference. Naive Normalization.The direct way is to centralize SRx.by subtracting the average value SRx.1,as shown in Figure 5(a).The phase variance range can be amplified to [0,2].However,in the real system,SRx.contains the large ambient noise,and such subtracting can import the additional noise signal.Thus,both the vibration effect and the signal noise are amplified. To efficiently amplify the vibration effect,our basic idea is to extract the backscattered signals,which are related to the tag displacement.If we can obtain the backscattered signal So or S.the corresponding phase change can indicate the tag displacement.However,it is difficult to measure the leakage signal SL and the environment signal SE,thus,we cannot individually get either So or S by referring to SRx.o and SRx.1.Fortunately,since both SL and Se are static in most scenarios,by regarding hed approximates to he.d,we can remove SL and Sg from Eq.(4)and Eq.(3)as: ASRX SRX.1 -SRX.0 STX(h1-1)h (6) We call it Modulated Signal Difference(MSD).Here,only hd changes with the tag vibration in principle,meaning that the vibration can be extracted from the MSD phase. However,in any snapshot,only one of Sgx.o and Sgx.1 can be received.Therefore,we cannot get the MSD ASRx in reality.For a static tag,we can use SRx.o and SRx.I to calculate the MSD ASRx,which is called Static MSD. But for a vibrating tag,both SRx.o and SRx.I get changed even during one tag response.Therefore,we cannot simply calculate the MSD from the average value.To address the problem,two kinds of cancellation solutions are considered to extract the MSD efficiently. Proc.ACM Interact.Mob.Wearable Ubiquitous Technol.,Vol.5,No.4,Article 182.Publication date:December 2021
Thru-the-wall Eavesdropping on Loudspeakers via RFID by Capturing Sub-mm Level Vibration • 182:9 In-phase Quadrature O !! !"#,% !"#,& Raw phase change Centralized phase change I Q !"#,% (a) Raw signal V.S. Centralized signal 0.1005 0.1006 0.1007 Time (s) 0.525 0.53 0.535 Amplitude !"#,% !"#,& Large signal noise Long time interval (b) Signal cancellation from adjacent samples O �� ���,1 ���,0 Phase change of Static MSD �� Phase change of Amplified MSD �� ���,0 Quadrature In-phase (c) Ampli$ed MSD V.S. Static MSD 0 0.5 FFT Raw phase 0 0.5 FFT Adjacent samples 0 0.5 FFT Mean of SRX,1 200 400 600 800 1000 Frequency (Hz) 0 0.5 FFT Mean of S 300Hz RX,0 (d) Vibration extraction results of di!erent cancellation methods Fig. 5. Vibration extraction mechanisms. signal change by omitting (⇢. Both ('-,0 and ('-,1 slightly rotate, and the raw phase change is much small due to the strong leakage signal. Moreover, sub-mm level vibration of the tag due to the sound can be easily drowned by the ambient noise. Thus, we need to amplify the vibration e!ect by removing the strong interference. Naïve Normalization. The direct way is to centralize ('-,1 by subtracting the average value ('-,1, as shown in Figure 5(a). The phase variance range can be ampli$ed to [0, 2c]. However, in the real system, ('-,1 contains the large ambient noise, and such subtracting can import the additional noise signal. Thus, both the vibration e!ect and the signal noise are ampli$ed. To e#ciently amplify the vibration e!ect, our basic idea is to extract the backscattered signals, which are related to the tag displacement. If we can obtain the backscattered signal (0 or (1, the corresponding phase change can indicate the tag displacement. However, it is di#cult to measure the leakage signal (! and the environment signal (⇢, thus, we cannot individually get either (0 or (1 by referring to ('-,0 and ('-,1. Fortunately, since both (! and (⇢ are static in most scenarios, by regarding ⌘0 ⇢,3 approximates to ⌘⇢,3 , we can remove (! and (⇢ from Eq. (4) and Eq. (3) as: ('- = ('-,1 ('-,0 ⇡ ()- (⌘1 1)⌘2 3 . (6) We call it Modulated Signal Di!erence (MSD). Here, only ⌘3 changes with the tag vibration in principle, meaning that the vibration can be extracted from the MSD phase. However, in any snapshot, only one of ('-,0 and ('-,1 can be received. Therefore, we cannot get the MSD ('- in reality. For a static tag, we can use ('-,0 and ('-,1 to calculate the MSD ('- , which is called Static MSD. But for a vibrating tag, both ('-,0 and ('-,1 get changed even during one tag response. Therefore, we cannot simply calculate the MSD from the average value. To address the problem, two kinds of cancellation solutions are considered to extract the MSD e#ciently. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 5, No. 4, Article 182. Publication date: December 2021