This article has been accepted for publication in a future issue of this journal,but has not been fully edited.Content may change prior to final publication.Citation information:DOI 10.1109/TMC.2020.3034354.IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING,VOL.XX,NO.XX,2020 SpeedTalker:Automobile Speed Estimation via Mobile Phones Xinran Lu,Lei Xie,Member,IEEE,Yafeng Yin,Member,IEEE,Wei Wang,Member,IEEE, Yanling Bu,Member,IEEE,Qing Guo,and Sanglu Lu,Member,IEEE Abstract-Among all the road accidents,speeding is the most deadly factor.To reduce speeding,it is essential to devise efficient schemes for ubiquitous speed monitoring.Traditional approaches either suffers from using special equipment(e.g.,radar speed gun)or special deployment(e.g.,position-fixed cameras).In this paper,we propose SpeedTalker,a mobile phone-based approach to perform speed detection on automobiles.By leveraging the built-in microphones and camera from the mobile phone,SpeedTalker estimates the automobile speed by passively sensing the acoustic and image signals.We propose an integrated solution to effectively estimate the automobile's speed based on COTS devices,and provide a platform for every pedestrian to help report the speeding event of automobiles.Specifically,we use the time difference of arrivals(TDOA)model based on acoustic signals to figure out the candidate trajectories of automobile,and use the pin-hole model based on image frames to figure out the vertical distance between the user's position and the automobile's trajectory,thus to estimate the unique trajectory.Combined with the time stamp of the trajectory,the automobile speed can be estimated.Besides,we propose a method to effectively mitigate the influence of the movement jitters of mobile phone.We implemented a system prototype for SpeedTalker and estimated the automobile speed with high accuracy. Experiment results show that in the scenario of single automobile,SpeedTalker can achieve an average estimation error of 6.1% compared to radar speed guns.In the scenario of multiple automobiles,SpeedTalker can achieve an average estimation error of 9.8%. which is acceptable for usage. 1 INTRODUCTION Driving direction 1.1 Motivation Nowadays,more and more traffic violations occur due to the increase of the automobile,e.g.,in 2016,the number of the road traffic deaths reached 1.35 million.Among all Sound wave kinds of the traffic violations,speeding is the most deadly factor[1].Appropriate reductions in speed can reduce fatal Top Mic bttom Mic and serious crash risk to prevent death and serious injury[2]. To reduce speeding,it is essential to devise efficient schemes for ubiquitous monitoring on traffic.Traditional ways to monitor the traffic are using speed radar or using cameras. However,they are costly and inconvenient since they need (a)Illustration of the system. wide deployment of special equipment.As a result,a low- cost and mobile solution to measure the speed is needed. 08 It is noted that,the mobile phones embedded with many kinds of sensors,such as cameras and microphones,have Current Automobile Type:Nissan become indispensable in daily life.By utilizing the built- in sensors,we can propose a method to measure the auto- Loc:Hil St mobile speed with mobile phones.Specifically,we can use the microphones and camera to recover the trajectory of the automobile and estimate the speed.IMU sensors are utilized 84km/h18 to remove jitters to raise the accuracy of the system.In this Overspeed:4km/h way,every pedestrian can help to monitor the traffic condi- (b)The application of the system. tion with his/her mobile phone.Furthermore,all people can Fig.1:Application scenario of SpeedTalker. Xinran Lu,Lei Xie,Yafeng Yin,Wei Wang,Yanling Bu,Oing Guo participate in the activities of reporting traffic conditions by and Sanglu Lu are with the State Key Laboratory for Novel Software sufficiently applying the crowdsourcing method [3]. Technology,Nanjing University,China E-mail:luxinran@smail.nju.edu.cn,lxie@nju.edu.cn,yafeng@nju.edu.cn, A typical scenario of SpeedTalker is as follows.In the ww@nju.edu.cn,yanling@smail.nju.edu.cn,guoqing@smail.nju.edu.cn, speed prone areas,the pedestrians who volunteer to moni- sanglu@nju.edu.cn. tor the traffic can arrive at the area in advance and contin- .Lei Xie is the corresponding author. uously record the acoustic and the visual signals of the au- 36-1233(c)2020 IEEE Personal use is permitted,but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. Authorized licensed use limited to:Nanjing University.Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore.Restrictions apply
1536-1233 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2020.3034354, IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. XX, NO. XX, 2020 1 SpeedTalker: Automobile Speed Estimation via Mobile Phones Xinran Lu, Lei Xie, Member, IEEE, Yafeng Yin, Member, IEEE, Wei Wang, Member, IEEE, Yanling Bu, Member, IEEE, Qing Guo, and Sanglu Lu, Member, IEEE Abstract—Among all the road accidents, speeding is the most deadly factor. To reduce speeding, it is essential to devise efficient schemes for ubiquitous speed monitoring. Traditional approaches either suffers from using special equipment(e.g., radar speed gun) or special deployment(e.g., position-fixed cameras). In this paper, we propose SpeedTalker, a mobile phone-based approach to perform speed detection on automobiles. By leveraging the built-in microphones and camera from the mobile phone, SpeedTalker estimates the automobile speed by passively sensing the acoustic and image signals. We propose an integrated solution to effectively estimate the automobile’s speed based on COTS devices, and provide a platform for every pedestrian to help report the speeding event of automobiles. Specifically, we use the time difference of arrivals (TDOA) model based on acoustic signals to figure out the candidate trajectories of automobile, and use the pin-hole model based on image frames to figure out the vertical distance between the user’s position and the automobile’s trajectory, thus to estimate the unique trajectory. Combined with the time stamp of the trajectory, the automobile speed can be estimated. Besides, we propose a method to effectively mitigate the influence of the movement jitters of mobile phone. We implemented a system prototype for SpeedTalker and estimated the automobile speed with high accuracy. Experiment results show that in the scenario of single automobile, SpeedTalker can achieve an average estimation error of 6.1% compared to radar speed guns. In the scenario of multiple automobiles, SpeedTalker can achieve an average estimation error of 9.8%, which is acceptable for usage. ✦ 1 INTRODUCTION 1.1 Motivation Nowadays, more and more traffic violations occur due to the increase of the automobile, e.g., in 2016, the number of the road traffic deaths reached 1.35 million. Among all kinds of the traffic violations, speeding is the most deadly factor[1]. Appropriate reductions in speed can reduce fatal and serious crash risk to prevent death and serious injury[2]. To reduce speeding, it is essential to devise efficient schemes for ubiquitous monitoring on traffic. Traditional ways to monitor the traffic are using speed radar or using cameras. However, they are costly and inconvenient since they need wide deployment of special equipment. As a result, a lowcost and mobile solution to measure the speed is needed. It is noted that, the mobile phones embedded with many kinds of sensors, such as cameras and microphones, have become indispensable in daily life. By utilizing the builtin sensors, we can propose a method to measure the automobile speed with mobile phones. Specifically, we can use the microphones and camera to recover the trajectory of the automobile and estimate the speed. IMU sensors are utilized to remove jitters to raise the accuracy of the system. In this way, every pedestrian can help to monitor the traffic condition with his/her mobile phone. Furthermore, all people can • Xinran Lu, Lei Xie, Yafeng Yin, Wei Wang, Yanling Bu, Qing Guo and Sanglu Lu are with the State Key Laboratory for Novel Software Technology, Nanjing University, China. E-mail: luxinran@smail.nju.edu.cn, lxie@nju.edu.cn, yafeng@nju.edu.cn, ww@nju.edu.cn, yanling@smail.nju.edu.cn, guoqing@smail.nju.edu.cn, sanglu@nju.edu.cn. • Lei Xie is the corresponding author. (a) Illustration of the system. (b) The application of the system. Fig. 1: Application scenario of SpeedTalker. participate in the activities of reporting traffic conditions by sufficiently applying the crowdsourcing method [3]. A typical scenario of SpeedTalker is as follows. In the speed prone areas, the pedestrians who volunteer to monitor the traffic can arrive at the area in advance and continuously record the acoustic and the visual signals of the auAuthorized licensed use limited to: Nanjing University. Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore. Restrictions apply
This article has been accepted for publication in a future issue of this journal,but has not been fully edited.Content may change prior to final publication.Citation information:DOI 10.1109/TMC.2020.3034354.IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING,VOL.XX,NO.XX,2020 tomobiles.The pedestrians only need to use the system for sidewalk can utilize mobile phones'built-in microphones a few minutes to collect the traffic speed information in this and camera to estimate the speed of the automobile.IMU period.SpeedTalker estimates the speed of the automobiles sensors are utilized to compensate the jitters caused by and collects the speeding related information.Traffic speed users.Figure 1 illustrates the application scenario of the information will be uploaded onto the server of the related system.To perform speed detection,the user needs to hold department.With the help of volunteers,data from different the mobile phone in landscape orientation as shown in the regions at different time then can be analyzed for traffic figure,i.e.,the top microphone and the bottom microphone control.The distributions of traffic police and equipment are placed in a left-and-right manner.When the automo- can be optimized and the drivers and pedestrians can be bile passes by,both two microphones record the sound of warned of danger when moving in this area. the automobiles.And the camera records the movement of the automobile.According to the measurements from 1.2 Limitation of Prior Art these two kinds of sensors,SpeedTalker estimates the speed There exist two main approaches to measure the speed of of the automobiles.Specifically,during the process when the automobiles.One approach is to use the fixed devices the automobile is passing by,the sound wave reaches the to measure the speed of the automobiles.The cameras and top and bottom microphones at different time,respectively. coils are traditional fixed devices for speed detection.They According to the time difference of arrivals(TDOA)derived can monitor whether there exist automobiles at two pre-set from acoustic signals obtained by different microphones, locations.If the automobile passes the two corresponding SpeedTalker estimates the candidate trajectories of the auto- locations,the system then records the time interval the mobile as a set of hyperbolas.According to the obtained automobile uses.Thus the speed of the automobile can be frames from the camera,SpeedTalker estimates the vertical easily estimated.However,if the fixed speed measurement distance between the user's position and the automobile's devices are widely deployed to monitor the traffic,the cost trajectory,by referring to the pin-hole model of the camera. is unacceptable.Besides,the drivers can easily figure out Then,the trajectory of the automobile can be determined from the candidates by referring to the unique vertical dis- whether there exist speed measurement devices since their positions are fixed.Moreover,each speed detection camera fance.Combined with the temporal information in acoustic needs its own parameters to estimate the speed of the signals,SpeedTalker is able to estimate the speed of the automobiles.The height,gesture and the field of view(FOV) automobiles.Besides,since the mobile phones are held in determines the detection region of the camera deployed on hands,the jitters may cause rotation and translation of the the traffic pole.This makes the estimation simple but can mobile phones.IMU sensors can be used to compensate the translations and rotations and reduce the errors only work for the specific camera. Another approach to measure the speed of the automo- bile is to use portable devices,such as radar speed gun[4] 1.4 Challenges or lidar[5].Radar speed guns use Doppler Effect to perform speed measurement.They send out a radio signal in a nar- There are three main challenges in our work.The first row beam,then receive the same signal back after it bounces challenge is to propose a passive sensing method to mea- off the target object.If the object is moving,the frequency sure the speed of automobile.Passive sensing means the of the radio waves change.According to the difference detection system does not actively transmit any detecting between the reflected radio waves and transmitted waves, signals,such as ultrasonic and flash light.Active sensing the speed of the object can be calculated.However,there has two limitations for the speeding detection.First,the exist limitations when using these portable devices.First, active signals,e.g.,the electromagnetic wave,can be easily special devices are needed to emit the directional modulated detected by the radar detectors.Second,an ultrasonic wave or flash light actively generated by the mobile phone will electromagnetic waves in certain frequency.This increases be dramatically attenuated when it is transmitted outdoors. the cost of the hardware and prohibits it to be widely used by ordinary people.Second,the electromagnetic wave To address this challenges,we propose a passive sensing emitted by the equipment can be easily detected by radar method to estimate the speed of the automobile,by utilizing two microphones and one camera in the mobile phones. detector in the automobile.Usually this makes them fail to capture the speeding event,since the automobiles may Instead of actively transmitting the modulated signals and receiving the reflected signals,our solution only collects the intentionally slow down when they pass by. acoustic signals and the image frames from the automobiles Therefore,in order to make every pedestrian become in a passive manner.The trajectory of the automobiles can potential speeding inspectors,it is essential to leverage portable daily devices,such as mobile phone,and propose be estimated by the acoustic signals from the two separated microphones and the image frames from the camera.Com- easy-to-use measurements to measure the speed of automo- biles.In fact,by sufficiently using the embedded sensors bined with the timestamp of the trajectory,the speed of the automobiles can be estimated. like the microphones and cameras,we can effectively use the mobile phones to measure the automobiles'speed. The second challenge is to derive the automobile speed from the complicated acoustic signals.The complication of the acoustic signals comes from two aspects.On one hand, 1.3 Our Approach the automobile noises are made up of many parts,including In this paper,we propose SpeedTalker,a mobile phone- the tire noise,engine noise,exhaust noise,wind noise,etc based approach to perform speed detection on automobiles. [6].These noises are mixed not only in time domain but Instead of using special devices,the pedestrian on the also in frequency domain.Therefore,it is hard to separate 1536-1233(c)2020 IEEE Personal use is permitted,but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. Authorized licensed use limited to:Nanjing University.Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore.Restrictions apply
1536-1233 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2020.3034354, IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. XX, NO. XX, 2020 2 tomobiles. The pedestrians only need to use the system for a few minutes to collect the traffic speed information in this period. SpeedTalker estimates the speed of the automobiles and collects the speeding related information. Traffic speed information will be uploaded onto the server of the related department. With the help of volunteers, data from different regions at different time then can be analyzed for traffic control. The distributions of traffic police and equipment can be optimized and the drivers and pedestrians can be warned of danger when moving in this area. 1.2 Limitation of Prior Art There exist two main approaches to measure the speed of the automobiles. One approach is to use the fixed devices to measure the speed of the automobiles. The cameras and coils are traditional fixed devices for speed detection. They can monitor whether there exist automobiles at two pre-set locations. If the automobile passes the two corresponding locations, the system then records the time interval the automobile uses. Thus the speed of the automobile can be easily estimated. However, if the fixed speed measurement devices are widely deployed to monitor the traffic, the cost is unacceptable. Besides, the drivers can easily figure out whether there exist speed measurement devices since their positions are fixed. Moreover, each speed detection camera needs its own parameters to estimate the speed of the automobiles. The height, gesture and the field of view(FOV) determines the detection region of the camera deployed on the traffic pole. This makes the estimation simple but can only work for the specific camera. Another approach to measure the speed of the automobile is to use portable devices, such as radar speed gun[4] or lidar[5]. Radar speed guns use Doppler Effect to perform speed measurement. They send out a radio signal in a narrow beam, then receive the same signal back after it bounces off the target object. If the object is moving, the frequency of the radio waves change. According to the difference between the reflected radio waves and transmitted waves, the speed of the object can be calculated. However, there exist limitations when using these portable devices. First, special devices are needed to emit the directional modulated electromagnetic waves in certain frequency. This increases the cost of the hardware and prohibits it to be widely used by ordinary people. Second, the electromagnetic wave emitted by the equipment can be easily detected by radar detector in the automobile. Usually this makes them fail to capture the speeding event, since the automobiles may intentionally slow down when they pass by. Therefore, in order to make every pedestrian become potential speeding inspectors, it is essential to leverage portable daily devices, such as mobile phone, and propose easy-to-use measurements to measure the speed of automobiles. In fact, by sufficiently using the embedded sensors like the microphones and cameras, we can effectively use the mobile phones to measure the automobiles’ speed. 1.3 Our Approach In this paper, we propose SpeedTalker, a mobile phonebased approach to perform speed detection on automobiles. Instead of using special devices, the pedestrian on the sidewalk can utilize mobile phones’ built-in microphones and camera to estimate the speed of the automobile. IMU sensors are utilized to compensate the jitters caused by users. Figure 1 illustrates the application scenario of the system. To perform speed detection, the user needs to hold the mobile phone in landscape orientation as shown in the figure, i.e., the top microphone and the bottom microphone are placed in a left-and-right manner. When the automobile passes by, both two microphones record the sound of the automobiles. And the camera records the movement of the automobile. According to the measurements from these two kinds of sensors, SpeedTalker estimates the speed of the automobiles. Specifically, during the process when the automobile is passing by, the sound wave reaches the top and bottom microphones at different time, respectively. According to the time difference of arrivals (TDOA) derived from acoustic signals obtained by different microphones, SpeedTalker estimates the candidate trajectories of the automobile as a set of hyperbolas. According to the obtained frames from the camera, SpeedTalker estimates the vertical distance between the user’s position and the automobile’s trajectory, by referring to the pin-hole model of the camera. Then, the trajectory of the automobile can be determined from the candidates by referring to the unique vertical distance. Combined with the temporal information in acoustic signals, SpeedTalker is able to estimate the speed of the automobiles. Besides, since the mobile phones are held in hands, the jitters may cause rotation and translation of the mobile phones. IMU sensors can be used to compensate the translations and rotations and reduce the errors. 1.4 Challenges There are three main challenges in our work. The first challenge is to propose a passive sensing method to measure the speed of automobile. Passive sensing means the detection system does not actively transmit any detecting signals, such as ultrasonic and flash light. Active sensing has two limitations for the speeding detection. First, the active signals, e.g., the electromagnetic wave, can be easily detected by the radar detectors. Second, an ultrasonic wave or flash light actively generated by the mobile phone will be dramatically attenuated when it is transmitted outdoors. To address this challenges, we propose a passive sensing method to estimate the speed of the automobile, by utilizing two microphones and one camera in the mobile phones. Instead of actively transmitting the modulated signals and receiving the reflected signals, our solution only collects the acoustic signals and the image frames from the automobiles in a passive manner. The trajectory of the automobiles can be estimated by the acoustic signals from the two separated microphones and the image frames from the camera. Combined with the timestamp of the trajectory, the speed of the automobiles can be estimated. The second challenge is to derive the automobile speed from the complicated acoustic signals. The complication of the acoustic signals comes from two aspects. On one hand, the automobile noises are made up of many parts, including the tire noise, engine noise, exhaust noise, wind noise, etc [6]. These noises are mixed not only in time domain but also in frequency domain. Therefore, it is hard to separate Authorized licensed use limited to: Nanjing University. Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore. Restrictions apply
This article has been accepted for publication in a future issue of this journal,but has not been fully edited.Content may change prior to final publication.Citation information:DOI 10.1109/TMC.2020.3034354.IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING,VOL.XX,NO.XX,2020 3 different noises with two built-in microphones of mobile phones are at the sidewalk and the positions and gestures phones.On the other hand,there might be many kinds are unknown.So novel approaches utilizing mobile phones of noises in the environment,especially for the sound of to calculate the speed of the automobiles are needed.To other automobiles on the road.It is hard to remove the get the relative position information between automobiles environment noises,since the frequencies of other auto- and mobile phones,we need to use cameras inside the mobiles mainly lie in very close frequency band with the mobile phones,which is analogous to knowing the posi- target automobile.To address this challenge,we consider tion and gestures of the cameras in traditional CV based the acoustic signals at full frequency as a whole.We utilize approaches.Apart from distance calculation,SpeedTalker the cross-correlation of the acoustic signals from the top utilizes acoustic signals to estimate the candidate trajectory and bottom microphones to estimate the time difference of of the automobiles.There are two advantages of acoustic arrivals(TDOA).As the automobile is continuously moving, signals over the visual signals.Firstly,the detection region we can obtain a series of time delays through TDOA at of acoustic signals is broader than that of visual signals. different time.The candidate trajectories of the automobile Common cameras inside the microphones usually have nar- can be estimated as a set of hyperbolas according to the row field of view(FOV).For example,the wide-angle camera curve of the time delay.Thus the automobile speed can be of Samsung Galaxy Note 8 has 77 field of view.If we further estimated. utilize the microphones of Samsung Galaxy Note 8 to detect The third challenge is to estimate the speed of multiple automobiles,the detection field of view is around 160 automobiles.We can not separate the sound of multiple according to the hyperbola model we propose.Secondly, automobiles.Therefore,when multiple automobiles pass compute complexity of acoustic signals processing is much through the mobile phone,it is challenging to estimate the lower than that of visual signals.If visual signals are utilized speed.To address the challenge,we utilize the multiple to complete the same work,each frame of the videos should peaks in the cross-correlation figures between the top and be processed.The compute complexity of the processing is bottom microphones.Then we may recover the delay curve unacceptable. of each automobile and calculate the speed of the automo- Automobile detection via mobile phones:Automobile biles. detection is an important research area since undetected automobiles are likely to endanger human life.Mobile 1.5 Contributions phones can be utilized to inform the users of the approach- This paper makes four contributions:First,this is the first ing automobiles.There are three approaches to sense the work that estimates the automobile speed via mobile phones automobiles with mobile phones.The first approach is to through passive sensing of acoustic and image signals.We install applications both on the automobiles and the mobile propose an integrated solution to effectively estimate the au- phones.Oki Electric Industry Co.Ltd.develops a mobile tomobile's speed based on commercial off-the-shelf(COTS) phone that notifies the users of the presence of the auto- devices,and provide a platform for every pedestrian to mobiles using DSRC[9].Car-2-X utilizes ad-hoc and cellular help report the speeding event of automobiles.Second,we networks to inform the pedestrians of the automobile with use the time difference of arrivals (TDOA)model based on the same method[10].The second approach is to sense acoustic signals to figure out the candidate trajectories of the moving automobiles via images.Sivaraman proposed a automobile,and use the pin-hole model based on image general active-learning framework for on-road automobiles frames to figure out the vertical distance,thus to estimate recognition and tracking based on videos[11].Wang pro- the unique trajectory.Combined with the timestamp of the posed WalkSafe,a mobile phone application based on the trajectory,the automobile speed can be estimated.Third,we back camera to sense the automobiles[121.The drawback of implemented a system prototype for SpeedTalker and esti- these work is that image processing needs huge calculating mated the automobile speed with high accuracy.The system resources.And the camera of the mobile phone is needed to works in the outdoor environment and effectively mitigates face the road,which makes the detection inconvenient.The the ambient environmental interference.Experiment results third approach is to utilize acoustic signals to sense the auto- show that SpeedTalker can achieve an average estimation mobiles.Tsuzuki proposed an automobile sound detection error of 6.1%in the scenario of single automobile.In the system for a mobile phone[13].Takagi introduced a hybrid scenario of multiple automobiles,SpeedTalker can achieve and electric vehicles detection system[14],which focused an average estimation error of 9.8%. on switching noise of the electric motor.So they failed to detect automobiles other than these types.Li proposed Auto++,a system that detects approaching automobiles for 2 RELATED WORK smart phone users to detect all kinds of automobiles via Automobile detection via visual signals:Traditional ap- overall acoustic signals[15].However,all these works can proaches utilize cameras to calculate the speed of the au- only inform the user of the approach of the automobiles tomobiles.Kumar7]and Czajewski8]use computer vision and can not estimate the speed of the automobile. based technologies to detect automobiles.The cameras are Sensing via acoustic signals with mobile phones:Sens- deployed in fixed positions and gestures above the street. ing with daily equipment is a popular issue.Sound waves As a result,the detection region is known and fixed.That can easily be transmitted and received by daily equipment, means the moving distance of the automobiles can easily such as mobile phones and smart watches.Much work be acquired.Then the speed of the automobiles can be based on sound wave has been published.AAMouse mea- calculated.However,the scenarios of SpeedTalker is differ- sures the Doppler Shift of the sound waves transmitted by ent from that of traditional visual approaches.The mobile a mobile phone to track the phone itself with an accuracy 1536-1233(c)2020 IEEE.Personal use is permitted,but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. Authorized licensed use limited to:Nanjing University.Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore.Restrictions apply
1536-1233 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2020.3034354, IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. XX, NO. XX, 2020 3 different noises with two built-in microphones of mobile phones. On the other hand, there might be many kinds of noises in the environment, especially for the sound of other automobiles on the road. It is hard to remove the environment noises, since the frequencies of other automobiles mainly lie in very close frequency band with the target automobile. To address this challenge, we consider the acoustic signals at full frequency as a whole. We utilize the cross-correlation of the acoustic signals from the top and bottom microphones to estimate the time difference of arrivals (TDOA). As the automobile is continuously moving, we can obtain a series of time delays through TDOA at different time. The candidate trajectories of the automobile can be estimated as a set of hyperbolas according to the curve of the time delay. Thus the automobile speed can be further estimated. The third challenge is to estimate the speed of multiple automobiles. We can not separate the sound of multiple automobiles. Therefore, when multiple automobiles pass through the mobile phone, it is challenging to estimate the speed. To address the challenge, we utilize the multiple peaks in the cross-correlation figures between the top and bottom microphones. Then we may recover the delay curve of each automobile and calculate the speed of the automobiles. 1.5 Contributions This paper makes four contributions: First, this is the first work that estimates the automobile speed via mobile phones through passive sensing of acoustic and image signals. We propose an integrated solution to effectively estimate the automobile’s speed based on commercial off-the-shelf(COTS) devices, and provide a platform for every pedestrian to help report the speeding event of automobiles. Second, we use the time difference of arrivals (TDOA) model based on acoustic signals to figure out the candidate trajectories of automobile, and use the pin-hole model based on image frames to figure out the vertical distance, thus to estimate the unique trajectory. Combined with the timestamp of the trajectory, the automobile speed can be estimated. Third, we implemented a system prototype for SpeedTalker and estimated the automobile speed with high accuracy. The system works in the outdoor environment and effectively mitigates the ambient environmental interference. Experiment results show that SpeedTalker can achieve an average estimation error of 6.1% in the scenario of single automobile. In the scenario of multiple automobiles, SpeedTalker can achieve an average estimation error of 9.8%. 2 RELATED WORK Automobile detection via visual signals: Traditional approaches utilize cameras to calculate the speed of the automobiles. Kumar[7] and Czajewski[8] use computer vision based technologies to detect automobiles. The cameras are deployed in fixed positions and gestures above the street. As a result, the detection region is known and fixed. That means the moving distance of the automobiles can easily be acquired. Then the speed of the automobiles can be calculated. However, the scenarios of SpeedTalker is different from that of traditional visual approaches. The mobile phones are at the sidewalk and the positions and gestures are unknown. So novel approaches utilizing mobile phones to calculate the speed of the automobiles are needed. To get the relative position information between automobiles and mobile phones, we need to use cameras inside the mobile phones, which is analogous to knowing the position and gestures of the cameras in traditional CV based approaches. Apart from distance calculation, SpeedTalker utilizes acoustic signals to estimate the candidate trajectory of the automobiles. There are two advantages of acoustic signals over the visual signals. Firstly, the detection region of acoustic signals is broader than that of visual signals. Common cameras inside the microphones usually have narrow field of view(FOV). For example, the wide-angle camera of Samsung Galaxy Note 8 has 77◦ field of view. If we utilize the microphones of Samsung Galaxy Note 8 to detect automobiles, the detection field of view is around 160◦ according to the hyperbola model we propose. Secondly, compute complexity of acoustic signals processing is much lower than that of visual signals. If visual signals are utilized to complete the same work, each frame of the videos should be processed. The compute complexity of the processing is unacceptable. Automobile detection via mobile phones: Automobile detection is an important research area since undetected automobiles are likely to endanger human life. Mobile phones can be utilized to inform the users of the approaching automobiles. There are three approaches to sense the automobiles with mobile phones. The first approach is to install applications both on the automobiles and the mobile phones. Oki Electric Industry Co. Ltd. develops a mobile phone that notifies the users of the presence of the automobiles using DSRC[9]. Car-2-X utilizes ad-hoc and cellular networks to inform the pedestrians of the automobile with the same method[10]. The second approach is to sense the moving automobiles via images. Sivaraman proposed a general active-learning framework for on-road automobiles recognition and tracking based on videos[11]. Wang proposed WalkSafe, a mobile phone application based on the back camera to sense the automobiles[12]. The drawback of these work is that image processing needs huge calculating resources. And the camera of the mobile phone is needed to face the road, which makes the detection inconvenient. The third approach is to utilize acoustic signals to sense the automobiles. Tsuzuki proposed an automobile sound detection system for a mobile phone[13]. Takagi introduced a hybrid and electric vehicles detection system[14], which focused on switching noise of the electric motor. So they failed to detect automobiles other than these types. Li proposed Auto++, a system that detects approaching automobiles for smart phone users to detect all kinds of automobiles via overall acoustic signals[15]. However, all these works can only inform the user of the approach of the automobiles and can not estimate the speed of the automobile. Sensing via acoustic signals with mobile phones: Sensing with daily equipment is a popular issue. Sound waves can easily be transmitted and received by daily equipment, such as mobile phones and smart watches. Much work based on sound wave has been published. AAMouse measures the Doppler Shift of the sound waves transmitted by a mobile phone to track the phone itself with an accuracy Authorized licensed use limited to: Nanjing University. Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore. Restrictions apply
This article has been accepted for publication in a future issue of this journal,but has not been fully edited.Content may change prior to final publication.Citation information:DOI 10.1109/TMC.2020.3034354.IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING,VOL.XX,NO.XX,2020 4200 415 4100 405 茶 30 380 2 Time (secs) (a)Empirical study setup. (b)STFT of acoustic signals when the automobile passes (c)S-shaped curve. by. Fig.2:Simple analysis on acoustic signals. of 1.4cm[16].Wang proposed a device-free gesture tracking on a tripod.As shown in figure 2a,the tripod is set at one method using acoustic signals[17].It has a tracking accuracy side of the road with its camera facing the road.And the of 3.5mm and 4.6mm respectively for 1-D hand movement mobile phone is in the landscape orientation.The mobile and 2-D drawing in the air.ApenaApp,uses chirp signals phone is about 1.5m above the ground,and about 8m away to detect the changes in reflected sound that are caused from the lane.The mobile phone records the sound when by human breaths[18].The system applies FFT over the the automobile passes by.The sampling rate fs of the sound acoustic signals to monitor the periodical movements that in empirical study is 44.1kHz. have frequency lower than 1Hz.All these works need to transmit active sound wave to sense objects.However they 3.1.2 Doppler Effect do not work if they are applied outdoors in a long distance with powerful environmental noise.Above all,calculating The usual way to estimate the speed of the moving object the speed of the automobile with a mobile phone in ourdoor is to utilize Doppler Effect.If we already know the frequency environment is quite challenging. f of the original wave,the frequency f'of real-time wave Distance perception via cameras:Distance perception should be given by: is demanded in computer vision technology to optimize v2t the algorithm and enhance the performance.Traditional f C2f (1) C2-2 VC2v2t2+12(C2-v2) approaches to estimate the distance between the object where C is the velocity of sound,v is the velocity of the au- and the camera is to use binocular system to calculate the tomobile,is the closest distance between the mobile phone depth.Hartley gives detailed view geometry in computer and the automobile,and t is the time[141.The distance vision for distance calculating[19].Tram utilizes two cam- between the mobile phone and the automobile is shortest eras mounted in the automobiles to capture LED light and at t=0.To calculate the speed of the automobile,one of the estimate the distance between vehicles[20].However,the approach is not suitable for our scenario.Although some problems is to find the original frequency f and real-time frequency f'of a specific sound wave. mobile phones have multiple cameras at the backside,the cameras have its own roles.Some cameras have wide-angle First we focus on the original frequency f of the moving automobile.Since active sensing does not work in our sce- lens,some have telephoto lens and some have infrared lens. They may not work together at the same time.Moreover nario,we do not transmit sound wave in specific frequency. As a result,we have to analyze the sound made by the au- some mobile phones only have one camera at the backside. Some other papers use one camera to estimate the distance. tomobile to find the original frequency.In fact,automobile noises include tyre noise,engine noise,wind noise,exhaust Diaz-Cabrera utilizes one camera to estimate the distance between the automobile and the traffic light[21].They need noise,wind noise and so on.The frequency of tyre noise is to know the height of the traffic light and the parameter widely distributed.The peak part locates between 315Hz and 1000Hz[23].The engine noise is dominated by the of the cameras in advance.Rahman utilizes one camera to rotation speed of the engine.The frequency of the engine estimate the distance between the user and the camera[221. noise is mainly distributed from 1600H2 to 4000Hz and They also need to know the distance between the eyes and the peak part concentrates in the range from 100Hz to the parameters of the cameras.Our approach uses similar view geometry to calculate the distance and we can get the 400Hz[24].The frequency of exhaust noise and wind noise real diameter of the wheel hub through machine learning is closely related to the speed of the automobiles.All these approaches. noises vary with the type of the automobiles,tyres,engines and so on.This means automobile noise does not have specific frequency and varies with specific automobiles.We 3 EMPIRICAL STUDY AND MODELING cannot find the original frequency f in our scenario. 3.1 Acoustic Signal Study Then we focus on the real-time frequency f.Figure 2b shows the short-time Fourier transform(STFT)of the process 3.1.1 Measurement of Acoustic Signals via Mobile Phones when the automobile passes by.We can see that the power In order to study the relations between the acoustic signals of full-frequency band increases.It is a hard job to focus and the speed of the automobile,we need to collect acoustic on a specific frequency to calculate the speed.That is to signals when automobiles pass by.To avoid the influence of say,we can hardly know the reason for the increase of jitters from the mobile phone,we deploy the mobile phone the specific frequency power.The increase may be because 1536-1233(c)2020 IEEE Personal use is permitted,but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. Authorized licensed use limited to:Nanjing University.Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore.Restrictions apply
1536-1233 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2020.3034354, IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. XX, NO. XX, 2020 4 (a) Empirical study setup. (b) STFT of acoustic signals when the automobile passes by. (c) S-shaped curve. Fig. 2: Simple analysis on acoustic signals. of 1.4cm[16]. Wang proposed a device-free gesture tracking method using acoustic signals[17]. It has a tracking accuracy of 3.5mm and 4.6mm respectively for 1-D hand movement and 2-D drawing in the air. ApenaApp, uses chirp signals to detect the changes in reflected sound that are caused by human breaths[18]. The system applies FFT over the acoustic signals to monitor the periodical movements that have frequency lower than 1Hz. All these works need to transmit active sound wave to sense objects. However they do not work if they are applied outdoors in a long distance with powerful environmental noise. Above all, calculating the speed of the automobile with a mobile phone in ourdoor environment is quite challenging. Distance perception via cameras: Distance perception is demanded in computer vision technology to optimize the algorithm and enhance the performance. Traditional approaches to estimate the distance between the object and the camera is to use binocular system to calculate the depth. Hartley gives detailed view geometry in computer vision for distance calculating[19]. Tram utilizes two cameras mounted in the automobiles to capture LED light and estimate the distance between vehicles[20]. However, the approach is not suitable for our scenario. Although some mobile phones have multiple cameras at the backside, the cameras have its own roles. Some cameras have wide-angle lens, some have telephoto lens and some have infrared lens. They may not work together at the same time. Moreover some mobile phones only have one camera at the backside. Some other papers use one camera to estimate the distance. Diaz-Cabrera utilizes one camera to estimate the distance between the automobile and the traffic light[21]. They need to know the height of the traffic light and the parameter of the cameras in advance. Rahman utilizes one camera to estimate the distance between the user and the camera[22]. They also need to know the distance between the eyes and the parameters of the cameras. Our approach uses similar view geometry to calculate the distance and we can get the real diameter of the wheel hub through machine learning approaches. 3 EMPIRICAL STUDY AND MODELING 3.1 Acoustic Signal Study 3.1.1 Measurement of Acoustic Signals via Mobile Phones In order to study the relations between the acoustic signals and the speed of the automobile, we need to collect acoustic signals when automobiles pass by. To avoid the influence of jitters from the mobile phone, we deploy the mobile phone on a tripod. As shown in figure 2a, the tripod is set at one side of the road with its camera facing the road. And the mobile phone is in the landscape orientation. The mobile phone is about 1.5m above the ground, and about 8m away from the lane. The mobile phone records the sound when the automobile passes by. The sampling rate fs of the sound in empirical study is 44.1kHz. 3.1.2 Doppler Effect The usual way to estimate the speed of the moving object is to utilize Doppler Effect. If we already know the frequency f of the original wave, the frequency f 0 of real-time wave should be given by: f 0 = C 2f C2 − v 2 ( 1 − v 2 t p C2v 2t 2 + l 2(C2 − v 2) ) (1) where C is the velocity of sound, v is the velocity of the automobile, l is the closest distance between the mobile phone and the automobile, and t is the time[14]. The distance between the mobile phone and the automobile is shortest at t = 0. To calculate the speed of the automobile, one of the problems is to find the original frequency f and real-time frequency f 0 of a specific sound wave. First we focus on the original frequency f of the moving automobile. Since active sensing does not work in our scenario, we do not transmit sound wave in specific frequency. As a result, we have to analyze the sound made by the automobile to find the original frequency. In fact, automobile noises include tyre noise, engine noise, wind noise, exhaust noise, wind noise and so on. The frequency of tyre noise is widely distributed. The peak part locates between 315Hz and 1000Hz[23]. The engine noise is dominated by the rotation speed of the engine. The frequency of the engine noise is mainly distributed from 1600Hz to 4000Hz and the peak part concentrates in the range from 100Hz to 400Hz[24]. The frequency of exhaust noise and wind noise is closely related to the speed of the automobiles. All these noises vary with the type of the automobiles, tyres, engines and so on. This means automobile noise does not have specific frequency and varies with specific automobiles. We cannot find the original frequency f in our scenario. Then we focus on the real-time frequency f 0 . Figure 2b shows the short-time Fourier transform(STFT) of the process when the automobile passes by. We can see that the power of full-frequency band increases. It is a hard job to focus on a specific frequency to calculate the speed. That is to say, we can hardly know the reason for the increase of the specific frequency power. The increase may be because Authorized licensed use limited to: Nanjing University. Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore. Restrictions apply
This article has been accepted for publication in a future issue of this journal,but has not been fully edited.Content may change prior to final publication.Citation information:DOI 10.1109/TMC.2020.3034354.IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING,VOL.XX,NO.XX,2020 5 Driving Direction B D Top Microphone Bottom Microphone (a)Positions of the automobile at different time. 0. 04 -Bottom-Top Bottom-Top Bottom-Top Bottom-Top Bottom-Tc 13 A. -02 -03 -0 0.005 0.01 0.005 0.01 0.005 0.01 0.005 0.0 0.005 0.01 Time(s) Time(s) Time(s) Time(s) Time(s) (b)Signals from micro-(c)Signals from micro-(d)Signals from micro-(e)Signals from micro-(f)Signals from micro- phones at position A phones at position B. phones at position C. phones at position D. phones at position E. 0 N-w 2 21 -250 0 250 S0 500 -250 250 00 -250 0 250 500 -250 250 -500 -250 0 250 00 Time Delay(sample) Time Delay(sample) Time Delay(sample) Time Delay(sample) Time Delay(sample) (g)Cross-correlation at po-(h)Cross-correlation at po- (i)Cross-correlation posi- (j)Cross-correlation at posi-(k)Cross-correlation at po- sition A. sition B. tion C. tion D. sition E. Fig.3:Empirical study. of the approaching of the automobile,or the shift of the they are in the similar shape with certain time delays. original high-power frequency. To further study the relation between the two signals,we To conclude,if Doppler Effect can be utilized to solve the calculate the cross-correlation[25]between the two signals. problem in our scenario,we should find some S-shaped Figure 3g to figure 3k show the cross-correlation between curves[14]in the spectrogram.The S-shaped curves show signals collected by the top and the bottom microphones that some specific frequencies shift to the lower frequency at different positions.Signals in figure 3b and figure 3c are in the spectrogram of the acoustic signals,which can be cal- recorded at the top side of the mobile phone.We can see the culated by equation(1).For example,if we let f =4000Hz signals from top microphone is ahead of the signals from I 10m,v =20m/s,C=340m/s,we can get the S-shaped bottom microphone.From figure 3g and figure 3h we can curve as figure 2c shows.We can not find any S-shaped see the time delay can be calculated from the value of cross- curve in the spectrogram of the acoustic signal.That means correlation between the two signals.Similarly,figure 3e and Doppler Effect cannot be used to estimate the speed in our figure 3f show the signals when the automobile is at the scenario. bottom side of the mobile phone.Time delays of position D and position E are-9 and-15. 3.1.3 Correlation between the Acoustic Signals Since we have the idea that the acoustic signals from the Since frequency domain cannot help us estimate the speed top microphone and the bottom microphone are temporally of the automobile,we may look for clues in time domain. related,we can split the signals into small segments to study To understand how automobile speed affects the acoustic the detailed relationship.This give us the chance to calculate signals from automobiles,it is essential to extract spatial and the speed of the automobile. temporal information from received acoustic signals.Since we have two audio streams recorded at the same time from 3.2 Modeling Automobile Speed via Microphone and the top and the bottom microphones,we have the chance to Camera calculate the spatial information. 3.2.1 Build the Coordinate System Figure 3a shows five positions of the automobile's trace We can use three-dimensional coordinate system to describe we choose to study.We record the sound for 0.01s with the scenario,just as figure 4a shows.The origin is located both top and bottom microphones at each place.Figure 3b at the midpoint of MiM2.M and M2 are the points to figure 3f show the raw signals at position A to position representing the two microphones.The x-axis is horizontal E.Although the waveforms of the two acoustic signals are and points to the right,the y-axis points towards the outside different in detail due to the difference of the microphones, of the screen face and the z-axis is vertical and points up. 1536-1233(c)2020 IEEE Personal use is permitted,but republication/redistribution requires IEEE permission.See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. Authorized licensed use limited to:Nanjing University.Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore.Restrictions apply
1536-1233 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TMC.2020.3034354, IEEE Transactions on Mobile Computing IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. XX, NO. XX, 2020 5 A B C D E Driving Direction Top Microphone Bottom Microphone (a) Positions of the automobile at different time. 0 0.005 0.01 Time(s) -0.4 -0.2 0 0.2 0.4 Amplitude Bottom Top (b) Signals from microphones at position A. 0 0.005 0.01 Time(s) -0.4 -0.2 0 0.2 0.4 Amplitude Bottom Top (c) Signals from microphones at position B. 0 0.005 0.01 Time(s) -0.4 -0.2 0 0.2 0.4 Amplitude Bottom Top (d) Signals from microphones at position C. 0 0.005 0.01 Time(s) -0.4 -0.2 0 0.2 0.4 Amplitude Bottom Top (e) Signals from microphones at position D. 0 0.005 0.01 Time(s) -0.4 -0.2 0 0.2 0.4 Amplitude Bottom Top (f) Signals from microphones at position E. -500 -250 0 250 500 Time Delay(sample) -2 -1 0 1 2 Correlation 19 (g) Cross-correlation at position A. -500 -250 0 250 500 Time Delay(sample) -2 -1 0 1 2 Correlation 8 (h) Cross-correlation at position B. -500 -250 0 250 500 Time Delay(sample) -2 -1 0 1 2 Correlation 0 (i) Cross-correlation position C. -500 -250 0 250 500 Time Delay(sample) -2 -1 0 1 2 Correlation -9 (j) Cross-correlation at position D. -500 -250 0 250 500 Time Delay(sample) -2 -1 0 1 2 Correlation -15 (k) Cross-correlation at position E. Fig. 3: Empirical study. of the approaching of the automobile, or the shift of the original high-power frequency. To conclude, if Doppler Effect can be utilized to solve the problem in our scenario, we should find some S-shaped curves[14] in the spectrogram. The S-shaped curves show that some specific frequencies shift to the lower frequency in the spectrogram of the acoustic signals, which can be calculated by equation(1). For example, if we let f = 4000Hz, l = 10m, v = 20m/s, C = 340m/s, we can get the S-shaped curve as figure 2c shows. We can not find any S-shaped curve in the spectrogram of the acoustic signal. That means Doppler Effect cannot be used to estimate the speed in our scenario. 3.1.3 Correlation between the Acoustic Signals Since frequency domain cannot help us estimate the speed of the automobile, we may look for clues in time domain. To understand how automobile speed affects the acoustic signals from automobiles, it is essential to extract spatial and temporal information from received acoustic signals. Since we have two audio streams recorded at the same time from the top and the bottom microphones, we have the chance to calculate the spatial information. Figure 3a shows five positions of the automobile’s trace we choose to study. We record the sound for 0.01s with both top and bottom microphones at each place. Figure 3b to figure 3f show the raw signals at position A to position E. Although the waveforms of the two acoustic signals are different in detail due to the difference of the microphones, they are in the similar shape with certain time delays. To further study the relation between the two signals, we calculate the cross-correlation[25] between the two signals. Figure 3g to figure 3k show the cross-correlation between signals collected by the top and the bottom microphones at different positions. Signals in figure 3b and figure 3c are recorded at the top side of the mobile phone. We can see the signals from top microphone is ahead of the signals from bottom microphone. From figure 3g and figure 3h we can see the time delay can be calculated from the value of crosscorrelation between the two signals. Similarly, figure 3e and figure 3f show the signals when the automobile is at the bottom side of the mobile phone. Time delays of position D and position E are -9 and -15. Since we have the idea that the acoustic signals from the top microphone and the bottom microphone are temporally related, we can split the signals into small segments to study the detailed relationship. This give us the chance to calculate the speed of the automobile. 3.2 Modeling Automobile Speed via Microphone and Camera 3.2.1 Build the Coordinate System We can use three-dimensional coordinate system to describe the scenario, just as figure 4a shows. The origin is located at the midpoint of M1M2. M1 and M2 are the points representing the two microphones. The x-axis is horizontal and points to the right, the y-axis points towards the outside of the screen face and the z-axis is vertical and points up. Authorized licensed use limited to: Nanjing University. Downloaded on July 06,2021 at 04:35:27 UTC from IEEE Xplore. Restrictions apply