A Context Aware Energy-Saving Scheme for Smart Camera Phones based on Activity Sensing Yuanyuan Fan,Lei Xie,Yafeng Yin,Sanglu Lu State Key Laboratory for Novel Software Technology,Nanjing University,P.R.China Email:fyymonica@dislab.nju.edu.cn,Ixie@nju.edu.cn,yyf@dislab.nju.edu.cn,sanglu@nju.edu.cn Abstract-Nowadays more and more users tend to take photos time of smart camera phones.However,the previous work in with their smart phones.However,energy-saving continues to be energy-saving schemes for smart phones have the following a thorny problem for smart camera phones,since smart phone limitations:First,they mainly reduce the energy consumption photographing is a very power hungry function.In this paper, in a fairly isolated approach,without sufficiently considering we propose a context aware energy-saving scheme for smart the user's actual behaviors from the application perspective, camera phones,by accurately sensing the user's activities in the photographing process.Our solution is based on the observation this may greatly impact the user's experience of the smart that during the process of photographing,most of the energy are phones.Second,in regard to the energy-saving scheme for wasted in the preparations before the shooting.By leveraging photographing,they mainly focus on the shooting process the embedded sensors like the accelerometer and gyroscope,our instead of the preparations before shooting. solution is able to extract representative features to perceive the user's current activities including body movement,arm movement In this paper,we propose a context aware energy-saving and wrist movement.Furthermore,by maintaining an activity scheme for smart camera phones,by accurately sensing the state machine,our solution can accurately determine the user's user's activities in the photographing process.Our idea is current activity states and make the corresponding energy saving that,since current smart phones are mostly equipped with strategies.Experiment results show that,our solution is able to tiny sensors such as the accelerometer and gyroscope,we can perceive the user's activities with an average accuracy of 95.5% leverage these tiny sensors to effectively perceive the user's and reduce the overall energy consumption by 46.5%for smart activities,such that the corresponding energy-saving strategies camera phones compared to that without energy-saving scheme. can be applied according to the user's activities. There are several challenges in building an activity sensing- I.INTRODUCTION based scheme for smart phones.The first challenge is to ef- Nowadays smart phones have been widely used in our fectively classify the user's activities during the photographing daily lives.These devices are usually equipped with sensors process,which contains various levels of activities of bodies, such as the camera,accelerometer,and gyroscope.Due to arms and wrists.To address this challenge,we propose a the portability of smart phones,more and more people tend three-tier architecture for activity sensing,including the body to take photos with their smart phones.However,energy movement,arm movement and wrist movement.Furthermore, saving continues to be a upsetting problem for smart camera by maintaining an activity state machine,we can accurately phones,since smart phone photographing is a very power determine the user's current activity states and make the hungry function.For example,according to KS Mobile's [1] corresponding energy saving strategies.The second challenge report in 2014,the application Camera 360 Ultimate is listed is to make an appropriate trade-off between the accuracy of in the first place of top 10 battery draining applications for activity sensing and energy consumption.In order to accurately Android.Therefore,the huge energy consumption becomes a perceive the user's activities with the embedded sensors,more non-negligible pain point for the users of smart camera phones. types of sensor data and higher sampling rates are required. However,this further causes more energy consumption.To Nevertheless,during the process of photographing,we address this challenge,our solution only leverages those observe that a fairly large proportion of the energy is wasted low power sensors,such as accelerometer and gyroscope,to in the preparations before shooting.For example,the user first classify the activities by extracting representative features to turns on the camera in the smart phone,then the user may distinguish the user's activities respectively.We further choose move and adjust the camera phone time and again,so as to sampling rates according to the user's current activities.In this find a view,finally,the user focuses on the object and presses way,we can sufficiently reduce the energy consumption of the button to shoot.A lot of energy is wasted in the process activity sensing so as to achieve the overall energy efficiency. between two consecutive shootings,since the camera phone uses the same settings like the frame rate during the whole We make the following contributions in three folds:First, process,and these settings requires basically equal power we propose a context aware energy-saving scheme for smart consumption in the camera phone.Besides,it is also not wise camera phones,by leveraging the embedded sensor to conduct to frequently turn on/off the camera function,since it is not activity sensing.Based on the activity sensing results,we can only very annoying but also not energy efficient.Therefore, make the corresponding energy saving strategies.Second,we it is essential to reduce the unnecessary energy consumption build a three-tier architecture for activity sensing,including during the photographing process to greatly extend the life the body movement,arm movement and wrist movement.We use low-power sensors like the accelerometer and gyroscope Corresponding Author:Dr.Lei Xie,Ixie@nju.edu.cn to extract representative features to distinguish the user's
A Context Aware Energy-Saving Scheme for Smart Camera Phones based on Activity Sensing Yuanyuan Fan, Lei Xie, Yafeng Yin, Sanglu Lu State Key Laboratory for Novel Software Technology, Nanjing University, P.R. China Email: fyymonica@dislab.nju.edu.cn, lxie@nju.edu.cn, yyf@dislab.nju.edu.cn, sanglu@nju.edu.cn Abstract—Nowadays more and more users tend to take photos with their smart phones. However, energy-saving continues to be a thorny problem for smart camera phones, since smart phone photographing is a very power hungry function. In this paper, we propose a context aware energy-saving scheme for smart camera phones, by accurately sensing the user’s activities in the photographing process. Our solution is based on the observation that during the process of photographing, most of the energy are wasted in the preparations before the shooting. By leveraging the embedded sensors like the accelerometer and gyroscope, our solution is able to extract representative features to perceive the user’s current activities including body movement, arm movement and wrist movement. Furthermore, by maintaining an activity state machine, our solution can accurately determine the user’s current activity states and make the corresponding energy saving strategies. Experiment results show that, our solution is able to perceive the user’s activities with an average accuracy of 95.5% and reduce the overall energy consumption by 46.5% for smart camera phones compared to that without energy-saving scheme. I. INTRODUCTION Nowadays smart phones have been widely used in our daily lives. These devices are usually equipped with sensors such as the camera, accelerometer, and gyroscope. Due to the portability of smart phones, more and more people tend to take photos with their smart phones. However, energysaving continues to be a upsetting problem for smart camera phones, since smart phone photographing is a very power hungry function. For example, according to KS Mobile’s [1] report in 2014, the application Camera 360 Ultimate is listed in the first place of top 10 battery draining applications for Android. Therefore, the huge energy consumption becomes a non-negligible pain point for the users of smart camera phones. Nevertheless, during the process of photographing, we observe that a fairly large proportion of the energy is wasted in the preparations before shooting. For example, the user first turns on the camera in the smart phone, then the user may move and adjust the camera phone time and again, so as to find a view, finally, the user focuses on the object and presses the button to shoot. A lot of energy is wasted in the process between two consecutive shootings, since the camera phone uses the same settings like the frame rate during the whole process, and these settings requires basically equal power consumption in the camera phone. Besides, it is also not wise to frequently turn on/off the camera function, since it is not only very annoying but also not energy efficient. Therefore, it is essential to reduce the unnecessary energy consumption during the photographing process to greatly extend the life Corresponding Author: Dr. Lei Xie, lxie@nju.edu.cn time of smart camera phones. However, the previous work in energy-saving schemes for smart phones have the following limitations: First, they mainly reduce the energy consumption in a fairly isolated approach, without sufficiently considering the user’s actual behaviors from the application perspective, this may greatly impact the user’s experience of the smart phones. Second, in regard to the energy-saving scheme for photographing, they mainly focus on the shooting process instead of the preparations before shooting. In this paper, we propose a context aware energy-saving scheme for smart camera phones, by accurately sensing the user’s activities in the photographing process. Our idea is that, since current smart phones are mostly equipped with tiny sensors such as the accelerometer and gyroscope, we can leverage these tiny sensors to effectively perceive the user’s activities, such that the corresponding energy-saving strategies can be applied according to the user’s activities. There are several challenges in building an activity sensingbased scheme for smart phones. The first challenge is to effectively classify the user’s activities during the photographing process, which contains various levels of activities of bodies, arms and wrists. To address this challenge, we propose a three-tier architecture for activity sensing, including the body movement, arm movement and wrist movement. Furthermore, by maintaining an activity state machine, we can accurately determine the user’s current activity states and make the corresponding energy saving strategies. The second challenge is to make an appropriate trade-off between the accuracy of activity sensing and energy consumption. In order to accurately perceive the user’s activities with the embedded sensors, more types of sensor data and higher sampling rates are required. However, this further causes more energy consumption. To address this challenge, our solution only leverages those low power sensors, such as accelerometer and gyroscope, to classify the activities by extracting representative features to distinguish the user’s activities respectively. We further choose sampling rates according to the user’s current activities. In this way, we can sufficiently reduce the energy consumption of activity sensing so as to achieve the overall energy efficiency. We make the following contributions in three folds: First, we propose a context aware energy-saving scheme for smart camera phones, by leveraging the embedded sensor to conduct activity sensing. Based on the activity sensing results, we can make the corresponding energy saving strategies. Second, we build a three-tier architecture for activity sensing, including the body movement, arm movement and wrist movement. We use low-power sensors like the accelerometer and gyroscope to extract representative features to distinguish the user’s
Walk L推UpAm Phone Rotate Fine-tuning and Shooting Lay Down Arm 40 80 100 20 40 Time(0.1s) Fig.1.The Process of Photographing activities.By maintaining an activity state machine,we can classify the user's activities in a very accurate approach.Third, we have implemented a system prototype in android-powered E10 smart camera phones,the experiment results shows that our solution is able to perceive the user's activities with an average accuracy of 95.5%and reduce the overall energy consumption by 46.5%for smart camera phones compared to that without energy-saving scheme. acc la gra gyro mf pro cam (a)Energy Consumption Compo-(b)Energy Consumption of Built-in nents Sensors II.SYSTEM OVERVIEW Fig.2.Energy Consumption.(a)pl-Samsung GT-19250,p2-Huawei H30- In order to adaptively reduce the power consumption based T10.p3-Samsung GT-19023,p4-Huawei G520-T10:(b)acc-accelerometer,la- on activity sensing,we first use the built-in sensors of the linear accelerometer,gra-gravity sensor.gyro-gyroscope,mf-magnetic field, phone to observe the human activities and discuss the energy pro-proxity,cam-camera. consumption during the process of photographing.Then,we introduce the system architecture of our proposed energy- preview mode.For the four phones,we can find that keeping saving scheme. screen on increases the power consumption by 20%,18%,27% and 26%,respectively.While using camera increases power A.Human activities related to photographing consumption by 63%,70%,53%and 57%,respectively.There- Usually,the users tend to have similar activities during pho- fore,in preparation for photographing,energy consumption is tographing.As shown in Fig.1,we use the linear accelerometer wasted on camera and screen. to detect the activities.Before or after the user takes photos. 2)Energy consumption of turning on/off the camera: he/she may stay motionless,walking or jogging.While taking Frequently turning on/off the phone is annoying and energy photos,the user usually lifts up the arm,rotates the phone, wasting.When we need to take photos frequently for a period makes fine-tuning,shoots a picture,then lays down the arm. of time,the camera may switch between on/off frequently and We categorize all the eight actions into the following three so is to screen.And user needs to press button and unlock parts: the screen with scrolling which results in large energy waste [3].Based on Table I and Eq.(1),we can find that energy Body movement:Motionless,walking and jogging TABLE I. ENERGY DATA OF TURNING ON/OFF PHONE (SAMSUNG Arm movement:Lifting up the arm and laying down GT-I9250) the arm Subjec Wrist movement:Rotating the phone,making fine- Energy Tumn On (Total) Tum Off (Total) Preview (1s) tuning and shooting a picture Energy(uAh】 769.71 400.71 16022 B.Energy consumption related to photographing consumption of turning on/off the phone one time can keep camera working in previewing mode for seven seconds,which Before we propose the energy-saving scheme to reduce the means one picture can be shot. power consumption based on the user's activities,we observe the energy consumption of the phone by using Monsoon power 769.71uAh+400.71uAh)/160.22Ah=7.305s(1) monitor [2]firstly It indicates that frequently turning on/off the phone manually 1)Energy consumption in preparation for photographing: is indeed inconvenient and rather energy-consuming. Power consumption for shooting a picture is large.We observe 3)Energy consumption of the sensors:Fig.2(b)shows the the power consumption in the following four android-based power consumption of a phone when a sensor is turned on. phones,i.e.,Samsung GT-19250,Huawei H30-T10,Samsung All these sensors work in their maximum sampling frequency, GT-19230 and Huawei G520-T10.In Fig.2(a),"Base"repre- i.e.,100Hz.When we turn on the camera and phone will be sents the phone's basic power when the phone is off."Display" in preview state,its increase of power is much larger than that represents the power when phone is idle and only keeps screen of other sensors.Therefore,low power-consuming sensors can on."Camera"represents the power when camera works in be used to reduce the energy consumption of photographing
Time (0.1s) 0 20 40 60 80 100 120 140 150 Linear Accelerometer -6 -4 -2 0 2 4 6 8 x-axis Walk Lift Up Arm Phone Rotate Fine-tuning and Shooting Lay Down Arm Walk A B C D Fig. 1. The Process of Photographing activities. By maintaining an activity state machine, we can classify the user’s activities in a very accurate approach. Third, we have implemented a system prototype in android-powered smart camera phones, the experiment results shows that our solution is able to perceive the user’s activities with an average accuracy of 95.5% and reduce the overall energy consumption by 46.5% for smart camera phones compared to that without energy-saving scheme. II. SYSTEM OVERVIEW In order to adaptively reduce the power consumption based on activity sensing, we first use the built-in sensors of the phone to observe the human activities and discuss the energy consumption during the process of photographing. Then, we introduce the system architecture of our proposed energysaving scheme. A. Human activities related to photographing Usually, the users tend to have similar activities during photographing. As shown in Fig. 1, we use the linear accelerometer to detect the activities. Before or after the user takes photos, he/she may stay motionless, walking or jogging. While taking photos, the user usually lifts up the arm, rotates the phone, makes fine-tuning, shoots a picture, then lays down the arm. We categorize all the eight actions into the following three parts: • Body movement: Motionless, walking and jogging • Arm movement: Lifting up the arm and laying down the arm • Wrist movement: Rotating the phone, making finetuning and shooting a picture B. Energy consumption related to photographing Before we propose the energy-saving scheme to reduce the power consumption based on the user’s activities, we observe the energy consumption of the phone by using Monsoon power monitor [2] firstly. 1) Energy consumption in preparation for photographing: Power consumption for shooting a picture is large. We observe the power consumption in the following four android-based phones, i.e., Samsung GT-I9250, Huawei H30-T10, Samsung GT-I9230 and Huawei G520-T10. In Fig. 2(a), “Base” represents the phone’s basic power when the phone is off. “Display” represents the power when phone is idle and only keeps screen on. “Camera” represents the power when camera works in p1 p2 p3 p4 Power (mW) 0 500 1000 1500 2000 2500 Base Display Camera (a) Energy Consumption Components acc la gra gyro mf pro cam Power (mW) 0 500 1000 1500 (b) Energy Consumption of Built-in Sensors Fig. 2. Energy Consumption. (a) p1-Samsung GT-I9250, p2-Huawei H30- T10, p3-Samsung GT-I9023, p4-Huawei G520-T10; (b) acc-accelerometer, lalinear accelerometer, gra-gravity sensor, gyro-gyroscope, mf-magnetic field, pro-proxity, cam-camera. preview mode. For the four phones, we can find that keeping screen on increases the power consumption by 20%, 18%, 27% and 26%, respectively. While using camera increases power consumption by 63%, 70%, 53% and 57%, respectively. Therefore, in preparation for photographing, energy consumption is wasted on camera and screen. 2) Energy consumption of turning on/off the camera: Frequently turning on/off the phone is annoying and energy wasting. When we need to take photos frequently for a period of time, the camera may switch between on/off frequently and so is to screen. And user needs to press button and unlock the screen with scrolling which results in large energy waste [3]. Based on Table I and Eq. (1), we can find that energy TABLE I. ENERGY DATA OF TURNING ON/OFF PHONE (SAMSUNG GT-I9250) Energy Subject Turn On (Total) Turn Off (Total) Preview (1s) Energy (uAh) 769.71 400.71 160.22 consumption of turning on/off the phone one time can keep camera working in previewing mode for seven seconds, which means one picture can be shot. (769.71uAh + 400.71uAh)/160.22uAh = 7.305s (1) It indicates that frequently turning on/off the phone manually is indeed inconvenient and rather energy-consuming. 3) Energy consumption of the sensors: Fig. 2(b) shows the power consumption of a phone when a sensor is turned on. All these sensors work in their maximum sampling frequency, i.e., 100Hz. When we turn on the camera and phone will be in preview state, its increase of power is much larger than that of other sensors. Therefore, low power-consuming sensors can be used to reduce the energy consumption of photographing
Sensors -Sensor Data- Segmentation -Data Segment (Choosing Low-Power Sensors) (Based on Pause between Actions) Linear Accelerometer Gyroscope Activity Sensing (Recognizing the activities using state machine) Gravity Sensor Set A Set B -State- Motionless Arm Lifting Up Mobile Rotating Energy Saving (Applying Different Schemes Walking Fine-Tuning pased on States) Maximum Body Energy Saved Jogging Arm Laying Down Shooting Medium Energy Saved Arm Body Arm Wrist Minimum Wrist Energy Saved Fig.3.System Architecture L00 According to the above observations,we can utilize the low energy-consuming built-in sensors of the phone to detect z-axis top the user's activities and reduce the energy consumption of taking photos.A simple example could be turning off the screen,decreasing the brightness of the screen,or decreasing x-axis the preview frame rate of the camera to reduce the energy cost when we find the user is not taking a photo. (a)Coordinates of Mo-(b)hold horizontally (c)hold horizontally C.System Architecture tion Sensors naturally backward The architecture of our system is shown in Figure 3. Fig.4.Coordinates of Phone and Direction of Phone Hold Firstly,we mainly obtain the data from low power-consuming built-in sensors,i.e..the linear accelerometer,the gyroscope and the gravity sensor,as shown in the "Sensor"module. Arm level:It includes lifting the arm up and laying Secondly.we separate the data into different regions,which the arm down.The relationship between the data of are corresponding to the users'actions,as shown in the gravity sensor and linear accelerometer is used to "Activity Sensing"module.Thirdly,we adaptively adopt an distinguish the two actions.And voting mechanism appropriate energy-saving scheme for each action,as shown is used to guarantee the accuracy. in the "Energy Saving"module.In the following paragraphs, we briefly introduce how we can realize the activity sensing Wrist level:It includes rotating the phone,making and reduce the power consumption. fine-tuning,and shooting a picture.We make use of 1)Activity Sensing:Based on section II-A,the user's ac- a linear SVM model to distinguish them with the variance,mean,maximum and minimum of three tions can be categorized into three parts:body movement,arm movement,wrist movement.Correspondingly,in our system axises of three sensors as its features. architecture,we call the above parts as body level,arm level, wrist level,respectively.In each level,there may be more than 2)Energy-saving Scheme:Based on the feature and energy one action.Besides,the different levels may exist some transfer consumption in each action/state,we propose an adaptive relations.Therefore,we use the State Machine to describe the energy-saving scheme for taking photos.For example,when specific actions of the user.In the State Machine,each action you walk,jog or stay motionless,it's unnecessary to keep the is represented as a state.The transferable relations between screen on.When you lift your arm up,it's better to turn on the states are shown in Fig.3.Before we determine the type the screen and adjust the screen's brightness based on the light of the action,we first estimate which level the action belongs conditions.When you make fine-tuning to observe the camera to.Then,we further infer the specific action of the user based view before shooting a picture,it's better to make the camera on more sensor information. work on the preview state.In this way,we can make the context aware energy-saving schemes for the camera phones. Body level:It includes motionless,walking and jog- ging.Motionlessness can be recognized with its low variance of linear accelerometer's data.Then walk- III.SYSTEM DESIGN ing and jogging are distinguished with the frequency which can be calculated using Fast Fourier Transfor- In this section,we present the design of our energy-saving mation. scheme for smart camera phones based on activity sensing
Sensors Linear Accelerometer Gyroscope Gravity Sensor Segmentation Activity Sensing Body Arm Wrist Arm Lifting Up Arm Laying Down Mobile Rotating Fine-Tuning Shooting Motionless Walking Jogging Set A Set B Maximum Energy Saved Body Medium Energy Saved Arm Minimum Energy Saved Wrist Energy Saving (Choosing Low-Power Sensors) (Based on Pause between Actions) Sensor Data Data Segment (Applying Different Schemes based on States) State (Recognizing the activities using state machine) Fig. 3. System Architecture According to the above observations, we can utilize the low energy-consuming built-in sensors of the phone to detect the user’s activities and reduce the energy consumption of taking photos. A simple example could be turning off the screen, decreasing the brightness of the screen, or decreasing the preview frame rate of the camera to reduce the energy cost when we find the user is not taking a photo. C. System Architecture The architecture of our system is shown in Figure 3. Firstly, we mainly obtain the data from low power-consuming built-in sensors, i.e., the linear accelerometer, the gyroscope and the gravity sensor, as shown in the “Sensor” module. Secondly, we separate the data into different regions, which are corresponding to the users’ actions, as shown in the “Activity Sensing” module. Thirdly, we adaptively adopt an appropriate energy-saving scheme for each action, as shown in the “Energy Saving” module. In the following paragraphs, we briefly introduce how we can realize the activity sensing and reduce the power consumption. 1) Activity Sensing: Based on section II-A, the user’s actions can be categorized into three parts: body movement, arm movement, wrist movement. Correspondingly, in our system architecture, we call the above parts as body level, arm level, wrist level, respectively. In each level, there may be more than one action. Besides, the different levels may exist some transfer relations. Therefore, we use the State Machine to describe the specific actions of the user. In the State Machine, each action is represented as a state. The transferable relations between the states are shown in Fig. 3. Before we determine the type of the action, we first estimate which level the action belongs to. Then, we further infer the specific action of the user based on more sensor information. • Body level: It includes motionless, walking and jogging. Motionlessness can be recognized with its low variance of linear accelerometer’s data. Then walking and jogging are distinguished with the frequency which can be calculated using Fast Fourier Transformation. x-axis y-axis z-axis top (a) Coordinates of Motion Sensors top top (b) hold horizontally naturally top top (c) hold horizontally backward Fig. 4. Coordinates of Phone and Direction of Phone Hold • Arm level: It includes lifting the arm up and laying the arm down. The relationship between the data of gravity sensor and linear accelerometer is used to distinguish the two actions. And voting mechanism is used to guarantee the accuracy. • Wrist level: It includes rotating the phone, making fine-tuning, and shooting a picture. We make use of a linear SVM model to distinguish them with the variance, mean, maximum and minimum of three axises of three sensors as its features. 2) Energy-saving Scheme: Based on the feature and energy consumption in each action/state, we propose an adaptive energy-saving scheme for taking photos. For example, when you walk, jog or stay motionless, it’s unnecessary to keep the screen on. When you lift your arm up, it’s better to turn on the screen and adjust the screen’s brightness based on the light conditions. When you make fine-tuning to observe the camera view before shooting a picture, it’s better to make the camera work on the preview state. In this way, we can make the context aware energy-saving schemes for the camera phones. III. SYSTEM DESIGN In this section, we present the design of our energy-saving scheme for smart camera phones based on activity sensing
End of a 40 80 80 100 2 180 Time (0.1s) Fig.5.Segment the Data of Linear Accelerometer A.Activity Sensing i00 1)Raw Data Collection:We collect data from linear ac- otionlessness celerometer,gyroscope and gravity sensor of android phones. These sensors have their own coordinates as shown in Figures 4(a),which are different from the earth coordinate system 100120 180 For example,when we hold the phone as Fig.4(b),the data 0 Variance of gravity sensor's x-axis almost equals to g(9.8m/s2).When we hold the phone as Fig.4(c),the result is opposite. Fig.7. CDF of Variance of Y-axis of Three Actions in Body Level 2)Action Segmentation:From sensors,we can only get sequential raw data.To do the following activity sensing,data point of the next segment and return back to calculate the should be segmented as one segment corresponds to one action. variance in the window of linear accelerometer's data.Fourth Observation.For a user,there is always a short pause if there is too much data before the second eligible variance between two different actions shown with red rectangles (A showing up,a maximum segment size is set to segment data. B,and D)in Figure 1.However,some actions like fine- The maximum size is set as ten times of the value of sensor's tuning and shooting are very gentle,it's difficult for the linear sampling rate because most of the actions in arm and wrist accelerometer to detect the pause from the actions shown with level won't last for more than 10 seconds for common people. blue rectangle (D)in Figure 5.On this occasion,gyroscope is 3)Action Recognition:We first do the recognition in three used to assist for the segmentation because it's more sensitive levels respectively.Then we describe how to do recognition with the motion.The gyroscope data corresponding to the data among levels. in rectangle D is shown in Figures 6 and the pause between actions can be detected as shown with red rectangles (D1/D2). Body Level:Body level includes three actions which Back to Figure 5,one action which lasts for a long time shown are motionlessness,walking and jogging.They are important with purple rectangle(E),may bring computational overhead. actions which connect two shoots.As the movements of walking and jogging are very obvious,we take advantage of linear accelerometer to classify the actions. 0 Observation.Motionlessness is easy to be distinguished D1 D2 x-axis from walk and jog because of its low variance of raw data from linear accelerometer.Figures 7 shows the distribution of three z-axis actions'variances.Variance of motionlessness is almost zero 80 85 90 95 100 105 110 Time (0.1s) and can be clearly distinguished.While walking and jogging can't be distinguished only based on the variances as they have Fig.6.Raw Data of Gyroscope Corresponding to Rectangle D in Fig 5 some overlaps. Segmentation.First,we leverage a sliding window to We hold the phone like Fig 4(b)and use linear accelerom- continuously calculate the variances of data of linear ac- eter to get raw data of walking and jogging shown in Figures celerometer's three axises.Second,if all three variances are 8(a)and 8(c).We apply Fast Fourier Transform on the data and below a threshold,the window is regraded as the start/end of get the spectrum.Fig 8(b)shows that the frequency of walking a segment shown with green rectangle (B/C)in Fig 5.The is about 1 Hz.Fig 8(d)shows that the frequency of jogging window size is set as half of the value of sensor's sampling is about 3.5 Hz.Thus,these two actions can be distinguished frequency as the pause between two continuous actions is using frequency always less than half a second.Third,when two continuous Recognition in Body Level.Effected by the holding gesture, sliding windows whose variances are both below the threshold, the changes of data in three axises are different.(1)We first we use the corresponding data of gyroscope and calculate the determine the axis whose data will be used.To common variance in the sliding window.If two continuous windows people,the phone can be held perpendicularly or parallel whose variances are below the threshold in gyroscope,we to the ground.If the phone is held perpendicularly to the continue to calculate until a window whose variance above ground,the data of z-axis doesn't change a lot in this level.If threshold is found.Then,this part is regarded as a segment. phone is held parallel to the ground,the data of x-axis doesn't After that,we will take last sample of the window as a start change obviously.Therefore,we use the data of y-axis.(2)We
Time (0.1s) 0 20 40 60 80 100 120 140 160 180 Linear Accelerometer -10 -5 0 5 10 x-axis y-axis z-axis Start of a Segment End of a Segment A B C D E Sliding Window Fig. 5. Segment the Data of Linear Accelerometer A. Activity Sensing 1) Raw Data Collection: We collect data from linear accelerometer, gyroscope and gravity sensor of android phones. These sensors have their own coordinates as shown in Figures 4(a), which are different from the earth coordinate system. For example, when we hold the phone as Fig. 4(b), the data of gravity sensor’s x-axis almost equals to g (9.8m/s2 ). When we hold the phone as Fig. 4(c) , the result is opposite. 2) Action Segmentation: From sensors, we can only get sequential raw data. To do the following activity sensing, data should be segmented as one segment corresponds to one action. Observation. For a user, there is always a short pause between two different actions shown with red rectangles (A, B, and D) in Figure 1. However, some actions like finetuning and shooting are very gentle, it’s difficult for the linear accelerometer to detect the pause from the actions shown with blue rectangle (D) in Figure 5. On this occasion, gyroscope is used to assist for the segmentation because it’s more sensitive with the motion. The gyroscope data corresponding to the data in rectangle D is shown in Figures 6 and the pause between actions can be detected as shown with red rectangles (D1/D2). Back to Figure 5, one action which lasts for a long time shown with purple rectangle (E), may bring computational overhead. Time (0.1s) 80 85 90 95 100 105 110 Gyroscope -4 -2 0 2 x-axis y-axis z-axis D1 D2 Fig. 6. Raw Data of Gyroscope Corresponding to Rectangle D in Fig 5 Segmentation. First, we leverage a sliding window to continuously calculate the variances of data of linear accelerometer’s three axises. Second, if all three variances are below a threshold, the window is regraded as the start/end of a segment shown with green rectangle (B/C) in Fig 5. The window size is set as half of the value of sensor’s sampling frequency as the pause between two continuous actions is always less than half a second. Third, when two continuous sliding windows whose variances are both below the threshold, we use the corresponding data of gyroscope and calculate the variance in the sliding window. If two continuous windows whose variances are below the threshold in gyroscope, we continue to calculate until a window whose variance above threshold is found. Then, this part is regarded as a segment. After that, we will take last sample of the window as a start Variance 0 20 40 60 80 100 120 140 160 180 200 CDF 0 0.2 0.4 0.6 0.8 1 jog walk motionlessness 5 Fig. 7. CDF of Variance of Y-axis of Three Actions in Body Level point of the next segment and return back to calculate the variance in the window of linear accelerometer’s data. Fourth, if there is too much data before the second eligible variance showing up, a maximum segment size is set to segment data. The maximum size is set as ten times of the value of sensor’s sampling rate because most of the actions in arm and wrist level won’t last for more than 10 seconds for common people. 3) Action Recognition: We first do the recognition in three levels respectively. Then we describe how to do recognition among levels. Body Level: Body level includes three actions which are motionlessness, walking and jogging. They are important actions which connect two shoots. As the movements of walking and jogging are very obvious, we take advantage of linear accelerometer to classify the actions. Observation. Motionlessness is easy to be distinguished from walk and jog because of its low variance of raw data from linear accelerometer. Figures 7 shows the distribution of three actions’ variances. Variance of motionlessness is almost zero and can be clearly distinguished. While walking and jogging can’t be distinguished only based on the variances as they have some overlaps. We hold the phone like Fig 4(b) and use linear accelerometer to get raw data of walking and jogging shown in Figures 8(a) and 8(c). We apply Fast Fourier Transform on the data and get the spectrum. Fig 8(b) shows that the frequency of walking is about 1 Hz. Fig 8(d) shows that the frequency of jogging is about 3.5 Hz. Thus, these two actions can be distinguished using frequency. Recognition in Body Level. Effected by the holding gesture, the changes of data in three axises are different. (1) We first determine the axis whose data will be used. To common people, the phone can be held perpendicularly or parallel to the ground. If the phone is held perpendicularly to the ground, the data of z-axis doesn’t change a lot in this level. If phone is held parallel to the ground, the data of x-axis doesn’t change obviously. Therefore, we use the data of y-axis. (2) We
三 0 10 (a):Lift Up with Phone Rotating 0 2 (a):Time (0.1s) (b):Frequency (Hz) (b):Lift Up with Phone Rotating 0 -e-Product 0 (c):Lift Up with Phone Rotating (c):Time (0.1s) (d):Frequency (Hz) Fig.10.Lift up the phone with rotating 360 degrees Fig.8.Frequencies of walking and jogging.(a)shows raw data of walking and (b)shows its frequency.(c)shows raw data of jogging and (d)shows its direction.Therefore only the data of linear accelerometer's x- frequency. axis is showed.In Figures 9(a),when lifting up your arm,the value of gravity sensor stays positive while the value of linear 10 10 Gravity x-axis x-axis accelerometer changes from positive to negative.It means the -Linear Acc x-axis Linear Acc x-axis signs of two sensors'value change from same to different. 4 64 In Figures 9(b),the signs of two sensors'value change from 2 2 different to same.When the phone is held as Figures 4(c).the 0 corresponding sensor data is showed in Figures 9(c)and (d) When lifting up your arm,the signs of x-axis value of two (a):Lift up arm when phone (b):Lay down arm when phone sensors still change form same to different.And when laying hold horizontal normal hold horizontal normal down your arm,the result is opposite. However,the phone may be held in hand in any gesture. 0 We lift up the phone and rotate 360 degrees at the same time. 2 The data of gravity sensor and linear accelerometer is showed in Figures 10 (a)and (b).We can't simply figure out the Gravity x-axis Gravity x-axis Linear Acc x-axis 的 -Linear Acc x-axis relationship between two sensors.The data in three axises of 10 gravity sensor,whose absolute value is maximum,is chosen (c):Lift up arm when phone (d):Lay down arm when phone as shown with black circle in Fig 10(a).And the data of linear hold horizontal backward hold horizontal backward accelerometer of corresponding axis is chosen,as shown with black circles in Fig 10(b).We multiply the two corresponding Fig.9.Data of linear acceleration and gravity sensor of arm level when data and the result is showed in Figures 10(c).We can find phone held horizontally in normal and in backward direction that the sign changes from positive to negative.In summary, when you lift arm up,the signs of gravity sensor and linear calculate the variance of y-axis of linear accelerometer.If it is accelerometer change from same to different.When you lay less than a threshold,the action is regarded as motionlessness. down arm,the change is diametrically opposite. The threshold is set to 5 according to Figure 7.(3)If the action Recognition in Arm Level.The maximum absolute data of is not motionlessness,we apply FFT to the data segment.In gravity sensor and the corresponding data of linear accelerom- general,the frequency of walk ranges from I Hz to 3 Hz and eter are chosen.Then we analyze the relationship between that of jog is 3 Hz to 6 Hz.Therefore,if the frequency is the two sensors and then apply voting mechanism to avoid less than 3.the action is recognized as walking.Otherwise. the noise made by hands tremble.At last,if the signs of two it's jogging. sensors'selected data change from same to different,the state Arm Level:Arm level contains two actions,arm lifting up should be arm lifting up.Otherwise,the state should be arm laying down.The specific process is showed in Algorithm 1. and laying down.They are actions which connect body and wrist level.After you lift up your arm,you may start shooting Wrist Level:Wrist movement contains phone rotating, After you lay down your arm,the shooting may end. fine-tuning and shooting.Picture is shot in this level Observation.Arms lifting up and laying down are two Observation.The raw data of three axises of linear ac- reversed actions.When we hold the phone horizontally in celerometer of three actions are showed in Figures 11(a)and normal direction as Fig 4(b),we get sensors'data of lifting arm 11(b).From Figure 11(a),phone rotating can be distinguished up shown in Fig 9(a)and laying arm down shown in Figures by using a plane.From Figure 11(b),the other two actions 9 (b).Under this situation,the data of linear accelerometer can be distinguished.Therefore,a classifier as Support Vector will change mostly in x-axis as the motion happens in its Machine (SVM)can be used for classification.We take the
(a): Time (0.1s) 0 10 20 30 40 50 Linear Acceleration (m/s 2 ) -6 -4 -2 0 2 4 6 8 x-axis y-axis z-axis (b): Frequency (Hz) 012345 Amplitude 0 20 40 60 80 100 x-axis y-axis z-axis (c): Time (0.1s) 0 20 40 60 Linear Acceleration (m/s 2 ) -15 -10 -5 0 5 10 15 x-axis y-axis z-axis (d): Frequency (Hz) 012345 Amplitude 0 50 100 150 200 x-axis y-axis z-axis Fig. 8. Frequencies of walking and jogging. (a) shows raw data of walking and (b) shows its frequency. (c) shows raw data of jogging and (d) shows its frequency. (a): Lift up arm when phone hold horizontal normal 0 2 4 6 8 10 Sensor Data -4 -2 0 2 4 6 8 10 Gravity x-axis Linear Acc x-axis (b): Lay down arm when phone hold horizontal normal 0 2 4 6 8 10 Sensor Data -4 -2 0 2 4 6 8 10 Gravity x-axis Linear Acc x-axis (c): Lift up arm when phone hold horizontal backward 0 2 4 6 8 10 Sensor Data -10 -8 -6 -4 -2 0 2 4 Gravity x-axis Linear Acc x-axis (d): Lay down arm when phone hold horizontal backward 0 2 4 6 8 10 Sensor Data -10 -8 -6 -4 -2 0 2 Gravity x-axis Linear Acc x-axis Fig. 9. Data of linear acceleration and gravity sensor of arm level when phone held horizontally in normal and in backward direction calculate the variance of y-axis of linear accelerometer. If it is less than a threshold, the action is regarded as motionlessness. The threshold is set to 5 according to Figure 7. (3) If the action is not motionlessness, we apply FFT to the data segment. In general, the frequency of walk ranges from 1 Hz to 3 Hz and that of jog is 3 Hz to 6 Hz. Therefore, if the frequency is less than 3, the action is recognized as walking. Otherwise, it’s jogging. Arm Level: Arm level contains two actions, arm lifting up and laying down. They are actions which connect body and wrist level. After you lift up your arm, you may start shooting. After you lay down your arm, the shooting may end. Observation. Arms lifting up and laying down are two reversed actions. When we hold the phone horizontally in normal direction as Fig 4(b), we get sensors’ data of lifting arm up shown in Fig 9(a) and laying arm down shown in Figures 9 (b). Under this situation, the data of linear accelerometer will change mostly in x-axis as the motion happens in its (a): Lift Up with Phone Rotating Gravity Sensor 0 1 2 3 4 5 6 7 8 9 10 -10 -5 0 5 10 x-axis y-axis z-axis biggest absolute data (b): Lift Up with Phone Rotating 0 1 2 3 4 5 6 7 8 9 10 Linear Accelerometer -6 -4 -2 0 2 x-axis y-axis z-axis corresponding data (c): Lift Up with Phone Rotating 0 1 2 3 4 5 6 7 8 9 10 Product of Sensors -40 -20 0 20 40 60 Product Fig. 10. Lift up the phone with rotating 360 degrees direction. Therefore only the data of linear accelerometer’s xaxis is showed. In Figures 9(a), when lifting up your arm, the value of gravity sensor stays positive while the value of linear accelerometer changes from positive to negative. It means the signs of two sensors’ value change from same to different. In Figures 9(b), the signs of two sensors’ value change from different to same. When the phone is held as Figures 4(c), the corresponding sensor data is showed in Figures 9(c) and (d). When lifting up your arm, the signs of x-axis value of two sensors still change form same to different. And when laying down your arm, the result is opposite. However, the phone may be held in hand in any gesture. We lift up the phone and rotate 360 degrees at the same time. The data of gravity sensor and linear accelerometer is showed in Figures 10 (a) and (b). We can’t simply figure out the relationship between two sensors. The data in three axises of gravity sensor, whose absolute value is maximum, is chosen as shown with black circle in Fig 10(a). And the data of linear accelerometer of corresponding axis is chosen, as shown with black circles in Fig 10(b). We multiply the two corresponding data and the result is showed in Figures 10(c). We can find that the sign changes from positive to negative. In summary, when you lift arm up, the signs of gravity sensor and linear accelerometer change from same to different. When you lay down arm, the change is diametrically opposite. Recognition in Arm Level. The maximum absolute data of gravity sensor and the corresponding data of linear accelerometer are chosen. Then we analyze the relationship between the two sensors and then apply voting mechanism to avoid the noise made by hands tremble. At last, if the signs of two sensors’ selected data change from same to different, the state should be arm lifting up. Otherwise, the state should be arm laying down. The specific process is showed in Algorithm 1. Wrist Level: Wrist movement contains phone rotating, fine-tuning and shooting. Picture is shot in this level. Observation. The raw data of three axises of linear accelerometer of three actions are showed in Figures 11(a) and 11(b). From Figure 11(a), phone rotating can be distinguished by using a plane. From Figure 11(b), the other two actions can be distinguished. Therefore, a classifier as Support Vector Machine (SVM) can be used for classification. We take the