162:6· 5000 250 三4000 200 3000 150 2000 00 言100 100 10 50 100 0 100 150 50 100 150 Sampling point Sampling point Sampling point Sampling point (a)Velocity (Touch sensor) (b)Direction (Touch sensor) (c)Acceleration-Z (Inertial sensor)(d)Rotation-Y (Inertial sensor) Fig.3.Touch sensor data and inertial sensor data of two touch gestures for user 1 1000 250 03 800 一Sampk 0 mple 600 200W 100 20 4060 80 20 406080 20406080 20 4060 Sampling point Sampling point Sampling point Sampling point (a)Velocity (Touch sensor) (b)Direction (Touch sensor)(c)Acceleration-Z(Inertial sensor)(d)Rotation-Y (Inertial sensor) Fig.4.Touch sensor data and inertial sensor data of two touch gestures for user 2 Android API [27],the measured pressure in many smartphones is either 0 or 1,i.e.,touching or non-touching, which is too coarse-grained to analyze the touch force.Therefore,we use the device's motion caused by the touch gesture to represent the pressure indirectly.As shown in Fig.2,when performing a touch gesture,we can obtain the moving trajectory and touch sizes of the fingertip along the time from the touch sensor.In regard to the device's motion,it is caused by the resultant force from the gravity Fg,the hand grasping the phone F and the fingertip pressing the screen Fp.When holding the device statically in a fixed orientation,the forces from the gravity and the hand can be treated as constant forces,thus the device's motion is mainly caused by the finger's pressure.That is to say,the device's motion measured by the embedded accelerometer and gyroscope can represent the finger's pressure.Consequently,we can combine the on-screen gestures and the device's motion to describe a touch gesture on the mobile device. 3.2 Feasibility of User Authentication The touch gestures demonstrate the stability of gestures from the same user,while demonstrating the discriminability of gestures from different users.To explore whether the touch gesture can be used for user authentication,we first invite two users and each one performs the gesture 'L'twice on the same smartphone,as shown in Fig.2.In Fig. 3 and Fig.4,we show the velocity and direction inferred from the touch sensor data,as well as linear acceleration in z-axis and angular velocity in y-axis measured from the inertial sensor for each user,respectively.The solid and dashed lines in the same figure indicate that the gestures from the same user have a high consistency.When comparing the figures in the same column of Fig.3 and Fig.4,we can conclude that the gestures from different users have differences in sensor data. To measure the similarity(or difference)between the sensor data from different gestures,we introduce the operations of normalization and interpolation,as well as the metric of Root Mean Squard Error(RMSE)[6]. For simplicity,we use die [1,nl,d2 i e [1,n]to represent the time-series data for the first gesture and Proc.ACM Interact.Mob.Wearable Ubiquitous Technol.,Vol.37,No.4,Article 162.Publication date:December 2020
162:6 • (a) Velocity (Touch sensor) (b) Direction (Touch sensor) (c) Acceleration-Z (Inertial sensor) (d) Rotation-Y (Inertial sensor) Fig. 3. Touch sensor data and inertial sensor data of two touch gestures for user 1 (a) Velocity (Touch sensor) (b) Direction (Touch sensor) (c) Acceleration-Z (Inertial sensor) (d) Rotation-Y (Inertial sensor) Fig. 4. Touch sensor data and inertial sensor data of two touch gestures for user 2 Android API [27], the measured pressure in many smartphones is either 0 or 1, i.e., touching or non-touching, which is too coarse-grained to analyze the touch force. Therefore, we use the device’s motion caused by the touch gesture to represent the pressure indirectly. As shown in Fig. 2, when performing a touch gesture, we can obtain the moving trajectory and touch sizes of the fingertip along the time from the touch sensor. In regard to the device’s motion, it is caused by the resultant force from the gravity 𝐹𝑔, the hand grasping the phone 𝐹ℎ and the fingertip pressing the screen 𝐹𝑝 . When holding the device statically in a fixed orientation, the forces from the gravity and the hand can be treated as constant forces, thus the device’s motion is mainly caused by the finger’s pressure. That is to say, the device’s motion measured by the embedded accelerometer and gyroscope can represent the finger’s pressure. Consequently, we can combine the on-screen gestures and the device’s motion to describe a touch gesture on the mobile device. 3.2 Feasibility of User Authentication The touch gestures demonstrate the stability of gestures from the same user, while demonstrating the discriminability of gestures from different users. To explore whether the touch gesture can be used for user authentication, we first invite two users and each one performs the gesture ‘L’ twice on the same smartphone, as shown in Fig. 2. In Fig. 3 and Fig. 4, we show the velocity and direction inferred from the touch sensor data, as well as linear acceleration in z-axis and angular velocity in y-axis measured from the inertial sensor for each user, respectively. The solid and dashed lines in the same figure indicate that the gestures from the same user have a high consistency. When comparing the figures in the same column of Fig. 3 and Fig. 4, we can conclude that the gestures from different users have differences in sensor data. To measure the similarity (or difference) between the sensor data from different gestures, we introduce the operations of normalization and interpolation, as well as the metric of Root Mean Squard Error (RMSE) [6]. For simplicity, we use 𝑑1𝑖 ,𝑖 ∈ [1, 𝑛1], 𝑑2𝑖 ,𝑖 ∈ [1, 𝑛2] to represent the time-series data for the first gesture and Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 37, No. 4, Article 162. Publication date: December 2020
TouchID:User Authentication on Mobile Devices via Inertial-Touch Gesture Analysis.162:7 the second gesture,and then we normalize dpp [1,2]with Eq.(1).Here,dp means the normalized data, dpmin={dp.Idp:≤dp,j≠ih,dpmax={dp,ldpu≥dpy,j≠i. dp-dpmin dpn dmax (1) 0 dpmin=dpmax Considering that the length of d and d2 can be different,i.e.,n n2,we introduce a linear interpolation algorithm [10]to make the length of d'and d be equal.Suppose n>n2.we need to interpolate data points into d2 to change the length of d2,as n.Consequently,the interval between consecutive data points is changed to and the kth data point for the second gesture is changed to ()where whend =d4,+(d1-d☑,)*(k.2- n1-1 =k (2) At this time,we have obtained the time-series datad and d with the same length m.Then,we can use Eq.(3) to calculate the similarity (or difference),i.e.,RMSE value r12,between them.Here,r12E[0,1],the smaller the value of r12,the higher the similarity (i.e.,the smaller the difference). ∑-) 1 T12 (3) To use the RMSE value to illustrate the stability or discriminability among gestures,we invite three users and each user performs gesture 'L'50 times.Then we calculate the RMSE value of sensor data from the same user and that from different users,respectively.According to Fig.5,the RMSE value corresponding to the same user (i.e.,Ui-Uj,i=j)is generally less than that corresponding to different users(i.e.,Ui-Uj',ij).It indicates that the gestures from the same user keep the similarity while the gestures from different users have unavoidable difference. 0.4f 06 06 a.5 白 0.3 20.3 204 0.2 02 白白自 中日白 01 UI-I U2-U2 U3-0B UI-4R2 UI-43 U203 UI-UI U2-02 LB-03 UI-02 UI-U3 U2-UG UI-UI U2-02 13-03 UI-2 UI-3 U2-03 User X-User Y User X.User Y User X-User Y User X-User Y (a)Velocity (b)Direction (c)Acceleration-Z (d)Rotation-Y Fig.5.The RMSE values between three users along each sensor data 3.3 Sensor Data Alignment in Space Domain The sensor data of touch gestures has unavoidable differences in the time domain while keeping non-negligible consistencies in the space domain.Due to the uncertainty of user behaviors,when a user performs the same gesture multiple times,the duration of each gesture can be different,which will reduce the stability of the gestures from the same user and lead to an authentication error.However,due to the layout constraint of the on-screen gesture, i.e.,the fixed locations of nodes,the trajectories of the gestures corresponding to the same graphic pattern keep essential consistencies.This motivates us to align the sensor data of the gestures based on the graphic pattern,to improve the stability of gestures from the same user. Proc.ACM Interact.Mob.Wearable Ubiquitous Technol.,Vol.37,No.4,Article 162.Publication date:December 2020
TouchID: User Authentication on Mobile Devices via Inertial-Touch Gesture Analysis • 162:7 the second gesture, and then we normalize 𝑑𝑝𝑖 , 𝑝 ∈ [1, 2] with Eq. (1). Here, 𝑑 ′ 𝑝𝑖 means the normalized data, 𝑑𝑝𝑚𝑖𝑛 = {𝑑𝑝𝑖 |𝑑𝑝𝑖 ≤ 𝑑𝑝𝑗 , ∀𝑗 ≠ 𝑖}, 𝑑𝑝𝑚𝑎𝑥 = {𝑑𝑝𝑖 |𝑑𝑝𝑖 ≥ 𝑑𝑝𝑗 , ∀𝑗 ≠ 𝑖}. 𝑑 ′ 𝑝𝑖 = 𝑑𝑝𝑖 − 𝑑𝑝𝑚𝑖𝑛 𝑑𝑝𝑚𝑎𝑥 − 𝑑𝑝𝑚𝑖𝑛 ,𝑑𝑝𝑚𝑖𝑛 ≠ 𝑑𝑚𝑎𝑥, 0 ,𝑑𝑝𝑚𝑖𝑛 = 𝑑𝑝𝑚𝑎𝑥 . (1) Considering that the length of 𝑑 ′ 1𝑖 and 𝑑 ′ 2𝑖 can be different, i.e., 𝑛1 ≠ 𝑛2, we introduce a linear interpolation algorithm [10] to make the length of 𝑑 ′ 1𝑖 and 𝑑 ′ 2𝑖 be equal. Suppose 𝑛1 > 𝑛2, we need to interpolate data points into 𝑑 ′ 2𝑖 to change the length of 𝑑 ′ 2𝑖 as 𝑛1. Consequently, the interval between consecutive data points is changed to 𝑛2−1 𝑛1−1 , and the 𝑘th data point for the second gesture is changed to Eq. (2), where 𝑘 ∈ [2, 𝑛1], when 𝑘 = 1, 𝑑 ′ 2𝑘 = 𝑑21 . 𝑑 ′ 2𝑘 = 𝑑2𝑖 + (𝑑2𝑖+1 − 𝑑2𝑖 ) ∗ (𝑘 · 𝑛2 − 1 𝑛1 − 1 − 𝑖),𝑖 = ⌊𝑘 · 𝑛2 − 1 𝑛1 − 1 ⌋ (2) At this time, we have obtained the time-series data 𝑑 ′ 1𝑖 and 𝑑 ′ 2𝑘 with the same length 𝑛1. Then, we can use Eq. (3) to calculate the similarity (or difference), i.e., RMSE value 𝑟12, between them. Here, 𝑟12 ∈ [0, 1], the smaller the value of 𝑟12, the higher the similarity (i.e., the smaller the difference). 𝑟12 = vut 1 𝑛1 𝑘 Õ=𝑛1 𝑘=1 (𝑑 ′ 1𝑘 − 𝑑 ′ 2𝑘 ) (3) To use the RMSE value to illustrate the stability or discriminability among gestures, we invite three users and each user performs gesture ‘L’ 50 times. Then we calculate the RMSE value of sensor data from the same user and that from different users, respectively. According to Fig. 5, the RMSE value corresponding to the same user (i.e., ‘U𝑖-U𝑗’, 𝑖 = 𝑗) is generally less than that corresponding to different users (i.e., ‘U𝑖-U𝑗’, 𝑖 ≠ 𝑗). It indicates that the gestures from the same user keep the similarity while the gestures from different users have unavoidable difference. U1-U1 U2-U2 U3-U3 U1-U2 U1-U3 U2-U3 User X - User Y 0.1 0.2 0.3 0.4 0.5 0.6 RMSE | | | | | | | | | | (a) Velocity U1-U1 U2-U2 U3-U3 U1-U2 U1-U3 U2-U3 User X - User Y 0.2 0.3 0.4 0.5 0.6 RMSE | | | | | | | | | | (b) Direction U1-U1 U2-U2 U3-U3 U1-U2 U1-U3 U2-U3 User X - User Y 0.1 0.2 0.3 0.4 RMSE | | | | | | | | | | (c) Acceleration-Z U1-U1 U2-U2 U3-U3 U1-U2 U1-U3 U2-U3 User X - User Y 0.1 0.2 0.3 0.4 RMSE | | | | | | | | | | (d) Rotation-Y Fig. 5. The RMSE values between three users along each sensor data 3.3 Sensor Data Alignment in Space Domain The sensor data of touch gestures has unavoidable differences in the time domain while keeping non-negligible consistencies in the space domain. Due to the uncertainty of user behaviors, when a user performs the same gesture multiple times, the duration of each gesture can be different, which will reduce the stability of the gestures from the same user and lead to an authentication error. However, due to the layout constraint of the on-screen gesture, i.e., the fixed locations of nodes, the trajectories of the gestures corresponding to the same graphic pattern keep essential consistencies. This motivates us to align the sensor data of the gestures based on the graphic pattern, to improve the stability of gestures from the same user. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 37, No. 4, Article 162. Publication date: December 2020
1628 0.20 0.1 0. 0.15 Userd 0. -0.2 0.05 0.3 03 04 400600800100012001400 0 100200300400500600700 Duration of touch gestures(ms) Time(ms) Normalized Time (a)Duration distribution for four users (b)Angular velocity data in y-axis of two sam- (c)Temporal alignment results ples from the same user Fig.6.Temporal characteristics of touch gestures 00 0减xh 0.1 -Sample2 1前 Ns' -02 03 Ns -04 0 0 x(pixel) x(pixel) Normalized Time (a)Sensor data is spatially consistent (b)Spatial distribution and definition of nodes (c)Spatial alignment results Fig.7.Spatial characteristics of touch gestures In fact,data alignment is an effective way in the data processing and has been used in many scenarios,such as signal alignment in communications(e.g.,beam alignment in RADAR [18],optical axis alignment of the transmitter and receiver in LiDAR [15],C/A code alignment of the receiver and satellite in GPS [32]),point matching in point set registration [33,39],sequence alignment in videos [5],and so on.Take the sequence alignment task[5]as an example,they leverage both spatial displacement and temporal variations between image frames as cues,to correlate two different video sequences of the same dynamic scene in time and in space Differently,we adopt the layout constraint in space domain as spatial cues to align the time-series sensor data,as described below. Unavoidable time difference among touch gestures:To demonstrate the time difference among touch gestures,we invite four users to perform the gesture 'L'on the screen,as shown in Fig.2.Each one performs the same gesture 50 times.As shown in Fig.6(a),the durations of gestures corresponding to the same graphic pattern 'L'can be different,whether the gestures are performed by the same user or different users.Specifically, in Fig.6(b),we show the angular velocities in y-axis of two gestures corresponding to 'L'from the same user. The duration difference between the two gestures(i.e.,sample 1 and sample 2)is about 100 ms.At this time,to calculate the similarity between them,the temporal alignment method is often adopted,e.g,using the linear interpolation algorithm [10]in time domain to make the number of data points in sample 1 and that in sample 2 be equal,as shown in Fig.6(c).However,this temporal alignment method may break the consistency between the gestures,i.e.,decreasing the stability of gestures from the same user,as the misaligned peaks shown in Fig. 6(c).It indicates that it is inappropriate to align the sensor data in time domain Proc.ACM Interact.Mob.Wearable Ubiquitous Technol.,Vol.37,No.4,Article 162.Publication date:December 2020
162:8 • 400 600 800 1000 1200 1400 Duration of touch gestures (ms) 0 0.05 0.10 0.15 0.20 Probability User1 User2 User3 User4 (a) Duration distribution for four users 0 100 200 300 400 500 600 700 Time(ms) -0.4 -0.3 -0.2 -0.1 0 0.1 Rotaion(rad/s) Sample1 Sample2 (b) Angular velocity data in 𝑦-axis of two samples from the same user 0 1 Normalized Time -0.4 -0.3 -0.2 -0.1 0 0.1 Rotaion(rad/s) Sample1 Sample2 (c) Temporal alignment results Fig. 6. Temporal characteristics of touch gestures 0 200 400 600 800 1000 x(pixel) 900 1000 1100 1200 1300 1400 1500 1600 1700 y(pixel) 500 1000 1500 2000 2500 3000 3500 4000 Velocity(pixels/s) 450 500 550 600 1550 1620 1690 150 200 250 300 1100 1150 → 1200 → (a) Sensor data is spatially consistent N1 N4 N2 N5 N7 N8 N9 N6 O(3 xo3, yo3) r 3 (xi, yi) (xj, yj) (b) Spatial distribution and definition of nodes 0 1 Normalized Time -0.4 -0.3 -0.2 -0.1 0 0.1 Rotaion(rad/s) Sample1 Sample2 (c) Spatial alignment results Fig. 7. Spatial characteristics of touch gestures In fact, data alignment is an effective way in the data processing and has been used in many scenarios, such as signal alignment in communications (e.g., beam alignment in RADAR [18], optical axis alignment of the transmitter and receiver in LiDAR [15], C/A code alignment of the receiver and satellite in GPS [32]), point matching in point set registration [33, 39], sequence alignment in videos [5], and so on. Take the sequence alignment task[5] as an example, they leverage both spatial displacement and temporal variations between image frames as cues, to correlate two different video sequences of the same dynamic scene in time and in space. Differently, we adopt the layout constraint in space domain as spatial cues to align the time-series sensor data, as described below. Unavoidable time difference among touch gestures: To demonstrate the time difference among touch gestures, we invite four users to perform the gesture ‘L’ on the screen, as shown in Fig. 2. Each one performs the same gesture 50 times. As shown in Fig. 6(a), the durations of gestures corresponding to the same graphic pattern ‘L’ can be different, whether the gestures are performed by the same user or different users. Specifically, in Fig. 6(b), we show the angular velocities in y-axis of two gestures corresponding to ‘L’ from the same user. The duration difference between the two gestures (i.e., sample 1 and sample 2) is about 100 ms. At this time, to calculate the similarity between them, the temporal alignment method is often adopted, e.g, using the linear interpolation algorithm [10] in time domain to make the number of data points in sample 1 and that in sample 2 be equal, as shown in Fig. 6(c). However, this temporal alignment method may break the consistency between the gestures, i.e., decreasing the stability of gestures from the same user, as the misaligned peaks shown in Fig. 6(c). It indicates that it is inappropriate to align the sensor data in time domain. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 37, No. 4, Article 162. Publication date: December 2020
TouchID:User Authentication on Mobile Devices via Inertial-Touch Gesture Analysis.162:9 Non-negligible space consistency among gestures:Different from the sensor data in time domain,the touch gestures in space domain are constrained by the layout of the lock screen,e.g.,the 3 x 3 grid in Fig.2. Consequently,the gestures corresponding to the same graphic pattern will keep the consistency in space domain. As shown in Fig.7(a),we show the moving velocity of the fingertip in each touch gesture,whose graphic pattern is'L'.We can find that the moving velocities of two gestures have a high consistency in space domain,e.g.having smaller velocities in nodes and larger velocities between nodes.It indicates that it is possible to align the sensor data in space domain to keep the stability of gestures. To achieve the above goal,we first define the touch gesture on the screen in space domain.As shown in Fig. 7(b),we use pi(xi y),ie[1,n]to represent the coordinate of the ith point in the moving trajectory of a touch gesture,while using the pair Ok,rk to represent the kth node in the lock screen.Here,O(o,yor),k E[1,m] and rk represent the center and radius of the node.When considering the layout constraint of the lock screen, the points in a moving trajectory can be classified into in-node points (i.e.,blue points in Fig.7(b))and out-node points(i.e.,red points in Fig.7(b)).That is to say,a touch gesture can be represented with in-node points and out-node points by turns along with time. For an on-screen point pi(xi.yi),if it satisfies Eq.(4),it is located in the kth node.Otherwise,it is out of the kth node. V(x:-xk)2+(班-yok)2≤rk,k∈[1,m (4) In this way,we can represent all the nk points in the kth node as pk j e[1,nkl,kj <kj+in sequence,based on the occurrence time of the point.In regard to the non-node points occurring between the kth node and k+1th node,they are represented as [pkP().For simplicity,we use N and Ckk+to represent the set of points in the kth node and the connection part between the kth node and (k +1)th node,as shown in Fig.7(b).For a node Nk,k E[1,m](or a connection part Ckk+1),we use the linear interpolation algorithm shown in Eq.(2)to align the sensor data of different gestures in the kth node(or Ck.k+).That is to say,in a node Nk,the number of data points from different gestures is the same,while in a connect part Ckk+1,the number of data points from different gestures is the same.In regard to applying the linear interpolation in N or Ck.k+1,everytime we align the sensor data in one dimension,e.g.,coordinates in x-axis,coordinates in y-axis,touch areas along the time,etc Finally,we can align all the sensor data in space domain. By introducing the spatial alignment,the angular velocity in Fig.6(b)is transformed into Fig.7(c),which can solve the problem of the time difference between the gestures corresponding to the same graphic pattern.When compared with the temporal alignment result shown in Fig.6(c),the spatial alignment result in Fig.7(c)keeps a higher consistency of the gestures from the same user. To quantitatively measure the similarity(difference)of the spatial(temporal)aligned sensor data,we collect 50 samples of gesture 'L'performed by the same user and calculate the RMSE value of the sensor data.The average RMSE value of the temporal and the spatial aligned sensor data is 0.282(standard deviation=0.081)and 0.157 (standard deviation=0.045),respectively.As mentioned before,low RMSE value means high similarity.Therefore, our spatial alignment method can keep the stability among gestures and reduce the intra-class difference for better user authentication. 3.4 Fine-grained Modeling for Gesture Segmentation The touch gestures in nodes and that out of nodes have different properties,especially for the gestures located in turning points.Therefore,after sensor data alignment in space domain,we further segment the touch gesture into several sub-gestures,to highlight the sub-gestures which contribute more for user authentication.As shown in Fig.8,we illustrate the mean acceleration in x-axis of 50 samples corresponding to a whole touch gesture 'L',the ith and the jth segmented sub-gesture of 'L'from different users,respectively.The overlap in Fig.8(a) indicates the low discriminability in whole gestures.However,the little overlap in Fig.8(b)indicates that the Proc.ACM Interact.Mob.Wearable Ubiquitous Technol..Vol.37.No.4.Article 162.Publication date:December 2020
TouchID: User Authentication on Mobile Devices via Inertial-Touch Gesture Analysis • 162:9 Non-negligible space consistency among gestures: Different from the sensor data in time domain, the touch gestures in space domain are constrained by the layout of the lock screen, e.g., the 3 × 3 grid in Fig. 2. Consequently, the gestures corresponding to the same graphic pattern will keep the consistency in space domain. As shown in Fig. 7(a), we show the moving velocity of the fingertip in each touch gesture, whose graphic pattern is ‘L’. We can find that the moving velocities of two gestures have a high consistency in space domain, e.g., having smaller velocities in nodes and larger velocities between nodes. It indicates that it is possible to align the sensor data in space domain to keep the stability of gestures. To achieve the above goal, we first define the touch gesture on the screen in space domain. As shown in Fig. 7(b), we use 𝑝𝑖(𝑥𝑖 , 𝑦𝑖),𝑖 ∈ [1, 𝑛] to represent the coordinate of the 𝑖th point in the moving trajectory of a touch gesture, while using the pair < 𝑂𝑘, 𝑟𝑘 > to represent the 𝑘th node in the lock screen. Here, 𝑂𝑘 (𝑥𝑜𝑘 , 𝑦𝑜𝑘 ), 𝑘 ∈ [1,𝑚] and 𝑟𝑘 represent the center and radius of the node. When considering the layout constraint of the lock screen, the points in a moving trajectory can be classified into in-node points (i.e., blue points in Fig. 7(b)) and out-node points (i.e., red points in Fig. 7(b)). That is to say, a touch gesture can be represented with in-node points and out-node points by turns along with time. For an on-screen point 𝑝𝑖(𝑥𝑖 , 𝑦𝑖), if it satisfies Eq. (4), it is located in the 𝑘th node. Otherwise, it is out of the 𝑘th node. q (𝑥𝑖 − 𝑥𝑜𝑘 ) 2 + (𝑦𝑖 − 𝑦𝑜𝑘 ) 2 ≤ 𝑟𝑘, 𝑘 ∈ [1,𝑚] (4) In this way, we can represent all the 𝑛𝑘 points in the 𝑘th node as 𝑝𝑘𝑗 , 𝑗 ∈ [1, 𝑛𝑘 ], 𝑘𝑗 < 𝑘𝑗+1 in sequence, based on the occurrence time of the point. In regard to the non-node points occurring between the 𝑘th node and 𝑘 + 1th node, they are represented as [𝑝𝑘𝑛𝑘 +1, 𝑝(𝑘+1)1−1]. For simplicity, we use 𝑁𝑘 and 𝐶𝑘,𝑘+1 to represent the set of points in the 𝑘th node and the connection part between the 𝑘th node and (𝑘 + 1)th node, as shown in Fig. 7(b). For a node 𝑁𝑘, 𝑘 ∈ [1,𝑚] (or a connection part 𝐶𝑘,𝑘+1 ), we use the linear interpolation algorithm shown in Eq. (2) to align the sensor data of different gestures in the 𝑘th node (or 𝐶𝑘,𝑘+1). That is to say, in a node 𝑁𝑘 , the number of data points from different gestures is the same, while in a connect part 𝐶𝑘,𝑘+1, the number of data points from different gestures is the same. In regard to applying the linear interpolation in 𝑁𝑘 or 𝐶𝑘,𝑘+1, everytime we align the sensor data in one dimension, e.g., coordinates in x-axis, coordinates in y-axis, touch areas along the time, etc. Finally, we can align all the sensor data in space domain. By introducing the spatial alignment, the angular velocity in Fig. 6(b) is transformed into Fig. 7(c), which can solve the problem of the time difference between the gestures corresponding to the same graphic pattern. When compared with the temporal alignment result shown in Fig. 6(c), the spatial alignment result in Fig. 7(c) keeps a higher consistency of the gestures from the same user. To quantitatively measure the similarity (difference) of the spatial (temporal) aligned sensor data, we collect 50 samples of gesture ‘L’ performed by the same user and calculate the RMSE value of the sensor data. The average RMSE value of the temporal and the spatial aligned sensor data is 0.282 (standard deviation=0.081) and 0.157 (standard deviation=0.045), respectively. As mentioned before, low RMSE value means high similarity. Therefore, our spatial alignment method can keep the stability among gestures and reduce the intra-class difference for better user authentication. 3.4 Fine-grained Modeling for Gesture Segmentation The touch gestures in nodes and that out of nodes have different properties, especially for the gestures located in turning points. Therefore, after sensor data alignment in space domain, we further segment the touch gesture into several sub-gestures, to highlight the sub-gestures which contribute more for user authentication. As shown in Fig. 8, we illustrate the mean acceleration in x-axis of 50 samples corresponding to a whole touch gesture ‘L’, the 𝑖th and the 𝑗th segmented sub-gesture of ‘L’ from different users, respectively. The overlap in Fig. 8(a) indicates the low discriminability in whole gestures. However, the little overlap in Fig. 8(b) indicates that the Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 37, No. 4, Article 162. Publication date: December 2020
162:10 020 025 025 02 02 0.1 01 0.05 0.0 .0.08 0.0400.040.0s0.12 024020.160.120.08-004 0.02 00.020.040.D6 0.08 Mean of sensor d山ata Mean of sensor data Mean of sensor data (a)x-axis acceleration of the entire (b)x-axis acceleration of i-th sub- (c)x-axis acceleration of j-th sub- gesture gesture gesture Fig.8.Distribution of feature value at gesture and different sub-gesture parts Touch Sensor Data Data Filtering Spatial Alignment Gesture Segmentation based on Sub-gesture's Synchronization of Sensor Data Graphic Patterns Sensor Data Inertial Sensor Data Fig.9.Gesture segmentation scheme high discriminability in the ith segmented sub-gesture,while the large overlap in Fig.8(c)indicates the very poor discriminability in the jth segmented sub-gesture.It means that the gesture in different segments has different stability and discriminability.Thus it is necessary and meaningful to segment the whole touch gesture into sub-gestures,to extract the sub-gestures having a good stability for the same user and the sub-gestures having a good discriminability for different users. To segment the touch gesture,we need to split the inertial sensor data and touch sensor data into each sub- gesture.As shown in Fig.9,the gesture segmentation consists of three steps,i.e.,data filtering and synchronization, spatial alignment of sensor data,gesture segmentation based on the graphic patterns.Firstly,we use a moving average filter to remove the high-frequency noises in the inertial sensor data and the touch sensor data.Besides, considering the difference between the sampling rates of the touch sensor(i.e.,60 Hz)and the inertial sensor (i.e.,100 Hz),we introduce the linear interpolation described in Eq.(2)to synchronize the sensor data,and make them have a uniform sampling rate,i.e.,100 Hz.Secondly,we use the spatial alignment method described in Section 3.3 to align the sensor data of gestures corresponding to the same graphic pattern,to keep the stability of gestures from the same user.Thirdly,we use the layout of the lock screen,i.e.,the locations of nodes,to segment the touch gesture as in-node sub-gestures and out-node sub-gesture by turns.As the blue segments and red segments shown in Fig.7(b).As mentioned in Section 3.3,we use [PkPk]to represent the in-node points in the kth node.Accordingly,the time of the first,the last data point occurring in kth node is represented as t,, tm respectively.Therefore,the sensor data occurring in [tktk]is split into the sub-gesture located in the kth node,while the sensor data occurring in [(is split into the sub-gesture located in the connection part between the kth node and the (k+1)th node. 4 DATA ANALYSIS AND FEATURE SELECTION According to the observations in Section 3,the touch sensor data and inertial sensor data of touch gestures can be used for authentication,since the sensor data shows the similarity of gestures from the same user and the difference of gestures from different users.However,the uncertainty of user behaviors may reduce the intra-class similarity and reduce the inter-class difference.Therefore,it is necessary to analyze the sensor data detailedly Proc.ACM Interact.Mob.Wearable Ubiquitous Technol.,Vol.37,No.4,Article 162.Publication date:December 2020
162:10 • -0.08 -0.04 0 0.04 0.08 0.12 Mean of sensor data 0 0.05 0.10 0.15 0.20 Probability User1 User2 (a) 𝑥-axis acceleration of the entire gesture -0.24 -0.2 -0.16 -0.12 -0.08 -0.04 Mean of sensor data 0 0.05 0.10 0.15 0.20 0.25 Probability User1 User2 (b) 𝑥-axis acceleration of 𝑖-th subgesture -0.02 0 0.02 0.04 0.06 0.08 Mean of sensor data 0 0.05 0.10 0.15 0.20 0.25 Probability User1 User2 (c) 𝑥-axis acceleration of 𝑗-th subgesture Fig. 8. Distribution of feature value at gesture and different sub-gesture parts Data Filtering & Synchronization Spatial Alignment of Sensor Data Gesture Segmentation based on Graphic Patterns Inertial Sensor Data Touch Sensor Data Sub-gesture’s Sensor Data Fig. 9. Gesture segmentation scheme high discriminability in the 𝑖th segmented sub-gesture, while the large overlap in Fig. 8(c) indicates the very poor discriminability in the 𝑗th segmented sub-gesture. It means that the gesture in different segments has different stability and discriminability. Thus it is necessary and meaningful to segment the whole touch gesture into sub-gestures, to extract the sub-gestures having a good stability for the same user and the sub-gestures having a good discriminability for different users. To segment the touch gesture, we need to split the inertial sensor data and touch sensor data into each subgesture. As shown in Fig. 9, the gesture segmentation consists of three steps, i.e., data filtering and synchronization, spatial alignment of sensor data, gesture segmentation based on the graphic patterns. Firstly, we use a moving average filter to remove the high-frequency noises in the inertial sensor data and the touch sensor data. Besides, considering the difference between the sampling rates of the touch sensor (i.e., 60 Hz) and the inertial sensor (i.e., 100 Hz), we introduce the linear interpolation described in Eq. (2) to synchronize the sensor data, and make them have a uniform sampling rate, i.e., 100 Hz. Secondly, we use the spatial alignment method described in Section 3.3 to align the sensor data of gestures corresponding to the same graphic pattern, to keep the stability of gestures from the same user. Thirdly, we use the layout of the lock screen, i.e., the locations of nodes, to segment the touch gesture as in-node sub-gestures and out-node sub-gesture by turns. As the blue segments and red segments shown in Fig. 7(b). As mentioned in Section 3.3, we use [𝑝𝑘1 , 𝑝𝑘𝑛𝑘 ] to represent the in-node points in the 𝑘th node. Accordingly, the time of the first, the last data point occurring in 𝑘th node is represented as 𝑡𝑘1 , 𝑡𝑛𝑘 , respectively. Therefore, the sensor data occurring in [𝑡𝑘1 , 𝑡𝑘𝑛𝑘 ] is split into the sub-gesture located in the 𝑘th node, while the sensor data occurring in [𝑡𝑘𝑛𝑘 +1 , 𝑡(𝑘+1)1−1] is split into the sub-gesture located in the connection part between the 𝑘th node and the (𝑘 + 1)th node. 4 DATA ANALYSIS AND FEATURE SELECTION According to the observations in Section 3, the touch sensor data and inertial sensor data of touch gestures can be used for authentication, since the sensor data shows the similarity of gestures from the same user and the difference of gestures from different users. However, the uncertainty of user behaviors may reduce the intra-class similarity and reduce the inter-class difference. Therefore, it is necessary to analyze the sensor data detailedly Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., Vol. 37, No. 4, Article 162. Publication date: December 2020