44:6 Y.Yin et al. 0.2 02 0.2 E (w) 0 N -0.2 -0.2 0.2 0.2 0.2 0.2 0.2 0.2 0 0.2 )0202 x(m) 0.2-02 (m) 0202 x(m) (a)From left (b)From center (c)From right Fig.3.Observed 3D contours from different viewing angles. 0.2 0.2 0.2 -0.2 0.2 0.2 0.2 0.2 0.2 0.2 02 0.2 0.2-0.2 x(m) -02-02 (m) 片份0202 x(m) (a)Writing towardsx plane (b)Writing towards y plane (c)Writing towards x plane Fig.4.Different contours from the same viewing angle. look at the contours of"b"and "q"from different viewing angles,we may see similar shapes,and it may be difficult to distinguish them.Therefore,it is expected to select a proper viewing angle to mitigate the confusion about character contours. 3.2.2 Writing Directions.From a uniform view,2D contours of a same character are similar,while the 3D contours can be quite different,due to uncertain writing directions.On a 2D plane,the contours of the same character keep the essential shape feature.Even if the orientation of a 2D contour changes,e.g.,the 2D contour rotates in the plane,it still keeps the shape feature of the contour. However,in 3D space,even if we look at the contours of the same character from the same viewing angle,the observed contours can be quite different,as the contours in the red circles shown in Figure 4.This is because the user can write in-air gestures towards different directions.Intuitively, if we can adaptively project the 3D contour into a corresponding coordinate plane(e.g.,xh-zh plane,yh-zh plane,or xh-yh plane),we may mitigate the contour distortion caused by writing directions. 3.2.3 Contour Distribution.A 2D contour locates in a plane,while a 3D contour can distribute across different planes.In Figure 5(a),we show the human coordinate system(human-frame for short)xh-yh-zh.When the user writes in the air,her/his hand can move left and right,up and down,thus the in-air gesture generates a 3D contour across different planes.At this time,the in-air contour may be mainly located in or close to the plane A3B3C3D3 while not be parallel to any coordinate plane,and we cannot directly project the in-air contour into a coordinate plane for dimensionality reduction,e.g.xh-zh plane.As shown in Figure 5(b),the 3D contour of"k" distributes across different planes,and the 3D contour is mainly located in or close to the red plane,instead of any coordinate plane (e.g.,the blue plane in Figure 5(a)).Here,the red plane is called principal plane or writing plane,which contains or is close to most of points in the 3D ACM Transactions on Sensor Networks,Vol 15.No.4.Article 44.Publication date:October 2019
44:6 Y. Yin et al. Fig. 3. Observed 3D contours from different viewing angles. Fig. 4. Different contours from the same viewing angle. look at the contours of “b” and “q” from different viewing angles, we may see similar shapes, and it may be difficult to distinguish them. Therefore, it is expected to select a proper viewing angle to mitigate the confusion about character contours. 3.2.2 Writing Directions. From a uniform view, 2D contours of a same character are similar, while the 3D contours can be quite different, due to uncertain writing directions. On a 2D plane, the contours of the same character keep the essential shape feature. Even if the orientation of a 2D contour changes, e.g., the 2D contour rotates in the plane, it still keeps the shape feature of the contour. However, in 3D space, even if we look at the contours of the same character from the same viewing angle, the observed contours can be quite different, as the contours in the red circles shown in Figure 4. This is because the user can write in-air gestures towards different directions. Intuitively, if we can adaptively project the 3D contour into a corresponding coordinate plane (e.g., xh − zh plane, yh − zh plane, or xh − yh plane), we may mitigate the contour distortion caused by writing directions. 3.2.3 Contour Distribution. A 2D contour locates in a plane, while a 3D contour can distribute across different planes. In Figure 5(a), we show the human coordinate system (human-frame for short) xh − yh − zh. When the user writes in the air, her/his hand can move left and right, up and down, thus the in-air gesture generates a 3D contour across different planes. At this time, the in-air contour may be mainly located in or close to the plane A3B3C3D3 while not be parallel to any coordinate plane, and we cannot directly project the in-air contour into a coordinate plane for dimensionality reduction, e.g., xh − zh plane. As shown in Figure 5(b), the 3D contour of “k” distributes across different planes, and the 3D contour is mainly located in or close to the red plane, instead of any coordinate plane (e.g., the blue plane in Figure 5(a)). Here, the red plane is called principal plane or writing plane, which contains or is close to most of points in the 3D ACM Transactions on Sensor Networks, Vol. 15, No. 4, Article 44. Publication date: October 2019
AirContour:Building Contour-based Model for In-Air Writing Gesture Recognition 44:7 B Plane Human B -0.1 frame D C -0.2 0.1 0.1 y8m)0.10.1 m (a)Hand movements (b)3D contour across different planes Fig.5.In-air gesture across different planes. (a)Writing with left hand (b)Writing with right hand Fig.6.Viewing angles for writing with different hands. contour,i.e.,the projected contour in the principal plane keeps the essential feature of the 3D contour.Therefore,we are expected to adaptively project the 3D contour into the principal plane and obtain the essential contour feature of the handwritten character,as the contour"k"shown in the red circle in Figure 5(b). 3.3 Some Definitions about In-air Gestures According to Section 3.2,the improper viewing angle will lead to the distortion of the observed gesture contour.To mitigate the confusion or misrecognition of gesture contours caused by view- ing angles,we first define the appropriate range of viewing angles based on people's writing habits, i.e.,when the user writes in the air,her/his eyes track the movement of the hand naturally. As shown in Figure 6(a),when the user writes with the left hand,she/he tends to write in front,left side,or below;the corresponding viewing angle comes from behind,right side,or up side.Accordingly,we select a reference coordinate plane for each viewing angle,i.e.,xh-zh plane, yh-Zh plane,and xh-yh plane,respectively.Similarly,as shown in Figure 6(b),when the user writes with the right hand in front,right side,or below,the corresponding viewing angle comes from behind,left side,or up side.The selected reference coordinate plane under the viewing angles are xh-zh plane,(-yh)-zh plane,and xh-yh plane,respectively.Therefore,there is a mapping relationship between a reference coordinate plane and a viewing angle.With the selected reference coordinate plane,the user will not view a character contour in the right orientation as a reversed contour(referring to Figure 3(a)and Figure 3(c)).It is worth mentioning that the selected reference coordinate plane is used to indicate the possible orientation of the projected contour in principal plane,as described in Section 4.2.It does not mean that the user can only write on xh-zh,yh -Zh, or xy-yh planes;in fact,the user can write towards arbitrary directions in 3D space. ACM Transactions on Sensor Networks,Vol.15,No.4.Article 44.Publication date:October 2019
AirContour: Building Contour-based Model for In-Air Writing Gesture Recognition 44:7 Fig. 5. In-air gesture across different planes. Fig. 6. Viewing angles for writing with different hands. contour, i.e., the projected contour in the principal plane keeps the essential feature of the 3D contour. Therefore, we are expected to adaptively project the 3D contour into the principal plane and obtain the essential contour feature of the handwritten character, as the contour “k” shown in the red circle in Figure 5(b). 3.3 Some Definitions about In-air Gestures According to Section 3.2, the improper viewing angle will lead to the distortion of the observed gesture contour. To mitigate the confusion or misrecognition of gesture contours caused by viewing angles, we first define the appropriate range of viewing angles based on people’s writing habits, i.e., when the user writes in the air, her/his eyes track the movement of the hand naturally. As shown in Figure 6(a), when the user writes with the left hand, she/he tends to write in front, left side, or below; the corresponding viewing angle comes from behind, right side, or up side. Accordingly, we select a reference coordinate plane for each viewing angle, i.e., xh − zh plane, yh − zh plane, and xh − yh plane, respectively. Similarly, as shown in Figure 6(b), when the user writes with the right hand in front, right side, or below, the corresponding viewing angle comes from behind, left side, or up side. The selected reference coordinate plane under the viewing angles are xh − zh plane, (−yh ) − zh plane, and xh − yh plane, respectively. Therefore, there is a mapping relationship between a reference coordinate plane and a viewing angle. With the selected reference coordinate plane, the user will not view a character contour in the right orientation as a reversed contour (referring to Figure 3(a) and Figure 3(c)). It is worth mentioning that the selected reference coordinate plane is used to indicate the possible orientation of the projected contour in principal plane, as described in Section 4.2. It does not mean that the user can only write on xh − zh, yh − zh, or xy − yh planes; in fact, the user can write towards arbitrary directions in 3D space. ACM Transactions on Sensor Networks, Vol. 15, No. 4, Article 44. Publication date: October 2019
44:8 Y.Yin et al. 0.2 02 0.2 N 10 -0.2 -0.2 -0.2 02 02 02 0.2 0.2 0.2 m)02 -0.2 (m) m)0202 (m) %9m02 -0.2 x(m) (a) (b) (c) Fig.7.Different principal planes. Here,the hand (i.e.,left hand or right hand)and the writing directions,i.e.,in front,left side, right side,or below,determine the viewing angles.To detect which hand writes in the air,we introduce an initial gesture before writing,i.e.,the user stands with the hands down and then opens up the arm wearing the device until the arm is parallel to the floor.In the human coordinate system,if the hand moves left,then the user writes with the left hand.Otherwise,the user writes with the right hand.In regard to the human coordinate system,it will be described in the later System Design section.To detect the writing direction and project the 3D contour into a 2D plane properly,we introduce the 3D contour-based gesture model,as described below. 4 3D CONTOUR-BASED GESTURE MODEL Based on the accelerometer,gyroscope,and magnetometer of the wrist-worn device,we can get the 3D contour of the in-air gesture.However,according to Section 3.2,due to the uncertainty of the viewing angle,writing direction,and contour distribution,it is essential to find a plane to get the proper projection of 3D contour for character recognition.To solve this issue,we first introduce Principal Component Analysis(PCA)to adaptively detect the principal/writing plane.Then,we detect the reference coordinate plane and determine the viewing angle.After that,we tune the 2D contour in the principal plane to get the character contour in right orientation and normalized size. 4.1 Principal Plane Detection with PCA As mentioned before,to get a proper projected 2D contour for character recognition,we need to detect the principal/writing plane,which contains or is close to most of points in the 3D contour, as the red plane in Figure 7(a),Figure 7(b),and Figure 7(c)indicates.It is worth noting that the principal plane may not be parallel to any coordinate plane,as shown in Figure 7.In this article, we utilize Principal Component Analysis(PCA)[30]to reduce the dimensionality of 3D contour and detect the principal plane adaptively,as described below. For convenience,we use xi=(xi,x2,xi),ie[1,n]to represent the contour (i.e.,point se- quence)in xh-axis,yh-axis,and zh-axis of the human coordinate system.First,we introduce the centralization operation to update the coordinatesxi of the contour,i.e. xi2 =xi2-xx=x1x.Then,we use @i =(@n,0n2.0ns),ie [1,2]to rep- resent the orthonormal basis vectors of the principal plane.Here,ll=1.=0.ij. As shown in Figure 8,for the point xi in human-frame,its projection point in the principal plane isyi=(yn,yi2)=xi,where =(@@2).Then,we can use y;to reconstruct the coordinate of xi as i,as shown in Equation (1).The distance between xi and i is di=lxiil2: 2 xi= ∑wj=2(2'x. (1) j=1 ACM Transactions on Sensor Networks,Vol 15.No.4,Article 44.Publication date:October 2019
44:8 Y. Yin et al. Fig. 7. Different principal planes. Here, the hand (i.e., left hand or right hand) and the writing directions, i.e., in front, left side, right side, or below, determine the viewing angles. To detect which hand writes in the air, we introduce an initial gesture before writing, i.e., the user stands with the hands down and then opens up the arm wearing the device until the arm is parallel to the floor. In the human coordinate system, if the hand moves left, then the user writes with the left hand. Otherwise, the user writes with the right hand. In regard to the human coordinate system, it will be described in the later System Design section. To detect the writing direction and project the 3D contour into a 2D plane properly, we introduce the 3D contour-based gesture model, as described below. 4 3D CONTOUR-BASED GESTURE MODEL Based on the accelerometer, gyroscope, and magnetometer of the wrist-worn device, we can get the 3D contour of the in-air gesture. However, according to Section 3.2, due to the uncertainty of the viewing angle, writing direction, and contour distribution, it is essential to find a plane to get the proper projection of 3D contour for character recognition. To solve this issue, we first introduce Principal Component Analysis (PCA) to adaptively detect the principal/writing plane. Then, we detect the reference coordinate plane and determine the viewing angle. After that, we tune the 2D contour in the principal plane to get the character contour in right orientation and normalized size. 4.1 Principal Plane Detection with PCA As mentioned before, to get a proper projected 2D contour for character recognition, we need to detect the principal/writing plane, which contains or is close to most of points in the 3D contour, as the red plane in Figure 7(a), Figure 7(b), and Figure 7(c) indicates. It is worth noting that the principal plane may not be parallel to any coordinate plane, as shown in Figure 7. In this article, we utilize Principal Component Analysis (PCA) [30] to reduce the dimensionality of 3D contour and detect the principal plane adaptively, as described below. For convenience, we use xi = (xi1, xi2, xi3) T , i ∈ [1,n] to represent the contour (i.e., point sequence) in xh − axis, yh − axis, and zh − axis of the human coordinate system. First, we introduce the centralization operation to update the coordinates xi of the contour, i.e., xi1 = xi1 − 1 n n j=1 xj1, xi2 = xi2 − 1 n n j=1 xj2, xi3 = xi3 − 1 n n j=1 xj3. Then, we use ωi = (ωi1,ωi2,ωi3) T , i ∈ [1, 2] to represent the orthonormal basis vectors of the principal plane. Here, ωi2 = 1, ωT i ωj = 0, i j. As shown in Figure 8, for the point xi in human-frame, its projection point in the principal plane is yi = (yi1,yi2) T = ΩT xi , where Ω = (ω1,ω2). Then, we can use yi to reconstruct the coordinate of xi as xˆi , as shown in Equation (1). The distance between xi and xˆi is di = xi − xˆi2: xˆi = 2 j=1 yijωj = Ω(ΩT xi ). (1) ACM Transactions on Sensor Networks, Vol. 15, No. 4, Article 44. Publication date: October 2019.
AirContour:Building Contour-based Model for In-Air Writing Gesture Recognition 44:9 xa-axis d xj-axis x2-axis Fig.8.The principle of writing plane detection with PCA. 02 E 4 02 0.2 0.2 0202 xp(m) -02 x。m 02 (a)3 D contour (b)Projected contour Fig.9.Relationship between contours and principal plane. When the average distanced=di.i[1.n]reaches the minimal value,the plane represented with the orthonormal basis vectors =(@1,@2)is the principal/writing plane,as shown in Equa- tion (2): arg min 1x- n司 (2) s.t.22=L. By combining Equation(1)and Equation(2),we can transform the objective in Equation(2)to Equation (3),where X =(x1,x2.....xn),while tr means the trace of a matrix,i.e.,the sum of the elements on the main diagonal of the matrix. arg max tr(Q'XX'Q) (3) s.t.22=L. After that,we use Lagrange multiplier method to obtain the orthonormal basis vectors (@@z), based on eigenvalue decomposition of XX,as shown in Equation(4).The orthonormal basis vec- tor @with the largest eigenvalue corresponds to the eigenvector @while the second eigenvector is @2.In the principal plane,we use and @z to represent the xp-axis and yp-axis of the principal plane,respectively. XXTwi=Ai@i. (4) As shown in Figure 9(a),the black line and the green line respectively mean the first basis vector @1 and the second basis vector @2,while the red plane containing and @z is the de- tected principal plane.In the principal plane,we can obtain the projected 2D contour,as shown in Figure 9(b).However,due to the information loss of dimensionality reduction,there may exist the ACM Transactions on Sensor Networks,Vol.15,No.4.Article 44.Publication date:October 2019
AirContour: Building Contour-based Model for In-Air Writing Gesture Recognition 44:9 Fig. 8. The principle of writing plane detection with PCA. Fig. 9. Relationship between contours and principal plane. When the average distance ¯ d = n i=1 di , i ∈ [1,n] reaches the minimal value, the plane represented with the orthonormal basis vectors Ω = (ω1,ω2) is the principal/writing plane, as shown in Equation (2): arg min Ω 1 n n i=1 xi − xˆi2 s.t.ΩT Ω = I. (2) By combining Equation (1) and Equation (2), we can transform the objective in Equation (2) to Equation (3), where X = (x1, x2,..., xn ), while tr means the trace of a matrix, i.e., the sum of the elements on the main diagonal of the matrix. arg max Ω tr(ΩTXXT Ω) s.t.ΩT Ω = I. (3) After that, we use Lagrange multiplier method to obtain the orthonormal basis vectors {ω1,ω2}, based on eigenvalue decomposition of XXT , as shown in Equation (4). The orthonormal basis vector ωi with the largest eigenvalue corresponds to the eigenvector ω1, while the second eigenvector is ω2. In the principal plane, we use ω1 and ω2 to represent the xp -axis and yp -axis of the principal plane, respectively. XXTωi = λiωi . (4) As shown in Figure 9(a), the black line and the green line respectively mean the first basis vector ω1 and the second basis vector ω2, while the red plane containing ω1 and ω2 is the detected principal plane. In the principal plane, we can obtain the projected 2D contour, as shown in Figure 9(b). However, due to the information loss of dimensionality reduction, there may exist the ACM Transactions on Sensor Networks, Vol. 15, No. 4, Article 44. Publication date: October 2019.