16 2 Pattern Classification Based account in classification procedures. Although a lot of scientific developments have already been made in the area of pattern classification, existing techniques of pattern classification remain inferior to the human classification processes which perform extremely complex tasks. Hence, we attempt to develop a plau- sible tool using fuzzy relational calculus(FRC) for modeling and mimicking the cognitive process of human reasoning for pattern classification. The FRC approach to pattern classification can take care uncertainties in feature values of patterns under different conditions like measurement error, noise, etc. Though there are several existing approaches to designing a classifier using the concept of uzzy set/fuzzy logic(Mori 1983: Mori and Laface 1980: Hirota 1988; Seif and Aguilar- Martin 1980: Dubois and Jaulent 1985: Hirota et al. 1987; Huntsberger et al. 1985; Kickert and Koppleaar 1986: Siy and Chen 1974; Shimura 1975 Huntsberger et al. 1986: Lee 1972: Zadeh et al. 1975; bortolan and Degani 1983 Bortolan et al. 1988: Degani and Ortolan 1987a, b: Pedrycz 1985a, b, c: Watanabe 1985: Kumar 1977; Saitta and Tarasso 1981; Woodbury and Clive 1974: Bezdek and Pal 1992; Simpson 1992), we have selected the concept proposed by Pedrycz (1990)and suitably modified it to incorporate our oncept of computation of the derivative of the fuzzy max--function and mil function. To represent the knowledge about the training data set, we consider the onventional interpretation of multidimensional fuzzy implication(MFD)(Sugeno and Takagi 1983; Tsukamoto 1979). We introduce a novel notion of fuzzy pattern vector as stated in Appendix-A to represent a population of patterns(a set of patterns)in the pattern space. It represents the antecedent part of the said par ticular interpretation of the MFI to meaningfully carry out the task of pattern classification using FRC. during the construction of the classifier based on FRC we use fuzzy linguistic statements(or fuzzy membership function to represent the linguistic statement) to represent the values of features(e.g, feature FI is small and F2 is big)for a population of patterns (a set of patterns)represented by the above said notion of Fuzzy pattern vector. Note that the construction of the classifier essentially depends on the estimation of a fuzzy relation 3R between the antecedent part and consequent part of the rules. As, for a given problem of Is no specitic g select a particular logical operator, e.g., Mamani's min operator, Zadeh's arith- metic rule etc (Mizumoto 1985; Zadeh 1970)to translate a fuzzy implication to a fuzzy relation we estimate it(the relation R), based on different learning scheme using soft computing tools. Thus, in our entire treatment for classification and recognition we replace logic by learning. The estimated 3 is the core of the classifier(recognizer). Once the classifier is constructed, the nonfuzzy features of a pattern can be classified. At the time of classification of the nonfuzzy features of the test patterns, we use the concept of fuzzy masking to fuzzify the nonfuzzy feature values of the test patterns. The performance of the proposed scheme is tested on synthetic data. Finally, we use the proposed scheme for the vowel classification problem of an Indian language
account in classification procedures. Although a lot of scientific developments have already been made in the area of pattern classification, existing techniques of pattern classification remain inferior to the human classification processes which perform extremely complex tasks. Hence, we attempt to develop a plausible tool using fuzzy relational calculus (FRC) for modeling and mimicking the cognitive process of human reasoning for pattern classification. The FRC approach to pattern classification can take care uncertainties in feature values of patterns under different conditions like measurement error, noise, etc. Though there are several existing approaches to designing a classifier using the concept of fuzzy set/fuzzy logic (Mori 1983; Mori and Laface 1980; Hirota 1988; Seif and Aguilar-Martin 1980; Dubois and Jaulent 1985; Hirota et al. 1987; Huntsberger et al. 1985; Kickert and Koppleaar 1986; Siy and Chen 1974; Shimura 1975; Huntsberger et al. 1986; Lee 1972; Zadeh et al. 1975; Bortolan and Degani 1983; Bortolan et al. 1988; Degani and Bortolan 1987a, b; Pedrycz 1985a, b, c; Watanabe 1985; Kumar 1977; Saitta and Tarasso 1981; Woodbury and Clive 1974; Bezdek and Pal 1992; Simpson 1992), we have selected the concept proposed by Pedrycz (1990) and suitably modified it to incorporate our new concept of computation of the derivative of the fuzzy max—function and min— function. To represent the knowledge about the training data set, we consider the conventional interpretation of multidimensional fuzzy implication (MFI) (Sugeno and Takagi 1983; Tsukamoto 1979). We introduce a novel notion of fuzzy pattern vector as stated in Appendix-A to represent a population of patterns (a set of patterns) in the pattern space. It represents the antecedent part of the said particular interpretation of the MFI to meaningfully carry out the task of pattern classification using FRC. During the construction of the classifier based on FRC, we use fuzzy linguistic statements (or fuzzy membership function to represent the linguistic statement) to represent the values of features (e.g., feature F1 is small and F2 is big) for a population of patterns (a set of patterns) represented by the above said notion of Fuzzy pattern vector. Note that the construction of the classifier essentially depends on the estimation of a fuzzy relation < between the antecedent part and consequent part of the rules. As, for a given problem of pattern classification and object recognition, there is no specific guideline to select a particular logical operator, e.g., Mamani’s min operator, Zadeh’s arithmetic rule etc. (Mizumoto 1985; Zadeh 1970) to translate a fuzzy implication to a fuzzy relation we estimate it (the relation <), based on different learning scheme using soft computing tools. Thus, in our entire treatment for classification and recognition we replace logic by learning. The estimated < is the core of the classifier (recognizer). Once the classifier is constructed, the nonfuzzy features of a pattern can be classified. At the time of classification of the nonfuzzy features of the test patterns, we use the concept of fuzzy masking to fuzzify the nonfuzzy feature values of the test patterns. The performance of the proposed scheme is tested on synthetic data. Finally, we use the proposed scheme for the vowel classification problem of an Indian language. 16 2 Pattern Classification Based
2.2 Statement of the Problem 2. 2 Statement of the problem For the present problem, let us consider the conventional interpretation of a MFI [see Appendix-A, Eq. A 1 A 2] as stated below; a) if x is A and y is B then z is C b)if x is a then y is B then z is C The notion of a fuzzy pattern vector(see Appendix-A)represents the ante- cedent clauses of (a)of (2.1)and locates a population of patterns P in the quar tized pattern space. Assume that the quantized pattern space consists of"c universes U1,U2,…, U. in the form l=U1×U2×…xU, where each U presents the universe on the ith feature axis Fi Assume that D is a fuzzy relation [formed by the antecedent clauses of (a) of (2. 1)], which is a fuzzy set in quantized product space U, namely up: U-10, 1 Also, assume that there exists a set Cclass of finite number of classes CI, C2,..., Cnt e, Cclass =c1, c2, .. Cn), by which the finite range of the pattern space is overed. The consequent clause of a)of (2. 1)is a fuzzy setC=iau(ci)/c where p(Ci)denotes the degree belongingness of the population of patterns Pto the class ci, for j= 1, 2,..., n(see Example A l of Appendix-A). Therefore, by considering the conventional interpretation of a MFl, the fuzzy set D formed by the antecedent clauses of (a) of(2.1) is associated with the fuzzy set C which represents the consequent clause of (a)of(2.1). Hence, there exists a relation between D and C. More precisely, D and C are related via a certain relation 3R (i.e, D JC), which is presently unknown and has to be estimated, based on the training data set, for the design of the classifier. Now, for the testing of the classifier, we specify how C is derived from given D and estimated We may onsider the fuzzy relational equation, namely, a direct equation C=DoR where 0 max-t composition operator, where t is a T-norm operator Equation(2.2)can be rewritten, in terms of the membership function, in the following form He(C)=v.Lup(u)tpg(u, c) for j=1, 2, This explicit form of (2.3)is needed for actual design study of the classifier Let us assume that the training set consists of ordered pairs and the classifier relation is supposed to specify a system of equations
2.2 Statement of the Problem For the present problem, let us consider the conventional interpretation of a MFI [see Appendix-A, Eq. A.1 & A.2] as stated below; aÞ if x is A and y is B then z is C or bÞ if x is A then y is B then z is C : ð2:1Þ The notion of a fuzzy pattern vector (see Appendix-A) represents the antecedent clauses of (a) of (2.1) and locates a population of patterns P in the quantized pattern space. Assume that the quantized pattern space consists of ‘‘c’’ universes U1, U2, …, Uc in the form U = U1 9 U2 9 9 Uc, where each Ui represents the universe on the ith feature axis Fi, i = 1, 2, …, c. Assume that D is a fuzzy relation [formed by the antecedent clauses of (a) of (2.1)], which is a fuzzy set in quantized product space U, namely lD: U ? [0, 1]. Also, assume that there exists a set Cclass of finite number of classes c1, c2, …, cn, i.e., Cclass ¼ fc1; c2; ...; cng; by which the finite range of the pattern space is covered. The consequent clause of a) of (2.1) is a fuzzy set C ¼ Pn j¼1 lcðcjÞ=cj; where lcðCjÞ denotes the degree belongingness of the population of patterns P to the class cj, for j = 1, 2, …, n (see Example A.1 of Appendix-A). Therefore, by considering the conventional interpretation of a MFI, the fuzzy set D formed by the antecedent clauses of (a) of (2.1) is associated with the fuzzy set C which represents the consequent clause of (a) of (2.1). Hence, there exists a relation between D and C. More precisely, D and C are related via a certain relation < (i.e.,D<C), which is presently unknown and has to be estimated, based on the training data set, for the design of the classifier. Now, for the testing of the classifier, we specify how C is derived from given D and estimated <. We may consider the fuzzy relational equation, namely, a direct equation C ¼ Do< ð2:2Þ where 0 : max–t composition operator, where t is a T—norm operator. Equation (2.2) can be rewritten, in terms of the membership function, in the following form: lCðCjÞ ¼ _ u2U½lDðuÞtlRðu; cjÞ for j ¼ 1; 2; ...; n : ð2:3Þ This explicit form of (2.3) is needed for actual design study of the classifier. Let us assume that the training set consists of ordered pairs ðP1; C2Þ; ðP2; C2Þ; ... ; ðPk; CkÞ and the classifier relation is supposed to specify a system of equations Cl ¼ Dlo<l; l ¼ 1; 2; 3; ...; k ð2:4Þ 2.2 Statement of the Problem 17
2 Pattern Classification Based then the fuzzy relation which satisfies (2. 4) is given by But the above mentioned system of equations in(2. 4)may not have a solution (Pedrycz 1990). Hence, in this chapter we look for an approximate solution of the ystem of fuzzy relation equations in (2.4) 2.3 Existing Method to Solve Fuzzy Relation Equatio The numerical solution of fuzzy relational equation has been proposed by several esearchers(Wang 1993: Pedrycz 1983, 1985a, b, 1988, 1991, 1995; Ikoma et al 1993: Hellendoorn 1992; Dinola et al. 1991; Chakraborty 1985: Ovchinnikov and Riera 1992; Gottwald 1994: Wangming 1986). In this section, we briefly review ne method proposed by Pedrycz(1983). We focus our attention on max--com- position operator of fuzzy relational equations, which are defined on finite spaces C= DoR where 0 max-t composition operator, D and C are the fuzzy sets defined on the universe of discourses U=u1, 2,..., Um, and Class =c1, c2 respectively, and 3 is the fuzzy relation on U x Cclass. Let ri=(ui, ci/uieL cjeCclass), i= 1, 2,., m,j=1, 2, .,n;then, the fuzzy sets D and C and fuzzy relation R are as follows D=|p(u)1xm∈F(U C=[c(c)1xn∈F(Clas) ()mxn∈F(U×Cany) If the universe U of the quantized pattern space consists of 'c'features, say Fi i= 1, 2,-.., C, the D is a fuzzy set defined on the quantized product spaces of UI Uc that is U=U1xU2×…xUe, where L1={n1,n2,……,l} is the universe of the ith feature axis Fi with card(Ui)=m;. Let d be the fuzzy set on D=pD()1xm∈F(U)fori=1,2,…,c;then,card(U) IE=1 mi and ui is the c tuple each of type=(吨2,n2,……,/∈Up 1, 2,., c)and corresponding membership value belonging to D is determined by (2. 8)shown below; (1)1a)=/=:{()=pup(4.)A(2)…A(4) (2)1(U)=IL=1{p()=pp()(e),p(a) wherei=∑=1(m>pm)(-1)+i, for each ip=1,2,…m
then the fuzzy relation which satisfies (2.4) is given by < ¼b \k l¼1 <l: ð2:5Þ But the above mentioned system of equations in (2.4) may not have a solution (Pedrycz 1990). Hence, in this chapter we look for an approximate solution of the system of fuzzy relation equations in (2.4). 2.3 Existing Method to Solve Fuzzy Relation Equation The numerical solution of fuzzy relational equation has been proposed by several researchers (Wang 1993; Pedrycz 1983, 1985a, b, 1988, 1991, 1995; Ikoma et al. 1993; Hellendoorn 1992; Dinola et al. 1991; Chakraborty 1985; Ovchinnikov and Riera 1992; Gottwald 1994; Wangming 1986). In this section, we briefly review the method proposed by Pedrycz (1983). We focus our attention on max—composition operator of fuzzy relational equations, which are defined on finite spaces C ¼ Do< ð2:6Þ where 0 : max-t composition operator, D and C are the fuzzy sets defined on the universe of discourses U ¼ fu1; u2; ...; umg and Cclass ¼ fc1; c2; ...; cng; respectively, and < is the fuzzy relation on U 9 Cclass. Let rij ¼ ðui; cj=uieU; cjeCclassÞ; i = 1,2, …, m, j = 1,2, …, n; then, the fuzzy sets D and C and fuzzy relation < are as follows: D ¼ ½lDðuiÞ1m 2 FðUÞ C ¼ ½lCðcjÞ1n 2 FðCclassÞ <¼½l<ðrijÞmn 2 FðU CclassÞ: ð2:7Þ If the universe U of the quantized pattern space consists of ‘c’ features, say Fi, i = 1,2, …, c, the D is a fuzzy set defined on the quantized product spaces of U1, U2, …, Uc, that is U ¼ U1 U2 Uc; where Ui ¼ fui 1; ui 2; ...; ui mi g is the universe of the ith feature axis Fi with card ðUiÞ ¼ mi: Let Di be the fuzzy set on Ui, i.e., Di ¼ ½lDiðui j Þ1mi 2 FðUiÞ for i = 1, 2, …, c; then, cardð Þ¼ U m ¼ Qc i¼1 mi and ui is the c tuple each of type ui ¼ ðu1 i1 ; u2 i2 ; ...; uc ic =up ip 2 Up; p ¼ 1; 2; ...; cÞ and corresponding membership value belonging to D is determined by (2.8) shown below; ð1Þ lDðuiÞ ¼ ^c p¼1 flDp ðup ipÞg ¼ lD1 ðu1 i1 Þ ^ lD2 ðu2 i2 Þ... ^ lDc ðuc ic Þ ð2Þ lDðUiÞ ¼ Yc p¼1 flDp ðUp ipÞg ¼ lD1 ðu1 i1 Þ lD2 ðu2 i2 Þ...lDc ðuc ic Þ ð2:8Þ where i ¼ Pc1 p¼1ðPc k [pmkÞðip 1Þ þ ic; for each ip ¼ 1; 2; ...; mp; p ¼ 1; 2; ...; c: 18 2 Pattern Classification Based
2.3 Existing Method to Solve Fuzzy Relation Equation Equation(2.6)can be put in the following form ()=V{m(u)mge(},for,j=1,2,……,n (2.9) where t is the t-norm operator. Thus, from(2.)and(2.9), where t of (2.9)is one of the operators in iprod min we get following four types of problems Type I: by using(1)of (2. 8)and t prod of (2.9) Type I: by using(2)of(2.8)and t= prod of(2.9); Type I: by using(1)of (2.8) and t= min of (2.9) Type I: by using(2)of (2.8)and t= min of(2.9) Let e be the sum of the square of the error over p= 1, 2, .., n and is defined Ac(Cp (2.10) where C is the calculated fuzzy set using(2.9), and C is the desired fuzzy set ow,the basic problem is to estimate i= Lar(ri )lmxn via some given D and C which minimize E defined in(2. 10)and satisfying uR(i)A(1-AR(ri))) ≥0,ⅵi=1,2,……, m and j=1,2, g eneatio ns which form he necessary oo ditions sorea nein morm of the sauvare of the error defined in(2.10). Thus, we have [(aE)/(OuR(i)lmxn=[Omxn. Now we discuss the applicability of Newton's method and its simplification The Newton's iterative scheme for finding the solution of Lug(ri)lmxn is )(+1=1)()-ap(01R= where i=1,2,.…, m and j=1,2,…,n· as is the convergent factor and also is an nonincreasing gain factor depending on the number of iteration. It can be described as as=1/(2.0+s).>0 is chosen empirically in order to achieve good convergent properties and avoid significant oscillations in the iteration procedure(Pedrycz 1983) E app (=2ic(c)-ha()h (2.12) where opc(ci) aun(i)
Equation (2.6) can be put in the following form: lc cj ¼ _m i¼1 flDðuiÞtl<ðrijg; for; j ¼ 1; 2; ...; n ð2:9Þ where t is the t-norm operator. Thus, from (2.8) and (2.9), where t of (2.9) is one of the operators in {prod, min}, we get following four types of problems: Type I: by using (1) of (2.8) and t : prod of (2.9); Type II: by using (2) of (2.8) and t : prod of (2.9); Type III: by using (1) of (2.8) and t : min of. (2.9); Type IV: by using (2) of (2.8) and t : min of (2.9). Let E be the sum of the square of the error over p ¼ 1; 2; ...; n and is defined by E ¼ Xn p¼1 lCðcpÞ lC~ ðcpÞ 2 ð2:10Þ where C is the calculated fuzzy set using (2.9), and C~ is the desired fuzzy set. Now, the basic problem is to estimate <¼½l<ðrijÞmn via some given D and C which minimize E defined in (2.10) and satisfying fl<ðrijÞ^ð1 l<ðrijÞÞg 0; 8i ¼ 1; 2; ...; m and j ¼ 1; 2; ...; n: A general method to solve an optimization problem, defined above, is to solve a set of equations, which form the necessary conditions for a minimum of the square of the error defined in (2.10). Thus, we have ½ðoEÞ=ðol<ðrijÞÞmn ¼ ½0 mn: Now, we discuss the applicability of Newton’s method and its simplification. The Newton’s iterative scheme for finding the solution of <¼½l<ðrijÞmn is l<ðrijÞ ð Þ sþ1 ¼ l<ðrijÞ ð Þs as oE ol< rij j<¼<ðsÞ ð2:11Þ where i ¼ 1; 2; ...; m and j ¼ 1; 2; ...; n as is the convergent factor and also is an nonincreasing gain factor depending on the number of iteration. It can be described as as ¼ 1=ð2:0 þ skÞ 0 is chosen empirically in order to achieve good convergent properties and avoid significant oscillations in the iteration procedure (Pedrycz 1983). Now oE ol<ðrijÞ ¼ 2flCðCjÞ lC~ ðCjÞgPij ð2:12Þ where Pij ¼ olCðCjÞ ol<ðrijÞ ; ð2:13Þ 2.3 Existing Method to Solve Fuzzy Relation Equation 19
2 Pa P=μR(T {D(up)ra2(rp)万 Vm{p()g(m)×V{p()p( fori=1,2,,,, m and j=1,2,.…,n. If we consider t-norm operator as"prod, "then the(.9)is written as q)=V=1{(a)(r) and in this case Pi in(2. 14)is determined as (2.16), for i=1, 2 m),Vm=:{),H(m)≤(4),/e() otherwise Again, if we consider t-norm operator as"min", then(2.9)is written as (q)=V=1{(a)AHe( 1,2, (217) and in this case, Pi is determined as if ViAD(up)Ap(pi))<p(ui)AHm(i) P (2.18) and up(u)≥p(r) otherwise i=1,2,……, m and j=1,2 Here, the derivative of the max-function and min-function in the (2. 14) (2.16), and(2. 18), respectively are as follows w a ∫1,if 1o, ifw<a (2.19) here a= Vp+ lap(up)tuR(pi) and w=pp(ui)tue(rin)and (Ab)=1 ifz<b 0, if z>b (220) where b=pp(ui) and z=HR(i), which are piecewise differentiable and is undefined at w= a for max-function in (2. 19) and z= b for min-function in (2.20). Thus, we get some problems in our numerical computation (lkoma et al 1993)which may be overcome by defining the derivatives at pectively as follows
i.e., Pij ¼ o ol<ðrijÞ ½ _m p¼1 flDðupÞtl<ðrpjÞg ¼ o ol<ðrijÞ ½ _m p6¼i flDðupÞtl<ðrpjÞgg _flDðuiÞtl<ðrijÞg ð2:14Þ for i ¼ 1; 2; ...; m and j ¼ 1; 2; ...; n: If we consider t-norm operator as ‘‘prod,’’ then the (2.9) is written as lCðcjÞ ¼ _m i¼1 flDðuiÞ l<ðrijÞg; for; j ¼ 1; 2; ...; n ð2:15Þ and in this case Pij in (2.14) is determined as (2.16), for i ¼ 1; 2; ...; m and j ¼ 1; 2; ...; n: Pij ¼ lDðuiÞ; if Wm p6¼i flDðupÞ l<ðrpjÞg lDðuiÞ l<ðrijÞ 0; otherwise: ð2:16Þ Again, if we consider t-norm operator as ‘‘min’’, then (2.9) is written as lCðcjÞ ¼ _m i¼1 flDðuiÞ ^ l<ðrijÞg; for; j ¼ 1; 2; ...; n ð2:17Þ and in this case, Pij is determined as Pij ¼ 1; if _m p6¼i flDðupÞ ^ l<ðrpjÞg lDðuiÞ ^ l<ðrijÞ and lDðuiÞ l<ðrijÞ 0; otherwise 8 >>>< >>>: ð2:18Þ for i ¼ 1; 2; ...; m and j ¼ 1; 2; ...; n: Here, the derivative of the max—function and min-function in the (2.14), (2.16), and (2.18), respectively are as follows: o ow ðw _ aÞ ¼ 1; if w[ a 0; if w\a ð2:19Þ where a ¼ Wm p6¼i flDðupÞtl<ðrpjÞg and w ¼ lDðuiÞtl<ðrijÞ and o oz ðz ^ bÞ ¼ 1; if z\b 0; if z [ b ð2:20Þ where b ¼ lDðuiÞ and z ¼ l<ðrijÞ, which are piecewise differentiable and is undefined at w = a for max-function in (2.19) and z = b for min-function in (2.20). Thus, we get some problems in our numerical computation (Ikoma et al. 1993) which may be overcome by defining the derivatives at w = a and z = b, respectively as follows 20 2 Pattern Classification Based