Chapter 1 Soft Computing Approach to Pattern Classification and object recognition Abstract The basic aim of this research monograph is to develop a unified approach to supervised pattern classification (Tou and Gonzalez, Pattern Recog nition Principles. Addison-Wesley, Reading, 1974) and model based occluded object recognition(Koch and Kashyap, IEEE Trans Pattern Anal Machine Intell. 9(4): 483-494, 1987; Ray and Dutta Mazumder, Pattern Recogn Lett 9: 351-360 1989). To perform this task we essentially consider soft computing tools, viz uzzy relational calculus(FRC)(Pedrycz, Fuzzy Sets Syst 16: 163-174, 1985 Pattern Recogn 23(112): 121-146, 1990), genetic algorithm (GA)(Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, Reading, 1989: Michalewicz, Genetic Algorithm Data Structures Evolution Programs, Springer, New York, 1994)and multilayer perceptron (MLP (Pao, Adaptive pattern recognition and neural networks. Addison Wesle Reading, 1989). The supervised approach to pattern classification and model based approach to occluded object recognition are treated in one framework which is based on either conventional interpretation or new interpretation of multidimen sional fuzzy implication(MFI(Sugeno and Takagi, Fuzzy Sets Syst 9: 313-325 1983; Tsukamoto, Advance in Fuzzy Set Theory and Applications. North-Holland, Amsterdam, 137-149, 1979)and a novel notion of fuzzy pattern vector(FPV. In the context of representation of knowledge about patterns and/or objects we try to generalize the concept of feature vector by fuzzy feature vector. Readers are advised to read Appendix-a before they go into the details of classification (recognition) concept based on soft computing tools. K. S. Ray, Soft Computing Approach to Pattern Classification and Object Recognition, DOI: 10.1007/978-1-4614-5348-2_1 Media New York 2012
Chapter 1 Soft Computing Approach to Pattern Classification and Object Recognition Abstract The basic aim of this research monograph is to develop a unified approach to supervised pattern classification (Tou and Gonzalez, Pattern Recognition Principles. Addison-Wesley, Reading, 1974) and model based occluded object recognition (Koch and Kashyap, IEEE Trans Pattern Anal Machine lntell. 9(4):483–494, 1987; Ray and Dutta Mazumder, Pattern Recogn Lett 9:351–360, 1989). To perform this task we essentially consider soft computing tools, viz., fuzzy relational calculus (FRC) (Pedrycz, Fuzzy Sets Syst 16:163–174, 1985, Pattern Recogn 23(1/2):121–146, 1990), genetic algorithm (GA) (Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, Reading, 1989; Michalewicz, Genetic Algorithm + Data Structures = Evolution Programs, Springer, New York, 1994) and multilayer perceptron (MLP) (Pao, Adaptive pattern recognition and neural networks. Addison Wesley, Reading, 1989). The supervised approach to pattern classification and model based approach to occluded object recognition are treated in one framework which is based on either conventional interpretation or new interpretation of multidimensional fuzzy implication (MFI) (Sugeno and Takagi, Fuzzy Sets Syst 9:313–325, 1983; Tsukamoto, Advance in Fuzzy Set Theory and Applications. North-Holland, Amsterdam, 137–149, 1979) and a novel notion of fuzzy pattern vector (FPV). In the context of representation of knowledge about patterns and/or objects we try to generalize the concept of feature vector by fuzzy feature vector. Readers are advised to read Appendix-A before they go into the details of classification (recognition) concept based on soft computing tools. K. S. Ray, Soft Computing Approach to Pattern Classification and Object Recognition, DOI: 10.1007/978-1-4614-5348-2_1, Springer Science+Business Media New York 2012 1
I Soft Computing Approach to Pattern Classification 1.1 Introductio As the primary concern of soft computing approach to pattern classification(object recognition)(Zadeh 1977; Pedrycz 1990) is to mimic the cognitive process of human reasoning for classification(recognition), we try to imitate the way human beings perceive(Newell and Simon 1972)different classes of objects(patterns) based on some rough(inexact) information of certain parameters(features). For instance, human being can easily distinguish between a poor person and a rich person just by looking at the individual,s standard of living which cannot be measured explicitly by any specific scale but can onsidering the area where the person lives, the kind of food he/she takes, the kind of education his/her family takes, the kind of clothes he/she wears, the kind of ommodities he/she uses, etc. And such information(knowledge) can be repre sented by a rule(law of implication) as given below RI: if standard of living of a person is high then the person is rich, R2: if standard of living of a person is low then the person is poor; where standard of living high, standard of living low, etc, are linguistic terms and rich, poor are different classes. Each primary linguistic term(i.e. high/low, etc. associated with a term set which is finite and where each primary term in the term set is defined on the same universe of discourse. the said universe of discourse is partitioned (in an overlapped manner) by the finite elements of the term set. For instance, if we place the standard of living between 0 and 10(by some arbitrary scale)then Fig. 1. 1 explains how the elements of the term set partition the universe on which each element of the term set is defined. Each primary term of the term set the degree of uncertainty of the elements of the said universe to become member of either of the fuzzy set(fuzzy set of low/fuzzy set of medium/fuzzy set of high Now, instead of having one-dimensional implications (i.e. R, and r2)we can have multidimensional implication for representing our knowledge(information) For instance R3: if(behavior of a person is smart, appearance of a person is beautiful)then he/she becomes a candidate for interview of a personal assistant of a firm where t is transpose Usually, the above type of rule is read as R4: if behavior of a person is smart and appearance of a person is beautiful ther he/she becomes a candidate for interview of a personal assistant of a firm Such one-dimensional implication (i.e. R4)is a kind of interpretation(see Eq (A 1)of Appendix-A) of the said multi-dimensional form (i.e. R3) of an plication Here, according to the one-dimensional form of an implication, we have two antecedent clauses(e.g. smart behavior and beautiful appearance)which can be represented by two fuzzy sets defined over two different universe of discourses In the consequent part, we always have single clause which can be represented by a
1.1 Introduction As the primary concern of soft computing approach to pattern classification (object recognition) (Zadeh 1977; Pedrycz 1990) is to mimic the cognitive process of human reasoning for classification (recognition), we try to imitate the way human beings perceive (Newell and Simon 1972) different classes of objects (patterns) based on some rough (inexact) information of certain parameters (features). For instance, human being can easily distinguish between a poor person and a rich person just by looking at the individual’s standard of living which cannot be measured explicitly by any specific scale but can be indirectly estimated by considering the area where the person lives, the kind of food he/she takes, the kind of education his/her family takes, the kind of clothes he/she wears, the kind of commodities he/she uses, etc. And such information (knowledge) can be represented by a rule (law of implication) as given below. R1: if standard of living of a person is high then the person is rich, R2: if standard of living of a person is low then the person is poor; where standard of living high, standard of living low, etc., are linguistic terms and rich, poor are different classes. Each primary linguistic term (i.e. high/low, etc.) is associated with a term set which is finite and where each primary term in the term set is defined on the same universe of discourse. The said universe of discourse is partitioned (in an overlapped manner) by the finite elements of the term set. For instance, if we place the standard of living between 0 and 10 (by some arbitrary scale) then Fig. 1.1 explains how the elements of the term set partition the universe on which each element of the term set is defined. Each primary term of the term set is represented by a fuzzy set. Between partition there is an overlap which indicates the degree of uncertainty of the elements of the said universe to become member of either of the fuzzy set (fuzzy set of low/fuzzy set of medium/fuzzy set of high). Now, instead of having one-dimensional implications (i.e. R1 and R2) we can have multidimensional implication for representing our knowledge (information). For instance, R3: if (behavior of a person is smart, appearance of a person is beautiful)t then he/she becomes a candidate for interview of a personal assistant of a firm; where t is transpose. Usually, the above type of rule is read as R4: if behavior of a person is smart and appearance of a person is beautiful then he/she becomes a candidate for interview of a personal assistant of a firm. Such one-dimensional implication (i.e. R4) is a kind of interpretation (see Eq. (A.1) of Appendix-A) of the said multi-dimensional form (i.e. R3) of an implication. Here, according to the one-dimensional form of an implication, we have two antecedent clauses (e.g. smart behavior and beautiful appearance) which can be represented by two fuzzy sets defined over two different universe of discourses. In the consequent part, we always have single clause which can be represented by a 2 1 Soft Computing Approach to Pattern Classification
1.1 Introduction Fig. 1.1 Overlapped Elements of the term set discourse by the elements of he term set medium Standord of liv ing fuzzy set defined over a finite universe of different classes. The cardinality of the term set of each antecedent clause of an implication determines the number of rules that can be generated. For instance, if we have two antecedent clauses in an implication each having the cardinality 3(say, low, medium and high), then we will have total 3 x 3=9 rules. Note, that the cardinality of a term set defined over a given universe is not unique. Depending upon the need of the problem it determined. It simply indicates the granularity(see Fig. 1. 2) by which we want to partition the given universe to facilitate our representation of perception about grouping of objects(patterns) Depending on whether the universe of discourse is continuous or discrete, we an define the fuzzy sets of the antecedent clause of an implication by two ways. In ase universe is continuous, we may go by functional definition, e.g. a bell-shaped function(see Fig. 1. 1), triangle shaped function( see Fig. 1. 2), trapezoid shaped function or any arbitrary shaped function. In case of discrete universe, we may go by numerical definition. In this case, the grade of membership function of a fuzzy et is represented by a one-dimensional array of numbers. The length of the array depends on the degree of discretization. Discretization of a universe of discourse is frequently referred to as quantiza- on. In effect quantization discretizes a universe into a certain number of segments (quantization levels). Each segment is labeled as a generic element of a discrete niverse. A fuzzy set is then defined over the said discrete universe by assigning grade of membership values to each generic element of the discrete universe. The consequent clause of an implication basically represents different classes of objects(patterns) existing over the finite range of the pattern space as shown in Fig. 1.3. The possibility of occurrence of different classes of patterns in the pattern space under a particular observation(it may be imprecise observation, like feature FI is high and feature F2 is low, etc ) may be represented by a fuzzy set defined over the pattern space which is treated as universe of different classes of patterns Now, we try to give a more meaningful discussion on the correspondence between conventional approach to pattern classification and soft computin approach to pattern classification. Note that recognition of the occluded scene consists of model objects is based on the features of few dominant points of the occluded and model objects. The features of patterns and objects are un fiedly treated on a feature(pattern) space spanned by the individual feature axis Further note that the basic concept of supervised approach to pattern classification
fuzzy set defined over a finite universe of different classes. The cardinality of the term set of each antecedent clause of an implication determines the number of rules that can be generated. For instance, if we have two antecedent clauses in an implication each having the cardinality 3 (say, low, medium and high), then we will have total 3 9 3 = 9 rules. Note, that the cardinality of a term set defined over a given universe is not unique. Depending upon the need of the problem it is determined. It simply indicates the granularity (see Fig. 1.2) by which we want to partition the given universe to facilitate our representation of perception about grouping of objects (patterns). Depending on whether the universe of discourse is continuous or discrete, we can define the fuzzy sets of the antecedent clause of an implication by two ways. In case universe is continuous, we may go by functional definition, e.g. a bell-shaped function (see Fig. 1.1), triangle shaped function (see Fig. 1.2), trapezoid shaped function or any arbitrary shaped function. In case of discrete universe, we may go by numerical definition. In this case, the grade of membership function of a fuzzy set is represented by a one-dimensional array of numbers. The length of the array depends on the degree of discretization. Discretization of a universe of discourse is frequently referred to as quantization. In effect quantization discretizes a universe into a certain number of segments (quantization levels). Each segment is labeled as a generic element of a discrete universe. A fuzzy set is then defined over the said discrete universe by assigning grade of membership values to each generic element of the discrete universe. The consequent clause of an implication basically represents different classes of objects (patterns) existing over the finite range of the pattern space as shown in Fig. 1.3. The possibility of occurrence of different classes of patterns in the pattern space under a particular observation (it may be imprecise observation, like feature F1 is high and feature F2 is low, etc.) may be represented by a fuzzy set defined over the pattern space which is treated as universe of different classes of patterns. Now, we try to give a more meaningful discussion on the correspondence between conventional approach to pattern classification and soft computing approach to pattern classification. Note that recognition of the occluded scene consists of model objects is based on the features of few dominant points of the occluded scene and model objects. The features of patterns and objects are uni- fiedly treated on a feature (pattern) space spanned by the individual feature axis. Further note that the basic concept of supervised approach to pattern classification Fig. 1.1 Overlapped partition of the universe of discourse by the elements of the term set 1.1 Introduction 3
I Soft Computing Approach to Pattern Classification (a) Fig. 1. 2 Granularity of partition of universe, a coarse. b Fine F2tleoture axis pottern space (U1×Uz) Pt(pattern) P2(pattem) operating range of F2= U2 E,弓 Fr(teature oxis) feature vector/ patem vector operating range ofF1≡U1 Fig. 1.3 Representation of different classes of patterns on R- by feature vectors/pattern vectors and model based approach to occluded object recognition are similar in nature. Hence both the approaches are treated on a unified platform, i.e. either based on conventional interpretation of MFI or based on new interpretation of MFI (Sugeno and Takagi 1983; Tsukamoto 1979) In the supervised approach to pattern classification and model based approach to occluded object recognition, the declarative form of knowledge representation about the features of the training patterns or model objects is performed through fuzzy IF-THEN rules. The antecedent part of the rules represent the feature values in terms of primary fuzzy sets and the consequent part of the rules represent the class belongingness of the known patterns or objects. Thus feature values of training patterns or model objects are related to different classes formed by training patterns or model objects. To estimate the said relations is the primary task of classifier design(for patterns)or recognizer design(for scene consists of model
and model based approach to occluded object recognition are similar in nature. Hence both the approaches are treated on a unified platform, i.e. either based on conventional interpretation of MFI or based on new interpretation of MFI (Sugeno and Takagi 1983; Tsukamoto 1979). In the supervised approach to pattern classification and model based approach to occluded object recognition, the declarative form of knowledge representation about the features of the training patterns or model objects is performed through fuzzy IF–THEN rules. The antecedent part of the rules represent the feature values in terms of primary fuzzy sets and the consequent part of the rules represent the class belongingness of the known patterns or objects. Thus feature values of training patterns or model objects are related to different classes formed by training patterns or model objects. To estimate the said relations is the primary task of classifier design (for patterns) or recognizer design (for scene consists of model Fig. 1.3 Representation of different classes of patterns on R2 by feature vectors/pattern vectors Fig. 1.2 Granularity of partition of universe, a coarse. b Fine 4 1 Soft Computing Approach to Pattern Classification
LI Introduction 5 objects). Usually, for inferencing(reasoning/decision making) from a set of fuzzy IF-THEN rules, multivalued concept of fuzzy logic is used (Mizumoto 198 Zadeh 1970). But in this research monograph, for inferencing(reasoning/decision making) we replace logic by the concept of learning, simply because the logical pproach to fuzzy reasoning depends upon suitable choice of interpretation, viz. Mamdanis min operator, Zadeh's arithmatic rules etc.(Mizumoto 1985: Zadeh 1970)to construct a relation between antecent clause(s)and consequent clause. To design a classifier(for patterns) or recognizer(for occluded scene consists of model objects) we adopt the concept of learning to estimate the relation between antecedent clause(s) and consequent clause 1. 2 Passage Between Conventional Approach to Pattern Classification(Object Recognition) and Soft Computing Approach to Pattern Classification(Object Recognition) For simplicity of discussion and/or demonstration, let us restrict ourselves to the problem of supervised pattern classification(model based object recognition)on R-. Without lack of any generality, all the discussions and/or demonstrations will be valid for any problem of supervised pattern classification(model based object recognition)on R,n> 2. In case of supervised pattern classification(model based object recognition) we basically have two stages; viz. learning(training stage and testing(verification) stage. At the learning(training) stage we design the classifier(recognizer) based on the training data(model objects)set. Subsequently, at the testing(verification) stage we test the performance of the designed classifier (recognizer) based on the test data set In conventional approach to pattern classification(object recognition) on R- (Tou and Gonzalez 1974; Koch and Kashyap 1987: Ray and DuttaMazumder 989), we usually represent the training data set by two dimensional feature vector (pattern vector)having two feature axes(say FI and F2). Depending upon the limit of the operating range of features, we obtain a finite range of pattern space formed by the finite length of each feature axis(see Fig. 1.3). Looking at diagram of the training data set we get an idea of how the training data are grouped together and accordingly each group of training data set is labeled by a particular class say Ci, i= 1, 2,? Each data(pattern) of each class is represented by a pattern vector/feature vector as stated earlier. For instance, any data(pattern) Pi on the pattern space is represented by the tip of the following pattern vector/feature vector(see Fig. 1.3) Fc F The task of the conve pproach to pattern classification is to classify each vector F to one of the classes C; on R2
objects). Usually, for inferencing (reasoning/decision making) from a set of fuzzy IF–THEN rules, multivalued concept of fuzzy logic is used (Mizumoto 1985; Zadeh 1970). But in this research monograph, for inferencing (reasoning/decision making) we replace logic by the concept of learning, simply because the logical approach to fuzzy reasoning depends upon suitable choice of interpretation, viz. Mamdani’s min operator, Zadeh’s arithmatic rules etc. (Mizumoto 1985; Zadeh 1970) to construct a relation between antecent clause(s) and consequent clause. To design a classifier (for patterns) or recognizer (for occluded scene consists of model objects) we adopt the concept of learning to estimate the relation between antecedent clause(s) and consequent clause. 1.2 Passage Between Conventional Approach to Pattern Classification (Object Recognition) and Soft Computing Approach to Pattern Classification (Object Recognition) For simplicity of discussion and/or demonstration, let us restrict ourselves to the problem of supervised pattern classification (model based object recognition) on R2 . Without lack of any generality, all the discussions and/or demonstrations will be valid for any problem of supervised pattern classification (model based object recognition) on Rn , n C 2. In case of supervised pattern classification (model based object recognition) we basically have two stages; viz. learning (training) stage and testing (verification) stage. At the learning (training) stage we design the classifier (recognizer) based on the training data (model objects) set. Subsequently, at the testing (verification) stage we test the performance of the designed classifier (recognizer) based on the test data set. In conventional approach to pattern classification (object recognition) on R2 (Tou and Gonzalez 1974; Koch and Kashyap 1987; Ray and DuttaMazumder 1989), we usually represent the training data set by two dimensional feature vector (pattern vector) having two feature axes (say F1 and F2). Depending upon the limit of the operating range of features, we obtain a finite range of pattern space formed by the finite length of each feature axis (see Fig. 1.3). Looking at the scatter diagram of the training data set we get an idea of how the training data are grouped together and accordingly each group of training data set is labeled by a particular class say eci; i = 1, 2,? Each data (pattern) of each class is represented by a pattern vector/feature vector as stated earlier. For instance, any data (pattern) Pi on the pattern space is represented by the tip of the following pattern vector/feature vector (see Fig. 1.3). ~F~ci ¼ Fi 11 Fi 22 ; i ¼ 1; 2; ... ð1:1Þ The task of the conventional approach to pattern classification is to classify each vector F ! eci to one of the classes eci on R2 . 1.1 Introduction 5