当前位置：和泉文库 > 基础医学 > 浏览文档

《医学影像信息学概论》课程参考资源：Geoff Dougherty：Pattern Recognition and Classification An Introduction

1 Introduction 2 Classification 3 Nonmetric Methods 4 Statistical Pattern Recognition 5 Supervised Learning 6 Nonparametric Learning 7 Feature Extraction and Selection 8 Unsupervised Learning 9 Estimating and Comparing Classifiers 10 Projects

文件格式：PDF，文件大小：7.45MB，售价：41.85元

共199页，可试读30页，点击往前阅读 ↑↑

文档详细内容（约199页）

Contents 1 Introduction .......................................... 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Classification . ..................................... 3 1.3 Organization of the Book . . ........................... 6 1.4 Exercises ......................................... 6 References ............................................ 7 2 Classification .......................................... 9 2.1 The Classification Process . ............................ 9 2.2 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Training and Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4 Supervised Learning and Algorithm Selection . . . . . . . . . . . . . . 17 2.5 Approaches to Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.6.1 Classification by Shape . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.6.2 Classification by Size . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.6.3 More Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.6.4 Classification of Letters . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3 Nonmetric Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2 Decision Tree Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.1 Information, Entropy, and Impurity . . . . . . . . . . . . . . . . 29 3.2.2 Information Gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 3.2.3 Decision Tree Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.2.4 Strengths and Weaknesses . . . . . . . . . . . . . . . . . . . . . . . 38 3.3 Rule-Based Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4 Other Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 ix

4 Statistical Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.1 Measured Data and Measurement Errors . . . . . . . . . . . . . . . . . . 43 4.2 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.2.1 Simple Probability Theory . . . . . . . . . . . . . . . . . . . . . . . 43 4.2.2 Conditional Probability and Bayes’ Rule . . . . . . . . . . . . . 46 4.2.3 Naı¨ve Bayes Classifier . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.3 Continuous Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3.1 The Multivariate Gaussian . . . . . . . . . . . . . . . . . . . . . . . 57 4.3.2 The Covariance Matrix . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.3.3 The Mahalanobis Distance . . . . . . . . . . . . . . . . . . . . . . . 69 4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.1 Parametric and Non-parametric Learning . . . . . . . . . . . . . . . . . . 75 5.2 Parametric Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.1 Bayesian Decision Theory . . . . . . . . . . . . . . . . . . . . . . . 75 5.2.2 Discriminant Functions and Decision Boundaries . . . . . . 87 5.2.3 MAP (Maximum A Posteriori) Estimator . . . . . . . . . . . . 94 5.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 6 Nonparametric Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 6.1 Histogram Estimator and Parzen Windows . . . . . . . . . . . . . . . . . 99 6.2 k-Nearest Neighbor (k-NN) Classification . . . . . . . . . . . . . . . . . 100 6.3 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 6.4 Kernel Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 6.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 7 Feature Extraction and Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.1 Reducing Dimensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 7.1.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.2 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.2.1 Inter/Intraclass Distance . . . . . . . . . . . . . . . . . . . . . . . . . 124 7.2.2 Subset Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 7.3 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 7.3.1 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . 127 7.3.2 Linear Discriminant Analysis . . . . . . . . . . . . . . . . . . . . . 135 7.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 x Contents

8 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 8.1 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 8.2 k-Means Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 8.2.1 Fuzzy c-Means Clustering . . . . . . . . . . . . . . . . . . . . . 148 8.3 (Agglomerative) Hierarchical Clustering . . . . . . . . . . . . . . . . . 150 8.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 9 Estimating and Comparing Classifiers . . . . . . . . . . . . . . . . . . . . . . 157 9.1 Comparing Classifiers and the No Free Lunch Theorem . . . . . . 157 9.1.1 Bias and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 9.2 Cross-Validation and Resampling Methods . . . . . . . . . . . . . . . 160 9.2.1 The Holdout Method . . . . . . . . . . . . . . . . . . . . . . . . . 161 9.2.2 k-Fold Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . 162 9.2.3 Bootstrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 9.3 Measuring Classifier Performance . . . . . . . . . . . . . . . . . . . . . . 164 9.4 Comparing Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 9.4.1 ROC Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 9.4.2 McNemar’s Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 9.4.3 Other Statistical Tests . . . . . . . . . . . . . . . . . . . . . . . . 169 9.4.4 The Classification Toolbox . . . . . . . . . . . . . . . . . . . . . 171 9.5 Combining Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 10 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 10.1 Retinal Tortuosity as an Indicator of Disease . . . . . . . . . . . . . . 177 10.2 Segmentation by Texture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 10.3 Biometric Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 10.3.1 Fingerprint Recognition . . . . . . . . . . . . . . . . . . . . . . . 184 10.3.2 Face Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Contents xi

hapter 1 Introduction 1.1 Overview Humans are good at recognizing objects(or patterns, to use the generic term).We are so good that we take this ability for granted, and find it difficult to analyze the steps in the process. It is generally easy to distinguish the sound of a human voic from that of a violin; a handwritten numeral“3,”" from an“8”; and the aroma of a rose, from that of an onion. Every day, we recognize faces around us, but we do it unconsciously and because we cannot explain our expertise, we find it difficult to write a computer program to do the same. Each person's face is a pattern composed of a particular combination of structures (eyes, nose, mouth, .) located in certain positions on the face. By analyzing sample images of faces, a program should be able to capture the pattern specific to a face and identify(or recognize) it as a face(as a member of a category or class we already know ) this would be pattern recognition. There may be several categories(or classes)and we have to sort(or classify)a particular face into a certain category(or class); hence the term classification. Note that in pattern recognition, the term pattern is interpreted widely and does not necessarily imply a repetition; it is used to include all objects that we might want to classify, e. g, apples (or oranges), speech waveforms, and fingerprint a class is a collection of objects that are similar, but not necessarily identical, and which is distinguishable from other classes. Figure 1. 1 illustrates the difference between classification where the classes are known beforehand and classification where classes are created after inspecting the objects. Interest in pattern recognition and classification has grown due to emerging applications, which are not only challenging but also computatio eman These applications include · Data mining ifting through a large volume of data to extract a small amount of relevant and useful information, e.g, fraud detection, financial forecasting, and redit scoring) G. Dougherty, Pattern DOI 10.1007/978-1-4614-5323-9_1, o Springer Science+Business Media New York 2013

Chapter 1 Introduction 1.1 Overview Humans are good at recognizing objects (or patterns, to use the generic term). We are so good that we take this ability for granted, and find it difficult to analyze the steps in the process. It is generally easy to distinguish the sound of a human voice, from that of a violin; a handwritten numeral “3,” from an “8”; and the aroma of a rose, from that of an onion. Every day, we recognize faces around us, but we do it unconsciously and because we cannot explain our expertise, we find it difficult to write a computer program to do the same. Each person’s face is a pattern composed of a particular combination of structures (eyes, nose, mouth, ...) located in certain positions on the face. By analyzing sample images of faces, a program should be able to capture the pattern specific to a face and identify (or recognize) it as a face (as a member of a category or class we already know); this would be pattern recognition. There may be several categories (or classes) and we have to sort (or classify) a particular face into a certain category (or class); hence the term classification. Note that in pattern recognition, the term pattern is interpreted widely and does not necessarily imply a repetition; it is used to include all objects that we might want to classify, e.g., apples (or oranges), speech waveforms, and fingerprints. A class is a collection of objects that are similar, but not necessarily identical, and which is distinguishable from other classes. Figure 1.1 illustrates the difference between classification where the classes are known beforehand and classification where classes are created after inspecting the objects. Interest in pattern recognition and classification has grown due to emerging applications, which are not only challenging but also computationally demanding. These applications include: • Data mining (sifting through a large volume of data to extract a small amount of relevant and useful information, e.g., fraud detection, financial forecasting, and credit scoring) G. Dougherty, Pattern Recognition and Classification: An Introduction, DOI 10.1007/978-1-4614-5323-9_1, # Springer Science+Business Media New York 2013 1

A异Acat a B 6 AA1171111 B Category "B B凸B8 Fig. 1.1 Classification when the classes are(a) known and (b) unknown beforehand Biometrics(personal identification based on physical attributes of the face, iris, fingerprints, etc.) Machine vision(e. g, automated visual inspection in an assembly line) Character recognition [e. g, automatic mail sorting by zip code, automated check scanners at ATMs(automated teller machines) Document recognition(e.g, recognize whether an e-mail is spam or not, based on the message header and content) Computer-aided diagnosis [e. g, helping doctors make diagnostic decisions based on interpreting medical data such as mammographic images, ultrasound images, electrocardiograms(ECGs), and electroencephalograms(EEGs) Medical imaging [e.g, classifying cells as malignant or benign based on the results of magnetic resonance imaging(MRI)scans, or classify different emo- tional and cognitive states from the images of brain activity in functional MRI Speech recognition(e. g, helping handicapped patients to control machines) Bioinformatics(e. g, DNA sequence analysis to detect genes related to particul Remote sensing(e.g, land use and crop yield) Astronomy (classifying galaxies based on their shapes; or automated searches such as the Search for Extra-Terrestrial Intelligence (SETD) which analyzes radio telescope data in an attempt to locate signals that might be artificial i The methods used have been developed in various fields, often independently In statistics, going from particular observations to general descriptions is called nference, learning [i.e, using example(training) data] is called estimation, and classification is known as discriminant analysis(McLachlan 1992). In engineer g, classification is called pattern recognition and the approach is nonparametric and much more empirical (Duda et al. 2001). Other approaches have their origins n machine learning(Alpaydin 2010), artificial intelligence(Russell and Norvig 2002), artificial neural networks( Bishop 2006), and data mining(Han and Kamber 2006). We will incorporate techniques from these different emphases to give a more unified treatment( Fig. 1.2)

• Biometrics (personal identification based on physical attributes of the face, iris, fingerprints, etc.) • Machine vision (e.g., automated visual inspection in an assembly line) • Character recognition [e.g., automatic mail sorting by zip code, automated check scanners at ATMs (automated teller machines)] • Document recognition (e.g., recognize whether an e-mail is spam or not, based on the message header and content) • Computer-aided diagnosis [e.g., helping doctors make diagnostic decisions based on interpreting medical data such as mammographic images, ultrasound images, electrocardiograms (ECGs), and electroencephalograms (EEGs)] • Medical imaging [e.g., classifying cells as malignant or benign based on the results of magnetic resonance imaging (MRI) scans, or classify different emotional and cognitive states from the images of brain activity in functional MRI] • Speech recognition (e.g., helping handicapped patients to control machines) • Bioinformatics (e.g., DNA sequence analysis to detect genes related to particular diseases) • Remote sensing (e.g., land use and crop yield) • Astronomy (classifying galaxies based on their shapes; or automated searches such as the Search for Extra-Terrestrial Intelligence (SETI) which analyzes radio telescope data in an attempt to locate signals that might be artificial in origin) The methods used have been developed in various fields, often independently. In statistics, going from particular observations to general descriptions is called inference, learning [i.e., using example (training) data] is called estimation, and classification is known as discriminant analysis (McLachlan 1992). In engineering, classification is called pattern recognition and the approach is nonparametric and much more empirical (Duda et al. 2001). Other approaches have their origins in machine learning (Alpaydin 2010), artificial intelligence (Russell and Norvig 2002), artificial neural networks (Bishop 2006), and data mining (Han and Kamber 2006). We will incorporate techniques from these different emphases to give a more unified treatment (Fig. 1.2). Fig. 1.1 Classification when the classes are (a) known and (b) unknown beforehand 2 1 Introduction

点击进入文档下载页（PDF格式）

共199页，可试读30页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录