Geoff Dougherty Pattern Recognition and classification An Introduction EXRA MATERIALS extras. springer. com ②S pringer
Geoff Dougherty Applied Physics and Medical Imaging California State University, Channel Isl Camarillo, CA. USA Please note that additional material for this book can be downloaded from ISBN978-1-4614-5322-2 ISBN978-1-4614-5323-9( e book) DOI10.1007/978-1-4614-5323-9 Springer New York Heidelberg Dordrecht London Library of Congress Control Number: 2012949108 Springer Science+ Business Media New York 2013 his work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part on, reprinting, reuse of illustrations. citation,broadcasting, reproduction on microfilms or in any other physical way, and transmission information storage and retrieval. electronic ition. con oftware, or by similar or dissimila nethodology now known or hereafter developed Exempted from this legal reservation are brief excerpts scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication this publication or parts thereof is permitted only under the provisions of the Copyright Law of the blisher's location, in its current version, and permission for use must always be obtained from pringer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this does not imply, he absence of a specific statement, that such names are exempt free for general use to be true and accurate at the date of L. neither the authors nor the editors nor the can accept any legal responsibility at may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein Printed on acid-free paper SpringerispartofSpringerScience+businessMedia(www.springer.com)
Geoff Dougherty Applied Physics and Medical Imaging California State University, Channel Islands Camarillo, CA, USA Please note that additional material for this book can be downloaded from http://extras.springer.com ISBN 978-1-4614-5322-2 ISBN 978-1-4614-5323-9 (eBook) DOI 10.1007/978-1-4614-5323-9 Springer New York Heidelberg Dordrecht London Library of Congress Control Number: 2012949108 # Springer Science+Business Media New York 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
Preface The use of patten recognition and classification is fundamental to many of the automated electronic systems in use today. Its applications range from military efense to medical diagnosis, from biometrics to machine learning, from bioinfor- matics to home entertainment, and more. However, despite the existence of a number of notable books in the field, the subject remains very challenging, espe We have found that the current textbooks are not completely satisfactory for our students, who are primarily computer science students but also include students from mathematics and physics backgrounds and those from industry. Their mathe matical and computer backgrounds are considerably varied, but they all want to understand and absorb the core concepts with a minimal time investment to the point where they can use and adapt them to problems in their own fields. Texts with extensive mathematical or statistical prerequisites were daunting and unappealing to them. Our students complained of"not seeing the wood for the trees, which is rather ironic for textbooks in pattern recognition. It is crucial for newcomers to the field to be introduced to the key concepts at a basic level in an ordered, logical fashion, so that they appreciate the"big picture"; they can then handle progres sively more detail, building on prior knowledge, without being overwhelmed. Too often our students have dipped into various textbooks to sample different approaches but have ended up confused by the different terminologies in use We have noticed that the majority of our students are very comfortable with and respond well to visual learning, building on their often limited entry knowledge, but focusing on key concepts illustrated by practical examples and exercises. We believe that a more visual presentation and the inclusion of worked examples promote a greater understanding and insight and appeal to a wider audience This book began as notes and lecture slides for a senior undergraduate course in Pattern Recognition at California State University Channel Islands(CSUCI). Over time it grew and approached its current form, which has been class tested over several years at CSUCI. It is suitable for a wide range of students at the advanced undergraduate or graduate level. It assumes only a modest
Preface The use of pattern recognition and classification is fundamental to many of the automated electronic systems in use today. Its applications range from military defense to medical diagnosis, from biometrics to machine learning, from bioinformatics to home entertainment, and more. However, despite the existence of a number of notable books in the field, the subject remains very challenging, especially for the beginner. We have found that the current textbooks are not completely satisfactory for our students, who are primarily computer science students but also include students from mathematics and physics backgrounds and those from industry. Their mathematical and computer backgrounds are considerably varied, but they all want to understand and absorb the core concepts with a minimal time investment to the point where they can use and adapt them to problems in their own fields. Texts with extensive mathematical or statistical prerequisites were daunting and unappealing to them. Our students complained of “not seeing the wood for the trees,” which is rather ironic for textbooks in pattern recognition. It is crucial for newcomers to the field to be introduced to the key concepts at a basic level in an ordered, logical fashion, so that they appreciate the “big picture”; they can then handle progressively more detail, building on prior knowledge, without being overwhelmed. Too often our students have dipped into various textbooks to sample different approaches but have ended up confused by the different terminologies in use. We have noticed that the majority of our students are very comfortable with and respond well to visual learning, building on their often limited entry knowledge, but focusing on key concepts illustrated by practical examples and exercises. We believe that a more visual presentation and the inclusion of worked examples promote a greater understanding and insight and appeal to a wider audience. This book began as notes and lecture slides for a senior undergraduate course and a graduate course in Pattern Recognition at California State University Channel Islands (CSUCI). Over time it grew and approached its current form, which has been class tested over several years at CSUCI. It is suitable for a wide range of students at the advanced undergraduate or graduate level. It assumes only a modest v
Preface background in statistics and mathematics, with the necessary additional material integrated into the text so that the book is essentially self-contained. The book is suitable both for individual study and for classroom use for students physics, computer science, computer engineering, electronic engineering medical engineering, and applied mathematics taking senior undergraduate and graduate courses in pattern recognition and machine learning. It presents a compre hensive introduction to the core concepts that must be understood in order to make independent contributions to the field. It is designed to be accessible to newcomers rom varied backgrounds, but it will also be useful to researchers and professionals n image and signal processing and analysis, and in computer vision. The goal is to present the fundamental concepts of supervised and unsupervised classification in an informal, rather than axiomatic, treatment so that the reader can quickly acquire the necessary background for applying the concepts to real problems. A final chapter indicates some useful and accessible projects which may be undertaken WeuseiMagej(http://rsbweb.nihgov/ij/)andtherelateddistributionFiji(http:// fiji. sc/wiki/index. php/ Fiji) in the early stages of image exploration and analysis, because of its intuitive interface and ease of use. We then tend to move on to MATLAB for its extensive capabilities in manipulating matrices and its image processing and statistics toolboxes. We recommend using an attractive GUI called Diplmage(fromhttp://www.diplib.org/download)toavoidmuchofthecommand line typing when manipulating images. There are also classification toolboxes availableforMatlab,suchasClassificationToolbox(http://www.wiley.com/ Wiley CDA/Section/id-105036. html) which requires a password obtainable from theassociatedcomputermanual)andPrtoOls(hTtp://www.prtools.org/download html). We use the Classification Toolbox in Chap. 8 and recommend it highly for its intuitive GUl. Some of our students have explored Weka, a collection of machine learning algorithms for solving data mining problems implemented in Java and open sourced(http://www.cs.waikatoac.nz/ml/weka/index_downloading.html There are a number of additional resources which can be downloaded from the companionWebsiteforthisbookathttp://extras.springercom/,includingseveral Useful Excel files and data files. Lecturers who adopt the book can also obtain access to the end-of-chapter exercises In spite of our best efforts at proofreading, it is still possible that some typos ma have survived. Please notify me if you find any I have very much enjoyed writing this book; I hope you enjoy reading it Camarillo. ca
background in statistics and mathematics, with the necessary additional material integrated into the text so that the book is essentially self-contained. The book is suitable both for individual study and for classroom use for students in physics, computer science, computer engineering, electronic engineering, biomedical engineering, and applied mathematics taking senior undergraduate and graduate courses in pattern recognition and machine learning. It presents a comprehensive introduction to the core concepts that must be understood in order to make independent contributions to the field. It is designed to be accessible to newcomers from varied backgrounds, but it will also be useful to researchers and professionals in image and signal processing and analysis, and in computer vision. The goal is to present the fundamental concepts of supervised and unsupervised classification in an informal, rather than axiomatic, treatment so that the reader can quickly acquire the necessary background for applying the concepts to real problems. A final chapter indicates some useful and accessible projects which may be undertaken. We use ImageJ (http://rsbweb.nih.gov/ij/) and the related distribution, Fiji (http:// fiji.sc/wiki/index.php/Fiji) in the early stages of image exploration and analysis, because of its intuitive interface and ease of use. We then tend to move on to MATLAB for its extensive capabilities in manipulating matrices and its image processing and statistics toolboxes. We recommend using an attractive GUI called DipImage (from http://www.diplib.org/download) to avoid much of the command line typing when manipulating images. There are also classification toolboxes available for MATLAB, such as Classification Toolbox (http://www.wiley.com/ WileyCDA/Section/id-105036.html) which requires a password obtainable from the associated computer manual) and PRTools (http://www.prtools.org/download. html). We use the Classification Toolbox in Chap. 8 and recommend it highly for its intuitive GUI. Some of our students have explored Weka, a collection of machine learning algorithms for solving data mining problems implemented in Java and open sourced (http://www.cs.waikato.ac.nz/ml/weka/index_downloading.html). There are a number of additional resources, which can be downloaded from the companion Web site for this book at http://extras.springer.com/, including several useful Excel files and data files. Lecturers who adopt the book can also obtain access to the end-of-chapter exercises. In spite of our best efforts at proofreading, it is still possible that some typos may have survived. Please notify me if you find any. I have very much enjoyed writing this book; I hope you enjoy reading it! Camarillo, CA Geoff Dougherty vi Preface
Acknowledgments I would like to thank my colleague Matthew Wiers for many useful conversations and for helping with several of the Excel files bundled with the book. And thanks to all my previous students for their feedback on the courses which eventually led to this book; especially to Brandon Ausmus, Elisabeth Perkins, Michelle Moeller, Charles Walden, Shawn Richardson, and Ray Alfano I am grateful to Chris Coughlin at Springer for his support and encouragement throughout the process of writing the book and to various anonymous reviewers who have critiqued the manuscript and trialed it with their classes. Special thanks go to my wife Hajijah and family(Daniel, Adeline, and Nadia) for their patience and support, and to my parents, Maud and Harry(who passed away in 2009) without whom this would never have happened
Acknowledgments I would like to thank my colleague Matthew Wiers for many useful conversations and for helping with several of the Excel files bundled with the book. And thanks to all my previous students for their feedback on the courses which eventually led to this book; especially to Brandon Ausmus, Elisabeth Perkins, Michelle Moeller, Charles Walden, Shawn Richardson, and Ray Alfano. I am grateful to Chris Coughlin at Springer for his support and encouragement throughout the process of writing the book and to various anonymous reviewers who have critiqued the manuscript and trialed it with their classes. Special thanks go to my wife Hajijah and family (Daniel, Adeline, and Nadia) for their patience and support, and to my parents, Maud and Harry (who passed away in 2009), without whom this would never have happened. vii