当前位置：和泉文库 > 基础医学 > 浏览文档

《医学影像信息学概论》课程参考资源：Support Vector Machines for Pattern Recognition（Support Vector Machines for Pattern Classification, Second Edition）

文件格式：PDF，文件大小：5.9MB，售价：84.3元

文档详细内容（约482页）

Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix 1 Introduction .............................................. 1 1.1 Decision Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Decision Functions for Two-Class Problems . . . . . . . . . . 2 1.1.2 Decision Functions for Multiclass Problems . . . . . . . . . . 4 1.2 Determination of Decision Functions . . . . . . . . . . . . . . . . . . . . . . 8 1.3 Data Sets Used in the Book. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.4 Classifier Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2 Two-Class Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . 21 2.1 Hard-Margin Support Vector Machines . . . . . . . . . . . . . . . . . . . . 21 2.2 L1 Soft-Margin Support Vector Machines . . . . . . . . . . . . . . . . . . 28 2.3 Mapping to a High-Dimensional Space . . . . . . . . . . . . . . . . . . . . 31 2.3.1 Kernel Tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 2.3.2 Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.3.3 Normalizing Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.3.4 Properties of Mapping Functions Associated with Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.3.5 Implicit Bias Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.3.6 Empirical Feature Space . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.4 L2 Soft-Margin Support Vector Machines . . . . . . . . . . . . . . . . . . 56 2.5 Advantages and Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.5.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.5.2 Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2.6 Characteristics of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.6.1 Hessian Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.6.2 Dependence of Solutions on C . . . . . . . . . . . . . . . . . . . . . 62 xiii

xiv Contents 2.6.3 Equivalence of L1 and L2 Support Vector Machines . . . 67 2.6.4 Nonunique Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 2.6.5 Reducing the Number of Support Vectors . . . . . . . . . . . 78 2.6.6 Degenerate Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 2.6.7 Duplicate Copies of Data . . . . . . . . . . . . . . . . . . . . . . . . . . 83 2.6.8 Imbalanced Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 2.6.9 Classification for the Blood Cell Data . . . . . . . . . . . . . . . 85 2.7 Class Boundaries for Different Kernels . . . . . . . . . . . . . . . . . . . . 88 2.8 Developing Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 2.8.1 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 2.8.2 Estimating Generalization Errors . . . . . . . . . . . . . . . . . . . 93 2.8.3 Sophistication of Model Selection . . . . . . . . . . . . . . . . . . . 97 2.8.4 Effect of Model Selection by Cross-Validation . . . . . . . . 98 2.9 Invariance for Linear Transformation . . . . . . . . . . . . . . . . . . . . . . 102 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 3 Multiclass Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . 113 3.1 One-Against-All Support Vector Machines . . . . . . . . . . . . . . . . . 114 3.1.1 Conventional Support Vector Machines . . . . . . . . . . . . . . 114 3.1.2 Fuzzy Support Vector Machines . . . . . . . . . . . . . . . . . . . . 116 3.1.3 Equivalence of Fuzzy Support Vector Machines and Support Vector Machines with Continuous Decision Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 3.1.4 Decision-Tree-Based Support Vector Machines . . . . . . . 122 3.2 Pairwise Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . 127 3.2.1 Conventional Support Vector Machines . . . . . . . . . . . . . . 127 3.2.2 Fuzzy Support Vector Machines . . . . . . . . . . . . . . . . . . . . 128 3.2.3 Performance Comparison of Fuzzy Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 3.2.4 Cluster-Based Support Vector Machines . . . . . . . . . . . . . 132 3.2.5 Decision-Tree-Based Support Vector Machines . . . . . . . 133 3.2.6 Pairwise Classification with Correcting Classifiers. . . . . 143 3.3 Error-Correcting Output Codes . . . . . . . . . . . . . . . . . . . . . . . . . . 144 3.3.1 Output Coding by Error-Correcting Codes. . . . . . . . . . . 145 3.3.2 Unified Scheme for Output Coding . . . . . . . . . . . . . . . . . 146 3.3.3 Equivalence of ECOC with Membership Functions . . . . 147 3.3.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 3.4 All-at-Once Support Vector Machines . . . . . . . . . . . . . . . . . . . . . 149 3.5 Comparisons of Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 3.5.1 One-Against-All Support Vector Machines . . . . . . . . . . . 152 3.5.2 Pairwise Support Vector Machines . . . . . . . . . . . . . . . . . . 152 3.5.3 ECOC Support Vector Machines . . . . . . . . . . . . . . . . . . . 153 3.5.4 All-at-Once Support Vector Machines . . . . . . . . . . . . . . . 153 3.5.5 Training Difficulty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 3.5.6 Training Time Comparison . . . . . . . . . . . . . . . . . . . . . . . . 157 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

Contents xv 4 Variants of Support Vector Machines . . . . . . . . . . . . . . . . . . . . . 163 4.1 Least-Squares Support Vector Machines . . . . . . . . . . . . . . . . . . . 163 4.1.1 Two-Class Least-Squares Support Vector Machines . . . 164 4.1.2 One-Against-All Least-Squares Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 4.1.3 Pairwise Least-Squares Support Vector Machines . . . . . 168 4.1.4 All-at-Once Least-Squares Support Vector Machines . . 169 4.1.5 Performance Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 170 4.2 Linear Programming Support Vector Machines . . . . . . . . . . . . . 174 4.2.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 4.2.2 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 4.3 Sparse Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . 180 4.3.1 Several Approaches for Sparse Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 4.3.2 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.3.3 Support Vector Machines Trained in the Empirical Feature Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 4.3.4 Selection of Linearly Independent Data. . . . . . . . . . . . . . 187 4.3.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 4.4 Performance Comparison of Different Classifiers . . . . . . . . . . . . 192 4.5 Robust Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . 196 4.6 Bayesian Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . 197 4.6.1 One-Dimensional Bayesian Decision Functions . . . . . . . 199 4.6.2 Parallel Displacement of a Hyperplane . . . . . . . . . . . . . . 200 4.6.3 Normal Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 4.7 Incremental Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 4.7.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 4.7.2 Incremental Training Using Hyperspheres . . . . . . . . . . . 204 4.8 Learning Using Privileged Information . . . . . . . . . . . . . . . . . . . . 213 4.9 Semi-Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 4.10 Multiple Classifier Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 4.11 Multiple Kernel Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 4.12 Confidence Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 4.13 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 5 Training Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 5.1 Preselecting Support Vector Candidates . . . . . . . . . . . . . . . . . . . 227 5.1.1 Approximation of Boundary Data . . . . . . . . . . . . . . . . . . 228 5.1.2 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 5.2 Decomposition Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 5.3 KKT Conditions Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 5.4 Overview of Training Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 5.5 Primal–Dual Interior-Point Methods . . . . . . . . . . . . . . . . . . . . . . 242

xvi Contents 5.5.1 Primal–Dual Interior-Point Methods for Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 5.5.2 Primal–Dual Interior-Point Methods for Quadratic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 5.5.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 5.6 Steepest Ascent Methods and Newton’s Methods . . . . . . . . . . . 252 5.6.1 Solving Quadratic Programming Problems Without Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 5.6.2 Training of L1 Soft-Margin Support Vector Machines . 254 5.6.3 Sequential Minimal Optimization . . . . . . . . . . . . . . . . . . . 259 5.6.4 Training of L2 Soft-Margin Support Vector Machines . 260 5.6.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 5.7 Batch Training by Exact Incremental Training . . . . . . . . . . . . . 262 5.7.1 KKT Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 5.7.2 Training by Solving a Set of Linear Equations . . . . . . . . 264 5.7.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 5.8 Active Set Training in Primal and Dual . . . . . . . . . . . . . . . . . . . 273 5.8.1 Training Support Vector Machines in the Primal . . . . . 273 5.8.2 Comparison of Training Support Vector Machines in the Primal and the Dual . . . . . . . . . . . . . . . . . . . . . . . . . . 276 5.8.3 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 5.9 Training of Linear Programming Support Vector Machines . . . 281 5.9.1 Decomposition Techniques . . . . . . . . . . . . . . . . . . . . . . . . . 282 5.9.2 Decomposition Techniques for Linear Programming Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . 289 5.9.3 Computer Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 6 Kernel-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 6.1 Kernel Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 6.1.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 6.1.2 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 6.2 Kernel Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . 311 6.3 Kernel Mahalanobis Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 6.3.1 SVD-Based Kernel Mahalanobis Distance. . . . . . . . . . . . 315 6.3.2 KPCA-Based Mahalanobis Distance . . . . . . . . . . . . . . . . 318 6.4 Principal Component Analysis in the Empirical Feature Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 6.5 Kernel Discriminant Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 6.5.1 Kernel Discriminant Analysis for Two-Class Problems . 321 6.5.2 Linear Discriminant Analysis for Two-Class Problems in the Empirical Feature Space . . . . . . . . . . . . . . . . . . . . . 324 6.5.3 Kernel Discriminant Analysis for Multiclass Problems . 325 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

点击进入文档下载页（PDF格式）

共482页，可试读40页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录