Eevh Gumey An introduction to neural networks An introduction to neural networks Kevin Gurney University of Sheffield OUTL R GE London and New York ©Kevin Gurney I997 This book is copyright under the Berne Convention. No reproduction without permission. All rights reserved. First published in 1997 by UCL Press UCL Press Limited 11 New Fetter Lane London EC4P 4EE 2
An introduction to neural networks An introduction to neural networks Kevin Gurney University of Sheffield London and New York © Kevin Gurney 1997 This book is copyright under the Berne Convention. No reproduction without permission. All rights reserved. First published in 1997 by UCL Press UCL Press Limited 11 New Fetter Lane London EC4P 4EE 2
Contents Preface 1 Neural networks-an overview 1.1 What are neural networks? 1.2 Why study neural networks? 1.3 Summary 14 Notes 2 Real and artificial neurons 2.1 Real neurons:a review 2.2 Artificial neurons:the TLU 2.3 Resilience to noise and hardware failure 2.4 Non-binary signal communication 2.5 Introducing time 2.6 Summary 2.7 Notes 3 TLUs.linear separability and vectors 3.1 Geometric interpretation of TLU action 3.2 Vectors 3.3 TLUs and linear separability revisited 3.4 Summary 3.5 Notes 4
Contents Preface 1 Neural networks—an overview 1.1 What are neural networks? 1.2 Why study neural networks? 1.3 Summary 1.4 Notes 2 Real and artificial neurons 2.1 Real neurons: a review 2.2 Artificial neurons: the TLU 2.3 Resilience to noise and hardware failure 2.4 Non-binary signal communication 2.5 Introducing time 2.6 Summary 2.7 Notes 3 TLUs, linear separability and vectors 3.1 Geometric interpretation of TLU action 3.2 Vectors 3.3 TLUs and linear separability revisited 3.4 Summary 3.5 Notes 4
4 Training TLUs:the perceptron rule 4.1 Training networks 4.2 Training the threshold as a weight 4.3 Adjusting the weight vector 4.4 The perceptron 4.5 Multiple nodes and layers 4.6 Some practical matters 4.7 Summary 4.8 Notes 5 The delta rule 5.1 Finding the minimum of a function:gradient descent 5.2 Gradient descent on an error 5.3 The delta rule 5.4 Watching the delta rule at work 5.5 Summary 6 Multilayer nets and backpropagation 6.1 Training rules for multilayer nets 6.2 The backpropagation al gorithm 6.3 Local versus global minima 6.4 The stopping criterion 6.5 Speeding up learning:the momentum term 6.6 More complex nets 5
4 Training TLUs: the perceptron rule 4.1 Training networks 4.2 Training the threshold as a weight 4.3 Adjusting the weight vector 4.4 The perceptron 4.5 Multiple nodes and layers 4.6 Some practical matters 4.7 Summary 4.8 Notes 5 The delta rule 5.1 Finding the minimum of a function: gradient descent 5.2 Gradient descent on an error 5.3 The delta rule 5.4 Watching the delta rule at work 5.5 Summary 6 Multilayer nets and backpropagation 6.1 Training rules for multilayer nets 6.2 The backpropagation algorithm 6.3 Local versus global minima 6.4 The stopping criterion 6.5 Speeding up learning: the momentum term 6.6 More complex nets 5
6.7 The action of well-trained nets 6.8 Taking stock 6.9 Generalization and overtraining 6.10 Fostering generalization 6.11 Applications 6.12 Final remarks 6.13 Summary 6.14 Notes 7 Associative memories:the Hopfield net 7.1 The nature of associative memory 7.2 Neural networks and associative memory 7.3 a physical analogy with memory 7.4 The Hopfield net 7.5 Finding the weights 7.6 Storage capacity 7.7 The analogue Hopfield model 7.8 Combinatorial optimization 7.9 Feedforward and recurrent associative nets 7.10 Summary 7.11 Notes 8 Self-organization 6
6.7 The action of well-trained nets 6.8 Taking stock 6.9 Generalization and overtraining 6.10 Fostering generalization 6.11 Applications 6.12 Final remarks 6.13 Summary 6.14 Notes 7 Associative memories: the Hopfield net 7.1 The nature of associative memory 7.2 Neural networks and associative memory 7.3 A physical analogy with memory 7.4 The Hopfield net 7.5 Finding the weights 7.6 Storage capacity 7.7 The analogue Hopfield model 7.8 Combinatorial optimization 7.9 Feedforward and recurrent associative nets 7.10 Summary 7.11 Notes 8 Self-organization 6
8.1 Competitive dynamics 8.2 Competitive learning 8.3 Kohonen's self-organizing feature maps 8.4 Principal component analysis 8.5 Further remarks 8.6 Summary 8.7 Notes 9 Adaptive resonance theory:ART 9.1 ART's objectives 9.2 A hierarchical description of networks 9.3 ARTI 9.4 The ART family 9.5 Applications 9.6 Further remarks 9.7 Summary 9.8 Notes 10 Nodes,nets and algorithms:further alternatives 10.1 Synapses revisited 10.2 Sigma-pi units 10.3 Digital neural networks 10.4 Radial basis functions 10.5 Learning by exploring the environment 7
8.1 Competitive dynamics 8.2 Competitive learning 8.3 Kohonen's self-organizing feature maps 8.4 Principal component analysis 8.5 Further remarks 8.6 Summary 8.7 Notes 9 Adaptive resonance theory: ART 9.1 ART's objectives 9.2 A hierarchical description of networks 9.3 ART1 9.4 The ART family 9.5 Applications 9.6 Further remarks 9.7 Summary 9.8 Notes 10 Nodes, nets and algorithms: further alternatives 10.1 Synapses revisited 10.2 Sigma-pi units 10.3 Digital neural networks 10.4 Radial basis functions 10.5 Learning by exploring the environment 7