当前位置：和泉文库 > 基础医学 > 浏览文档

《医学影像信息学概论》课程参考资源：Computational Intelligence Paradigms in Advanced Pattern Classification

Chapter 1 Recent Advances in Pattern Classification Chapter 2 Neural Networks for Handwriting Recognition Chapter 3 Moving Object Detection from Mobile Platforms Using Stereo Data Registration Chapter 4 Pattern Classifications in Cognitive Informatics Chapter 5 Optimal Differential Filter on Hexagonal Lattice Chapter 6 Graph Image Language Techniques Supporting Advanced Classification and Cognitive Interpretation of CT Coronary Vessel Visualizations Chapter 7 A Graph Matching Approach to Symmetry Detection and Analysis

文件格式：PDF，文件大小：8.08MB，售价：42.6元

共204页，可试读40页，点击往前阅读 ↑↑

文档详细内容（约204页）

M. Liwicki. A Graves and H. Bunke have an x-coordinate in the vicinity of the current point are considered. Addition- ally the points must have a minimal distance to the lines to be considered as part of an ascender or descender. The corresponding distances are set to a predefined fraction of the corpus height. context map: the two-dimensional vicinity of the current point is transformed to 3x3 map. The number of black points in each region is taken as a feature val- ue. So we obtain altogether nine features of this type 2.3 Our Offline System To extract the feature vectors from the offline images, a sliding window approach is used. The width of the window is one pixel, and nine geometrical features are computed at each window position. Each text line image is therefore converted to a sequence of 9-dimensional vectors. The nine features are as follows The mean gray value of the pixels The center of gravity of the pixels The second order vertical moment of the center of gravity The positions of the uppermost and lowermost black pixels The rate of change of these positions(with respect to the neighboring windows) The number of black-white transitions between the uppermost and lowermost The proportion of black pixels between the uppermost and lowermost pixels For a more detailed description of the offline features, see [171 In the next phase indicated in Fig. 1, a classification system is applied which generates a list of candidates or even a recognition lattice. This step and the last step, the postprocessing, are described in the next section. 3 Neural Network Based Recognition The main focus of this chapter is the recently introduced Neural Network classifier based on crc combined with Bidirectional or Multidimensional lstm. this sec. tion describes the different aspects of the architecture and gives brief insights into the algorithms behind 3.1 Recurrent Neural Networks(RNNs) Recurrent neural networks(RNNs)are a connectionist model containing a self- connected hidden layer. One benefit of the recurrent connection is that a memory of previous inputs remains in the network's internal state, allowing it to make use

12 M. Liwicki, A. Graves, and H. Bunke have an x-coordinate in the vicinity of the current point are considered. Additionally the points must have a minimal distance to the lines to be considered as part of an ascender or descender. The corresponding distances are set to a predefined fraction of the corpus height. • context map: the two-dimensional vicinity of the current point is transformed to a 3×3 map. The number of black points in each region is taken as a feature value. So we obtain altogether nine features of this type. 2.3 Our Offline System To extract the feature vectors from the offline images, a sliding window approach is used. The width of the window is one pixel, and nine geometrical features are computed at each window position. Each text line image is therefore converted to a sequence of 9-dimensional vectors. The nine features are as follows: • The mean gray value of the pixels • The center of gravity of the pixels • The second order vertical moment of the center of gravity • The positions of the uppermost and lowermost black pixels • The rate of change of these positions (with respect to the neighboring windows) • The number of black-white transitions between the uppermost and lowermost pixels • The proportion of black pixels between the uppermost and lowermost pixels. For a more detailed description of the offline features, see [17]. In the next phase indicated in Fig. 1, a classification system is applied which generates a list of candidates or even a recognition lattice. This step and the last step, the postprocessing, are described in the next section. 3 Neural Network Based Recognition The main focus of this chapter is the recently introduced Neural Network classifier based on CTC combined with Bidirectional or Multidimensional LSTM. This Section describes the different aspects of the architecture and gives brief insights into the algorithms behind. 3.1 Recurrent Neural Networks (RNNs) Recurrent neural networks (RNNs) are a connectionist model containing a selfconnected hidden layer. One benefit of the recurrent connection is that a `memory' of previous inputs remains in the network's internal state, allowing it to make use

Neural Networks for Handwriting recognition of past context. Context plays an important role in handwriting recognition, as il- lustrated in Figure 4. Another important advantage of recurrency is that change of the internal state can be finely modulated by the recurrent weights, which builds in robustness to localized distortions of the input data ⅵtdwa Fig 4 Importance of context. The characters"ur"would be hard to recognize without the context of the word "entourage 3.2 Long short-Term Memory(LsTm) Unfortunately, the range of contextual information that standard RNNs can access is quite limited. The problem is that the influence of a given input on the hidden layer, and therefore on the network output, either decays or blows up exponential ly as it cycles around the network's recurrent connections, and is repeatedly scaled by the connection weights. In practice this shortcoming(referred to in the litera ture as the vanishing gradient problem) makes it hard for an RNn to bridge gaps of more than about 10 time steps between relevant input and target events. Long Short-Term Memory (LSTM) is an RNN architecture specifically de signed to address the vanishing gradient problem. An LSTM hidden layer consists of multiple recurrently connected subnets, known as memory blocks. Each block contains a set of internal units, or cells, whose activation is controlled by three multiplicative units: the input gate, forget gate and output gate. Figure 5 provides a detailed illustration of an LSTM memory block with a single cel The effect of the gates is to allow the cells to store and access information over long periods of time. For example, as long as the input gate emaIns osed (i.e. has an activation close to O), the activation of the cell will not be overwritten by the new inputs arriving in the network. Similarly, the cell activation is only avail- able to the rest of the network when the output gate is open, and the cells recu rent connection is switched on and off by the forget gate

Neural Networks for Handwriting Recognition 13 of past context. Context plays an important role in handwriting recognition, as illustrated in Figure 4. Another important advantage of recurrency is that the rate of change of the internal state can be finely modulated by the recurrent weights, which builds in robustness to localized distortions of the input data. Fig. 4 Importance of context. The characters “ur” would be hard to recognize without the context of the word “entourage”. 3.2 Long Short-Term Memory (LSTM) Unfortunately, the range of contextual information that standard RNNs can access is quite limited. The problem is that the influence of a given input on the hidden layer, and therefore on the network output, either decays or blows up exponentially as it cycles around the network's recurrent connections, and is repeatedly scaled by the connection weights. In practice this shortcoming (referred to in the literature as the vanishing gradient problem) makes it hard for an RNN to bridge gaps of more than about 10 time steps between relevant input and target events. Long Short-Term Memory (LSTM) is an RNN architecture specifically designed to address the vanishing gradient problem. An LSTM hidden layer consists of multiple recurrently connected subnets, known as memory blocks. Each block contains a set of internal units, or cells, whose activation is controlled by three multiplicative units: the input gate, forget gate and output gate. Figure 5 provides a detailed illustration of an LSTM memory block with a single cell. The effect of the gates is to allow the cells to store and access information over long periods of time. For example, as long as the input gate remains closed (i.e., has an activation close to 0), the activation of the cell will not be overwritten by the new inputs arriving in the network. Similarly, the cell activation is only available to the rest of the network when the output gate is open, and the cell's recurrent connection is switched on and off by the forget gate

Neural Networks for Handwriting Recognition 15 where Xt is the overall net input at timestamp t and Wh is the weight matrix of the hidden layer. Note that for clarity reasons we use this abbreviated form of the complex formula, where the input weights do not directly appear (all is hidden in the function g(…)). This formula reveals that the influence of earlier time stamps t-n vanishes rapidly, as the time difference n appears in the exponent of the weight matrix. Since all values of the weight matrix Wh are smaller than 1, the n-th power of Wh is close to zero. Introducing the LSTM cell brings in three new cells which all get the weighted sum of the outputs of the hidden layer in the previous timestamp as an input, i.e., for the input gate: 1 , 1 , , − − = ⋅ + ⋅ + t c t h t i t c a W X W B w s ι ι ι ι where sc t-1 is the cell state of the previous timestamp and Wi,t and Wh,t are the weights for the current net input and the hidden layer output of the previous timestamp, respectively. The activation of the forget gate is: 1 , 1 , , − − = ⋅ + ⋅ + t c t h t i t c a W X W B w s θ θ θ θ which is the same formula just with other weights (those trained for the forget gate). The cell activation is usually calculated by: 1 , , − = ⋅ + ⋅ t h c t i c t ac W X W B However, the cell state is then weighted with the outputs of the two gate cells: 1 ( ) ( ) ( ) − = + t c t t c t t c s a g a a s θ σ ι σ where σ indicates that the sigmoid function is used as a squashing function for the gates and g() is cell’s activation function. As the sigmoid function often returns a value close to zero or one, the formula can be interpreted as: 1 [0 1] ( ) [0 1] −  = + t c t c t c s or g a or s or in words: the cell state is either depending on the input activation (if the input gates opens, i.e., the first weight is close to 1) or on the previous cell state (if the forget gate opens, i.e., the second weight is close to one). This particular property enables the LSTM-cell to bridge over long time periods. The value of the output gate is calculated similarly to the other gates, i.e.: t c t h t i t c a W X W B w s ω ω ω ,ω 1 = , ⋅ + , ⋅ + − and the final cell output is: ( ) ( ) t c t t c b a h s =σ ω which again is close to zero or the usual output of the cell h(…)

点击进入文档下载页（PDF格式）

共204页，可试读40页，点击继续阅读 ↓↓

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录