当前位置：和泉文库 > 计算机 > 浏览文档

《电子商务 E-business》阅读文献：application of spreading activation techniques in information retrieval

文件格式：PDF，文件大小：650.01KB，售价：8.7元

文档详细内容（约30页）

APPLICATION OF SPREADING ACTIVATION TECHNIQUES IN IR 459 tion that there exists statistically determinable relations among terms, amon documents. and among documents and terms. These associations can be represented in a similarity matrix. Quantitative evaluations of similarity between terms, for example, can be obtained by means of statistical analy of term co-occurrence in documents. Associations between documents based on a quantitative evaluation of their respective similarity, can be obtained evaluating similarities in the terms assignments to documents or by means of citations and other bibliographic indicators. There are many heavy assumptions on this model and more recent studies(Preece, 1981 Salton and Buckley, 1988) have lead to the conclusion that effective term expansion methods valid for a variety of different collections are difficult to generate. Moreover, IR systems based on this approach have shown a lack of consistent improvements in the effectiveness. This result can have various motivations. First, the similarities statistically derived from some pairs of documents, or some pairs of terms, may be valid only locally in the partic- ular" environment(or application domain) from which they are obtained. Second, most practical methods for computing the document associations are based on the assumption that the terms or the documents are originally uncor related, i.e. independent of each other. Such assumption is no more accepted in many of the new research directions of IR. Recently, these models of associative retrieval has been revised using the SO-called Spreading Activation Model, which is based on supposed mecha nisms of human memory operations. Originated from psychological studies (see for example(Rumelhart and Norman, 1983)) it was first introduced in Computing Science in the area of Artificial Intelligence to provide a process- ing framework for Semantic Networks. Its use has been praised and criticised but it is currently adopted in many different areas such as: Cognitive Science, Databases, Artificial Intelligence, Psychology, Biology, and lately to IR. The basic SA model has, however, been subject to various enhancements in order to make it more suitable to various application areas and the way it is used in IR is quite different from the original formulation in the area of psychology In the following three sections the SA model will be described in deptl ection 4. 1 will describe the"pure"SA model, which consists in the sole use of the associative nature of a network representation as a search controlling structure In Section 4.2 some more search controlling structures will be added in order to give preference to particular associations. Section 4.3 will show how the search controlling structure can be used in a interactive way using external feedback

APPLICATION OF SPREADING ACTIVATION TECHNIQUES IN IR 459 tion that there exists statistically determinable relations among terms, among documents, and among documents and terms. These associations can be represented in a similarity matrix. Quantitative evaluations of similarity between terms, for example, can be obtained by means of statistical analysis of term co-occurrence in documents. Associations between documents, based on a quantitative evaluation of their respective similarity, can be obtained evaluating similarities in the terms assignments to documents or by means of citations and other bibliographic indicators. There are many heavy assumptions on this model and more recent studies (Preece, 1981; Salton and Buckley, 1988) have lead to the conclusion that effective term expansion methods valid for a variety of different collections are difficult to generate. Moreover, IR systems based on this approach have shown a lack of consistent improvements in the effectiveness. This result can have various motivations. First, the similarities statistically derived from some pairs of documents, or some pairs of terms, may be valid only locally in the particular “environment” (or application domain) from which they are obtained. Second, most practical methods for computing the document associations are based on the assumption that the terms or the documents are originally uncorrelated, i.e. independent of each other. Such assumption is no more accepted in many of the new research directions of IR. Recently, these models of associative retrieval has been revised using the so-called Spreading Activation Model, which is based on supposed mechanisms of human memory operations. Originated from psychological studies (see for example (Rumelhart and Norman, 1983)) it was first introduced in Computing Science in the area of Artificial Intelligence to provide a processing framework for Semantic Networks. Its use has been praised and criticised, but it is currently adopted in many different areas such as: Cognitive Science, Databases, Artificial Intelligence, Psychology, Biology, and lately to IR. The basic SA model has, however, been subject to various enhancements in order to make it more suitable to various application areas and the way it is used in IR is quite different from the original formulation in the area of psychology. In the following three sections the SA model will be described in depth. Section 4.1 will describe the “pure” SA model, which consists in the sole use of the associative nature of a network representation as a search controlling structure. In Section 4.2 some more search controlling structures will be added in order to give preference to particular associations. Section 4.3 will show how the search controlling structure can be used in a interactive way using external feedback

E CRESTANI The input and the weight are usually real numbers, however their numer- ical type is determined by the specific requirements of the application to be modelled. In particular, they can be binary values(0 or 1), excitatory/ inhibitory values(+l or-1), or they can be real values indicating the strength of the relation between nodes. Usually the first two of these options are used in connection with networks with labelled links like for examples Semantic Networks, where the semantic value of the relation represented by the link determines, in the context of the application, the value to be associated to the link. The last option is mainly used for Associative Networks, where there is only one generic type of association that need to be weighted After a node has computed its input value, its output value must be deter- mined. The numerical type of the output of a node is also determined by the requirements of the application. The two most used cases being the binary active/non-active type(0 or 1)and the real value type. In Sa models there is usually no distinction between"activation"or"output of a unit. The activa tion level of a unit is its output value. This is usually computed as a function of the input value O;=f(1) There are many different functions that can be used in the evaluation of the output, some examples are depicted in Figure 6. The most commonly used function to the above formula in the case of binary value units give. esol 'o function in pure SA models is the threshold function. It is used to determin the node has to be considered active or not. The application of the threshold 01<kj lj>kj where kj is the threshold value for unit The threshold value of the activation function is application dependent and can vary from node to node, therefore the notation k; for the unit threshold has been used After the node has computed its output value, it fires it to all the nodes connected to it, usually sending the same value to each of them Pulse after pulse, the activation spreads over the network reaching nodes that are far from the initially activated ones. After a determined number of pulses has been fired a termination condition is checked. If the condition is verified than the sa process stops, otherwise it goes on for another series of pulses. SA is therefore iterative, consisting of a sequence of pulses and termination checks The result of the sa process is the activation level of nodes reached at termination time. The interpretation of the level of activation of each node

462 F. CRESTANI The input and the weight are usually real numbers, however their numerical type is determined by the specific requirements of the application to be modelled. In particular, they can be binary values (0 or 1), excitatory/ inhibitory values (+1 or -1), or they can be real values indicating the strength of the relation between nodes. Usually the first two of these options are used in connection with networks with labelled links like for examples Semantic Networks, where the semantic value of the relation represented by the link determines, in the context of the application, the value to be associated to the link. The last option is mainly used for Associative Networks, where there is only one generic type of association that need to be weighted. After a node has computed its input value, its output value must be determined. The numerical type of the output of a node is also determined by the requirements of the application. The two most used cases being the binary active/non-active type (0 or 1) and the real value type. In SA models there is usually no distinction between “activation” or “output” of a unit. The activation level of a unit is its output value. This is usually computed as a function of the input value: Oj = f (Ij ) There are many different functions that can be used in the evaluation of the output, some examples are depicted in Figure 6. The most commonly used function in pure SA models is the threshold function. It is used to determine if the node j has to be considered active or not. The application of the threshold function to the above formula in the case of binary value units gives: Oj = 0 Ij < kj 1 Ij > kj where kj is the threshold value for unit j. The threshold value of the activation function is application dependent and can vary from node to node, therefore the notation kj for the unit threshold has been used. After the node has computed its output value, it fires it to all the nodes connected to it, usually sending the same value to each of them. Pulse after pulse, the activation spreads over the network reaching nodes that are far from the initially activated ones. After a determined number of pulses has been fired a termination condition is checked. If the condition is verified than the SA process stops, otherwise it goes on for another series of pulses. SA is therefore iterative, consisting of a sequence of pulses and termination checks. The result of the SA process is the activation level of nodes reached at termination time. The interpretation of the level of activation of each node

点击进入文档下载页（PDF格式）

共30页，试读已结束，阅读完整版请下载

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录