458 E CRESTANI Figure 3. An example of a Semantic Network. The term Semantic Networks has been used in a far more general sense in the knowledge representation literature than what has been described above, and for what concerns IR, researchers have often used the term Semantic Network to refer to an Associative Network. This is a generic network of information items in which information items are represented by nodes, and links express sometimes undefined and unlabeled associative relations among information items. In modern IR, where statistical techniques are used in the indexing phase to associate weights to index terms, the relationships among information items are sometimes weighted, so adding to the network a measure of the strength of associations. In Section 5 we will report a few examples of the kind of Associative Networks used in IR. Semantic or Associative Networks are usually processed by means of a technique called Spreading Activation that will be explained in the next 4. Associative Retrieval using Spreading Activation Historically speaking, SA was not the first associative processing paradigm to be used in IR. Studies on Associative Retrieval date as early as the 60s and had their origins in statistical studies of associations among terms and/or documents in a collection. The "associative linear retrieval model"is one of these earliest models based on the concept of associative retrieval. Thi model, in its basic idea(Salton, 1968), consists of expanding the orig- inal query using statistically determined term-term, term-document, and document-document associations. This technique is based on the assump-
458 F. CRESTANI Figure 3. An example of a Semantic Network. The term Semantic Networks has been used in a far more general sense in the knowledge representation literature than what has been described above, and for what concerns IR, researchers have often used the term Semantic Network to refer to an Associative Network. This is a generic network of information items in which information items are represented by nodes, and links express sometimes undefined and unlabeled associative relations among information items. In modern IR, where statistical techniques are used in the indexing phase to associate weights to index terms, the relationships among information items are sometimes weighted, so adding to the network a measure of the strength of associations. In Section 5 we will report a few examples of the kind of Associative Networks used in IR. Semantic or Associative Networks are usually processed by means of a technique called Spreading Activation that will be explained in the next section. 4. Associative Retrieval using Spreading Activation Historically speaking, SA was not the first associative processing paradigm to be used in IR. Studies on Associative Retrieval date as early as the 60s and had their origins in statistical studies of associations among terms and/or documents in a collection. The “associative linear retrieval model” is one of these earliest models based on the concept of associative retrieval. This model, in its basic idea (Salton, 1968), consists of expanding the original query using statistically determined term–term, term–document, and document–document associations. This technique is based on the assump-
APPLICATION OF SPREADING ACTIVATION TECHNIQUES IN IR 459 tion that there exists statistically determinable relations among terms, amon documents. and among documents and terms. These associations can be represented in a similarity matrix. Quantitative evaluations of similarity between terms, for example, can be obtained by means of statistical analy of term co-occurrence in documents. Associations between documents based on a quantitative evaluation of their respective similarity, can be obtained evaluating similarities in the terms assignments to documents or by means of citations and other bibliographic indicators. There are many heavy assumptions on this model and more recent studies(Preece, 1981 Salton and Buckley, 1988) have lead to the conclusion that effective term expansion methods valid for a variety of different collections are difficult to generate. Moreover, IR systems based on this approach have shown a lack of consistent improvements in the effectiveness. This result can have various motivations. First, the similarities statistically derived from some pairs of documents, or some pairs of terms, may be valid only locally in the partic- ular" environment(or application domain) from which they are obtained. Second, most practical methods for computing the document associations are based on the assumption that the terms or the documents are originally uncor related, i.e. independent of each other. Such assumption is no more accepted in many of the new research directions of IR. Recently, these models of associative retrieval has been revised using the SO-called Spreading Activation Model, which is based on supposed mecha nisms of human memory operations. Originated from psychological studies (see for example(Rumelhart and Norman, 1983)) it was first introduced in Computing Science in the area of Artificial Intelligence to provide a process- ing framework for Semantic Networks. Its use has been praised and criticised but it is currently adopted in many different areas such as: Cognitive Science, Databases, Artificial Intelligence, Psychology, Biology, and lately to IR. The basic SA model has, however, been subject to various enhancements in order to make it more suitable to various application areas and the way it is used in IR is quite different from the original formulation in the area of psychology In the following three sections the SA model will be described in deptl ection 4. 1 will describe the"pure"SA model, which consists in the sole use of the associative nature of a network representation as a search controlling structure In Section 4.2 some more search controlling structures will be added in order to give preference to particular associations. Section 4.3 will show how the search controlling structure can be used in a interactive way using external feedback
APPLICATION OF SPREADING ACTIVATION TECHNIQUES IN IR 459 tion that there exists statistically determinable relations among terms, among documents, and among documents and terms. These associations can be represented in a similarity matrix. Quantitative evaluations of similarity between terms, for example, can be obtained by means of statistical analysis of term co-occurrence in documents. Associations between documents, based on a quantitative evaluation of their respective similarity, can be obtained evaluating similarities in the terms assignments to documents or by means of citations and other bibliographic indicators. There are many heavy assumptions on this model and more recent studies (Preece, 1981; Salton and Buckley, 1988) have lead to the conclusion that effective term expansion methods valid for a variety of different collections are difficult to generate. Moreover, IR systems based on this approach have shown a lack of consistent improvements in the effectiveness. This result can have various motivations. First, the similarities statistically derived from some pairs of documents, or some pairs of terms, may be valid only locally in the particular “environment” (or application domain) from which they are obtained. Second, most practical methods for computing the document associations are based on the assumption that the terms or the documents are originally uncorrelated, i.e. independent of each other. Such assumption is no more accepted in many of the new research directions of IR. Recently, these models of associative retrieval has been revised using the so-called Spreading Activation Model, which is based on supposed mechanisms of human memory operations. Originated from psychological studies (see for example (Rumelhart and Norman, 1983)) it was first introduced in Computing Science in the area of Artificial Intelligence to provide a processing framework for Semantic Networks. Its use has been praised and criticised, but it is currently adopted in many different areas such as: Cognitive Science, Databases, Artificial Intelligence, Psychology, Biology, and lately to IR. The basic SA model has, however, been subject to various enhancements in order to make it more suitable to various application areas and the way it is used in IR is quite different from the original formulation in the area of psychology. In the following three sections the SA model will be described in depth. Section 4.1 will describe the “pure” SA model, which consists in the sole use of the associative nature of a network representation as a search controlling structure. In Section 4.2 some more search controlling structures will be added in order to give preference to particular associations. Section 4.3 will show how the search controlling structure can be used in a interactive way using external feedback
460 E CRESTANI Figure 4. The network structure of a SA model 4. 1. The pure Spreading Activation model The SAmodel in its"pure" form is quite simple. It is made up ofa conceptually simple processing technique on a network data structure The network data structure consists of nodes connected by links, as depicted in Figure 4 Nodes model objects or features of objects of the"real world"to be represented They are usually labelled with the name of the objects they intend to represent. Links model relationships between nodes and they can be labelled and/or weighted. The connectivity pattern reflects the relation- ships between objects and/or features of objects of the "real world"to be represented. a links usually has a direction, a label, and/or a weight assigned according to a specific direction. This representation structure is very similar to a Semantic Network, but it is more general than the definition of Semantic Network given in Section 3. It could represent a Semantic Network, but also a more generic Associative Network The processing technique is defined by a sequence of iterations like the one hematically described in Figure 5. Each iteration is followed by another iteration until halted by the user or by the triggering of some termination condition. an iteration consists of 2. termination check What distinguish the pure Sa model from other more complex models is the sequence of actions which composes the pulse. a pulse is made up of three phases
460 F. CRESTANI Figure 4. The network structure of a SA model. 4.1. The pure Spreading Activation model The SA model in its “pure” form is quite simple. It is made up of a conceptually simple processing technique on a network data structure. The network data structure consists of nodes connected by links, as depicted in Figure 4. Nodes model objects or features of objects of the “real world” to be represented. They are usually labelled with the name of the objects they intend to represent. Links model relationships between nodes and they can be labelled and/or weighted. The connectivity pattern reflects the relationships between objects and/or features of objects of the “real world” to be represented. A links usually has a direction, a label, and/or a weight assigned according to a specific direction. This representation structure is very similar to a Semantic Network, but it is more general than the definition of Semantic Network given in Section 3. It could represent a Semantic Network, but also a more generic Associative Network. The processing technique is defined by a sequence of iterations like the one schematically described in Figure 5. Each iteration is followed by another iteration until halted by the user or by the triggering of some termination condition. An iteration consists of: 1. one or more pulses; 2. termination check. What distinguish the pure SA model from other more complex models is the sequence of actions which composes the pulse. A pulse is made up of three phases: 1. preadjustment;
APPLICATION OF SPREADING ACTIVATION TECHNIQUES IN IR pre-adjustment satisfied Figure 5. The pure SA model In the preadjustment and postadjustement phases, which are optional, some form of activation decay can be applied to the active nodes. These phases are used to avoid retention of activation from pre ol both activation of single nodes and the overall activation of the network They implement a form of loss of interest" in nodes that are not continually ctivated The spreading phase consists on a number of passages of activation weaves from one node to all other nodes connected to it. There are many ways of spreading the activation over a network(for a overview see(Preece, 1981)) In its more simple form, on a single unit level, SA consists first in the computation of the unit input using this formula J1=∑O where Ii is the total input of node j O; is the output of unit i connected to node wij is a weight associated to the link connecting node i to node
APPLICATION OF SPREADING ACTIVATION TECHNIQUES IN IR 461 Figure 5. The pure SA model. 2. spreading; 3. postadjustment. In the preadjustment and postadjustement phases, which are optional, some form of activation decay can be applied to the active nodes. These phases are used to avoid retention of activation from previous pulses, enabling to control both activation of single nodes and the overall activation of the network. They implement a form of “loss of interest” in nodes that are not continually activated. The spreading phase consists on a number of passages of activation weaves from one node to all other nodes connected to it. There are many ways of spreading the activation over a network (for a overview see (Preece, 1981)). In its more simple form, on a single unit level, SA consists first in the computation of the unit input using this formula: Ij = X i Okwij where: Ij is the total input of node j; Oi is the output of unit i connected to node j; wij is a weight associated to the link connecting node i to node j
E CRESTANI The input and the weight are usually real numbers, however their numer- ical type is determined by the specific requirements of the application to be modelled. In particular, they can be binary values(0 or 1), excitatory/ inhibitory values(+l or-1), or they can be real values indicating the strength of the relation between nodes. Usually the first two of these options are used in connection with networks with labelled links like for examples Semantic Networks, where the semantic value of the relation represented by the link determines, in the context of the application, the value to be associated to the link. The last option is mainly used for Associative Networks, where there is only one generic type of association that need to be weighted After a node has computed its input value, its output value must be deter- mined. The numerical type of the output of a node is also determined by the requirements of the application. The two most used cases being the binary active/non-active type(0 or 1)and the real value type. In Sa models there is usually no distinction between"activation"or"output of a unit. The activa tion level of a unit is its output value. This is usually computed as a function of the input value O;=f(1) There are many different functions that can be used in the evaluation of the output, some examples are depicted in Figure 6. The most commonly used function to the above formula in the case of binary value units give. esol 'o function in pure SA models is the threshold function. It is used to determin the node has to be considered active or not. The application of the threshold 01<kj lj>kj where kj is the threshold value for unit The threshold value of the activation function is application dependent and can vary from node to node, therefore the notation k; for the unit threshold has been used After the node has computed its output value, it fires it to all the nodes connected to it, usually sending the same value to each of them Pulse after pulse, the activation spreads over the network reaching nodes that are far from the initially activated ones. After a determined number of pulses has been fired a termination condition is checked. If the condition is verified than the sa process stops, otherwise it goes on for another series of pulses. SA is therefore iterative, consisting of a sequence of pulses and termination checks The result of the sa process is the activation level of nodes reached at termination time. The interpretation of the level of activation of each node
462 F. CRESTANI The input and the weight are usually real numbers, however their numerical type is determined by the specific requirements of the application to be modelled. In particular, they can be binary values (0 or 1), excitatory/ inhibitory values (+1 or -1), or they can be real values indicating the strength of the relation between nodes. Usually the first two of these options are used in connection with networks with labelled links like for examples Semantic Networks, where the semantic value of the relation represented by the link determines, in the context of the application, the value to be associated to the link. The last option is mainly used for Associative Networks, where there is only one generic type of association that need to be weighted. After a node has computed its input value, its output value must be determined. The numerical type of the output of a node is also determined by the requirements of the application. The two most used cases being the binary active/non-active type (0 or 1) and the real value type. In SA models there is usually no distinction between “activation” or “output” of a unit. The activation level of a unit is its output value. This is usually computed as a function of the input value: Oj = f (Ij ) There are many different functions that can be used in the evaluation of the output, some examples are depicted in Figure 6. The most commonly used function in pure SA models is the threshold function. It is used to determine if the node j has to be considered active or not. The application of the threshold function to the above formula in the case of binary value units gives: Oj = 0 Ij < kj 1 Ij > kj where kj is the threshold value for unit j. The threshold value of the activation function is application dependent and can vary from node to node, therefore the notation kj for the unit threshold has been used. After the node has computed its output value, it fires it to all the nodes connected to it, usually sending the same value to each of them. Pulse after pulse, the activation spreads over the network reaching nodes that are far from the initially activated ones. After a determined number of pulses has been fired a termination condition is checked. If the condition is verified than the SA process stops, otherwise it goes on for another series of pulses. SA is therefore iterative, consisting of a sequence of pulses and termination checks. The result of the SA process is the activation level of nodes reached at termination time. The interpretation of the level of activation of each node