Where are the semantics in the semantic Web?8 Michael Uschold The Boeing Comp PO BoX 3707 MS 7L-40 Seattle. Wa98124 USA +1425865-3605 michaelfuschold@boeing.com ABSTRACT The most widely accepted defining feature of the Semantic Web is machine-usable content. By this definition, the Semantic Web is already manifest in shopping agents that automatically access and use Web content to find the lowest air fares, or book prices. But where are the semantics? Most people regard the Semantic Web as a vision, not a reality-so shopping agents should not"count". To use Web content, machines need to know what to do when they encounter it. This in turn, requires the machine to"know what the content means (i.e. its semantics). The challenge of developing the Semantic Web is how to put his knowledge into the machine. The manner in which this is done is at the heart of the confusion about the Semantic Web. The goal of this paper is to clear up some of this confusion e proceed by describing a variety of meanings of the term"semantics", noting various things that can be said to have semantics of various kinds. We introduce a semantic continuum ranging from implicit semantics, which are only in the heads of the people who use the terms, to formal semantics for machine processing. We list some core requirements for enabling machines to use Web content, and we consider various issues such as hardwiring, agreements, clarity of semantics specifications, and public declaration of semantics. In light of these requirements and issues in conjunction with our semantic continuum, it is useful to collectively regard shopping agents as a degenerate case of the Semantic Web Shopping agents work in the complete absence of any explicit account of the semantics of Web content because the meaning of the Web content that the agents are expected to encounter can be determined by the human programmers who hardwire it into the Web application software We note various shortcomings of this approach, which give rise to some ideas about how the Semantic Web should evolve. We argue that this evolution will take place by(1)moving along the semantic continuum from implicit semantics to formal semantics for machine processing, (2)reducing the amount of Web content semantics that is hardwired, (3) increasing the amount of agreements and standards, and (4) developing semantic mapping and translation capabilities where differences remain. eywor Semantic Web, Software Agents, Semantic Heterogeneity, Ontologies The content of this paper was first presented as an invited talk at the Ontologies in Agent Systems workshop held at the Autonomous Agents Conference in Montreal, June 2001. This paper is a significantly revised and extended version of a short paper that appeared in a special issue of the Knowledge Engineering Review for papers from that workshop
Where are the Semantics in the Semantic Web?✤ Michael Uschold The Boeing Company PO Box 3707 MS 7L-40 Seattle, WA 98124 USA +1 425 865-3605 michael.f.uschold@boeing.com ABSTRACT The most widely accepted defining feature of the Semantic Web is machine-usable content. By this definition, the Semantic Web is already manifest in shopping agents that automatically access and use Web content to find the lowest air fares, or book prices. But where are the semantics? Most people regard the Semantic Web as a vision, not a reality—so shopping agents should not “count”. To use Web content, machines need to know what to do when they encounter it. This in turn, requires the machine to “know” what the content means (i.e. its semantics). The challenge of developing the Semantic Web is how to put this knowledge into the machine. The manner in which this is done is at the heart of the confusion about the Semantic Web. The goal of this paper is to clear up some of this confusion. We proceed by describing a variety of meanings of the term “semantics”, noting various things that can be said to have semantics of various kinds. We introduce a semantic continuum ranging from implicit semantics, which are only in the heads of the people who use the terms, to formal semantics for machine processing. We list some core requirements for enabling machines to use Web content, and we consider various issues such as hardwiring, agreements, clarity of semantics specifications, and public declarations of semantics. In light of these requirements and issues in conjunction with our semantic continuum, it is useful to collectively regard shopping agents as a degenerate case of the Semantic Web. Shopping agents work in the complete absence of any explicit account of the semantics of Web content because the meaning of the Web content that the agents are expected to encounter can be determined by the human programmers who hardwire it into the Web application software. We note various shortcomings of this approach, which give rise to some ideas about how the Semantic Web should evolve. We argue that this evolution will take place by (1) moving along the semantic continuum from implicit semantics to formal semantics for machine processing, (2) reducing the amount of Web content semantics that is hardwired, (3) increasing the amount of agreements and standards, and (4) developing semantic mapping and translation capabilities where differences remain. Keywords Semantic Web, Software Agents, Semantic Heterogeneity, Ontologies ✤ The content of this paper was first presented as an invited talk at the Ontologies in Agent Systems workshop held at the Autonomous Agents Conference in Montreal, June 2001. This paper is a significantly revised and extended version of a short paper that appeared in a special issue of the Knowledge Engineering Review for papers from that workshop
Where are the Semantics in the Semantic Web 1 Introduction The current evolution of the Web can be characterized from various perspectives Jasper Uschold 20011 Locating Resources: The way people find things on the Web is evolving from simple free text and keyword search to more sophisticated semantic techniques both for search and navigation Users: Web resources are evolving from being primarily intended for human consumption to being intended for use both by humans and machines Web Tasks and Services: The Web is evolving from being primarily a place to find things to being a place to do things as well [Smith 20011 All of these new capabilities for the Web depend in a fundamental way on the idea of semantics. This gives rise to a fourth perspective along which the Web evolution may be viewed Semantics-The Web is evolving from containing information resources that have little or no explicit semantics to having a rich semantic infrastructure Despite the widespread use of the term"Semantic Web, it does not yet exist except in isolated environments, mainly in research labs. In the w3C Semantic Web Activity Statement we are told that he Semantic Web is a vision: the idea of having data on the Web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications [W3C 2001][emphasis mine] As envisioned by Tim Berners-Lee The Semantic Web is an extension of the current Web in which information is given well-defined mean better enabling computers and people to work in cooperation. [Berners-Lee et al 2001][emphasis min [ SOmething has semantics when it can be processed and understood by a computer, such as how a bill can be processed by a package such as Quicken. " [Trippe 2001 There is no widespread agreement on exactly what the Semantic Web is, nor exactly what it is for. From the above descriptions, there is clear emphasis on the information content of the Web being machine usable and associated with Note that"machine" refers to computers(or computer programs)that perform tasks on the Web. These programs are commonly referred to as software agents, or sofbots and are found in Web applications Machine usable content presumes that the machine knows what to do with information on the Web. One way for this to happen is for the machine to read and process a machine-sensible specification of the semantics of the information. This is a robust and very challenging approach, and largely beyond the current state of the art. A much simpler alternative is for the human Web application developers to hardwire the knowledge into the software so that when the machine runs the software. it does the correct hing with the information. In this second situation, machines already use information on the Web. There are electronic broker agents in routine use that make use of the meaning associated with Web content words Ich as"price, weight, destination, and"airport, to name a few. Armed with a built-in with the lowest price for a book or the lowest air fare between two given cities. So, we still lack an Sites understanding of these terms, these so-called shopping agents automatically peruse the Web to find adequate characterization of what distinguishes the future Semantic Web from what exists today Because RDF (Resource Description Framework) [w3C 1999] is hailed by the w3C as a Semantic Web language, some people seem to have the view that if an application uses rdf, then it is a Semantic Web application. This is reminiscent of the "If it is programmed in Lisp or Prolog, then it must be Ar"sentiment Final Draft Submitted to AI Magazine
Where are the Semantics in the Semantic Web Final Draft Submitted to AI Magazine Page 2 1 Introduction The current evolution of the Web can be characterized from various perspectives [Jasper & Uschold 2001]: Locating Resources: The way people find things on the Web is evolving from simple free text and keyword search to more sophisticated semantic techniques both for search and navigation. Users: Web resources are evolving from being primarily intended for human consumption to being intended for use both by humans and machines . Web Tasks and Services: The Web is evolving from being primarily a place to find things to being a place to do things as well [Smith 2001]. All of these new capabilities for the Web depend in a fundamental way on the idea of semantics. This gives rise to a fourth perspective along which the Web evolution may be viewed: • Semantics—The Web is evolving from containing information resources that have little or no explicit semantics to having a rich semantic infrastructure. Despite the widespread use of the term “Semantic Web,” it does not yet exist except in isolated environments, mainly in research labs. In the W3C Semantic Web Activity Statement we are told that: “The Semantic Web is a vision: the idea of having data on the Web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications.[W3C 2001] ” [emphasis mine] As envisioned by Tim Berners-Lee: “The Semantic Web is an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” [Berners-Lee et al 2001] [emphasis mine] “[S]omething has semantics when it can be ‘processed and understood by a computer,’ such as how a bill can be processed by a package such as Quicken.” [Trippe 2001] There is no widespread agreement on exactly what the Semantic Web is, nor exactly what it is for. From the above descriptions, there is clear emphasis on the information content of the Web being: • machine usable, and • associated with more meaning. Note that “machine” refers to computers (or computer programs) that perform tasks on the Web. These programs are commonly referred to as software agents, or sofbots and are found in Web applications. Machine usable content presumes that the machine knows what to do with information on the Web. One way for this to happen is for the machine to read and process a machine-sensible specification of the semantics of the information. This is a robust and very challenging approach, and largely beyond the current state of the art. A much simpler alternative is for the human Web application developers to hardwire the knowledge into the software so that when the machine runs the software, it does the correct thing with the information. In this second situation, machines already use information on the Web. There are electronic broker agents in routine use that make use of the meaning associated with Web content words such as “price,” “weight,” “destination,” and “airport,” to name a few. Armed with a built-in “understanding” of these terms, these so-called shopping agents automatically peruse the Web to find sites with the lowest price for a book or the lowest air fare between two given cities. So, we still lack an adequate characterization of what distinguishes the future Semantic Web from what exists today. Because RDF (Resource Description Framework) [W3C 1999] is hailed by the W3C as a Semantic Web language, some people seem to have the view that if an application uses RDF, then it is a Semantic Web application. This is reminiscent of the “If it is programmed in Lisp or Prolog, then it must be AI” sentiment
Where are the semantics in the Semantic Web that was sometimes evident in the early days of Artificial Intelligence. There is also confusion about what constitutes a legitimate Semantic Web application. Some seem to have the view that an rdF tool such as CWM is one. This is true only in the same sense that KEe and art were Al applications. They were certainly generating income for the vendors, but that is different from the companies using the tools to develop applications that help their bottom line. The lack of an adequate definition of the Semantic Web, however, is no reason to stop pursuing its development any more than an inadequate definition of Al was a reason to cease Al research. Quite the opposite, new ideas al ways need an incubation period The research community, industrial participants, and software vendors are working with the w3C to make the Semantic Web vision a reality([Berners-Lee et al 2001], [DAML 2001, [w3C 20011). It will be layered, extensible, and composable. A major part of this will entail representing and reasoning with emantic metadata, and/or providing semantic markup in the information resources. Fundamental to the semantic infrastructure are ontologies, knowledge bases, and agents along with inference, proof, and sophisticated semantic querying capability The main intent of the Semantic Web is to give machines much better access to information resources so they can be information intermediaries in support of humans. According to the vision described Berners-Lee et al 2001], agents will be pervasive on the Web, carrying out a multitude of everyday tasks Hendler describes many of the important technical issues that this entails, emphasizing the interdependence of agent technology and ontologies [Hendler 2001]. In order to carry out their required tasks, intelligent agents must communicate and understand meaning. They must advertise their capabilities, and recognize the capabilities of other agents. They must locate meaningful information resources on the Web and combine them in meaningful ways to perform tasks. They need to recognize, interpret, and respond to communication acts from other agents In other words, when agents communicate with each other, there needs to be some way to ensure that the meaning of what one agent"says"is accurately conveyed to the other agent. There are two extremes, in principal, for handling this problem. The simplest(and perhaps the most common) approach, is to ignore the problem altogether. That is, just assume that all agents are using the same terms to mean the same things. In practice, this will usually be an assumption built into the application. The assumption could be implicit and informal, or it could be an explicit agreement among all parties to commit to using the same terms in a pre-defined manner. This only works, however, when one has full control over what agents exist and what they might communicate. In reality, agents need to interact in a much wider world, where it cannot be assumed that other agents will use the same terms, or if they do, it cannot be assumed that the terms will mean the same thing The moment we accept the problem and grant that agents may not use the same terms to mean the same things, we need a way for an agent to discover what another agent means when it communicates. In order for this to happen, every agent will need to publicly declare exactly what terms it is using and what they mean. This specification is commonly referred to as the agents ontology gruber 1993. If it were written only for people to understand, this specification could be just a glossary. However, meaning must be accessible to other software agents. This requires that the meaning be encoded in some kind of formal otfe age. This will enable a given agent to use automated reasoning to accurately determine the meaning or pointer to Agent 1s ontology. Agent 2 can then look in Agent I's ontology to see what the terms mean, the message is successfully communicated, and the agents task is successfully performed. At least this is the theory. In practice there is a plethora of difficulties. The holy grail is for this to happen consistently eliably, and fully automatically. Most of these difficulties arise from various sources of heterogeneity For example, there are many different ontology representation languages, different modeling styles and inconsistent use of terminology, to name a few. This is explored further in section 3 ClosedWorldMachinehttp://infomesh.net/2001/cv Final Draft Submitted to AI Magazine
Where are the Semantics in the Semantic Web Final Draft Submitted to AI Magazine Page 3 that was sometimes evident in the early days of Artificial Intelligence. There is also confusion about what constitutes a legitimate Semantic Web application. Some seem to have the view that an RDF tool such as CWM1 is one. This is true only in the same sense that KEE and ART were AI applications. They were certainly generating income for the vendors, but that is different from the companies using the tools to develop applications that help their bottom line. The lack of an adequate definition of the Semantic Web, however, is no reason to stop pursuing its development any more than an inadequate definition of AI was a reason to cease AI research. Quite the opposite, new ideas always need an incubation period. The research community, industrial participants, and software vendors are working with the W3C to make the Semantic Web vision a reality ([Berners-Lee et al 2001], [DAML 2001], [W3C 2001]). It will be layered, extensible, and composable. A major part of this will entail representing and reasoning with semantic metadata, and/or providing semantic markup in the information resources. Fundamental to the semantic infrastructure are ontologies, knowledge bases, and agents along with inference, proof, and sophisticated semantic querying capability. The main intent of the Semantic Web is to give machines much better access to information resources so they can be information intermediaries in support of humans. According to the vision described in [Berners-Lee et al 2001], agents will be pervasive on the Web, carrying out a multitude of everyday tasks. Hendler describes many of the important technical issues that this entails, emphasizing the interdependence of agent technology and ontologies [Hendler 2001]. In order to carry out their required tasks, intelligent agents must communicate and understand meaning. They must advertise their capabilities, and recognize the capabilities of other agents. They must locate meaningful information resources on the Web and combine them in meaningful ways to perform tasks. They need to recognize, interpret, and respond to communication acts from other agents. In other words, when agents communicate with each other, there needs to be some way to ensure that the meaning of what one agent “says” is accurately conveyed to the other agent. There are two extremes, in principal, for handling this problem. The simplest (and perhaps the most common) approach, is to ignore the problem altogether. That is, just assume that all agents are using the same terms to mean the same things. In practice, this will usually be an assumption built into the application. The assumption could be implicit and informal, or it could be an explicit agreement among all parties to commit to using the same terms in a pre-defined manner. This only works, however, when one has full control over what agents exist and what they might communicate. In reality, agents need to interact in a much wider world, where it cannot be assumed that other agents will use the same terms, or if they do, it cannot be assumed that the terms will mean the same thing. The moment we accept the problem and grant that agents may not use the same terms to mean the same things, we need a way for an agent to discover what another agent means when it communicates. In order for this to happen, every agent will need to publicly declare exactly what terms it is using and what they mean. This specification is commonly referred to as the agent’s ontology [Gruber 1993]. If it were written only for people to understand, this specification could be just a glossary. However, meaning must be accessible to other software agents. This requires that the meaning be encoded in some kind of formal language. This will enable a given agent to use automated reasoning to accurately determine the meaning of other agents’ terms. For example, suppose Agent 1 sends a message to Agent 2 and in this message is a pointer to Agent 1’s ontology. Agent 2 can then look in Agent 1's ontology to see what the terms mean, the message is successfully communicated, and the agent’s task is successfully performed. At least this is the theory. In practice there is a plethora of difficulties. The holy grail is for this to happen consistently, reliably, and fully automatically. Most of these difficulties arise from various sources of heterogeneity. For example, there are many different ontology representation languages, different modeling styles and inconsistent use of terminology, to name a few. This is explored further in section 3. 1 Closed World Machine http://infomesh.net/2001/cwm/
Where are the Semantics in the Semantic Web 2 Semantics: A Many-Splendored Thing The core meaning of the word"semantics" is: meaning itself. Yet there is no agreement as to how this applies to the term "Semantic Web. In what follows, we characterize the many things that one might mean but rather to make some important distinctions that people can use to communicate more clearly when when talking about semantics as it pertains to the Semantic Web. It is not our intention to define the term talking about the Semantic Web need for agents to understand the meaning of the information being exchanged between agents, and the meaning of the content of various information sources that agents require in order to perform their tasks We focus attention on the questions of what kinds of semantics there are, what kinds of things have semantics, where the semantics are and how they are used. We identify a kind of semantic continuum ranging from the kind of semantics that exist on the Web today to a rich semantic infrastructure on the Semantic Web of the future Real World Semantics-Real world semantics" are concerned with the"mapping of objects in the model or computational world onto the real world. [and] issues that involve human interpretation, or meaning and use of data or information. [Ouksel Sheth 1999] In this context, we talk about the semantics of ar item", which might be a tag or a term, or possibly a complex expression in some language. We may also peak of the semantics of a possibly large set of expressions, which collectively are intended to represent some real world domain. The real world semantics correspond to the concepts in the real world that the items or sets of items refer to Agent Communication Language Performatives-In the context of the Semantic Web, there are special items that require semantics to ensure that agents communicate effectively. These are performatives such as request or inform in agent communication languages [Smith et al. 98] Axiomatic Semantics--An axiomatic semantics for a language specifies"a mapping of a set of descriptions in [that] language into a logical theory expressed in first-order predicate calculus. " The basic idea is that "the logical theory produced by the mapping.of a set of such descriptions is logically equivalent to the intended meaning of that set of descriptions"[Fikes McGuinness 2001]. Axiomatic semantics have been given for the Resource Description Framework(RDF), RDF Schema(RDF-S), and DAML+OIL. The axiomatic semantics for a language helps to ascribe a real world semantics expressions in that language, in that it limits the possible models or interpretations that the set of axioms Model-Theoretic Semantics"A model-theoretic semantics for a language assumes that the language refers to a world, and describes the minimal conditions that a world must satisfy in order to assign an appropriate meaning for every expression in the language.[W3C 2002a] It is used as a technical tool for determining when proposed operations on the language preserve meaning. In particular, it characterizes what conclusions can validly be drawn from a given set of expressions, independently from what the ntended vs. Actual Meaning-a key to the successful operation of the Semantic Web is that the intended meaning of Web content be accurately conveyed to potential users of that content. In the case of shopping agents, the meaning of terms like"price"is conveyed based on human consensus. However mistakes are al ways possible, due to inconsistency of natural language usage. When formal languages are used, an author attempts to communicate meaning by specifying axioms in a logical theory. In this case we can talk about intended versus actual models of the theory. There is normally just one intended model. It corresponds to what the author wanted the axioms to represent. The actual models correspond to what the author actually has represented They consist of all the objects and relationships, etc, in the real world that n This term is commonly used in the literature on semantic integration of da Final Draft Submitted to AI Magazine
Where are the Semantics in the Semantic Web Final Draft Submitted to AI Magazine Page 4 2 Semantics: A Many-Splendored Thing The core meaning of the word “semantics” is: meaning itself. Yet there is no agreement as to how this applies to the term “Semantic Web.” In what follows, we characterize the many things that one might mean when talking about semantics as it pertains to the Semantic Web. It is not our intention to define the term, but rather to make some important distinctions that people can use to communicate more clearly when talking about the Semantic Web. In the context of achieving successful communication among agents on the Web, we are talking about the need for agents to understand the meaning of the information being exchanged between agents, and the meaning of the content of various information sources that agents require in order to perform their tasks. We focus attention on the questions of what kinds of semantics there are, what kinds of things have semantics, where the semantics are and how they are used. We identify a kind of semantic continuum ranging from the kind of semantics that exist on the Web today to a rich semantic infrastructure on the Semantic Web of the future. Real World Semantics—Real world semantics2 are concerned with the “mapping of objects in the model or computational world onto the real world … [and] issues that involve human interpretation, or meaning and use of data or information.” [Ouksel & Sheth 1999] In this context, we talk about the semantics of an “item”, which might be a tag or a term, or possibly a complex expression in some language. We may also speak of the semantics of a possibly large set of expressions, which collectively are intended to represent some real world domain. The real world semantics correspond to the concepts in the real world that the items or sets of items refer to. Agent Communication Language Performatives—In the context of the Semantic Web, there are special items that require semantics to ensure that agents communicate effectively. These are performatives such as request or inform in agent communication languages [Smith et al. 98]. Axiomatic Semantics—An axiomatic semantics for a language specifies “a mapping of a set of descriptions in [that] language into a logical theory expressed in first-order predicate calculus.” The basic idea is that “the logical theory produced by the mapping … of a set of such descriptions is logically equivalent to the intended meaning of that set of descriptions” [Fikes & McGuinness 2001]. Axiomatic semantics have been given for the Resource Description Framework (RDF), RDF Schema (RDF-S), and DAML+OIL. The axiomatic semantics for a language helps to ascribe a real world semantics to expressions in that language, in that it limits the possible models or interpretations that the set of axioms may have. Model-Theoretic Semantics— “A model-theoretic semantics for a language assumes that the language refers to a 'world', and describes the minimal conditions that a world must satisfy in order to assign an appropriate meaning for every expression in the language”. [W3C 2002a] It is used as a technical tool for determining when proposed operations on the language preserve meaning. In particular, it characterizes what conclusions can validly be drawn from a given set of expressions, independently from what the symbols mean. Intended vs. Actual Meaning— A key to the successful operation of the Semantic Web is that the intended meaning of Web content be accurately conveyed to potential users of that content. In the case of shopping agents, the meaning of terms like “price” is conveyed based on human consensus. However, mistakes are always possible, due to inconsistency of natural language usage. When formal languages are used, an author attempts to communicate meaning by specifying axioms in a logical theory. In this case we can talk about intended versus actual models of the theory. There is normally just one intended model. It corresponds to what the author wanted the axioms to represent. The actual models correspond to what the author actually has represented. They consist of all the objects and relationships, etc., in the real world that 2 This term is commonly used in the literature on semantic integration of databases
Where are the Semantics in the Semantic Web are consistent with the axioms. The goal is to create a set of axioms such that the actual models only include the intended model(s) We believe that the idea of real world semantics, as described above captures the essence of the main use of the term"semantics" in a Semantic Web context. However, it is only loosely defined. The ideas of axiomatic and model-theoretic semantics are being used to make the idea of real world semantics for the Semantic Web more concrete From this discussion, it is clear that several things have semantics 1. Terms or expressions, referring to the real world subject matter of Web content(e.g, semantic markup) 2. Terms or expressions in an agent communication language(e.g, inform); 3. A language for representing the above information(e.g, the semantics of DAML+OIL or RDF 2.1 A semantic continuum We ask three questions about how semantics may be specified 1. Are the semantics explicit or implicit? 2. Are the semantics expressed informally or formally? 3. Are the semantics intended for human processing, or machine processing? These give rise to four kinds of semantics 2. Explicit and informal, 3. Explicit and formal for human processing 4. Explicit and formal for machine processing We define these to be four somewhat arbitrary points along a semantic continuum(see Figure 1).At one extreme, there are no semantics at all, except what is in the minds of the people who use the terms. At the other extreme, we have formal and explicit semantics that are fully automated. The further we move along the continuum, the less ambiguity there is and the more likely we are to have robust, correctly functioning and easy to maintain Web applications. We consider these four points on our semantic continuum, in turn Note that there are likely to be many cases that are not clear cut and thus arguably may fall somewhere between 2.1.1 Implicit Semantics In the simplest case, the semantics are implicit only. Meaning is conveyed based on a shared understanding derived from human consensus. A common example of this case is the typical use of XML tags, such as tags mean[Cover 98]. However, if there is an implicit shared consensus about what the tags mean, then ese price, address, or delivery date. Nowhere in an XML document, or DTD or Schema, does it say what these people can hardwire this implicit semantics into web application programs, using screen-scrapers and wrappers. This is how one implements shopping agents that search Web sites for the best deals. From th perspective of mature ial applications that automatically use Web content as conceived by Semantic Web visionaries, this is at or near the current state of the art. The disadvantage of implicit semantics is that they are rife with ambiguity. People often do disagree about the meaning of a term. For example, prices come in different currencies and they may or may not include various taxes or shipping costs. The removal of ambiguity is the major motivation for the use of specialized language used in legal contracts. The costs of identifying and removing ambiguity are very high 2.1.2 Informal Semantics At a further point along the continuum, the semantics are explicit and are expressed in an informal manner, e.g., a glossary or a text specification document. Given the complexities of natural language, machines have an extremely limited ability to make direct use of informally expressed semantics. This is mainly for humans. There are many examples of informal semantics, usually found in text specification documents The meaning of tags in hTml such as <h2> which means second level header Final Draft Submitted to AI Magazine
Where are the Semantics in the Semantic Web Final Draft Submitted to AI Magazine Page 5 are consistent with the axioms. The goal is to create a set of axioms such that the actual models only include the intended model(s). We believe that the idea of real world semantics, as described above captures the essence of the main use of the term “semantics” in a Semantic Web context. However, it is only loosely defined. The ideas of axiomatic and model-theoretic semantics are being used to make the idea of real world semantics for the Semantic Web more concrete. From this discussion, it is clear that several things have semantics: 1. Terms or expressions, referring to the real world subject matter of Web content (e.g., semantic markup); 2. Terms or expressions in an agent communication language (e.g., inform); 3. A language for representing the above information (e.g., the semantics of DAML+OIL or RDF). 2.1 A semantic continuum We ask three questions about how semantics may be specified: 1. Are the semantics explicit or implicit? 2. Are the semantics expressed informally or formally? 3. Are the semantics intended for human processing, or machine processing? These give rise to four kinds of semantics: 1. Implicit; 2. Explicit and informal; 3. Explicit and formal for human processing; 4. Explicit and formal for machine processing. We define these to be four somewhat arbitrary points along a semantic continuum (see Figure 1). At one extreme, there are no semantics at all, except what is in the minds of the people who use the terms. At the other extreme, we have formal and explicit semantics that are fully automated. The further we move along the continuum, the less ambiguity there is and the more likely we are to have robust, correctly functioning and easy to maintain Web applications. We consider these four points on our semantic continuum, in turn. Note that there are likely to be many cases that are not clear cut and thus arguably may fall somewhere in between. 2.1.1 Implicit Semantics In the simplest case, the semantics are implicit only. Meaning is conveyed based on a shared understanding derived from human consensus. A common example of this case is the typical use of XML tags, such as price, address, or delivery date. Nowhere in an XML document, or DTD or Schema, does it say what these tags mean [Cover 98]. However, if there is an implicit shared consensus about what the tags mean, then people can hardwire this implicit semantics into web application programs, using screen-scrapers and wrappers. This is how one implements shopping agents that search Web sites for the best deals. From the perspective of mature commercial applications that automatically use Web content as conceived by Semantic Web visionaries, this is at or near the current state of the art. The disadvantage of implicit semantics is that they are rife with ambiguity. People often do disagree about the meaning of a term. For example, prices come in different currencies and they may or may not include various taxes or shipping costs. The removal of ambiguity is the major motivation for the use of specialized language used in legal contracts. The costs of identifying and removing ambiguity are very high. 2.1.2 Informal Semantics At a further point along the continuum, the semantics are explicit and are expressed in an informal manner, e.g., a glossary or a text specification document. Given the complexities of natural language, machines have an extremely limited ability to make direct use of informally expressed semantics. This is mainly for humans. There are many examples of informal semantics, usually found in text specification documents. • The meaning of tags in HTML such as <h2>, which means second level header;