TLFeBOOK 1.3 Semantic Web Technologies 9 <treatmentoffered>Physiotherapy</treatmentoffered> companyName>Agilitas Physiotherapy Centre</companyName> <therapist>Lisa Davenport</therapist> therapist>steve Matthews</therapist secretary>Kelly Townsend</secretary> </company> This representation is far more easily processable by machines. The term metadata refers to such information: data about data. Metadata capture part of the meaning of data, thus the term semantic in Semantic Web In our example scenarios in section 1.2 there seemed to be no barriers in the ccess to information in Web pages: therapy details, calendars and appoint- ments, prices and product descriptions, it seemed like all this information could be directly retrieved from existing Web content. But, as we explained, this will not happen using text-based manipulation of information but rather by taking advantage of machine-processable metadata As with the current development of Web pages, users will not have to be computer science experts to develop Web pages; they will be able to use tools for this purpose. Still, the question remains why users should care, why they should abandon HTML for Semantic Web languages. Perhaps we can give an optimistic answer if we compare the situation today to the beginnings of the Web. The first users decided to adopt hTML because it had been adopted as a standard and they were expecting benefits from being early adopters Others followed when more and better Web tools became available. And soon HTML was a universally accepted standard Similarly, we are currently observing the early adoption of XML. while not sufficient in itself for the realization of the semantic Web vision xml is an important first step. Early users, perhaps some large organizations interested in knowledge management and B2B e-commerce, will adopt XML and RDF, the current Semantic Web-related w3C standards. And the momentum will lead to more and more tool vendors and end users' adopting the technology This will be a decisive step in the Semantic Web venture, but it is also a challenge. As we mentioned, the greatest current challenge is not scientific but rather one of technology adoption TLFebooK
1.3 Semantic Web Technologies 9 <company> <treatmentOffered>Physiotherapy</treatmentOffered> <companyName>Agilitas Physiotherapy Centre</companyName> <staff> <therapist>Lisa Davenport</therapist> <therapist>Steve Matthews</therapist> <secretary>Kelly Townsend</secretary> </staff> </company> This representation is far more easily processable by machines. The term metadata refers to such information: data about data. Metadata capture part of the meaning of data, thus the term semantic in Semantic Web. In our example scenarios in section 1.2 there seemed to be no barriers in the access to information in Web pages: therapy details, calendars and appointments, prices and product descriptions, it seemed like all this information could be directly retrieved from existing Web content. But, as we explained, this will not happen using text-based manipulation of information but rather by taking advantage of machine-processable metadata. As with the current development of Web pages, users will not have to be computer science experts to develop Web pages; they will be able to use tools for this purpose. Still, the question remains why users should care, why they should abandon HTML for Semantic Web languages. Perhaps we can give an optimistic answer if we compare the situation today to the beginnings of the Web. The first users decided to adopt HTML because it had been adopted as a standard and they were expecting benefits from being early adopters. Others followed when more and better Web tools became available. And soon HTML was a universally accepted standard. Similarly, we are currently observing the early adoption of XML. While not sufficient in itself for the realization of the Semantic Web vision, XML is an important first step. Early users, perhaps some large organizations interested in knowledge management and B2B e-commerce, will adopt XML and RDF, the current Semantic Web-related W3C standards. And the momentum will lead to more and more tool vendors’ and end users’ adopting the technology. This will be a decisive step in the Semantic Web venture, but it is also a challenge. As we mentioned, the greatest current challenge is not scientific but rather one of technology adoption. TLFeBOOK TLFeBOOK
TLFeBooK 1.3.2 Ontologies The term ontology originates from philosophy. In that context, it is used as the name of a subfield of philosophy, namely, the study of the nature of ex istence(the literal translation of the greek word OvToAoyia), the branch of metaphysics concerned with identifying, in the most general terms, the kinds of things that actually exist, and how to describe them. For example, the ob- servation that the world is made up of specific objects that can be grouped into abstract classes based on shared properties is a typical ontological com- However, in more recent years, ontology has become one of the many words hijacked by computer science and given a specific technical meaning that is rather different from the original one. Instead of"ontology"we now eak of "an ontology". For our purposes, we will uses T.R. Gruber's defini tion, later refined by R Studer: An ontology is an explicit and formal specification of a conceptualization In general, an ontology describes formally a domain of discourse. Typi cally, an ontology consists of a finite list of terms and the relationships be- ween these terms. The terms denote important concepts(classes of objects)of the domain. For example, in a university setting, staff members, students, courses, lecture theaters, and disciplines are some important concepts The relationships typically include hierarchies of classes. A hierarchy spec ifies a class C to be a subclass of another class C"' if every object in C is also included in C. For example, all faculty are staff members. Figure 1. 1 shows a hierarchy for the university domain. part from subclass relationships, ontologies may include information properties (x value restrictions(only faculty members can teach courses disjointness statements(faculty and general staff are disjoint) specification of logical relationships between objects(every depart must include at least ten faculty members In the context of the Web, ontologies provide a shared understanding of a do- main. Such a shared understanding is necessary to overcome differences in terminology. One applications zip code may be the same as another applica- tions area code. Another problem is that two applications may use the same TLFeBOoK
10 1 The Semantic Web Vision 1.3.2 Ontologies The term ontology originates from philosophy. In that context, it is used as the name of a subfield of philosophy, namely, the study of the nature of existence (the literal translation of the Greek word Oντoλoγiα), the branch of metaphysics concerned with identifying, in the most general terms, the kinds of things that actually exist, and how to describe them. For example, the observation that the world is made up of specific objects that can be grouped into abstract classes based on shared properties is a typical ontological commitment. However, in more recent years, ontology has become one of the many words hijacked by computer science and given a specific technical meaning that is rather different from the original one. Instead of “ontology” we now speak of “an ontology”. For our purposes, we will uses T.R. Gruber’s definition, later refined by R. Studer: An ontology is an explicit and formal specification of a conceptualization. In general, an ontology describes formally a domain of discourse. Typically, an ontology consists of a finite list of terms and the relationships between these terms. The terms denote important concepts (classes of objects) of the domain. For example, in a university setting, staff members, students, courses, lecture theaters, and disciplines are some important concepts. The relationships typically include hierarchies of classes. A hierarchy specifies a class C to be a subclass of another class C if every object in C is also included in C . For example, all faculty are staff members. Figure 1.1 shows a hierarchy for the university domain. Apart from subclass relationships, ontologies may include information such as • properties (X teaches Y) • value restrictions (only faculty members can teach courses) • disjointness statements (faculty and general staff are disjoint) • specification of logical relationships between objects (every department must include at least ten faculty members) In the context of the Web, ontologies provide a shared understanding of a domain. Such a shared understanding is necessary to overcome differences in terminology. One application’s zip code may be the same as another application’s area code. Another problem is that two applications may use the same TLFeBOOK TLFeBOOK
TLFeBOOK 1.3 Semantic Web Technologies term with different meanings In university A, a course may refer to a degree (like computer science), while in university B it may mean a single subject ( CS 101 ). Such differences can be overcome by mapping the particular ter- minology to a shared ontology or by defining direct mappings between the ontologies. In either case, it is easy to see that ontologies support semantic interoperability Ontologies are useful for the organization and navigation of Web sites Many Web sites today expose on the left-hand side of the page the top levels of a concept hierarchy of terms. The user may click on one of them to expand the subcategories Also, ontologies are useful for improving the accuracy of we The search engines can look for pages that refer to a precise concept in an on- tology instead of collecting all pages in which certain, generally ambiguous, keywords occur. In this way, differences in terminology between Web pages and the queries can be overcome. In addition, Web searches can exploit generalization/specialization infor- mation. If a query fails to find any relevant documents, the search engine may suggest to the user a more general query. It is even conceivable for the engine to run such queries proactively to reduce the reaction time in case the TLFeBOoK
1.3 Semantic Web Technologies 11 staff administration staff technical support staff research staff visiting staff staff faculty regular academic staff students undergraduate postgraduate people university Figure 1.1 A hierarchy term with different meanings. In university A, a course may refer to a degree (like computer science), while in university B it may mean a single subject (CS 101). Such differences can be overcome by mapping the particular terminology to a shared ontology or by defining direct mappings between the ontologies. In either case, it is easy to see that ontologies support semantic interoperability . Ontologies are useful for the organization and navigation of Web sites. Many Web sites today expose on the left-hand side of the page the top levels of a concept hierarchy of terms. The user may click on one of them to expand the subcategories. Also, ontologies are useful for improving the accuracy of Web searches. The search engines can look for pages that refer to a precise concept in an ontology instead of collecting all pages in which certain, generally ambiguous, keywords occur. In this way, differences in terminology between Web pages and the queries can be overcome. In addition, Web searches can exploit generalization/specialization information. If a query fails to find any relevant documents, the search engine may suggest to the user a more general query. It is even conceivable for the engine to run such queries proactively to reduce the reaction time in case the TLFeBOOK TLFeBOOK
TLFeBOOK user adopts a suggestion. Or if too many answers are retrieved, the search engine may suggest to the user some specializations In Artificial Intelligence(Al) there is a long tradition of developing and us- ing ontology languages. It is a foundation Semantic Web research can build upon. At present, the most important ontology languages for the Web are the following: XML Provides a surface syntax for structured documents but imposes no semantic constraints on the meaning of these documents KML Schema is a language for restricting the structure of XMl docu- RDF is a data model for objects("resources")and relations between them; it provides a simple semantics for this data model; and these data models can be represented in an XMl syntax RDF Schema is a vocabulary description language for describing prop- erties and classes of RDF resources, with a semantics for generalization hierarchies of such properties and classes OWL is a richer vocabulary description language for describing erties and classes, such as relations between classes(e. g, disjoint cardinality (e.g. exactly one"), equality, richer typing of properties, acteristics of properties (e.g, symmetry), and enumerated classes 1.3.3 Logic Logic is the discipline that studies the principles of reasoning; it goes back to Aristotle. In general, logic offers, first, formal languages for expressing know ledge. Second, logic provides us with well-understood formal semantics: in describe what holds without caring about how it can be deduce ledge. er- most logics, the meaning of sentences is defined without the need to oper- ationalize the knowledge. Often we speak of declarative know we And third, automated reasoners can deduce(infer)conclusions from the given knowledge, thus making implicit knowledge explicit. Such reason- ers have been studied extensively in AL. Here is an example of an inference Suppose we know that all professors are faculty members, that all faculty members are staff members, and that Michael is a professor. In predicate gic the information is expressed as follows: TLFeBOoK
12 1 The Semantic Web Vision user adopts a suggestion. Or if too many answers are retrieved, the search engine may suggest to the user some specializations. In Artificial Intelligence (AI) there is a long tradition of developing and using ontology languages. It is a foundation Semantic Web research can build upon. At present, the most important ontology languages for the Web are the following: • XML provides a surface syntax for structured documents but imposes no semantic constraints on the meaning of these documents. • XML Schema is a language for restricting the structure of XML documents. • RDF is a data model for objects (“resources”) and relations between them; it provides a simple semantics for this data model; and these data models can be represented in an XML syntax. • RDF Schema is a vocabulary description language for describing properties and classes of RDF resources, with a semantics for generalization hierarchies of such properties and classes. • OWL is a richer vocabulary description language for describing properties and classes, such as relations between classes (e.g., disjointness), cardinality (e.g. “exactly one”), equality, richer typing of properties, characteristics of properties (e.g., symmetry), and enumerated classes. 1.3.3 Logic Logic is the discipline that studies the principles of reasoning; it goes back to Aristotle. In general, logic offers, first, formal languages for expressing knowledge. Second, logic provides us with well-understood formal semantics: in most logics, the meaning of sentences is defined without the need to operationalize the knowledge. Often we speak of declarative knowledge: we describe what holds without caring about how it can be deduced. And third, automated reasoners can deduce (infer) conclusions from the given knowledge, thus making implicit knowledge explicit. Such reasoners have been studied extensively in AI. Here is an example of an inference. Suppose we know that all professors are faculty members, that all faculty members are staff members, and that Michael is a professor. In predicate logic the information is expressed as follows: TLFeBOOK TLFeBOOK
TLFeBooK 1.3 Semantic Web Technologies pof(X)→ facultyt(x) faculty(X)→stof(X) prof(michael) Then we can deduce the following faculty(michael) michael pof(X)→stf(X) Note that this example involves knowledge typically found in ontologies Thus logic can be used to uncover ontological knowledge that is implicitly given. By doing so, it can also help uncover unexpected relationships and inconsistence But logic is more general than ontologies. It can also be used by intelligent agents for making decisions and selecting courses of action. For example,a shop agent may decide to grant a discount to a customer based on the rule loyalCustomer(X)- discount(5%) where the loyalty of customers is determined from data stored in the cor- porate database. Generally there is a trade-off between expressive power and computational efficiency. The more expressive a logic is, the more com- putationally expensive it becomes to draw conclusions. And drawing cer- tain conclusions may become impossible if noncomputability barriers are encountered. Luckily, most knowledge relevant to the Semantic Web seems to be of a relatively restricted form. For example, our previous examples in- volved rules of the form, "If conditions, then conclusion, "and only finitely many objects needed to be considered. This subset of logic is tractable and is supported by efficient reasoning tools An important advantage of logic is that it can provide explanations for conclusions: the series of inference steps can be retraced. Moreover AI re searchers have developed ways of presenting an explanation in a human- friendly way, by organizing a proof as a natural deduction and by grouping a number of low-level inference steps into metasteps that a person will typ- ically consider a single proof step. Ultimately an explanation will trace an answer back to a given set of facts and the inference rules used. Explanations are important for the Semantic Web because they Increase users' confidence in Semantic Web agents(see the physiotherapy example in TLFebooK
1.3 Semantic Web Technologies 13 prof(X) → faculty(X) f aculty(X) → staff(X) prof(michael) Then we can deduce the following: f aculty(michael) staff(michael) prof(X) → staff(X) Note that this example involves knowledge typically found in ontologies. Thus logic can be used to uncover ontological knowledge that is implicitly given. By doing so, it can also help uncover unexpected relationships and inconsistencies. But logic is more general than ontologies. It can also be used by intelligent agents for making decisions and selecting courses of action. For example, a shop agent may decide to grant a discount to a customer based on the rule loyalCustomer(X) → discount(5%) where the loyalty of customers is determined from data stored in the corporate database. Generally there is a trade-off between expressive power and computational efficiency. The more expressive a logic is, the more computationally expensive it becomes to draw conclusions. And drawing certain conclusions may become impossible if noncomputability barriers are encountered. Luckily, most knowledge relevant to the Semantic Web seems to be of a relatively restricted form. For example, our previous examples involved rules of the form, “If conditions, then conclusion,” and only finitely many objects needed to be considered. This subset of logic is tractable and is supported by efficient reasoning tools. An important advantage of logic is that it can provide explanations for conclusions: the series of inference steps can be retraced. Moreover AI researchers have developed ways of presenting an explanation in a humanfriendly way, by organizing a proof as a natural deduction and by grouping a number of low-level inference steps into metasteps that a person will typically consider a single proof step. Ultimately an explanation will trace an answer back to a given set of facts and the inference rules used. Explanations are important for the Semantic Web because they increase users’ confidence in Semantic Web agents (see the physiotherapy example in TLFeBOOK TLFeBOOK