TLFeBOOK Most information is currently available in a weakly structured form, for example, text, audio, and video From the knowledge management perspec tive, the current technology suffers from limitations in the following areas Searching information. Companies usually depend on keyword-based search engines, the limitations of which we have outlined Extracting information. Human time and effort are required to browse the retrieved documents for relevant information. Current intelligent agents are unable to carry out this task in a satisfactory fashion Maintaining information. Currently there are problems, such as inconsis- tencies in terminology and failure to remove outdated information Uncovering information. New knowledge implicitly existing in corpo- rate databases is extracted using data mining. However, this task is still difficult for distributed, weakly structured collections of documents Viewing information. Often it is desirable to restrict access to certain in- formation to certain groups of employees. "Views", which hide certain information are known from the area of databases but are hard to realize over an intranet(or the Web) The aim of the Semantic Web is to allow much more advanced knowledge management systems Knowledge will be organized in conceptual spaces according to its mean- Automated tools will support maintenance by checking for inconsisten- ies and extracting new knowledge Keyword-based search will be replaced by query answering: requested knowledge will be retrieved, extracted, and presented in a human- friendly way. Query answering over several documents will be supported Defining who may view certain parts of information(even parts of docu mentswill be possib TLFebooK
4 1 The Semantic Web Vision Most information is currently available in a weakly structured form, for example, text, audio, and video. From the knowledge management perspective, the current technology suffers from limitations in the following areas: • Searching information. Companies usually depend on keyword-based search engines, the limitations of which we have outlined. • Extracting information. Human time and effort are required to browse the retrieved documents for relevant information. Current intelligent agents are unable to carry out this task in a satisfactory fashion. • Maintaining information. Currently there are problems, such as inconsistencies in terminology and failure to remove outdated information. • Uncovering information. New knowledge implicitly existing in corporate databases is extracted using data mining. However, this task is still difficult for distributed, weakly structured collections of documents. • Viewing information. Often it is desirable to restrict access to certain information to certain groups of employees. “Views”, which hide certain information, are known from the area of databases but are hard to realize over an intranet (or the Web). The aim of the Semantic Web is to allow much more advanced knowledge management systems: • Knowledge will be organized in conceptual spaces according to its meaning. • Automated tools will support maintenance by checking for inconsistencies and extracting new knowledge. • Keyword-based search will be replaced by query answering: requested knowledge will be retrieved, extracted, and presented in a humanfriendly way. • Query answering over several documents will be supported. • Defining who may view certain parts of information (even parts of documents) will be possible. TLFeBOOK TLFeBOOK
TLFeBooK 1.2 From Today's Web to the Semantic Web: Examples 5 1.2.2 Business-to-Consumer electronic commerce Business-to-consumer(B2C) electronic commerce is the predominant com- mercial experience of Web users. A typical scenario involves a users visiting one or several online shops, browsing their offers, selecting and ordering roducts Ideally, a user would collect information about prices, terms, tions(such as availability)of all, or at least all major, online shops and then proceed to select the best offer. But manual browsing is too time-consuming to be conducted on this scale. Typically a user will visit one or a very few online stores before making a decision. To alleviate this situation, tools for shopp Web able in the form of shopbots, software agents that visit several shops, extract product and price information, and compile a market overview. Their func tionality is provided by wrappers, programs that extract information from an online store. One wrapper per store must be developed. This approach suffers from several drawbacks The information is extracted from the online store site through keyword earch and other means of textual analysis. This process makes use of as- sumptions about the proximity of certain pieces of information(for example, ne price is indicated by the word price followed by the symbol s followed by a positive number). This heuristic approach is error-prone; it is not always guaranteed to work. Because of these difficulties only limited information is extracted. For example, shipping ex the destination country, level of security, and privacy policies are typicall not extracted. But all these factors may be significant for the user's deci sion making. In addition, programming wrappers is time-consuming, and changes in the online store outfit require costly reprogramming The Semantic Web will allow the development of software agents that can interpret the product information and the terms of service Pricing and product information will be extracted correctly, and delivery and privacy policies will be interpreted and compared to the user require ments Additional information about the reputation of online shops will be re- trieved from other sources, for example, independent rating agencies or consumer bodies The low-level programming of wrappers will become obsolete TLFebooK
1.2 From Today’s Web to the Semantic Web: Examples 5 1.2.2 Business-to-Consumer Electronic Commerce Business-to-consumer (B2C) electronic commerce is the predominant commercial experience of Web users. A typical scenario involves a user’s visiting one or several online shops, browsing their offers, selecting and ordering products. Ideally, a user would collect information about prices, terms, and conditions (such as availability) of all, or at least all major, online shops and then proceed to select the best offer. But manual browsing is too time-consuming to be conducted on this scale. Typically a user will visit one or a very few online stores before making a decision. To alleviate this situation, tools for shopping around on the Web are available in the form of shopbots, software agents that visit several shops, extract product and price information, and compile a market overview. Their functionality is provided by wrappers, programs that extract information from an online store. One wrapper per store must be developed. This approach suffers from several drawbacks. The information is extracted from the online store site through keyword search and other means of textual analysis. This process makes use of assumptions about the proximity of certain pieces of information (for example, the price is indicated by the word price followed by the symbol $ followed by a positive number). This heuristic approach is error-prone; it is not always guaranteed to work. Because of these difficulties only limited information is extracted. For example, shipping expenses, delivery times, restrictions on the destination country, level of security, and privacy policies are typically not extracted. But all these factors may be significant for the user’s decision making. In addition, programming wrappers is time-consuming, and changes in the online store outfit require costly reprogramming. The Semantic Web will allow the development of software agents that can interpret the product information and the terms of service. • Pricing and product information will be extracted correctly, and delivery and privacy policies will be interpreted and compared to the user requirements. • Additional information about the reputation of online shops will be retrieved from other sources, for example, independent rating agencies or consumer bodies. • The low-level programming of wrappers will become obsolete. TLFeBOOK TLFeBOOK
TLFeBooK More sophisticated shopping agents will be able to conduct automated negotiations, on the buyers behalf, with shop agents 1.2.3 Business-to-Business Electronic Commerce Most users associate the commercial part of the Web with B2C e-commerce, but the greatest economic promise of all online technologies lies in the area of business-to-business(B2B )e-commerce Traditionally businesses have exchanged their data using the Electronic Data Interchange(EDI) approach. However this technology is complicated and understood only by experts. It is difficult to program and maintain, and it is error-prone. Each B2B communication requires separate programming, o such communications are costly. Finally, EDI is an isolated technology The interchanged data cannot be easily integrated with other business appli The Internet appears to be an ideal infrastructure for business-to-business communication. Businesses have increasingly been looking at Internet-based solutions, and new business models such as B2B portals have emerged. Still, markup language)is too weak to support the outlined activities effective B2B e-commerce is hampered by the lack of standards. HTML(hypertext it provides neither the structure nor the semantics of information. The new standard of XML is a big improvement but can still support communications only in cases where there is a priori agreement on the vocabulary to be used and on its meaning. The realization of the Semantic Web will allow businesses to enter partner ships without much overhead. Differences in terminology will be resolved using standard abstract domain models, and data will be interchanged using translation services. Auctioning, negotiations, and drafting contracts will be carried out automatically(or semiautomatically) by software agents 1.2.4 Personal agents: A Future Scenario Michael had just had a minor car accident and was feeling some neck pain. His primary care physician suggested a series of physical therapy sessions Michael asked his Semantic Web agent to work out some possibilities The agent retrieved details of the recommended therapy from the doctor's agent and looked up the list of therapists maintained by Michaels health insurance company. The agent checked for those located within a radius of 10 km from Michael's office or home, and looked up their reputation according TLFebooK
6 1 The Semantic Web Vision • More sophisticated shopping agents will be able to conduct automated negotiations, on the buyer’s behalf, with shop agents. 1.2.3 Business-to-Business Electronic Commerce Most users associate the commercial part of the Web with B2C e-commerce, but the greatest economic promise of all online technologies lies in the area of business-to-business (B2B) e-commerce. Traditionally businesses have exchanged their data using the Electronic Data Interchange (EDI) approach. However this technology is complicated and understood only by experts. It is difficult to program and maintain, and it is error-prone. Each B2B communication requires separate programming, so such communications are costly. Finally, EDI is an isolated technology. The interchanged data cannot be easily integrated with other business applications. The Internet appears to be an ideal infrastructure for business-to-business communication. Businesses have increasingly been looking at Internet-based solutions, and new business models such as B2B portals have emerged. Still, B2B e-commerce is hampered by the lack of standards. HTML (hypertext markup language) is too weak to support the outlined activities effectively: it provides neither the structure nor the semantics of information. The new standard of XML is a big improvement but can still support communications only in cases where there is a priori agreement on the vocabulary to be used and on its meaning. The realization of the Semantic Web will allow businesses to enter partnerships without much overhead. Differences in terminology will be resolved using standard abstract domain models, and data will be interchanged using translation services. Auctioning, negotiations, and drafting contracts will be carried out automatically (or semiautomatically) by software agents. 1.2.4 Personal Agents: A Future Scenario Michael had just had a minor car accident and was feeling some neck pain. His primary care physician suggested a series of physical therapy sessions. Michael asked his Semantic Web agent to work out some possibilities. The agent retrieved details of the recommended therapy from the doctor’s agent and looked up the list of therapists maintained by Michael’s health insurance company. The agent checked for those located within a radius of 10 km from Michael’s office or home, and looked up their reputation according TLFeBOOK TLFeBOOK
TLFeBOOK 1.3 Semantic Web Technologies to trusted rating services. Then it tried to match available appointment times with Michael's calendar In a few minutes the agent returned two proposals Unfortunately, Michael was not happy with either of them. One therapist had offered appointments in two weeks time; for the other Michael would have to drive during rush hour. Therefore, Michael decided to set stricter time constraints and asked the agent to try again A few minutes later the agent came back with an alternative: A therapist with an excellent reputation who had available appointments starting in two days. However, there were a few minor problems. Some of Michaels less im- portant work appointments would have to be rescheduled. The agent offered to make arrangements if this solution were adopted. Also, the therapist was not listed on the insurer's site because he charged more than the insurers maximum coverage. The agent had found his name from an independent t of therapists and had already checked that Michael was entitled to the surer's maximum coverage, according to the insurer's policy. It had also negotiated with the therapist's agent a special discount. The therapist had only recently decided to charge more than average and was keen to find new patients Michael was happy with the recommendation because he would have to only a few dollars extra. However, because he had installed the Semantic agent a few days ago, he asked it for explanations of some of its asser- tions: how was the therapists reputation established, why was it necessary for Michael to reschedule some of his work appointments, how was the price negotiation conducted? The agent provided appropriate information Michael was satisfied. His new Semantic Web agent was going to make his busy life easier. He asked the agent to take all necessary steps to finalize the task 1.3 Semantic Web technologies The scenarios outlined in section 1.2 are not science fiction they do not re- quire revolutionary scientific progress to be achieved. We can reasonably claim that the challenge is an engineering and technology adoption rather than a scientific one: partial solutions to all important parts of the problem exist. At present, the greatest needs are in the areas of integration, standard- ization, development of tools, and adoption by users. But, of course, further technological progress will lead to a more advanced Semantic Web than can, in principle, be achieved today. TLFebooK
1.3 Semantic Web Technologies 7 to trusted rating services. Then it tried to match available appointment times with Michael’s calendar. In a few minutes the agent returned two proposals. Unfortunately, Michael was not happy with either of them. One therapist had offered appointments in two weeks’ time; for the other Michael would have to drive during rush hour. Therefore, Michael decided to set stricter time constraints and asked the agent to try again. A few minutes later the agent came back with an alternative: A therapist with an excellent reputation who had available appointments starting in two days. However, there were a few minor problems. Some of Michael’s less important work appointments would have to be rescheduled. The agent offered to make arrangements if this solution were adopted. Also, the therapist was not listed on the insurer’s site because he charged more than the insurer’s maximum coverage. The agent had found his name from an independent list of therapists and had already checked that Michael was entitled to the insurer’s maximum coverage, according to the insurer’s policy. It had also negotiated with the therapist’s agent a special discount. The therapist had only recently decided to charge more than average and was keen to find new patients. Michael was happy with the recommendation because he would have to pay only a few dollars extra. However, because he had installed the Semantic Web agent a few days ago, he asked it for explanations of some of its assertions: how was the therapist’s reputation established, why was it necessary for Michael to reschedule some of his work appointments, how was the price negotiation conducted? The agent provided appropriate information. Michael was satisfied. His new Semantic Web agent was going to make his busy life easier. He asked the agent to take all necessary steps to finalize the task. 1.3 Semantic Web Technologies The scenarios outlined in section 1.2 are not science fiction; they do not require revolutionary scientific progress to be achieved. We can reasonably claim that the challenge is an engineering and technology adoption rather than a scientific one: partial solutions to all important parts of the problem exist. At present, the greatest needs are in the areas of integration, standardization, development of tools, and adoption by users. But, of course, further technological progress will lead to a more advanced Semantic Web than can, in principle, be achieved today. TLFeBOOK TLFeBOOK
TLFeBOOK 1. 3 7 In the following sections we outline a few technologies that are necessary achieving the functionalities previously outlined Currently, Web content is formatted for human readers rather than programs HTML is the predominant language in which Web pages are written(directl or using tools). A portion of a typical Web page of a physical therapist might look like this: <hl>Agilitas Physiotherapy Centre</h1> Welcome to the home page of the Agilitas Physiotherapy Centre Do you feel pain? Have you had an injury? Let our staff Lisa Davenport, Kelly Townsend (our lovely secretary) and Steve Mat thews take care of your body and soul h2>Consultation hours</h2 Mon llam -7pm<br Tue llam 7pm<br> ed 3pm -7pm<br Thu llam -7pm<bi Fr主11am-3pm<p> But note that we do not offer consultation during the weeks of the <a href=" >State of Origin</a> games For people the information is presented in a satisfactory way, but machines will have their problems. Keyword-based searches will identify the words physiotherapy and consultation hours. And an intelligent agent might even be ble to identify the personnel of the center. But it will have trouble distin guishing therapists from the secretary, and even more trouble with finding the exact consultation hours (for which it would have to follow the link to the State Of Origin games to find when they take place) The Semantic Web approach to solving these problems is not the devel opment of superintelligent agents. Instead it proposes to attack the problem from the Web page side. If HTML is replaced by more appropriate languages, then the Web pages could carry their content on their sleeve. In addition to containing formatting information aimed at producing a document for human readers, they could contain information about their content. In our example, there might be information such as TLFeBOoK
8 1 The Semantic Web Vision In the following sections we outline a few technologies that are necessary for achieving the functionalities previously outlined. 1.3.1 Explicit Metadata Currently, Web content is formatted for human readers rather than programs. HTML is the predominant language in which Web pages are written (directly or using tools). A portion of a typical Web page of a physical therapist might look like this: <h1>Agilitas Physiotherapy Centre</h1> Welcome to the home page of the Agilitas Physiotherapy Centre. Do you feel pain? Have you had an injury? Let our staff Lisa Davenport, Kelly Townsend (our lovely secretary) and Steve Matthews take care of your body and soul. <h2>Consultation hours</h2> Mon 11am - 7pm<br> Tue 11am - 7pm<br> Wed 3pm - 7pm<br> Thu 11am - 7pm<br> Fri 11am - 3pm<p> But note that we do not offer consultation during the weeks of the <a href=". . .">State Of Origin</a> games. For people the information is presented in a satisfactory way, but machines will have their problems. Keyword-based searches will identify the words physiotherapy and consultation hours. And an intelligent agent might even be able to identify the personnel of the center. But it will have trouble distinguishing therapists from the secretary, and even more trouble with finding the exact consultation hours (for which it would have to follow the link to the State Of Origin games to find when they take place). The Semantic Web approach to solving these problems is not the development of superintelligent agents. Instead it proposes to attack the problem from the Web page side. If HTML is replaced by more appropriate languages, then the Web pages could carry their content on their sleeve. In addition to containing formatting information aimed at producing a document for human readers, they could contain information about their content. In our example, there might be information such as TLFeBOOK TLFeBOOK