Ten-Year Best paper Querying Heterogeneous Information Sources using Source Descriptions. VLDB96 Alon Halevy a principal member of technical staff at at&T Bell Laboratories. and then at at &T Laboratories Main idea: the nformation manifold led to tremendous progress on data integration and to quite a few commercial data integration products
Ten-Year Best Paper Querying Heterogeneous Information Sources using Source Descriptions. VLDB96 Alon Halevy a principal member of technical staff at AT&T Bell Laboratories, and then at AT&T Laboratories. • Main idea: the Information Manifold • led to tremendous progress on data integration and to quite a few commercial data integration products
The Information manifold An implemented data integration system Goal: provide a uniform query interface to a heterogeneous collection of Web data sources Main contribution: the way it described the contents of the data sources it knew about IM contains declarative descriptions of the contents and capabilities of the information sources. Source Description)
The Information Manifold ◼ An implemented data integration system ◼ Goal: provide a uniform query interface to a heterogeneous collection of Web data sources ◼ Main contribution: the way it described the contents of the data sources it knew about. ◼ IM contains declarative descriptions of the contents and capabilities of the information sources. (Source Description)
An example of complex query find reviews of movie directed by Woody Allen playing in my area three web sites join! 1. a movie site containing actor and director information(IMDB) 2. movie playing sources(e.g, 777film com 3. movie review sites(e.g, a newspaper)
An example of complex query find reviews of movie directed by Woody Allen playing in my area three web sites join! 1. a movie site containing actor and director information (IMDB) 2. movie playing sources(e.g.,777film.com) 3. movie review sites (e.g., a newspaper)
Design time Run time Mediated schema query reformulation Semantic mappings optimization execution wrapper wrapper wrapper wrapper wrapper
wrapper wrapper wrapper wrapper wrapper Mediated Schema Semantic mappings optimization & execution query reformulation Design time Run time
emantic mappings Mediated Schema CD: ASIN, Title, Genre Artist: ASIN, name, Informatio n sources Mapping logic Books Author Album Title ISBN ASIN ISBN FirstName Pri LastName DiscountPrice DiscountPrice Studio Edition Artists CDCategories Book Categories ASIN ASIN ISBN ArtistName Category Category Group Name
Semantic Mappings Books Title ISBN Price DiscountPrice Edition CDs Album ASIN Price DiscountPrice Studio BookCategories ISBN Category CDCategories ASIN Category Artists ASIN ArtistName GroupName Authors ISBN FirstName LastName CD: ASIN, Title, Genre,… Artist: ASIN, name, … Mediated Schema Mapping logic Informatio n sources