MSCIT 5210/MSCBD 5002: Knowledge Discovery and Data Mining Acknowledgement: Slides modified by dr Lei chen based on the slides provided by jiawei Han micheline Kamber and Jian pei @2012 Han, Kamber pei. all rights reserved
1 1 MSCIT 5210/MSCBD 5002: Knowledge Discovery and Data Mining Acknowledgement: Slides modified by Dr. Lei Chen based on the slides provided by Jiawei Han, Micheline Kamber, and Jian Pei © 2012 Han, Kamber & Pei. All rights reserved
Chapter 4: Data Warehousing, On-line Analytical Processing and Data Cube Data Warehouse Basic Concepts a Data Warehouse Modeling: Data Cube and OLAP Data Cube Computation: Preliminary Concepts Data Cube Computation Methods Summary
2 Chapter 4: Data Warehousing, On-line Analytical Processing and Data Cube ◼ Data Warehouse: Basic Concepts ◼ Data Warehouse Modeling: Data Cube and OLAP ◼ Data Cube Computation: Preliminary Concepts ◼ Data Cube Computation Methods ◼ Summary
Aspects of SQL Most common Query language -used in all commercial systems Discussion is based on the SQL92 Standard Commercial products have different features of SQL, but the basic structure is the same Data Manipulation Language Data Definition Language Constraint Specification Embedded SQL Transaction Management Security Management COMP3311 Fall 2011 CSE, HKUST Slide 3
COMP3311 Fall 2011 CSE, HKUST Slide 3 Aspects of SQL ▪ Most common Query Language – used in all commercial systems • Discussion is based on the SQL92 Standard. Commercial products have different features of SQL, but the basic structure is the same ▪ Data Manipulation Language ▪ Data Definition Language ▪ Constraint Specification ▪ Embedded SQL ▪ Transaction Management ▪ Security Management
Basic structure SQL is based on set and relational operations with certain modifications and enhancements a typical SQL query has the form select a1,A2,…,An from r1,R,…R where p A represent attributes R represent relations P is a predicate This query is equivalent to the relational algebra expression: A1.2.An(op(R1×R2×…×Rm) The result of an sQl query is a relation(but may contain duplicates). SQL statements can be nested COMP3311 Fall 2011 CSE, HKUST Slide 4
COMP3311 Fall 2011 CSE, HKUST Slide 4 Basic Structure • SQL is based on set and relational operations with certain modifications and enhancements • A typical SQL query has the form: select A1 , A2 , …, An from R1 , R2 , …, Rm where P - Ai represent attributes - Ri represent relations - P is a predicate. • This query is equivalent to the relational algebra expression: A1, A2, …, An(P (R1 R2 … Rm)) • The result of an SQL query is a relation (but may contain duplicates). SQL statements can be nested
Projection The select clause corresponds to the projection operation of the relational algebra. It is used to list the attributes desired in the result of a query Find the names of all branches in the loan relation select branch-name from loan Equivalent to: I lbranch-name (loan) An asterisk in the select clause denotes all attributes select from loan Note: for our examples we use the tables: Branch(branch-name, branch-city, assets) Customer(customer-name, customer-street, customer-city) Loan(loan- number, amount, branch-name) Account(account-number, balance branch-name Borrower(customer-name, loan-number) Depositor(customer-name account-number COMP3311 Fall 2011 CSE, HKUST Slide 5
COMP3311 Fall 2011 CSE, HKUST Slide 5 Projection • The select clause corresponds to the projection operation of the relational algebra. It is used to list the attributes desired in the result of a query. • Find the names of all branches in the loan relation select branch-name from loan Equivalent to: branch-name(loan) • An asterisk in the select clause denotes “all attributes” select * from loan • Note: for our examples we use the tables: – Branch (branch-name, branch-city, assets) – Customer (customer-name, customer-street, customer-city) – Loan (loan-number, amount, branch-name) – Account (account-number, balance, branch-name) – Borrower (customer-name, loan-number) – Depositor (customer-name, account-number)