PRO_INFO suppler_name, product_name, Evans 60099 UP-_ LOC ( suppller_name, location Marti Evans FIGURE 94.4 Decomposition of PRODUCT into PRO_INFO and SUP_LOC. SUPPLIES(clientname, supplier_name, location Krad Martin hengru Greene Evans FIGURE 94.5 Instance of suPPlIeS UP-_CLI( client_name, supplier_name SUP- LOC (supplier_name, location Shangyu Martin Metairie Rudd FIGURE 94.6 mposition of SUPPLIES into SUP_CLI and SUP_LOC. The relation of Fig. 94.5, which is in 2NF, highlights update anomalies and data redundancy due to the transitive dependency of a nonprime attribute on a key. The relation gives the name of a client(client_name) the corresponding supplier(supplier_name), and the supplier's location. Each client is assumed to have one supplier. The relations key is client_name, and each supplier has only one location. A supplier and his location cannot be inserted in SUPPLIES unless the supplier has at least one client. In addition, the relation ha deletion anomaly since if Tillis is no longer a client of Rudd, the information about Rudd as a supplier and his location is lost. A change to a supplier's location may updating the location attribute name of several tuples in the relation. Also, although each supplier has only one location, such a location is sometimes repeated several time unnecessarily, leading to data redundancy. The relation exhibits the following transitive dependency: client_name-supplier_name, supplier_name >location(but not the inverse). The relation CLIENT is clearly in 2NF, but because of the transitive dependency of the nonprime attribute location on the key, it is not in 3NE. This is the cause of the anomalies mentioned above. Eliminating this transitive dependency by splitting the schema into two components will remove these anomalies. Clearly, the resulting two relations SUP_CLI and SUP_LOC are in 3NF(see Fig 94.6 Each partial dependency of a nonprime attribute on a key can be expressed as a transitive dependency of a onprime attribute on a key. Therefore, a scheme in NF is also in 2NF BCNF is a stricter form of 3Nf where a relation r on a schema r is in bcnf if whenever a functional dependency X-Yexists in r(R), then X is a superkey of R. The condition of 3NE, which allows Y to be prime if X is not a superkey, does not exist in BCNE. Thus, every scheme in BCNF is also in 3NF, but the opposite is not always true c2000 by CRC Press LLC
© 2000 by CRC Press LLC The relation of Fig. 94.5, which is in 2NF, highlights update anomalies and data redundancy due to the transitive dependency of a nonprime attribute on a key. The relation gives the name of a client (client_name), the corresponding supplier (supplier_name), and the supplier’s location. Each client is assumed to have one supplier. The relation’s key is client_name, and each supplier has only one location. A supplier and his location cannot be inserted in SUPPLIES unless the supplier has at least one client. In addition, the relation has a deletion anomaly since if Tillis is no longer a client of Rudd, the information about Rudd as a supplier and his location is lost. A change to a supplier’s location may require updating the location attribute name of several tuples in the relation. Also, although each supplier has only one location, such a location is sometimes repeated several time unnecessarily, leading to data redundancy. The relation exhibits the following transitive dependency: client_name Æ supplier_name, supplier_name Æ location (but not the inverse). The relation CLIENT is clearly in 2NF, but because of the transitive dependency of the nonprime attribute location on the key, it is not in 3NF. This is the cause of the anomalies mentioned above. Eliminating this transitive dependency by splitting the schema into two components will remove these anomalies. Clearly, the resulting two relations SUP_CLI and SUP_LOC are in 3NF (see Fig. 94.6). Each partial dependency of a nonprime attribute on a key can be expressed as a transitive dependency of a nonprime attribute on a key. Therefore, a scheme in 3NF is also in 2NF. BCNF is a stricter form of 3NF, where a relation r on a schema R is in BCNF if whenever a functional dependency X Æ Y exists in r(R), then X is a superkey of R. The condition of 3NF, which allows Y to be prime if X is not a superkey, does not exist in BCNF. Thus, every scheme in BCNF is also in 3NF, but the opposite is not always true. FIGURE 94.4 Decomposition of PRODUCT into PRO_INFO and SUP_LOC. FIGURE 94.5 Instance of SUPPLIES. FIGURE 94.6 Decomposition of SUPPLIES into SUP_CLI and SUP_LOC
detailed discussion of higher level normalizations, such as 4NF and 5NF, which are based on other forms of dependencies, can be found in [Elmasri and Navathe, 1994 Data Definition and Manipulation in Relational Databases Upon completion of the relational database design, a descriptive language, usually referred to as Data Definition Language(DDL), is used to define the designed schemes and their relationships. The dDl can be used to create w schemes or modify existing ones, but it cannot be used to query the database. Once DDL statements are compiled, they are stored in the data dictionary. a data dictionary is a repository where information about database schemas, such as attribute names, indexes, and integrity constraints are stored. Data dictionaries also ontain other information about databases, such as design decisions, usage standards, application program descriptions, and user information. During the processing of a query, the DBMS usually checks the data dictionary. The data dictionary can be seen as a relational database of its own. As a result, data manipulation languages that are used to manipulate databases can also be used to query the data dictionary An important function of a DBMS is to provide a Data Manipulation Language(DML) with which a user can retrieve, change, insert, and delete data from the database DMls are classified into two type and nonprocedural. The main difference between the two types is that in procedural DMLs, a user has to specify the desired data and how to obtain it, while in nonprocedural DMLs, a user has only to describe the desired data. Because they impose less burden on the user, nonprocedural DMls are normally easier to learn and use The component of a DMl that deals with data retrieval is referred to as query language. A an be used interactively in a stand-alone manner, or it can be embedded in a general-purpose programming language such as C and Cobol. One of the most popular query languages is SQL (Structured Query Language). SQL is a query language based to a large extent on Codds relational algebra. SQL has additional features for data definition and update. Therefore, SQL is a comprehensive relational database language that includes both a DDL and DML. QL includes the following commands for data definition: CREATE TABLE, DROP TABLE, and ALTER TABLE. The CREATE TABlE is used to create and describe a new relation the two relations of fig 94.4 can be created in the following manner: CREATE TABLE PRO INFo supplier name VARCHAR(12 NOT NULL, product name VARCHAR (8 NOT NULL price DECIMAL (, 2))i CREATE TABLE SUP LOC supplier name VARCHAR(12 NOT NULL, VARCHAR(10)); The CREATE TABLE command specifies all the attribute names of a relation and their data types(e.g INTEGER, DECIMAL, fixed length character"CHAR", variable length character"VARCHAR, DATE).The constraint NOT NULL is usually specified for those attributes that cannot have null values. The primary key of each relation in the database is usually required to have a nonnull value. If a relation is created incorrectly, it can be deleted using the DROP TABLE command. The command is DROP TABLE followed by the name of the relation to be deleted. A variation of DROP command, DROP SCHEMA, is used if the whole schema is no longer needed. The ALTER TABLE is used to add new attribute names to an existing relation, as follows: ALTER TABLE SUP LOC ADD zip code CHAR(5); The SUP_LOC relation now contains an extra attribute name, zip_code. In most DBMSs, the zip of existing tuples will automatically be assigned a null value. Other DBMSs allow for the assignment of initial value to a newly added attribute name. Also, definitions of attributes can be changed and new constraints can be added, or current constraints can be dropped. The DML component of SQL has one basic query statement, sometimes called a mapping, that has the following structure SELECT <attribute name list> FROM <relation list> WHERE <restriction> e 2000 by CRC Press LLC
© 2000 by CRC Press LLC A detailed discussion of higher level normalizations, such as 4NF and 5NF, which are based on other forms of dependencies, can be found in [Elmasri and Navathe, 1994]. Data Definition and Manipulation in Relational Databases Upon completion of the relational database design, a descriptive language, usually referred to as Data Definition Language (DDL), is used to define the designed schemes and their relationships. The DDL can be used to create new schemes or modify existing ones, but it cannot be used to query the database. Once DDL statements are compiled, they are stored in the data dictionary. A data dictionary is a repository where information about database schemas, such as attribute names, indexes, and integrity constraints are stored. Data dictionaries also contain other information about databases, such as design decisions, usage standards, application program descriptions, and user information. During the processing of a query, the DBMS usually checks the data dictionary. The data dictionary can be seen as a relational database of its own. As a result, data manipulation languages that are used to manipulate databases can also be used to query the data dictionary. An important function of a DBMS is to provide a Data Manipulation Language (DML) with which a user can retrieve, change, insert, and delete data from the database. DMLs are classified into two types: procedural and nonprocedural. The main difference between the two types is that in procedural DMLs, a user has to specify the desired data and how to obtain it, while in nonprocedural DMLs, a user has only to describe the desired data. Because they impose less burden on the user, nonprocedural DMLs are normally easier to learn and use. The component of a DML that deals with data retrieval is referred to as query language. A query language can be used interactively in a stand-alone manner, or it can be embedded in a general-purpose programming language such as C and Cobol. One of the most popular query languages is SQL (Structured Query Language). SQL is a query language based to a large extent on Codd’s relational algebra. SQL has additional features for data definition and update. Therefore, SQL is a comprehensive relational database language that includes both a DDL and DML. SQL includes the following commands for data definition: CREATE TABLE, DROP TABLE, and ALTER TABLE. The CREATE TABLE is used to create and describe a new relation. The two relations of Fig. 94.4 can be created in the following manner: CREATE TABLE PRO_INFO (supplier_name VARCHAR(12) NOT NULL, product_name VARCHAR(8) NOT NULL, price DECIMAL(6,2)); CREATE TABLE SUP_LOC ( supplier_name VARCHAR(12) NOT NULL, location VARCHAR(10)); The CREATE TABLE command specifies all the attribute names of a relation and their data types (e.g., INTEGER, DECIMAL, fixed length character “CHAR”, variable length character “VARCHAR”, DATE). The constraint NOT NULL is usually specified for those attributes that cannot have null values. The primary key of each relation in the database is usually required to have a nonnull value. If a relation is created incorrectly, it can be deleted using the DROP TABLE command. The command is DROP TABLE followed by the name of the relation to be deleted. A variation of DROP command, DROP SCHEMA, is used if the whole schema is no longer needed. The ALTER TABLE is used to add new attribute names to an existing relation, as follows: ALTER TABLE SUP_LOC ADD zip_code CHAR(5); The SUP_LOC relation now contains an extra attribute name, zip_code. In most DBMSs, the zip_code value of existing tuples will automatically be assigned a null value. Other DBMSs allow for the assignment of an initial value to a newly added attribute name. Also, definitions of attributes can be changed and new constraints can be added, or current constraints can be dropped. The DML component of SQL has one basic query statement, sometimes called a mapping, that has the following structure: SELECT <attribute_name list> FROM <relation_list> WHERE <restriction>