Translation of ER Model to Relational Schema
Remember, an ER Model is intended as a description of real-world entities. Although it is constructed in such a way as to allow easy translation to the relational model, this is not an entirely trivial process. The ER diagram represents the conceptual level of database design. A relational schema is at the logical level of database design.
The textbook (Connolly and Begg) devotes an entire chapter to this process of translation, and we spent a couple of sessions in class going over it. Before attempting the second assignment or completing this part of the project you should re-read chapter 8 (pages 242-267). Chapter 11 (pages 327-351) is a worked example of this process.
The notes below therefore focus on two supplemental areas. One is to summarize the key points. Another is to point out places where the methodology given in the textbooks differs from what will be required in this course.
The textbook tells you to redraw your conceptual ERM in a simplified form as a Logical Data Model. This uses ER notation, but employs only the limited set of conventions that map directly onto the relational model. Then you can translate this logical data model directly to a relational schema. You will find this a helpful thing to do – at least for the parts of the diagram that have to change. However, there is no need to hand in this logical data model. Just use it for your own convenience.
What you should hand in is a relational schema in the format used on pages 250 and 251. This looks like the following:
Relation_Name
(Attribute_1, Attribute_2, …. Attribute_N)
Primary_Key
(Key Attribute_1) <– remember, the key may be compound!
Foreign_Key(Attribute_In_This_Relation)
references Other_Relation.Attribute_In_Other_Table
The book spends some time discussing how to turn
a bunch of different local logical data models into a single global logical
data model. We are not covering this in the course, and you are now being made
to implement it in the projects or assignments. This technique fills pages
260-264 in chapter 8 and much of chapter 11. However, it is something you would
have to do in any real database project of appreciable scope, so you may wish
to study this material.
(NB: For clarity, to avoid phrases like “related relation” and “relationship between two relations” I am using the phrase “table” below – even though relation is technically appropriate until you reach the stage of actually implementing the schema in Oracle. For consistency I also use “column” rather than attribute. Be aware that questions in the final will require use of the correct (formal) terms. If you would like the pure version, run a search and replace!)
An entity type turns into a table. Give it a sensible name – avoid Oracle reserved words. Remember that Oracle does not allow spaces in names and is not case sensitive. Follow a consistent set of conventions.
Each attribute turns into a column in the table. Keep the attribute domains handy – you will need them later when you implement the schema in Oracle.
The primary key of the entity is the primary key of the table. It can be composite if required. It can never be null.
There is no such thing as a multi-valued attribute in a relational database. If you have a multi-valued attribute, take the attribute and turn it into a new entity of its own. Then make a 1:M relationship between the new entity and the existing one. Then convert as normal.
This is harder. Remember, to truly represent a relationship you need to make sure that the appropriate foreign keys are in place in the schema to allow joins between tables. However, every time you construct a new query involving these tables you will need to specify the appropriate join conditions. Even if you define foreign keys, Oracle will not figure out the joins for you. This makes the relational model very flexible, but also makes queries hard to write for the novice.
All relationships turn into foreign keys. Some of them also turn into an extra table. They may also turn into “not null” constraints.
To represent a 1:M relationship, take the primary key of the table on the “1” side and insert it as a foreign key into the table on the “M” side. This is the most basic use of a foreign key. It may make sense to rename the foreign key to reflect its relationship to the table you are inserting it into.
To represent a 1:1 relationship you have a choice. Ask yourself whether it makes more sense to leave this as two separate tables, or to join them together to make one big table. This will depend on whether records will usually exist in both tables, how data will be accesses, if joins will be made constantly or only occasionally when data is queried, etc. If the relationship is mandatory on both sides, there is probably little point in representing it as two tables.
Assuming you chose to retain two tables for a 1:1 relationship, you must chose which table will receive the primary key of the other as a foreign key. If the relationship is mandatory for one entity but not the other, the put foreign key into the table for which participation is mandatory. If it is mandatory for neither, ask which is more likely to exist or to be queried. Put the foreign key in the table that will be accessed less frequently – this minimizes the number of joins required when querying.
If the 1:1 or 1:M relationship is mandatory, you will need a “not null” constraint on the foreign key. If the foreign key can never be null, this means it must always refer to a row in the related table – and therefore means that a row cannot exist in this table without a related row on the other side of the relationship.
A weak entity will have a foreign key as part of its own primary key. If it has a 1:M relationship with the strong entity that defines it, it will also have a partial key of its own, to tell it apart from other weak entities associated with the same strong entity. (For example, different instances of the same flight number). This partial key, together with the primary key of the defining entity inserted as a foreign key, together make up the primary key of the table for the weak entity.
There is no direct representation of a M:N relationship in the relational model. You will need to turn each M:N relationship between two entities into a separate relation (table) of its own. This relation will usually have as its own primary key the combination of two foreign keys – each of these will be the primary key of one of the relations involved in this relationship. (See page 245 for this technique).
If a relationship has attributes then they need to go into a table. Where to put them depends on the type of the relationship. In a 1:1 or 1:M relationship, put them the same place the foreign key goes (on the M side in 1:M). In a M:N relationship, put them in the new table you create for the relationship.
I advise you to avoid n-ary relationships in the first place (a single relationship including three or more entities). They can usually be better represented by using an additional entity and a set of binary relationships. Every n-ary relationship so far handed into me in an exam or assignment has been produced in error! (See page 245-6 for details). If you had an n-ary relationship on your ERM then follow these steps to simplify them – then translate the simpler relationships and the new entity as normal.
Generalization and specialization allow you to show that some entities in a particular entity type are also examples of more specialized subtypes, and have different attributes from other entities of the same overall entity type. The techniques to use are discussed on page 253.
You cannot exactly do this in a relational schema. However, you have a choice. You can either model the supertype and all its subtypes as a single table (and leave the null values that don’t apply), or you can turn each subtype into its own table and set up 1:1 relationships between them.
The “single big table” approach is usually simpler. Which of these is the appropriate strategy depends on many factors: will the subtype attributes be needed in most queries, or hardly ever? Will most entities be part of one or more subtypes, or are subtypes quite rare? These factors influence how many additional joins would be demanded to query the additional tables. You would also consider how big the overall table was likely to be, relative to the available resources. If the entire table would only include a few thousand rows, the additional time and space involved in holding a few columns with null values would be much less consequential than if it held hundreds of millions of rows, and entire drives might be needed to store the extra data.
Either approach will receive full credit in this class if correctly implemented.
To implement as multiple tables: create a table for each subtype. Each will have a 1:1 relationship with the super-type. This means that you insert the primary key of the super-class as a foreign key for each of the sub-class tables. This will be the entire primary key of each new table.
To implement as a single big table, add the extra attributes from each of the subtype entities to the table representing the supertype. They will remain null in rows that do not represent an entity of that subtype. However, you will also need at least one additional column to record what type this entity is. If the relationship is disjoint (d) then a single “type” column will be fine – just store a simple code here. If the relationship is (o) then you will need a column for each and every subtype. Say the subtypes are “pilot” and “attendant” then you might call one of these fields “is_attendant_flag” and the other “is_pilot_flag”. These would hold true or false values according to whether this row represented someone in the appropriate subtype.