Translation of ER Model to Relational Schema
Remember, an ER Model is intended as a description of real-world entities. Although it is constructed in such a way as to allow easy translation to the relational model, this is not an entirely trivial process. The ER diagram represents the conceptual level of database design. A relational schema is at the logical level of database design.
The textbook (Connolly and Begg) devotes an entire chapter to this process of translation, and we spent a couple of sessions in class going over it. Before attempting the second assignment or completing this part of the project you should re-read chapter 8 (pages 242-267). Chapter 11 (pages 327-351) is a worked example of this process.
The notes below therefore focus on two supplemental areas. One is to summarize the key points. Another is to point out places where the methodology given in the textbooks differs from what will be required in this course.
The textbook tells you to redraw your conceptual
ERM in a simplified form as a Logical Data Model. This uses ER notation, but
employs only the limited set of conventions that map directly onto the
relational model. Then you can translate this logical data model directly to a
relational schema. You will find this a helpful thing to do – at least for the
parts of the diagram that have to change. However, there is no need to hand in
this logical data model. Just use it for your own convenience.
What
you should hand in is a relational schema in the format used on pages 250 and
251. This looks like the following:
Relation_Name
(Attribute_1, Attribute_2, …. Attribute_N)
Primary_Key
(Key Attribute_1) <– remember, the key may be compound!
Foreign_Key(Attribute_In_This_Relation)
references Other_Relation.Attribute_In_Other_Table
The book spends some time discussing how to turn
a bunch of different local logical data models into a single global logical
data model. We are not covering this in the course, and you are now being made
to implement it in the projects or assignments. This technique fills pages
260-264 in chapter 8 and much of chapter 11. However, it is something you would
have to do in any real database project of appreciable scope, so you may wish
to study this material.
(NB: For clarity, to avoid phrases like “related relation” and “relationship between two relations” I am using the phrase “table” below – even though relation is technically appropriate until you reach the stage of actually implementing the schema in Oracle. For consistency I also use “column” rather than attribute. Be aware that questions in the final will require use of the correct (formal) terms. If you would like the pure version, run a search and replace!)
An entity type turns into a table. Give
it a sensible name – avoid Oracle reserved words. Remember that Oracle does not
allow spaces in names and is not case sensitive. Follow a consistent set of
conventions.
Each attribute turns into a column in the
table. Keep the attribute domains handy – you will need them later when you
implement the schema in Oracle.
The primary key of the entity is the primary
key of the table. It can be composite if required. It can never be null.
There is no such thing as a multi-valued
attribute in a relational database. If you have a multi-valued attribute,
take the attribute and turn it into a new entity of its own. Then make a
1:M relationship between the new entity and the existing one. Then convert as
normal.
This is harder. Remember, to truly represent a relationship you need to make sure that the appropriate foreign keys are in place in the schema to allow joins between tables. However, every time you construct a new query involving these tables you will need to specify the appropriate join conditions. Even if you define foreign keys, Oracle will not figure out the joins for you. This makes the relational model very flexible, but also makes queries hard to write for the novice.
All relationships turn into foreign keys. Some
of them also turn into an extra table. They may also turn into “not null”
constraints.
To represent a 1:M relationship, take the
primary key of the table on the “1” side and insert it as a foreign key into
the table on the “M” side. This is the most basic use of a foreign key. It may
make sense to rename the foreign key to reflect its relationship to the table
you are inserting it into.
To represent a 1:1 relationship you have
a choice. Ask yourself whether it makes more sense to leave this as two
separate tables, or to join them together to make one big table. This will
depend on whether records will usually exist in both tables, how data will be
accesses, if joins will be made constantly or only occasionally when data is
queried, etc. If the relationship is mandatory on both sides, there is probably
little point in representing it as two tables.
Assuming you chose to retain two tables for a 1:1
relationship, you must chose which table will receive the primary key of
the other as a foreign key. If the relationship is mandatory for one entity but
not the other, the put foreign key into the table for which participation is
mandatory. If it is mandatory for neither, ask which is more likely to exist or
to be queried. Put the foreign key in the table that will be accessed less
frequently – this minimizes the number of joins required when querying.
If the 1:1 or 1:M relationship is mandatory,
you will need a “not null” constraint on the foreign key. If the foreign
key can never be null, this means it must always refer to a row in the related
table – and therefore means that a row cannot exist in this table without a
related row on the other side of the relationship.
A weak entity will have a foreign key as part
of its own primary key. If it has a 1:M relationship with the strong entity
that defines it, it will also have a partial key of its own, to tell it apart
from other weak entities associated with the same strong entity. (For example,
different instances of the same flight number). This partial key, together with
the primary key of the defining entity inserted as a foreign key, together make
up the primary key of the table for the weak entity.
There is no direct representation of a M:N
relationship in the relational model. You will need to turn each M:N
relationship between two entities into a separate relation (table) of its own. This
relation will usually have as its own primary key the combination of two
foreign keys – each of these will be the primary key of one of the relations
involved in this relationship. (See page 245 for this technique).
If a relationship has attributes then they need
to go into a table. Where to put them depends on the type of the relationship.
In a 1:1 or 1:M relationship, put them the same place the foreign key goes (on
the M side in 1:M). In a M:N relationship, put them in the new table you create
for the relationship.
I advise you to avoid n-ary relationships
in the first place (a single relationship including three or more entities).
They can usually be better represented by using an additional entity and a set
of binary relationships. Every n-ary relationship so far handed into me in an
exam or assignment has been produced in error! (See page 245-6 for details). If
you had an n-ary relationship on your ERM then follow these steps to simplify
them – then translate the simpler relationships and the new entity as normal.
Generalization and specialization allow you to
show that some entities in a particular entity type are also examples of more
specialized subtypes, and have different attributes from other entities of the
same overall entity type. The techniques to use are discussed on page 253.
You cannot exactly do this in a relational
schema. However, you have a choice. You can either model the supertype and
all its subtypes as a single table (and leave the null values that don’t
apply), or you can turn each subtype into its own table and set up 1:1
relationships between them.
The “single big table” approach is usually
simpler. Which of these is the appropriate strategy depends on many factors:
will the subtype attributes be needed in most queries, or hardly ever? Will
most entities be part of one or more subtypes, or are subtypes quite rare?
These factors influence how many additional joins would be demanded to query
the additional tables. You would also consider how big the overall table was
likely to be, relative to the available resources. If the entire table would
only include a few thousand rows, the additional time and space involved in
holding a few columns with null values would be much less consequential than if
it held hundreds of millions of rows, and entire drives might be needed to
store the extra data.
Either approach will receive full credit in this
class if correctly implemented.
To implement as multiple tables: create a
table for each subtype. Each will have a 1:1 relationship with the super-type.
This means that you insert the primary key of the super-class as a foreign key
for each of the sub-class tables. This will be the entire primary key of each
new table.
To implement as a single big table, add
the extra attributes from each of the subtype entities to the table
representing the supertype. They will remain null in rows that do not represent
an entity of that subtype. However, you will also need at least one additional
column to record what type this entity is. If the relationship is disjoint (d)
then a single “type” column will be fine – just store a simple code here. If
the relationship is (o) then you will need a column for each and every subtype.
Say the subtypes are “pilot” and “attendant” then you might call one of these
fields “is_attendant_flag” and the other “is_pilot_flag”. These would hold true
or false values according to whether this row represented someone in the
appropriate subtype.