Entity Relational Diagram

Conceptual Database Design

Conceptual database design involves modeling the collected information at a high-level of abstraction without using a particular data model or DBMS. Since conceptual database design occurs independently from a particular DBMS or data model, we need high-level modeling languages to perform conceptual design. The entity-relationship (ER) model was originally proposed by Peter Chen in 1976 for conceptual design. We will perform ER modeling using Unified Modeling Language (UML) syntax.

Entity-Relationship Modeling

Entity-relationship modeling is a top-down approach to database design that models the data as entities, attributes, and relationships. The ER model refines entities and relationships by including properties of entities and relationships called attributes, and by defining constraints on entities, relationships, and attributes. The ER model conveys knowledge at a high-level (conceptual level) which is suitable for interaction with technical and non- technical users. Since the ER model is data model independent, it can later be converted into the desired logical model (e.g. relational model).

Example Relation Instances

Emp Relation
WorksOn Relation

Entity Types

A set of objects with the same qualities that are designated as having their own existence is referred to as an entity type.
The ER model’s primary notion is an entity type, which represents a collection of real-world objects with attributes.
It’s worth noting that an entity type doesn’t have to be a tangible real-world thing like a person or department; it might be be an abstract idea like a project or job. An entity instance is a specific instance or occurrence of a given entity type.
Employee is an example of an entity type. ‘E1 – John Doe’ is an entity instance. A set of entity instances is referred to as an entity set.

Representing Entity Types

Rectangles with the name of the entity type inside them are used to represent entity kinds.

The name of an entity type is usually a singular noun. That is, use Person rather than People, Project rather than Projects, and so on. Each word in the entity name has its initial letter capitalized.

Relationship Types

A collection of relationships between entity types is referred to as a relationship type. Each form of relationship has a name that reflects what it does. A connection instance is a specific instance of a relationship type that links entity instances together. WorksOn, for example, is a relationship type. For example, ‘E1’ may work on project ‘P1’ or vice versa (E1,P1). A set of relationship instances is referred to as a relationship set. Between two entity types, there can be several relationships.

Representing Relationship Types

A designated edge between the two entity types represents the relationship type. The label is only used once in a single direction. As a result, an arrow shows the proper reading direction.

EmployeeWorksOn >Project
   
  • A relationship type name is normally a verb or verb phrase.
  • The first letter of each word in the name is capitalized.
  • Do not put arrows on either end of the line.

Relationship Degree

The number of entity types involved in a relationship is the degree of the relationship. WorksOn, for example, is a two-degree connection since the two participating entity types are Employee and Project.
Note that this is not the same as relation degree, which refers to the number of characteristics in a relationship. Degree two relationships are binary, degree three relationships are ternary, and degree four relationships are quaternary. The term n-ary refers to relationships of any degree N. Use a diamond to illustrate connections with more than two degrees.

Recursive Relationship

A recursive connection is one in which the same entity type engages in many roles at the same time. An employee, for example, has a supervisor. The supervisor is an employee as well. There is a name for each position.

A recursive connection is one in which the same entity type engages in many roles at the same time. An employee, for example, has a supervisor. The supervisor is an employee as well. There is a name for each position.

Attributes

An attribute is a property of an entity or a relationship type. For instance, entity type is a property of an entity. Employees have qualities such as name, pay, and title.

Some guidelines: attribute names should start with a lower case letter, and each attribute has a domain, which is the set of values that can be assigned to the attribute.
Different attributes can share the same domain, but each attribute can only have one domain.

Simple and Complex Attributes

If an attribute comprises only one component with its own existence, it is called a simple attribute.
Salary, for instance, is a straightforward characteristic.
Atomic attributes are simple traits that are frequently used in programming.

If an attribute has many components, each having their own existence, it is called a composite attribute.

For instance, because it consists of street, city, and state components, an address is a complicated property (subattributes).

Single and Multi-valued Attributes

An attribute is a single-valued attribute if it consists of a single value for each entity instance.

  • For example, salary is a single-valued attribute.

An attribute is a multi-valued attribute if it may have multiple values for a single entity instance.

  • For example, a telephone number attribute for a person may be multivalued as people may have different phone numbers (home phone number, cell phone number, etc.)

A derived attribute is an attribute whose value is calculated from other attributes but is not physically stored.

  • The calculation may involve attributes within the entity type of the derived attribute and attributes in other entity types.

Keys

A candidate key is a minimal set of attributes that uniquely identifies each instance of an entity type.

  • For example, the number attribute uniquely identifies an Employee and is a candidate key for the Employee entity type.

A primary key is a candidate key that is selected to identify each instance of an entity type.

  • The primary key is chosen from a set of candidate keys. For instance, an employee may also have SSN as an attribute.  The primary key may be either SSN or number as both are candidate keys.

A composite key is a key that consists of two or more attributes.

  • For example, a course is uniquely identified only by the department code (COSC) and the course number within the department (304)

Representing Keys

In UML attributes are listed in the rectangle for their entity.

Tags are used to denote any special features of the attributes.

  • primary key: {PK}, partial primary key: {PPK}, alternate key:

{AK}

  • derived attribute:  /attributeName  (e.g. /totalEmp)
    • multi-valued attribute: attributeName [minVals..maxVals]
  • e.g. phoneNumber [1..3]

Attributes on Relationships

An attribute may be associated with a relationship type.

For example, the WorksOn relationship type has two attributes: responsibility and hours.

Note that these two attributes belong to the relationship and cannot belong to either of the two entities individually (as they would not exist without the relationship).

Relationship attributes are represented as a separate box connected to the relationship using a dotted line.

Relationship Cardinalities

Relationship cardinalities or multiplicities are used to restrict how entity types participate in relationships in order to model real-world constraints.

The multiplicity is the number of possible occurrences of an entity type that may relate to a single occurrence of an associated entity type through a particular relationship.

For binary relationships, there are three common types:

  • one-to-one (1:1)
    • one-to-many (1:* or 1:N)
    • many-to-many (*:* or N:M)

One-to-One Relationships

In a one-to-one relationship, each instance of an entity class E1 can be associated with at most one instance of another entity class E2 and vice versa.

Example: A department may have only one manager, and a manager may manage only one department.

One-to-Many Relationships

In a one-to-many relationship, each instance of an entity class E1 can be associated with more than one instance of another entity class E2. However, E2 can only be associated with at most one instance of entity class E1.

Example: A department may have multiple projects, but a project may have only one department.

Many-to-Many Relationships

In a many-to-many relationship, each instance of an entity class E1 can be associated with more than one instance of another entity class E2 and vice versa.

Example: An employee may work on multiple projects, and a project may have multiple employees working on it.

Participation Constraints

Cardinality is the maximum number of relationship instances for an entity participating in a relationship type.

Participation is the minimum number of relationship instances for an entity participating in a relationship type.

  • Participation can be optional (zero) or mandatory (1 or more).

If an entity’s participation in a relationship is mandatory (also called total participation), then the entity’s existence depends on the relationship.

  • Called an existence dependency.

Example: A project is associated with one department, and a department may have zero or more projects.

Multiplicity of Non-Binary Relationships

The multiplicity in a complex relationship of an entity type is the number of possible occurrences of that entity-type in the n-ary relationship when the other (n-1) values are fixed.

Example: A supplier may provide zero or more parts to a project.  A project may have zero or more suppliers, and each supplier may provide to the project zero or more parts.

Challenges

The participation constraint for N-ary relationships (minimum cardinality) is ambiguous in UML notation. Example:

  • Constraint: A project has at least 1 supplier per part.
    • Initial idea: Multiplicity 1..* beside Supplier should force their to be a Supplier for each Part/Project combination.
    • Problem: Two different interpretations of Part/Project combination results in unforeseen consequences:

Actual tuple: All entities must always participate (as have actual tuples) so minimum values for all entities would be 1.

Potential tuple: 1 beside Supplier implies always a Supplier for every Part/Project combination. Not true in practice.

Bottom line: We will avoid problem of specifying participation constraints for N-ary relationships. One way to avoid it is convert a relationship into an entity with N binary relationships.

Strong and Weak Entity Types

A strong entity type is an entity type whose existence dependent on another entity type.

  • A strong entity type always has a primary key of its own attributes that uniquely identifies its instances.

is not

A weak entity type is an entity type whose existence is dependent on another entity type.

  • A weak entity type does not have a set of its own attributes that uniquely identifies its instances.

A common example of strong and weak entity types are employees and their dependents:

  • An employee is a strong entity because it has an employee number to identify its instances.
    • A dependent (child) is a weak entity because the database does not store a key for each child, but rather they are identified by the parent’s employee number and their name.

Problems with ER Models

When modeling a Universe of Discourse (UoD) using ER models, there are several challenges that you have.

The first, basic challenge is knowing when to model a concept as an entity, a relationship, or an attribute.

In general:

  • Entities are nouns.

You should be able to identify a set of key attributes for an entity.

  • Attributes are properties and may be nouns or adjectives.

Use an attribute if it relates to one entity and does not have its own key.

Use an entity if the concept may be shared by entities and has a key.

  • Relationships should generally be binary.

Note that non-binary relationships can be modeled as an entity instead.

Modelling Traps

There are several different “modeling traps” that you can fall into when designing your ER model.

Two connection traps that we will look at are:

  • Fan traps
  • Chasm traps

Fan Traps

A fan trap is when a model represents a relationship between entity types, but the pathway between certain entity instances is ambiguous.

  • Often occurs when two or more one-to-many relationships fan out (come from) the same entity type.

Example: A department has multiple employees, a department has multiple projects, and each project has multiple employees.

Chasm Traps

A chasm trap occurs when a model suggests that a relationship between entity types should be present, but the relationship does not actually exist.(missing relationship)

  • May occur when there is a path of optional relationships between entities.

Example: A department has multiple employees, a department has multiple projects, and each project has multiple employees.

Good Design Pracrices

When designing ER models, there are several things that you should consider:

  • Avoid redundancy – do not store the same fact more than once in the model.
  • Do not use an entity when you can use an attribute instead.
  • Limit the use of weak entity sets.

Weak Entity Sets

Avoid the use of weak entity sets in modeling. Although at first glance there are very few natural keys based on the properties of objects (name, age, etc.), there are many good human-made keys that can be used (SSN, student#, etc.)

Whenever possible use a global, human-made key instead of a weak entity. Note that sometimes a weak entity is required if no such global key exists.

  • For example, a database storing the students of multiple universities could not use a single student# as a key as it is unlikely that universities could agree on a global numbering system. Rather, student becomes a weak entity with partial key student# and identifying entity the university the student attends.

Simplifications

Two common relationship simplifications:

  • Many-to-many relationship –>Two one-to-many relationships
  • Higher order relationships  –> binary relationships

Conclusion

Conceptual design is performed at a high-level of abstraction involving entities, relationships, and attributes.

  • An entity type is a group of entities with the same properties.

Entities may be strong (have unique key) or weak (no unique key).

  • A relationship type is an association between entities.

A relationship may involve two or more entities and may be recursive.

A relationship has two types of constraints:

  • Participation – minimum # of times an entity must be involved in relationship
  • Cardinality – maximum # of times an entity can be involved in relationship

Common relationship multiplicities are: 1:1, 1:*, *:*.

  • Attributes are properties of entities or relationships.

The ER model using UML syntax is used for database design. It is possible to fall into design traps, so practice is necessary to become a good conceptual designer.

+ posts