Database Development Process

ANSI/SPARC Architecture

Data abstraction is one of the primary benefits of database systems. Data abstraction is accomplished by defining various data views. Each view separates higher-level perspectives from the specifics of data representation.

The ANSI/SPARC architecture, which was developed in 1975, has three views:

Internal View

The computer’s physical representation of the database.
The manner in which the data is kept.
Conceptual View

The database’s logical structure, which specifies what data is stored and how it is related.
External View

The user’s view of the database, which shows the user only the parts of the database that are important to them.

3-schema database Architecture
3-schema Architecture

Benefits of 3 -Schema Architecture

External Level: Each user has access to the database, but their view of the data is separate from that of other users. It also provides logical data independence which means modifications to the conceptual schema have no effect on external views.
Conceptual Level: A single shared data representation that is independent of actual data storage for all applications and users. Users do not need to understand physical data representation specifics, and the DBA may modify storage architectures without impacting users or applications. Conceptual level is basically a single common data representation for all applications and users that is independent of physical data storage. Physical data independence means that physical changes, such as adding indexes or spreading data, have no effect on the conceptual schema.
Internal (Physical) Level: Provides standard methods for interfacing with the operating system in order to allocate space and manipulate files.

Relational Model

History

Edgar Frank Codd proposed the relational model in 1970.
System R, one of the earliest relational database systems built by IBM, resulted in many significant breakthroughs:
extensive research on concurrency control, transaction management, and query processing and optimization the first version of SQL various commercial products such as Oracle and DB2 the first version of SQL the first version of SQL the first version of SQL the first version of SQL the first version of SQL the first version of SQL the first version of

In the late 1970s and early 1980s, commercial implementations (RDBMSs) arose. The relational model is currently used as the foundation for the bulk of commercial database systems.

Definitions

A table containing columns and rows is referred to as a relation.

A relation’s named column is called an attribute.

A row of a relation is referred to as a tuple.

A domain is a set of values for one or more characteristics that may be used.

The amount of characteristics in a connection determines its degree.

A relation’s cardinality is the number of tuples it contains.

A relational database is a set of standardized relationships with unique relation names.

The structure of a relation, including its domains, represents the relation’s intention.

A relation’s extension is the current collection of tuples in the relation.

Description of certain terms
Description of certain terms

Relational Model Formal Definition

Although tables and fields are commonly used to represent the relational model, it is technically described in terms of sets and set operations.

R (A1, A2,…, An) is a relation schema with attributes A =A1, A2,…, An>, where each Ai is an attribute name that spans a domain Di marked dom (Ai).
R = Product (relation name)

Example: Product (id, name, supplierId, categoryId, price)

dom(price) is a set of all possible positive monetary values

dom(name) is a set of all possible strings that represent people’s names.

Relation Schemas and Instances

A single relation is defined by a relation schema. The relation schema expresses the relationship’s intent. A set of relation schemas makes up a relational database schema (modeling a particular domain). r(R) is a relation instance over a relation schema. R(A1, A2,…, An) is a set of n-tuples d1, d2,…, dn>, where each di is a dom(Ai) element or null. The extension of the relation is the relation instance. A null value denotes a value that is absent or unknown.

Cartesian Product

D1 x D2 is a set operation that accepts two sets D1 and D2 and returns the set of all ordered pairs where the first element is a member of D1 and the second element is a member of D2.

D1 = 1,2,3

D2 = A,B

D1 x D2 = (1,A), (2,A), (3,A), (1,B), (2,B), (3,B)

D1 x D2 = (1,A), (2,A), (3,A), (1,B), (2,B), (3,B)

Relation Instance

A relation instance r(R) c can be defined as a subset of the Cartesian product of all attribute domains in the relation schema. r(R) dom(A1) x dom(A2) x… x dom (An)
Example: 

  • R = Person(id, firstName, lastName )
  • dom(id) = {1,2}, dom(firstName) = {Joe, Steve}, dom(lastName ) = {Jones, Perry}
  • dom(id) x dom(firstName) x dom(lastName ) = { (1,Joe,Jones), (1,Joe,Perry), (1,Steve,Jones), (1,Steve,Perry), (2,Joe,Jones), (2,Joe,Perry), (2,Steve,Jones), (2,Steve,Perry)}
  • Assuming that our database contains Joe Jones and Steve Perry, r(R) = (1,Joe, Jones), (2,Steve, Perry).

Properties of Relation

  • Each connection has its own name. ( There are no two relatives with the same name.)
  • Each relation (domain value) cell has precisely one atomic (single) value.
  • Every characteristic of a connection has its own name.
  • An attribute’s values are all from the same domain.
  • Each tuple is unique.
  • There are no duplicate tuples in the database. (Because relations are sets, this is the case.) Relationships are bags in SQL.)
  • The order of the characteristics is unimportant. This differs from a mathematical relation and our definitions, which both specify an ordered tuple. The reason for this is because attribute names are used to represent the domain and can be reordered.
  • The order of the tuples is irrelevant.

Relational Keys

  • In a relation, keys are used to uniquely identify a tuple. It’s worth noting that keys refer to the relational schema rather than the relational instance. That is, you can’t know if a collection of attributes is a key just by looking at the present data.
  • A key is a small set of characteristics that uniquely identifies a tuple in a relation.
  • A super key is a collection of attributes that uniquely identifies a tuple in a relationship.
  • A candidate key is one of a relation’s potential keys.
  • A main key is the candidate key designed as a relation’s distinguishing key.
  • A foreign key is a collection of attributes in one relation that relate to the primary key in another relation. Foreign keys make it possible to ensure referential integrity.

Relations Example

Employee Database: Each employee is assigned a unique number, name, title, and salary.
Each project has its own number, name, and budget.
A single person may work on several projects, while a single project may have several employees. An employee on a project has a certain role and is assigned to it for a specific amount of time.
Employer (eno, ename, title, pay) 
Project (pno, pname, budget) 
Employer (eno, ename, title, pay) 

Relations Example
Relations Example

Relational Integrity

To ensure that an is accurate, integrity rules are used.
Constraints are restrictions or rules that apply to the database and limit the data values it may hold.
Constraint types include:
Domain constraint: An attribute’s value must either be an element of the attribute’s domain or null.
Null is not the same as zero or an empty string, but it does indicate a value that is currently unknown or not applicable.
Entity integrity constraint: No attribute of a main key can be null in a base relation.
Referential integrity constraint: If a foreign key exists in a relation, the value of the foreign key must match the primary key value of a tuple in the referenced relation, or the foreign key must be null.

Foreign Key Example
Foreign Key Example

General Constraints

Some DBMSs are capable of enforcing more general constraints. Enterprise constraints or semantic integrity constraints are two terms used to describe these requirements.

For instance, An employee cannot work on more than two projects at the same time, and an employee cannot earn more money than their manager.
At least one project must be allocated to each employee.

Using triggers, it is typically possible to ensure that the database respects these rules.

Relational Algebra Operators

Selection Operator

The selection operation is a unary operation that takes a relation as input and outputs a new relation that contains a subset of the original relation’s tuples.

Selector Operator Example
Selector Operator Example

Projection Operator

The projection operation is a unary operation that takes a relation as input and produces a new relation with a subset of the input relation’s attributes and all non-duplicate tuples as output.

Projection Operator Example
Projection Operator Example

You might be interested in: Introduction to Database Systems