DB LECTURE 2 (4)
DB LECTURE 2 (4)
1
2.1. Data Models, Schemas and Instances
• Data Model: A set of concepts to describe the structure of a
database, and certain constraints that the database should obey.
• Database Schema: The description of a database. Includes
descriptions of the database structure and the constraints that
should hold on the database.
• A database schema refers to the logical and visual configuration of
the entire relational database. The database objects are often
grouped and displayed as tables, functions, and relations. A
schema describes the organization and storage of data in a
database and defines the relationship between various tables.
• Schema Diagram: A diagrammatic display of (some aspects of) a
database schema.
• Schema Construct: A component of the schema or an object
within the schema, e.g., STUDENT, COURSE.
• Database Instance: The actual data stored in a database at a
particular moment in time. Also called database state (or
occurrence).
2
Cont.
What is a Data Model?
•A data model determines the structure of data elements within an
information system. A data model documents the relationships between
data elements and how data is retrieved and stored.
•Data models often display the flow of data through a graph or data
model diagram.
•This visual representation helps facilitate communication between
software and business teams: business teams can identify the data and
data formats needed for business functions, and software teams can build
the responses needed for those requests.
•A data model refers to an abstract representation of data structures that
are used to organize and manage data in a database or information
system. It defines the relationship between the data elements
(representing real-world objects) and how they are organized, stored, and
retrieved.
3
Cont.
It can be helpful to imagine a building plan beneath/under/ the data
model, upon which the entire system will be built.
Data models are used to create databases and data warehouses,
manage data for analytical processing, and implement applications that
enable users to access information in meaningful ways.
To answer the question “what is a data model,” it’s helpful to
understand a few key terms used in the data model definition.
A data model identifies:
• Entities: the data components, including associated metadata, raw data
and processed data.
• Attributes appear as columns in specific tables.
• Records are shown in rows in each table.
• Relationships define the associations between entities.
4
Cont.
• Requirements: the anticipated uses of the data, especially future uses.
• Technology assessment: the strengths and weaknesses of the
hardware and software used in the project.
• Cardinality refers to the numerical relationship (1:1,1:M,M:1,M:M)
between two entities in a database. The number of instances
participating or associated with a single instance from an entity in a
relationship is called the CARDINALITY of the relationship. E.g.
Employee – Department (M:1), Department-Student(1:M), Student –
Course (M:M), Building – Location (1:1).
• Primary key is a unique identifier of each record in a current table.
• Foreign key is a column or a group of columns that points towards a
primary key in a different table
Data Modeling: is the process of creating a data model for an information
system by applying certain formal techniques.
5
The components of a data model Cont.
6
An example of Database Schema Cont.
8
2.2. Overview of Data Models
• The main purpose of a data model is to represent the data in an
understandable way.
Categories of data models:
• Conceptual (high-level, semantic) data models: Provide concepts that
are close to the way many users perceive data. (Also called entity-
based or object-based data models). It provide concepts for
presenting data in ways that are close to the way people perceive data.
A typical example is the entity relationship model, which uses main
concepts like entities, attributes and relationships.
• Logical data model: a logical data model establishes the structure of
data elements and the relationships among them. It is independent of
the physical database that details how the data will be implemented.
The logical data model serves as a blueprint for used data. The logical
data model takes the elements of conceptual data modeling a step
further by adding more information to them.
• Physical (low-level, internal) data models: Provide concepts that
9
describe details of how data is stored in the computer.
Cont.
• A conceptual data model identifies the entities that describe the data
and relationships between them. Conceptual data models only show
the highest-level relationships between entities, not attributes or
primary keys within the data model. Business stakeholders and data
architects are typically the ones who create conceptual data models
with the intent to organize and define various business concepts and
rules and to set the parameters or scope the database.
• A logical data model serves to define how a system has to be
implemented regardless of the database management system being
used. Data architects and business analysts are usually the creators of
a logical data model. The goal of creating a logical data model is to
develop a highly technical map of underlying rules and data structures.
N:B: Generally speaking, both conceptual data modeling and logical data
modeling are “requirements analysis” types of activities, while physical data
modeling is considered to be a design activity.
10
•
Cont.
A physical data model is a low-level representation of your data
that specifies how the data will be stored, accessed, and
manipulated in a specific database system.
• A physical data model identifies the table structures that will be
built in the database, including all tables, columns, primary keys
and foreign keys used to identify the relationships between tables.
• The physical data model, captures all of the implemented tables and
views in the current database and includes data types and
cardinality. It is also filled with data, which users can view via
SELECT query or manage with INSERT, UPDATE, or DELETE.
• The physical data model pertains to how the system will be
implemented, and factors in the specific databases management
system. This model is typically created by developers. The idea is
more to define how the actual database will be used or
implemented for business purposes. 11
Cont.
Example of logical
data model
Example of physical
data model
12
Cont.
Example of conceptual
data model
14
Cont.
16
Cont.
•Network Model: It is the advance version of the hierarchical data model.
To organize data it uses directed graphs instead of the tree-structure.
-In this child can have more than one parent. It uses the concept of the two data
structures i.e. Records and Sets. Allows record types to have more than one parent
unlike hierarchical model.
–A network data models sees records as set members.
–Each set has an owner and one or more members.
–Allow many to many relationship between entities
–Like hierarchical model network model is a collection of physically linked records.
–Allow member records to have more than one owner.
17
Cont.
– Advantages of Network Data Model:
• Network Model is able to model complex relationships and
represents semantics of add/delete on the relationships.
• Can handle most situations for modeling using record types and
relationship types.
• Language is navigational; uses constructs like FIND, FIND member,
FIND owner, FIND NEXT within set, GET etc. Programmers can do
optimal navigation through the database.
– Disadvantages of Network Data Model:
• Navigational and procedural nature of processing.
• Database contains a complex array of pointers that thread/continue/
through a set of records.
• Little scope for automated "query optimization”.
18
Cont.
• Relational Data Model
– Developed by Dr. Edgar Frank Codd in 1970 (famous paper,
'A Relational Model for Large Shared Data Banks').
– Terminologies originates from the branch of mathematics called
set theory and relation.
– Can define more flexible and complex relationship.
– Viewed as a collection of tables called “Relations” equivalent to
collection of record types.
– Relation: Two dimensional table.
– Stores information or data in the form of tables rows and
columns.
– A row of the table is called tuple equivalent to record.
– A column of a table is called attribute equivalent to fields.
– Data value is the value of the Attribute.
– Records are related by the data stored jointly in the fields of
records in two tables or files. The related tables contain
information that creates the relation.
19
Cont.
• The tables seem to be independent but are related some how.
• No physical consideration of the storage is required by the user .
• Many tables are merged together to come up with a new virtual
view of the relationship.
Alternative terminologies
20
Cont.
• The rows represent records (collections of information about
separate items).
• The columns represent fields (particular attributes of a record).
• Conducts searches by using data in specified columns of one table
to find additional data in another table.
• In conducting searches, a relational database matches information
from a field in one table with information in a corresponding field
of another table to produce a third table that combines requested
data from both tables.
21
Cont.
22
2.3. Architecture and Data Independence
• Defines DBMS schemas at three levels:
• Internal schema at the internal level to describe physical storage
structures and access paths. Typically uses a physical data model.
• Conceptual schema at the conceptual level to describe the structure
and constraints for the whole database for a community of users.
Uses a conceptual or an implementation data model (logical data
model).
• External schemas at the external level to describe the various user
views. Usually uses the same data model as the conceptual level.
23
Architecture and Data Independence …
24
Architecture and Data Independence …
Mappings among schema levels are needed to transform requests and
data. Programs refer to an external schema, and are mapped by the
DBMS to the internal schema for execution.
Data Independence
• Logical Data Independence: The capacity to change the conceptual
schema without having to change the external schemas and their
application programs.
• Physical Data Independence: The capacity to change the internal
schema without having to change the conceptual schema.
When a schema at a lower level is changed, only the mappings
between this schema and higher-level schemas need to be changed in a
DBMS that fully supports data independence. The higher-level
schemas themselves are unchanged. Hence, the application programs
need not be changed since they refer to the external schemas.
25
2.4. Database Languages and Interfaces
• A variety of users are supported by DBMS.
• The DBMS must provide appropriate languages and interfaces for
each category of users.
Database Languages
– Data Definition Language (DDL)
• Allows DBA or user to describe and name entities, attributes
and relationships required for the application.
• Specification notation for defining the database schema.
26
Database Languages and Interfaces …
– Data Manipulation Language (DML)
• Provides basic data manipulation operations on data held in the database.
• Language for accessing and manipulating the data organized by the
appropriate data model.
• DML also known as query language.
• There are two main types of DMLs
– Procedural DML: user specifies what data is required and how to get the
data. Examples of Procedural languages: FORTRAN, COBOL, ALGOL,
BASIC, C and Pascal.
– Non-Procedural DML: user specifies what data is required but not how
it is to be retrieved. SQL is an example of Non-Procedural DML, where
developers can simply write queries to retrieve, insert, update or delete
data.
27
Database Languages and Interfaces …
DBMS Interfaces
• User friendly interfaces provided by DBMSs include:
– Menu-Based Interfaces for Web Clients or Browsing.
– Forms-Based Interfaces.
– Graphical User Interfaces.
– Natural Language Interfaces.
– Speech Input and Output.
– Interfaces for Parametric Users (Naive Users): the unsophisticated
who don't have any DBMS knowledge but they frequently use the
database applications in their daily life to get the desired results.
For examples, Railway's ticket booking users are naive users.
– Interfaces for the DBA
28
2.5.The data base system environment
• Properties of Relational Databases
– Each row of a table is uniquely identified by a PRIMARY KEY
composed of one or more columns.
– Each tuple in a relation must be unique.
– ENTITY INTEGRITY RULE of the model states that no
component of the primary key may contain a NULL value.
– A column or combination of columns that matches the primary
key of another table is called a FOREIGN KEY used to cross-
reference tables.
– REFERENTIAL INTEGRITY RULE of the model states that,
for every foreign key value in a table there must be a
corresponding primary key value in another table in the database
or it should be NULL.
29
Cont.
– All tables are LOGICAL ENTITIES.
– A table is either a BASE TABLE (Named Relation) or VIEW
(Unnamed Relation).
– Only Base Tables physically store data.
– VIEWS are derived from BASE TABLES with SQL instructions
like: [SELECT .. FROM .. WHERE .. ORDER BY].
– Order of rows and columns is immaterial.
– Entries with repeating groups are said to be un-normalized.
– Entries are single-valued.
– Each column (field or attribute) has a distinct name.
– All values in a column represent the same attribute and have the
same data format.
30
Cont.
• The building blocks of the relational data model are:
– Entities: real world physical or logical object.
– Attributes: properties used to describe each entity or real world
object.
– Relationships: the associations between entities.
– Constraints: rules that should be obeyed while manipulating the
data.
31
2.6. Classification of DBMS
• Based on the Data Model on which the DBMS based. The
data models include:
– Relational Data Model - RDBMS
– Object Oriented Data Model – OO-DBMS
– Network Data Model – Network DBMS
– Hierarchical Data Model – Hierarchical DBMS
– Mix of Object Oriented and Relational Data model – Object-
Relational DBMS.
• Based on the number of users supported in the system:
– Single user Systems – support only one user at a time and
mostly are used with PCs.
– Multi User Systems – Supports multiple users.
Cont.
• Based on the Number of Sites over which the DB is distributed
– Centralized
• The DB and the DBMS exists on a single computer.
• May support multiple users.
– Distributed
• The DB and the DBMS can be distributed over multiple
sites.
33
Thank you!
34