Chapter 1 – Database Systems: Architecture and Components
Chapter 1 Objectives
After completing this chapter, the student will understand:
• The difference between data, metadata, and information and highlight how metadata serves
as the lens by which data can become information
• How data management is a discipline that focuses on the proper acquisition, storage,
maintenance, and retrieval of data
• The characteristics of file-processing systems and their limitations
• How the ANSI/SPARC Three-Schema Architecture constitutes the solution to the problems
plaguing file processing systems
• What constitutes a database, a database management system, and a database
• The difference between a model and a data model
• The role of data models in database design
• The role of the three data models (conceptual, logical, and physical) in the database design
life cycle portrayed in Figure 1.7
Chapter 1 Overview
This chapter begins with an introduction to the rudimentary concepts of data and how
information emerges from data when viewed through the lens of metadata. Next, the discussion
addresses data management, contrasting file-processing systems with database systems. This is
followed by brief examples of desktop, workgroup, and enterprise databases. The chapter then
presents a framework for database design in Figure 1.7 that describes the conceptual, logical, and
physical tiers of data modeling and their roles in the database design life cycle. This framework
serves as the roadmap to guide the reader through the remainder of the book. Finally, an
example walks one through the cradle to grave life cycle of data modeling and database design in
a nutshell.
,Data Modeling and Database Design 1-2
Chapter 1 Key Terms
Data Unorganized facts about things, events, activities,
and transactions.
Information Data that has been organized into a specific
context such that it has value to its recipient.
Metadata A lens through which data takes on specific
meaning and yields information.
Data element The smallest unit of data.
Record type A group of related data elements treated as a unit.
Record A set of values for the data elements constituting
a record type.
File A collection of records.
Data Set Another term for a file.
Sequential access An access approach where in order to get to the
nth record in a data set it is necessary to pass
through the previous n-1 records in the data set.
Direct access An access approach where it is possible to get to
the nth record in a data set without having to pass
through the previous n-1 records in the data set.
File-processing system The predecessor of a database system where
records were stored in separate non-integrated
files.
Data integrity Ensures that data is correct, consistent, complete,
and current.
ANSI/SPARC three-schema architecture A collection of three separate schemas or views
for describing data in a database: (a) external
schema (or application view), (b) conceptual
schema (or logical view) and (c) internal schema
(or physical view).
Conceptual schema Represents the global conceptual view of the
structure of the entire database for the community
of users. It is independent of any particular data
structure or data representation.
External schema Consists of a number of different user views or
subschemas, each describing portions of the
database of interest to a particular user or group
of users. The external schema describes the data
corresponding to part of the conceptual schema as
seen by one or more users or programs.
,Data Modeling and Database Design 1-3
Internal schema Describes the physical structure of the stored data
and the mechanism used to implement the access
strategy. As opposed to the conceptual schema
and external schema, which are technology-
independent, the internal schema is technology-
dependent.
Data independence The ability to modify a schema definition in one
level without affecting a schema definition in a
higher level. For example, the conceptual schema
insulates user views in the external schema from
changes in the physical storage structure of the
data in the internal schema.
Physical data independence The ability to modify the internal schema without
causing the application program in the external
schema to be rewritten.
Logical data independence The immunity of a user view from changes in the
other user views.
Database A self-describing collection of integrated files
consisting of (1) users’ data, (2) metadata, and (3)
overhead data.
Database management system A collection of general-purpose software that
facilitates the processes of defining, constructing,
and manipulating a database for various
applications.
Distributed database A collection of multiple logically interrelated
databases that may be geographically dispersed
over a computer network.
Distributed database management system Software that manages a distributed database
while rendering the geographical distribution of
the data transparent to the user community.
Data warehouse A collection of data designed to support
management decision making. A data warehouse
contains a wide variety of data that present a
coherent picture of business conditions at a single
point in time.
Data definition language (DDL) The component of a database management
system used to create the structure of database
objects such as tables, views, assertions, domains,
schemas, etc.
Data control language (DCL) The component of a database management
system used to control user access, facilitate
backup and recovery from failures, and insure
that users access only the data they are authorized
to use.
, Data Modeling and Database Design 1-4
Data manipulation language (DML) The component of a database management
system product that facilitates the retrieval,
insertion, deletion, and modification of data in a
database.
Data dictionary The component of a database system that stores
metadata that provides such information as the
definitions of data items and their relationships,
authorizations, and usage statistics.
Data repository A collection of metadata about data models and
application program interfaces.
Data model A representation of a real-world phenomenon that
makes use of descriptors.
Universe of interest The aspect of the real world represented by the
database.
Requirements specification The initial step in the database design process
where existing documents and systems are
reviewed and prospective users are interviewed in
an effort to identify the objectives to be supported
by the database system.
Business rules User-specified restrictions on the organization’s
activities (business processes) that must be
reflected in the database or database applications.
Business rule (from Chapter 2) A short statement of a specific condition or
procedure relevant to the universe of interest
being modeled expressed in a precise,
unambiguous manner.
Conceptual data modeling Involves describing the structure of the data to be
stored in the database without specifying how it
will be physically stored.
Logical data modeling Involves refining the conceptual data model to (a)
the point where it is more compatible with the
technology intended for implementation and (b)
eliminate data redundancy problems.
Physical data modeling Involves transforming the logical data model into
a form that can be implemented by some DBMS
product.