WGU C175 - Data Management - Fundations
Data Management Foundations (Western Governors University)
Studocu is not sponsored or endorsed by any college or university
Downloaded by john gatheca ()
, lOMoARcPSD|28964921
Unit 1: Introduction to dBs, information and data
Introduction to Databases, Information and Data
Data is
Is ubiquitous ((abundant, global, everywhere) and pervasive (unescapable, prevalent,
persistent)
Consists of raw facts
Must be formatted for storage, processing and presentation.
Foundation of information, which is the bedrock of knowledge.
Verifiable if the data always yields consistent results.
Information is
The result of processing raw data to reveal its meaning.
Can be used as the foundation for decision making.
To reveal meaning information requires context.
NOTE:
Data and information are not the same thing.
Raw data must be properly formatted for storage, processing and presentation.
Most data that can be encountered are best classified as semi-structured.
Knowledge
The body of information and facts about a specific subject. Knowledge implies familiarity,
awareness, and understanding of information as it applies to an environment. A key
characteristic is that new knowledge can be derived from old knowledge.
Application: Might be written by a programmer or it might be created through a DBMS utility
program.
Key Points
Data constitutes the building blocks of information.
Information is produced by processing data.
Information is used to reveal the meaning of data.
Accurate, relevant, and timely information is the key to good decision making.
Good decision making is the key to organizational survival in a global environment.
File Structure
Format that data is arranged and stored in a file
Types of Files
Flat files: Files having no internal hierarchy
Heap files: Files containing an unsorted set of records that are uniquely identified by a record id
which allows them to be inserted or deleted using that id.
Downloaded by john gatheca ()
, lOMoARcPSD|28964921
Index files: Files that store a list of lookup field values from a data file – along with the location
(address) in the data file of the corresponding record. Because the lookup field is much smaller
than the entire record, the entire index will usually fit in main memory for quick look up. Once
the address of the record is obtained from the index, the entire record can then be directly
accessed from the data file instead of reading in the entire data file – record by record, in order
to locate the desired one.
Hashed files: Files which use a hash function to decide where a record should be placed on a
disk. This allows for faster data lookup without the use of an index file.
Data Management
A process that focuses on data collection, storage, and retrieval. Common data management
functions include addition, deletion, modification, and listing.
Database is
A shared, integrated computer structure that houses a collection of related data. A database
contains two types of data:
End-user data (raw facts)
Metadata, or data about data, characteristics and relationships
Make data persistent and shareable in a secure way
Database Management system (DBMS)
The collection of programs that manages the database structure and controls access to the data
stored in the database. The database structure in a DBMS is stored as a collection of files.
NOTE: DBMS respond to queries with a query result set.
Advantages of DBMS
Downloaded by john gatheca ()
, lOMoARcPSD|28964921
Improved data sharing
Improved data security
Better data integration
Minimized data inconsistency
Data inconsistency: A condition in which different versions of the same data yield different
(inconsistent) results.
Improved data access
Makes possible Ad-Hoc Queries: Spur of the moment question or quick question.
Improved decision making
Doesn't guarantee data quality but provides a framework to facilitate data quality initiatives.
Increased end-user productivity
Types of Databases
By Type
Single-user Database: Supports only one user at a time.
Desktop Database: A single-user database that runs on a personal computer.
Multiuser Database: A database that supports multiple concurrent users.
Workgroup Database: A multiuser database that usually supports fewer than 50 users or is used
for a specific department in an organization.
Enterprise Database: The overall company data representation, which provides support for
present and expected future needs.
By Location
Centralized Database: A database located in a single site.
Distributed Database: A logically related database that is stored in two or more physically
independent sites.
Could Database: A database that is created and maintained using cloud services, such as MS,
Azure or Amazon AWS.
By type of data stored
Downloaded by john gatheca ()