100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Summary Data wrangling and data analysis

Rating
-
Sold
18
Pages
102
Uploaded on
18-11-2021
Written in
2021/2022

Applied Data Science Utrecht University (UU): Data handling and preparation, supervised & non-supervised machine learning, using SQL, Python, and R.

Institution
Course









Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
November 18, 2021
Number of pages
102
Written in
2021/2022
Type
Summary

Subjects

Content preview

Silberschatz Et Al. 2019 – Database Systems Concepts
1.1. Database-System Applications
Database-management system (DBMS): collection of interrelated data and a set of
programs to access those data goal of a DBMS is information storage and manipulation

- Back-office: database internal of an organisation
- End-users: interaction between user and database within organisation

Two modes of databases usage:

- Online transaction processing: where large number users use the database, with
each user retrieving relatively small amounts of data, and performing small updates
- Data analytics: the processing of data to draw conclusions, and infer rules or decision
procedures, which are then used to drive business decisions

The field of data mining combines knowledge-discovery techniques invented by artificial
intelligence researchers and statistical analysts with efficient implementation techniques that
enable them to be used on extremely large databases

1.2. Purpose of Database Systems
File-processing system: store permanent records in various files, and it needs different
application programs to extract records from, and add records to, the appropriate files.
Disadvantages organizational information in file-processing system:

- Data redundancy and inconsistency: different programmers / structures /
programming languages or double data per identifier over different groups
o Redundancy leads to higher storage and costs
o Inconsistency leads to disagreement of data
- Difficulty in accessing data: conventional file-processing environments do not
allow needed data to be retrieved in a convenient and efficient manner. More
responsive data-retrieval systems are required for general use
- Data isolation: because data is scattered in various files, and files may be in different
formats, writing new application programs to retrieve the appropriate data is difficult
- Integrity problems: data values stored in the data base must satisfy certain types of
consistency constraints, because new data and software may be dissimilar
- Atomicity problems: a computer system is subject to failure; data transfer must be
atomic — it must happen in its entirety or not at all
- Concurrent access anomalies: systems must allow multiple users to update data
simultaneously. The system must maintain some form of supervision
- Security problems: not every user of the database system should be able to access
all the data


1

, 1.3. View of Data
The data models can be classified into four different categories:

- Relational model: collection of tables to represent both data and the relationships
among those data (record-based model; matrix / excel sheet)
- Entity-relationship (E-R) model: collection of basic objects, called entities, and
relationships among these objects
- Semi-structured data model: permit the specification of data where individual
data items of the same type may have different sets of attributes (JSON / XML)
- Object-based data model: database systems allow procedures to be stored in the
database system and executed by the database system (Java, C++, or C#)

Database-system users are not computer trained, developers hide the complexity from users
through several levels of data abstraction, to simplify users’ interactions with the system:

- Physical level: lowest level of abstraction describes how the data are stored. The
physical level describes complex low-level data structures in detail
- Logical level: next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. The logical level thus
describes the entire database in terms of a small number of relatively simple structures
- View level: highest level of abstraction describes only part of the entire database.
Even though the logical level uses simpler structures, complexity remains because of
the variety of information stored in a large database

Instance: collection of information stored in the database at a particular moment

Schema: overall design of the database (physical; logical schema; view level subschema)

1.4. Database Languages
Database systems provide a data-definition language (DDL) to specify database schema
and a data-manipulation language (DML) to express database queries and updates (SQL)

Database systems implement only integrity constraints testable with minimal overhead:

- Domain constraints: domain of possible values must be associated with every
attribute (for example, integer types, character types, date/time types)
- Referential integrity: ensure that a value that appears in one relation for a given set
of attributes also appears in a certain set of attributes in another relation
- Authorisation: differentiate among users as far as type of access they are permitted
on various data values in the database; read / insert / update / delete authorisation

Data-definition language: SQL provides a rich DDL that allows one to define tables with
data types and integrity constraints


2

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
Samme Universiteit Utrecht
Follow You need to be logged in order to follow users or courses
Sold
43
Member since
4 year
Number of followers
26
Documents
9
Last sold
1 month ago

4.0

1 reviews

5
0
4
1
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions