100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Summary Data wrangling and data analysis

Beoordeling
-
Verkocht
18
Pagina's
102
Geüpload op
18-11-2021
Geschreven in
2021/2022

Applied Data Science Utrecht University (UU): Data handling and preparation, supervised & non-supervised machine learning, using SQL, Python, and R.










Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
18 november 2021
Aantal pagina's
102
Geschreven in
2021/2022
Type
Samenvatting

Onderwerpen

Voorbeeld van de inhoud

Silberschatz Et Al. 2019 – Database Systems Concepts
1.1. Database-System Applications
Database-management system (DBMS): collection of interrelated data and a set of
programs to access those data goal of a DBMS is information storage and manipulation

- Back-office: database internal of an organisation
- End-users: interaction between user and database within organisation

Two modes of databases usage:

- Online transaction processing: where large number users use the database, with
each user retrieving relatively small amounts of data, and performing small updates
- Data analytics: the processing of data to draw conclusions, and infer rules or decision
procedures, which are then used to drive business decisions

The field of data mining combines knowledge-discovery techniques invented by artificial
intelligence researchers and statistical analysts with efficient implementation techniques that
enable them to be used on extremely large databases

1.2. Purpose of Database Systems
File-processing system: store permanent records in various files, and it needs different
application programs to extract records from, and add records to, the appropriate files.
Disadvantages organizational information in file-processing system:

- Data redundancy and inconsistency: different programmers / structures /
programming languages or double data per identifier over different groups
o Redundancy leads to higher storage and costs
o Inconsistency leads to disagreement of data
- Difficulty in accessing data: conventional file-processing environments do not
allow needed data to be retrieved in a convenient and efficient manner. More
responsive data-retrieval systems are required for general use
- Data isolation: because data is scattered in various files, and files may be in different
formats, writing new application programs to retrieve the appropriate data is difficult
- Integrity problems: data values stored in the data base must satisfy certain types of
consistency constraints, because new data and software may be dissimilar
- Atomicity problems: a computer system is subject to failure; data transfer must be
atomic — it must happen in its entirety or not at all
- Concurrent access anomalies: systems must allow multiple users to update data
simultaneously. The system must maintain some form of supervision
- Security problems: not every user of the database system should be able to access
all the data


1

, 1.3. View of Data
The data models can be classified into four different categories:

- Relational model: collection of tables to represent both data and the relationships
among those data (record-based model; matrix / excel sheet)
- Entity-relationship (E-R) model: collection of basic objects, called entities, and
relationships among these objects
- Semi-structured data model: permit the specification of data where individual
data items of the same type may have different sets of attributes (JSON / XML)
- Object-based data model: database systems allow procedures to be stored in the
database system and executed by the database system (Java, C++, or C#)

Database-system users are not computer trained, developers hide the complexity from users
through several levels of data abstraction, to simplify users’ interactions with the system:

- Physical level: lowest level of abstraction describes how the data are stored. The
physical level describes complex low-level data structures in detail
- Logical level: next-higher level of abstraction describes what data are stored in the
database, and what relationships exist among those data. The logical level thus
describes the entire database in terms of a small number of relatively simple structures
- View level: highest level of abstraction describes only part of the entire database.
Even though the logical level uses simpler structures, complexity remains because of
the variety of information stored in a large database

Instance: collection of information stored in the database at a particular moment

Schema: overall design of the database (physical; logical schema; view level subschema)

1.4. Database Languages
Database systems provide a data-definition language (DDL) to specify database schema
and a data-manipulation language (DML) to express database queries and updates (SQL)

Database systems implement only integrity constraints testable with minimal overhead:

- Domain constraints: domain of possible values must be associated with every
attribute (for example, integer types, character types, date/time types)
- Referential integrity: ensure that a value that appears in one relation for a given set
of attributes also appears in a certain set of attributes in another relation
- Authorisation: differentiate among users as far as type of access they are permitted
on various data values in the database; read / insert / update / delete authorisation

Data-definition language: SQL provides a rich DDL that allows one to define tables with
data types and integrity constraints


2
€5,99
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
Samme Universiteit Utrecht
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
43
Lid sinds
4 jaar
Aantal volgers
26
Documenten
9
Laatst verkocht
1 maand geleden

4,0

1 beoordelingen

5
0
4
1
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen