Summary

Summary Advanced Database Management

Rating

Sold

Pages

Uploaded on

20-12-2025

Written in

2025/2026

Full summary Advanced Database and Control Management by prof. Wilfried Lemahieu, slides by prof. Bart Baesens. Treats every chapter that is treated in the classes. Large parts from some chapters like the last two, that we do not need to know for the exam were left out. If something is not included in the summary (like examples or figures), I always refer back to the specific slides. Difficult concepts are described in an intuitive way, like I have done in my PDBM summary.

Show more Read less

Institution

Course

Whoops! We can’t load your doc right now. Try again or contact support.

Report Copyright Violation

Connected book

Wilfried Lemahieu, Seppe Vanden Broucke Principles of Database Management

Edition:Unknown
ISBN:9781107186125
Edition:Unknown

Written for

Institution: Katholieke Universiteit Leuven (KU Leuven)
Study: Master Business & Information Systems Engineering
Course: Advanced Database Management ADBM (D0I72A)

All documents for this subject (1)

Document information

Summarized whole book?: No
Which chapters are summarized?: 8 t.e.m. 13, 16, 18
Uploaded on: December 20, 2025
Number of pages: 73
Written in: 2025/2026
Type: Summary

Subjects

database management
adbm
indexing
storage hierarchy
physical db organization
databases
file organization
db access methods
business continuity
physical file organization
enterprise storage subsystems

Content preview

Advanced Database Management and Content Management

2025 – 2026

Contents

I Physical File Organization and Indexing 4
1 Storage Hardware and Physical DB Design 4
1.1 The Storage Hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Internals of Hard Drives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 From Logical Concepts to Physical Constructs (3-layer architecture) . . . . . . . . . . . . 6

2 Record Organization 7

3 File Organization 8
3.1 Heap File Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Sequential File Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Random File Organization (Hashing) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4 Indexed Sequential File Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.5 List Data Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.6 Secondary Indexes and Inverted Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.7 B-trees and B+ -trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

II Physical DB Organization 21
4 Physical DB Organization and DB Access Methods 21
4.1 From DB to Tablespace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Index Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 DB Access Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3.1 Functioning of Query Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3.2 Index Search (with Atomic Search Key) . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3.3 Multiple Index and Multicolumn Index Search . . . . . . . . . . . . . . . . . . . . 25
4.3.4 Index Only Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.3.5 Full Table Scan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4 Join Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4.1 Nested-Loop join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.4.2 Sort-Merge join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.4.3 Hash join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Enterprise Storage Subsystems and Business Continuity 29
5.1 Disk Arrays and RAID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.1.1 Techniques Applied in RAID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2 Enterprise Storage Subsystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.2.1 DAS Directly Attached Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.2 SAN Storage Area Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.3 NAS Network Attached Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.4 NAS Gateway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.2.5 iSCSI / storage over IP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.3 Business Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

,III Extended Relational Databases 37
6 Active RDBMS Extensions 38
6.1 Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.2 Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

7 Object-Relational RDBMS Extensions 39

8 Recursive SQL queries 43

IV Object Oriented Databases and Object Persistence 43
9 Advanced Concepts of OO 44
9.1 Method Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
9.2 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
9.3 Method Overriding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
9.4 Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
9.5 Dynamic Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

10 Basic Principles of Object Persistence 45

11 OODBMS 47
11.1 Components of ODMG Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
11.1.1 Object Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
11.1.2 Object Definition Language ODL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
11.1.3 Object Query Language OQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
11.1.4 Language Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
11.2 Evaluation of OODBMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

V XML Databases 50
12 Extensible Markup Language XML 50
12.1 Document Type Definitions DTD and XML Schema . . . . . . . . . . . . . . . . . . . . . 51
12.2 Extensible Stylesheet Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
12.3 Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
12.4 XPath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

13 Processing XML Documents 53
13.1 DOM API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
13.2 SAX API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

14 Storage of XML Documents 54

15 Differences between XML and Relational Data 55

16 Mappings Between XML Documents and (Object-) Relational Data 55
16.1 Table-Based Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
16.2 Schema-Oblivious Mapping (Schredding) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
16.3 Schema-Aware Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
16.4 SQL/XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

17 Searching XML Data 58
17.1 Full-Text Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
17.2 Keyword-Based Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
17.3 Structured Search with Xquery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Advanced Database Management and Content Management

2/73

,18 XML for Information Exchange 59
18.1 Message Oriented Middleware (MOM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
18.2 SOAP-Based Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
18.3 REST-Based Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
18.4 Web Services and Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

19 Other Data Representation Formats 60
19.1 JSON (JavaScript Object Notation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
19.2 YAML (YAML Ain’t a Markup Language) . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

VI NoSQL Databases 61
20 The NoSQL Movement 61

21 Key-Value Stores 62

22 Tuple and Document Stores 65
22.1 Items with Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

23 Column-Oriented DBs 65

24 Graph Based DBs 66

25 Other NoSQL Categories 67

VII Data Distribution and Distributed Transaction Management 67
26 Distributed Systems and Distributed DBs 67

27 Architectural Implications of Distributed DBs 67

28 Fragmentation, Allocation and Replication 68

29 Transparancy 69

VIII Data Intergration, Data Quality and Data Governance 69
30 Data and Process Integration 69
30.1 Convergence of Analytical and Operational Data Needs . . . . . . . . . . . . . . . . . . . 69
30.2 Data Integration and Data Integration Patterns . . . . . . . . . . . . . . . . . . . . . . . . 70

Advanced Database Management and Content Management

3/73

, Quizzes and questions to prepare for the exam → https://www.pdbmbook.com

Legend Code Highlightings
SQL keywords are highlighted in blue
Java keywords are highlighted in red

ODL keywords are highlighted in purple

Part I
Physical File Organization and Indexing
1 Storage Hardware and Physical DB Design
1.1 The Storage Hierarchy

Figure 1: Computer memory hierarchy
→ High speed memory, expensive and limited in cap. at the top
→ Slower memory, relatively cheap and larger in size at the bottom

1. Primary Storage (volatile memory)
∗ CPU → executes mathematical and logical processor operations
∗ Cache memory opeates at nearly the same speed as CPU
∗ Central storage (internal-/ main memory) → consists of memory chips == RAM, performance
expressed in nanoseconds
∗ Contains DB buffer and runtime code of the applications and DBMS

2. Secondary Storage
∗ Persistent storage media
∗ HDD Hard disk drive and SSD solid state drives based on flash memory
∗ Contains physical DB files

Exchange of data between secondary and primary storage is called I/O (Input/Output) and is
supervised by the operating system

Advanced Database Management and Content Management

4/73

,1.2 Internals of Hard Drives
HDD stores data on circular platters, covered with magnetic particles. Platters are secured on a
spindle, which rotates at a constant speed. Magnetic particles on platters are organized in concentric
circular tracks, each track consisting of sectors1 , which are the smalles addressable unit on HDD. A set
of tracks with the same diameter are called cylinders.
∗ HDD also contains a hard disk controller
∗ HDDs are directly accessible storage devices (DASDs)
∗ Disk blocks (clusters, pages, allocation units) consist of 2 or more physically adjacent sectors

Figure 2: Internals of HDD

Operations

BS: block size, ROT: Rotation time, TR: Transfer rate

→ Reading from a block, or writing to a block implies
∗ Positioning the actuator (seek time)
∗ Wait until the desired sector has rotated under the read/write head (rotational delay/ latency)
Transfer time depends on block size, density of magnetic particles and rotation speed of disks
Response time
service time + queueing time
Service time
seek time + rotational delay + transfer time
−→ Minimize expected seek time and rotational delay
Trba = Seek + ROT/2 + BS/TR
→ Trba expected time to retrieve/write disk block that’s situated randomly on the platter, independently
of previous read/write (rba = random block access)
Tsba = ROT/2 + BS/TR
→ Tsba expected time to sequentially retrieve disk block with R/W head already in correct position, thus
when the R/W can just go on sequentially, not having to search for the block’s position (sba = sequential
block access)
1 Traditionally 512 bytes, more recently 4096 bytes

Advanced Database Management and Content Management

5/73

,1.3 From Logical Concepts to Physical Constructs (3-layer architecture)

Figure 3: 3-Layer-Architecture

The Layers can also be defined as followed:
1. Presentation Layer (external) e.g. tables etc.
Views are the visual components responsible for displaying information to a user
2. Application Layer (conceptual/logical)

3. Data Layer (internal) stores and manages all the data

Catalog is the heart of the DBMS and contains the data definition or metadata of de DB application.
– Stores the definitions of the views, logical and internal DM and synchronizes these 3 DM to guarantee
their consistency.

Advanced Database Management and Content Management

6/73

,2 Record Organization
→ Record organization refers to organization of data items into stored records. Physical
implementation of a data item is a series of bits. Common techniques:

Relative Location simplest and most widespread
∗ Data items that represent attributes of same entity are stored on physically adjacent addresses
∗ Attribute types determined by relative ordering

→ Here is a record a horizontal row in the table that represents a single, complete entry of related data.
Embedded Identification data items representing attributes always preceded by attribute type
∗ Only non-empty attributes of record included
∗ Missing attributes no problem and no need to store attributes in fixed order to identify
∗ Similar to XML and JSON

Advanced Database Management and Content Management

7/73

,Pointers and Lists ideal for dealing with variable length records (due to e.g. variable length data type
multivalued attribute type, optional attribute type. . . )

BS: Block size, ROT: Rotation time, TR: Transfer rate, BF: Blocking factor = ⌊BS/RS⌋

Blocking Factor indicates how many records are stored in a single disk block. Determines how many
records are retrieved with a single read operation

3 File Organization
Introductory Concepts
Search Key single attribute type, or set of attribute types, whose values determine criteria according
to which records are retrieved
∗ Can be PK, Alternative key, or one or more non-key attribute types
∗ Can be composite (e.g. country, gender)
∗ can also be used to specify range queries, e.g. YearOfBearth btw 1980 and 1990
Primary file organization methods determine physical positioning of stored records on storage medium
→ e.g. heap files, random file organization, indexed seq. file organiz.
∗ Can only be applied once
Secondary file organization methods provide constructs to efficiently retrieve records according to
search key that was not used for primary file organiz. Based on secondary index
Linear search each record in file is retrieved and assessed against search key
Hashing and indexing primary techniques that specify relationship btw record’s search key and physical
location

3.1 Heap File Organization
Basic (and simplest) primary file organization method

∗ New records inserted at end of file
∗ No relationship between record’s attributes and physical location
∗ Only option for record retrieval is linear search
∗ For a file with NBLK blocks, it takes on average NBLK/2 sba to find record according to unique
search key
∗ Searching records according to non-unique search key requires scanning entire file

→ Always adding records at the end results in an unstructured sequence. When retrieving or updating
a record, searching the record means checking every other record as well. Efficient for inserting,
inefficient for retrieval.

Ppt does not say anything about the specifics and details of heap file organization.
Advanced Database Management and Content Management

8/73

,3.2 Sequential File Organization
∗ Records stored in ascending/descending order of search key
∗ Efficient to retrieve records in order determined by search key
∗ Records can still be retrieved by means of linear search, but now a more effective stopping criterion
can be used, i.e. once first higher/lower key value than required one is found

Binary search technique can be used, for unique search key K, with values Ki : algorithm to retrieve
record with key value Kµ :

Expected number of block accesses to retrieve record according to PK by means of:
∗ Linear search: NBLK/2 sba
∗ Binary search: log2 (NBLK) rba

Figure 4: Example: Sequential File Organization

Updating sequential file is more cumbersome than updating heap file → often done in batch.
Sequential files often combined with one or more indexes, see chapter 3.4

3.3 Random File Organization (Hashing)
Random file organization / direct file organization / hash file organization

assumes direct relationship between value of search key and physical location. Hashing algorithm defines
key-to-address transformation → generated addressess pertain to a bucket (bucket is simply a storage
area that can hold one or more records)

Advanced Database Management and Content Management

9/73

, Figure 5: Hashing

Logically, most effective when using PK or other candidate key as search key.
Hashing cannot guarantee that all keys are mapped to different hash values, hence bucket addresses →
collision occurs when several records are assigned to same bucket, also called synonyms.
∗ If more synonyms than slots for a bucket, bucket is in overflow → additional block accesses needed
to retrieve overflow records
∗ Hashing algorithm should distribute keys as evenly as possible over the respective bucket addresses
→ Popular hashing technique is division
address(keyi ) = keyi mod M
M is often a prime number close to, but a bit larger than, the number of available addresses

Why M is typically chosen a prime number:
it minimizes repetitive mappings and ensures
a more uniform distribution of records among
Efficiency
buckets. A non-prime divisor tends to cause
clustering, increasing the risk of collisions and
overflows.

of hashing algorithm measured by expected number of rba and sba → After using the algorithm and
retrieving the bucket, there is 1 rba to the first block of the bucket. Possibly followed by sba’s if the
bucket has overflow records.

Percentage of overflow records depend on hashing algorithm and key set → aim to achieve uniform
distribution, spreading the set of records evenly over the set of available buckets. Required number of
buckets NB:
NR
NB = ⌈ ⌉
BS × LF
Advanced Database Management and Content Management

10/73

$13.70

Get access to the full document:

100% satisfaction guarantee

Immediately available after payment

Both online and in PDF

No strings attached

Get to know the seller

SmwBoy123

4.4

(5)

Get to know the seller

SmwBoy123 Katholieke Universiteit Leuven

View profile

Sold

Member since

3 year

Number of followers

Documents

Last sold

1 hour ago

4.4

5 reviews

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller SmwBoy123. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $13.70. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews) 43863 documents were sold in the last 30 days Founded in 2010, the go-to place to buy study notes for 15 years now

Summary Advanced Database Management

Connected book

Written for

Document information

Subjects

Content preview

Get to know the seller

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Didn't get what you expected? Choose another document

Pay as you like, start learning right away

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?