⛏️
Process Mining
Created @April 23, 2025 1:22 PM
Class INFOMPROM
Week 1
Lecture 1 - Introduction to Process Mining
Big Data and Data Science
based on data: what happened?
root cause analysis: why did it happen?
focus on future: what will happen?
what is the best that can happen
Why Process Mining?
Gaining knowledge of, and, insights in (business) processes by analysing the event data stored during execution of the
process.
Why Process Mining?
Understand the behavior as it actually happens
Then identify, remove, predict, prevent
bottlenecks: slow steps, unnecessary delays
inefficiencies: unnecessary rework, unnecessary additional steps,
costly/illegal variations: costly steps, short-cuts, variants leading to failure
Then improve, recommend, automate
Different Tasks in Process Mining
Process Mining 1
, Case: A single instance of a process (e.g., one customer order with number 9901).
Event: A recorded case and activity that happens at a specific time (e.g., "Order Received").
Activity: A specific task or step in the process (e.g., "Ship Item"); multiple events can be of the same activity type.
Trace: The ordered sequence of events for one case (e.g., ["Receive Order", "Pack", "Ship"]).
Variant: A unique sequence of activities (trace pattern) that appears across multiple cases.
Event Data May Come From
a database system (e.g., patient data in a hospital),
a comma-separated values (CSV) file or spreadsheet,
a transaction log (e.g., a trading system),
a business suite/ERP system (SAP, Oracle, etc.),
a message log (e.g., from IBM middleware),
an open API providing data from websites or social media, …
Formal notations
Scenario’s
Process Mining 2
, Directly Follows Graphs (Process Maps)
Most widely used process discovery
Create a graph with one node per activity (special start/end).
For each case, iterate through its sequence of events:
connect activities a and b if a is followed directly by b (in the same case).
Process Mining 3
, Advanced version:
Count how many times this happens.
Bookkeeping the timestamps.
Adding dummy start and end activities.
Bookkeeping any attribute of interest.
Pruning directly follows graph – (in)frequent activities and arcs
Remove infrequent activities from event log (projection).
Remove low-frequent arcs.
Pruning directly follows graph (in)frequent variants
Remove infrequent trace variants from event log.
Advantages of DFG: Limitations of DFG
Simple Ambigous semantics
Process Mining 4
Process Mining
Created @April 23, 2025 1:22 PM
Class INFOMPROM
Week 1
Lecture 1 - Introduction to Process Mining
Big Data and Data Science
based on data: what happened?
root cause analysis: why did it happen?
focus on future: what will happen?
what is the best that can happen
Why Process Mining?
Gaining knowledge of, and, insights in (business) processes by analysing the event data stored during execution of the
process.
Why Process Mining?
Understand the behavior as it actually happens
Then identify, remove, predict, prevent
bottlenecks: slow steps, unnecessary delays
inefficiencies: unnecessary rework, unnecessary additional steps,
costly/illegal variations: costly steps, short-cuts, variants leading to failure
Then improve, recommend, automate
Different Tasks in Process Mining
Process Mining 1
, Case: A single instance of a process (e.g., one customer order with number 9901).
Event: A recorded case and activity that happens at a specific time (e.g., "Order Received").
Activity: A specific task or step in the process (e.g., "Ship Item"); multiple events can be of the same activity type.
Trace: The ordered sequence of events for one case (e.g., ["Receive Order", "Pack", "Ship"]).
Variant: A unique sequence of activities (trace pattern) that appears across multiple cases.
Event Data May Come From
a database system (e.g., patient data in a hospital),
a comma-separated values (CSV) file or spreadsheet,
a transaction log (e.g., a trading system),
a business suite/ERP system (SAP, Oracle, etc.),
a message log (e.g., from IBM middleware),
an open API providing data from websites or social media, …
Formal notations
Scenario’s
Process Mining 2
, Directly Follows Graphs (Process Maps)
Most widely used process discovery
Create a graph with one node per activity (special start/end).
For each case, iterate through its sequence of events:
connect activities a and b if a is followed directly by b (in the same case).
Process Mining 3
, Advanced version:
Count how many times this happens.
Bookkeeping the timestamps.
Adding dummy start and end activities.
Bookkeeping any attribute of interest.
Pruning directly follows graph – (in)frequent activities and arcs
Remove infrequent activities from event log (projection).
Remove low-frequent arcs.
Pruning directly follows graph (in)frequent variants
Remove infrequent trace variants from event log.
Advantages of DFG: Limitations of DFG
Simple Ambigous semantics
Process Mining 4