Lecture 1
,There are 6 assigments, they due on Friday
The clips are mandatory learning material, we can watch them as many time as possible
→ there are self-tests, which do not count towards the final grade
→ al the clips and articles are mandatory learning material!
Web clip 1.1: What are big data?
Observing behavior:
Digital traces (or exhaust): a record created and stored of some behavior
- Click on a website
- Call or location on a phone
- Buy with a credit card
- Like or share
- Watch
- Internet of Things (sensor data)
- Image
- Text
-…
➔ explosion in usage of
data
➔ Total numbers of
variables and total number of
observations!
, Means ‘n is much larger than p’
→ = Tall Data
Means few observations and many
variables
→ = Wide Data
➔ So for us: Big data is not at least a label on what we are interested in scores, in summary, it is not
just about science of the data, it is also about tools and the models that we use to extract
insights form that data
, Primary data VS secondary data:
➔ Teacher disagrees in a sense with the book because while a lot of big data applications are/have
to be repurposed and have this readymade quality about them (above is his reasoning)
Web clip 1.2: Uses of big Data:
,There are 6 assigments, they due on Friday
The clips are mandatory learning material, we can watch them as many time as possible
→ there are self-tests, which do not count towards the final grade
→ al the clips and articles are mandatory learning material!
Web clip 1.1: What are big data?
Observing behavior:
Digital traces (or exhaust): a record created and stored of some behavior
- Click on a website
- Call or location on a phone
- Buy with a credit card
- Like or share
- Watch
- Internet of Things (sensor data)
- Image
- Text
-…
➔ explosion in usage of
data
➔ Total numbers of
variables and total number of
observations!
, Means ‘n is much larger than p’
→ = Tall Data
Means few observations and many
variables
→ = Wide Data
➔ So for us: Big data is not at least a label on what we are interested in scores, in summary, it is not
just about science of the data, it is also about tools and the models that we use to extract
insights form that data
, Primary data VS secondary data:
➔ Teacher disagrees in a sense with the book because while a lot of big data applications are/have
to be repurposed and have this readymade quality about them (above is his reasoning)
Web clip 1.2: Uses of big Data: