100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Summary Big Data — Review sheet (L3 — SKEMA Business School)

Rating
-
Sold
-
Pages
8
Uploaded on
14-04-2025
Written in
2024/2025

Big Data refers to all the massive data generated continuously by human and digital activities. It is characterized by the 5Vs: Volume (large quantity of data), Velocity (speed of generation and processing), Variety (diversity of formats: text, image, video...), Truthfulness (quality and reliability of data) and Value (ability to create useful information). In a business context, Big Data allows companies to better understand their customers, optimize their processes and make strategic decisions based on concrete data. Data sources include social networks, IoT sensors, online transactions, and web browsing. Key technologies include Hadoop, Spark, NoSQL, as well as visualization tools like Tableau or Power BI. Big Data is also closely linked to artificial intelligence and predictive analytics. The main challenges are ethical (protection of personal data), technological (storage and processing capacity) and organizational (skills in data science, digital transformation).

Show more Read less
Institution
Course









Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
April 14, 2025
Number of pages
8
Written in
2024/2025
Type
Summary

Subjects

Content preview

Big data : CM1
Descriptive : what has happened, identify
problems and solutions



3 types of analysis Predictive : What could happen, historical
techniques data, estimate, etc




Prescriptive : what should we do, optimize
and simulate, explore, build, etc




Big data : massive volume of both structured and unstructured data that are
difficult to manage, process, and analyze using traditional data-processing
tools.


3 Vs : Statistical data types

Volume Cross-sectional data : for a
given sort of entity for a
Velocity
single period of time Panel Data : for
Variety Time-series data : for a multiple entities
single entity for multiple for multiple
periods of time periods of time

The linear regression model
Dependent variable = y
Def : postulates that the relationship
between the dependent and independent
variable is linear Independent = X1, X2, etc, Xn


A regression mondel treats all independent variable as numerical


Big data : CM1 1

, d = 1 for 1 of the categories
Dummy variable : used to describe 2 categories of
a categorical variable, d
d = 0 for the other(s)


Simple regression model :




Multiple regression model :




Y^: predicted value of the dependent variable

This equation is the model. It allows to calculate
the predicted value of the dependent variable for
any given values of independent variables.


The difference between the observed and the
predicted values is the residual : e = y - y^




Big data : CM1 2
$10.04
Get access to the full document:

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Get to know the seller
Seller avatar
capucinetribondeau

Get to know the seller

Seller avatar
capucinetribondeau SKEMA Nice
Follow You need to be logged in order to follow users or courses
Sold
0
Member since
8 months
Number of followers
0
Documents
4
Last sold
-

0.0

0 reviews

5
0
4
0
3
0
2
0
1
0

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions