100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4.2 TrustPilot
logo-home
Resumen

Summary Data Mining for Business & Governance

Puntuación
4.0
(1)
Vendido
7
Páginas
65
Subido en
06-10-2021
Escrito en
2020/2021

Summary Data Mining for Business & Governance, written in the spring semester of 2021 for Data Science & Society, Tilburg University.

Institución
Grado











Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
6 de octubre de 2021
Número de páginas
65
Escrito en
2020/2021
Tipo
Resumen

Temas

Vista previa del contenido

Recap before midterm
What is data mining?
(slides) Data mining is the computational process of discovering patterns
in large data sets involving methods at the intersection of artificial
intelligence, machine learning, statistics and database systems.
(google) Data mining is searching for patterns in data. In exact words,
data is the actual extraction of knowledge from data via technologies that
incorporate these principles.
(slides Chris) Data mining is a concept to unify statistics, data analysis
and their related methods in order to understand and analyze actual
phenomena with data.
With data mining, we want to prove that something can be predicted
better than the baseline, or that a certain method works better than a
method that has been explored before.


What are the related disciplines?
The related disciplines that have overlap with data mining are;
1. Artificial Intelligence (AI): interdisciplinary field aiming to develop
intelligent machines
2. Machine Learning (ML): branch of computer science studying
learning from data
3. Statistics: branch of mathematics focused on data
4. Information retrieval/knowledge discovery in databases
Others are;

,
,What are the applications?
In companies, data mining is applied as business intelligence (market
analysis and management).
In science, data mining is applied as knowledge discovery (scientific
discovery in large data). In science, also text mining (natural language
processing) is used, which is going form unstructured text to structured
knowledge.


What is big data?
(slides) Big data consists of three parts;
1. Volume: data that is too big for manual analysis, too big to fit in
RAM and too big to store on disk.
2. Variety: big data has high ranges of values (variance), has outliers,
confounders and noise, and consists of different data types.
3. Velocity: big data changes quickly (require results before data
changes) and big data is streaming data (no storage).
(readings) Datasets that are too large for traditional data-processing
systems and that therefore require new technology. There is big data 1.0
(businesses got the basic internet technologies in place so that they could
establish a web presence, build electronic commerce capability and
improve operating efficiency. With big data 2.0, new systems and
companies started to exploit the interactive nature of the web. The
changes brought on by this shift in thinking are extensive and pervasive;
the most obvious are the incorporation of social-networking components
and the rise of the ‘voice’ of the individual consumer and citizen.


Different types of learning: supervised and unsupervised
Supervised learning (classification, regression) is done using a ground
truth; we have prior knowledge of what the output values of our samples
should be. The goal of supervised learning is to learn a function that,
given a sample of data and desired outputs, best approximates the
relationship between input and output observable in the data. Supervised

, learning means that the data is labeled. In supervised learning, you know
x and y.
Unsupervised learning (clustering, dimensionality reduction) does not
have labeled outputs, so its goal is to infer the natural structure present
within a set of data points. Unsupervised learning means that the data is
not labeled, we want to find patterns within the data. In unsupervised
learning, you know only x (you do not know yet what to research). In
short, unsupervised learning can be defined as data mining algorithms
that infer patterns from a dataset without reference to outcomes or
decisions.
Semi-supervised classification is a combination of both. It means that
we have some instances we shall attach to the decision classes, and we
have a small amount of labeled data with a large amount of unlabeled
data.


Examples of supervised and unsupervised learning (regression,
classification, clustering, dimensionality reduction)
Supervised: regression, classification (3 parts; input, output and function)
Unsupervised: clustering, dimensionality reduction


 Workflow of supervised learning
1. Collect data
2. Label examples
3. Choose representation (features are numerical or categorical,
possibly convert to feature vector)
4. Train models (use a training set for learning, and a validation
set for tuning. hyperparameters are settings of learning
algorithms. For each value of hyperparameters, apply
algorithm to training set to learn, check performance on
validation set and find the best-performing setting)
5. Evaluate (check performance of tuned model on test set. You
want to estimate how well your model will be do in the real
world).
$3.94
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Reseñas de compradores verificados

Se muestran los comentarios
3 año hace

4.0

1 reseñas

5
0
4
1
3
0
2
0
1
0
Reseñas confiables sobre Stuvia

Todas las reseñas las realizan usuarios reales de Stuvia después de compras verificadas.

Conoce al vendedor

Seller avatar
Los indicadores de reputación están sujetos a la cantidad de artículos vendidos por una tarifa y las reseñas que ha recibido por esos documentos. Hay tres niveles: Bronce, Plata y Oro. Cuanto mayor reputación, más podrás confiar en la calidad del trabajo del vendedor.
xtessaroes Tilburg University
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
18
Miembro desde
5 año
Número de seguidores
16
Documentos
8
Última venta
2 año hace

4.0

1 reseñas

5
0
4
1
3
0
2
0
1
0

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes