100% de satisfacción garantizada Inmediatamente disponible después del pago Tanto en línea como en PDF No estas atado a nada 4.2 TrustPilot
logo-home
Resumen

Samenvatting Computational Analysis Tentamen

Puntuación
1.0
(1)
Vendido
1
Páginas
35
Subido en
07-02-2022
Escrito en
2021/2022

Een samenvatting van de colleges en werkgroepen van computational gegeven in de master Communicatiewetenschap

Institución
Grado











Ups! No podemos cargar tu documento ahora. Inténtalo de nuevo o contacta con soporte.

Escuela, estudio y materia

Institución
Estudio
Grado

Información del documento

Subido en
7 de febrero de 2022
Número de páginas
35
Escrito en
2021/2022
Tipo
Resumen

Temas

Vista previa del contenido

Belangrijkste punten CADC tentamen
Definition of Computational communication science (CCS)
is the label applied to the emerging subfield that investigates the use of
computational algorithms to gather and analyze big and often semi- or
unstructured data sets to develop and test communication science
theories.

R  produce publication ready figures and visualizations. Allows us to
combine analyses and witing to proecures diverse output formats
R allows us  flexible and comprehensive programming
Complex data mangement
Advance analyses with large/messy data

10 characteristics of Big Data
1. Big: The scale or volume of some current datasets is often
impressive. Big datasets are not an end in themselves> data sets
are huge in volume
2. Always on: Many big data systems are constantly collecting data
and thus enable to study unexpected events als allow for real-time
measurement > je data is waardevol als het altijd aan staat, kunnen
soms zelfs analysis maken van dingen die nog niet eens gebeurd zijn
3. Non reactive: Participants are generally not aware that their data is
being captured or they have become so accustomed to this data
collection that it no longer changer their behaviour.Facebook deelt
niet met ons welke data zij opslaan
4. Incomplete: Most big data sources are incomplete in the sense that
they don’t have the information that they you want for your
research. Meeste bigdata bronnen zijn incompleet, in de zin dat ze
geen informatie hebben die je wil voor je onderzoek, deze data werd
voor een ander doel verzameld
5. Inaccesible: Data held by companies is difficult for researches to
access. > data wordt soms door bedrijven of minesteries beheerd
6. Nonrepresentive: Most bigdatasets are nonetheless not
representative of certain populations. > out of sample
generalizations are hence difficult or impossible > meeste big
datasets zijn niet representatief voor bepaalde populaties, out of
sample genereraties zijn moeilijk te meten
7. Drifting: Many bigdata systems are changing constantly, makes it
difficult to study long term trends > veel big data systemen
veranderen continue, maakt het moeilijk voor lange termijn studie
8. Algorithmically confounded: Behaviour in big data systems is not
natural: driven by the enginering goals of the systems > gedrag in
big data systemen is niet naturel, gestuurd door de doelen van het
systeem
9. Dirty: Big data often includes a lot of noise (spam, junk etc) big data
bevat veel noise (junk spam etc)
10. Sensitive: some information of companies and governments
is sensitive

Is big data always a good idea?

,Big data is niet de oplossing voor alle methodelogische problemen en
heeft limitaties
- Big data is found while survey data is made by researcher
- Big data is not always representative for a certain population, wordt
vaak ergens vandaan geplukt
- Significantie (p-waarde) are less meaningful as a measure for validity

Voorbeeld study Kramer et al Facebook




Veel aan deze studie is cool, maar nog meer is niet cool
- Not informed consent
- Not replicable
- Low internatal validity  Is sentiment of posts indicative of mood? &
does change in sentiment orginate in contagion of mood
- Low measurment accuracy – are word counts indicative of sentiment?
- Overt manipulation of people’s life
Ethical problems
- Respect for persons
- Beneficence: understanding and improving the risk/benefit profile of a
study
- Justice: risk and benefits should be evenly distributed
- Respect for law and public interest

Typical computational research strategies
1. Counting things  in the age of big data, reasercher can ‘count’
more than ever
2. Forecasting and nowcasting  big data allow for more accurate
predictions both in the present and future
3. Approximating experiments  computaional methods provide
oppurtunities to contact ‘natural experiments’

Promises of computational communication research
The recent acceleration in the use of computational methods for
communication science is primarily fueled by the confluence of at least
three developments:
- vast amounts of digitally available data, ranging from social media
messages and other "digital" traces to web archives and newly digitized
newspaper and other historical archives
- improved tools to analyze this data, including network analysis
methods and automatic text analysis methods such as supervised
text classification, topic modelling, word embeddings and
syntactic methods
- powerful and cheap processing power, and easy to use computing
infrastructure for processing these data, including scientific and
commercial cloud computing, sharing platforms such as Github and

, Dataverse, and crowd coding platforms such as Amazon MTurk and
Crowdflowern semi – or unstructured data sets to develop and test
communication science theories

Challenges of computational communication science
- Data-driven research questions might not be theoretically
interesting > onderzoeksvragen kunnen theoretisch niet interessant
zijn
- Proprietary data threatens accessibility and reproducibility
- Found data is not always representative, threatening external
validity > Gevonden data is niet altijd representatief (bedreiging voor
externe validiteit)
- Computational methods bias and noise threaten accuracy and
internal validity> bias en noise bedreigen accurariteit en interne
validiteit
- Inadequate ethical standards/ procedures

Computational communication science  is the label applied to the
emerging subfield that investigates the use of computational algorithms to
gather and analyze big and often semi- or unstructured data sets to
develop and test communication science theories.

Promises of Computational communication research
- Vast amount of digitallly available data
- Improved tools to analyse the data
- Powerful & cheap processing power and easy to use

Advantages & Disadvantages of Computational Methods
Advantages Disadvantages
From self report to real behavior Techniques often
Zo kan er echt gedrag gemeten worden, zonder complicated
dat self report attitudes of intenties in de weg Data often properiety
staan. Zo kan dit helpen bij sociaal wenselijke Data vaak alleen available voor
problemen en is het niet afhankelijk van mensen bepaalde mensen
hun verlangen en intenties Samples often biased
 Ook onderliggende menselijke communicatie Insufficient metadata
komt naarboven
Social context vs lab setting
Reactie van mensen zien in een echte
omgeving/dagelijks leven, in plaats van in een
lab setting.
Small N to large N
Meer mensen in een onderzoek zorgt
automatisch ook voor het verklaren van meer
subtiele relaties of effecten in kleinere sub
populaties
From solitary (allen) to collaborative
Digitale data & computer tols maken het
makkelijker om te delen en bronnen her te
gebruiken.

, RR 1 : When communication meets computation: opportunities,
challenges and pitfalls in Computational communication science
Wouter van Atteveldt & Tan Quan-Peng

De rol van computational methods in communicatie wetenschap
de laatste versnelling in het gebruik komt voornamelijk door de toeloop
van drie ontwikkelingen:
 veel data beschikbaar, verbeterde analyze tools & powerful & cheap
processing power
1. A deluge of digitally available data, ranging from social media
messages and other“digital traces” toweb archives and newly digitized
newspaper and other historical archives  een storvloed aan digitaal
beschikbare date, varierend van social media berichten tot webarchief,
kranten etc
2. Improved tools to analyze this data, including network analysis
methods and automatic text analysis methods such as supervised text
classification topic modelling, word embeddings and syntactic methods 
verbeterde tools om deze data te analyseren, netwerkanalysemethodes,
automatische tekst analyse (tekst classification, onderwerp modellering,
word embedding & syntactische methodes)
3. The emergence of powerful and cheap processing power, and
easy to use computing infrastructure for processing these data,
including scientific and commercial cloud computing,
sharing platforms such as Github and Dataverse, and crowd coding
platforms such as
Amazon MTurk and Crowdflower  de opkomst van goedkope, krachtige
verwerkingskrachten voor het verwerken van gegevens

Over het algemeen bevatten computational communication
methode studies het volgende:
1. Large & complex data sets
2. Consiting of digital traces and other naturally occurring data
3. Requiring algorithmic solutions to analyse
4. Allowing the study of human communication by appluing and testing
communication theory

Week 2 Data Wrangling & Data visualization
A general model of data science
Import  tidy  transform  visualize  model  communicate
1. Import data
- Data comes in different forms (two- or
multidimensional, text or numbers...) and formats
(.csv, .txt, .sav, .stata, .html...)
- First, we must find a way to import this data into R
- This typically means that you take data stored in a
file, database, or web application programming
interface (API), and load it into a data frame in R
- Imagine we would have found the following table on wikipedia and
would want to get it into R...
$6.94
Accede al documento completo:

100% de satisfacción garantizada
Inmediatamente disponible después del pago
Tanto en línea como en PDF
No estas atado a nada

Reseñas de compradores verificados

Se muestran los comentarios
10 meses hace

1.0

1 reseñas

5
0
4
0
3
0
2
0
1
1
Reseñas confiables sobre Stuvia

Todas las reseñas las realizan usuarios reales de Stuvia después de compras verificadas.

Conoce al vendedor

Seller avatar
Los indicadores de reputación están sujetos a la cantidad de artículos vendidos por una tarifa y las reseñas que ha recibido por esos documentos. Hay tres niveles: Bronce, Plata y Oro. Cuanto mayor reputación, más podrás confiar en la calidad del trabajo del vendedor.
teddievdstaak1 Radboud Universiteit Nijmegen
Seguir Necesitas iniciar sesión para seguir a otros usuarios o asignaturas
Vendido
22
Miembro desde
4 año
Número de seguidores
17
Documentos
15
Última venta
1 año hace

1.0

1 reseñas

5
0
4
0
3
0
2
0
1
1

Recientemente visto por ti

Por qué los estudiantes eligen Stuvia

Creado por compañeros estudiantes, verificado por reseñas

Calidad en la que puedes confiar: escrito por estudiantes que aprobaron y evaluado por otros que han usado estos resúmenes.

¿No estás satisfecho? Elige otro documento

¡No te preocupes! Puedes elegir directamente otro documento que se ajuste mejor a lo que buscas.

Paga como quieras, empieza a estudiar al instante

Sin suscripción, sin compromisos. Paga como estés acostumbrado con tarjeta de crédito y descarga tu documento PDF inmediatamente.

Student with book image

“Comprado, descargado y aprobado. Así de fácil puede ser.”

Alisha Student

Preguntas frecuentes