100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
College aantekeningen

Lecture notes Computational Analysis of Digital Communication (S_CADC)

Beoordeling
-
Verkocht
1
Pagina's
51
Geüpload op
27-11-2023
Geschreven in
2023/2024

Lecture notes Computational Analysis of Digital Communication (S_CADC) Lecture 1: Introduction to Computational Methods in Communication Science Lecture 2: Automated Text Analysis and Dictionary Approaches Lecture 3: Text Classification and Classic Machine Learning Lecture 4: Word Embeddings, Transformers, and Large Language Models

Meer zien Lees minder











Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
27 november 2023
Aantal pagina's
51
Geschreven in
2023/2024
Type
College aantekeningen
Docent(en)
Dr. philipp k. masur
Bevat
Alle colleges

Onderwerpen

Voorbeeld van de inhoud

Computational Analysis of Digital
Communication
Lecture 1 | Introduction
Readings lecture 1
Kramer et al. 2014
Van Atteveldt & Peng (2018)




Increasing amount of data available online
Much of what we know about huma
behavior…
… is based on what people tell us
- In self-report measures in surveys
- In responses in experimental
research
- In qualitative interviews
Note: although valuable, such
measurements can be biased (Scharkow,
What can we learn from this much data? 2013; Parry et al. 2021)
Timeline of natural language processing But a lot of (mass) communication looks
like this:




Or is based on user-generated content
- Tiktok
- Instagram
- Etc

,What is computational social science, and supervised text classification, topic
why should we care? modelling, word embeddings, as well as
large language models
Field of social science that uses
algorithmic tools and large/unstructured
data to understand human and social
10 characteristics of big data
behavior
1. Big
Complements rather than replaces
The scale of volume of some
traditional methodologies: methods are not
current data sets is often
the goal, but contribute to data generation
impressive. However, big data sets
Includes methods such as: are not an end in themselves, but
they can enable certain kinds of
- Data mining (e.g. scraping and
research including the study of
gathering of large data sets)
rare events, the estimation of
- Software development for social
heterogeneity, and the detection of
science experiences
small differences
- Automated text analysis (e.g.
2. Always-on
sentiment analysis, keyword
Many big data systems are
extraction, dictionary approaches)
constantly collecting data and thus
- Image classification (e.g. face
enable to study unexpected event
recognition, visual topic modeling)
and allow for real-time
- Machine learning approaches (e.g.
measurement
for classification, prediction, topic
3. Nonreactive
modelling)
Participants are generally not
- Actor-based modelling (e.g.
aware that their data are being
simulation of social behavior,
captured or they have become so
spreading of information
accustomed to this data collection
that it no longer changes their
behavior
Why is this important now? 4. Incomplete
Vast amount of digitally available data, Most big data sources are
ranging from social media messages and incomplete, in the sense that they
other digital traces to web archives and don’t have the information that you
newly digitized newspapers and other will want for your research. This is
historical archives a common feature of data that were
created for purposes other than
Large-scale records (big data) of persons research
or businesses are created constantly 5. Inaccessible
Powerful and comparatively cheap Data held by companies and
processing power and easy to use governments are difficult for
computing infrastructure for processing researchers to access
these data
Improve tools to analyze this data,
including network analysis methods and
automatic text analysis methods such as

, 6. Nonrepresentative Definition
Most big data are nonetheless not
“Computational communication
representative of certain
science (CSS) is the label applied to the
populations. Out-of-sample
emerging subfield that investigates the
generalizations are hence difficult
use of computational algorithms to
or impossible
gather and analyse big data and often
7. Drifting
semi- or unstructured data sets to
Many big data systems are
develop and test communication
changing constantly, thus making it
science theories”
difficult to study long-term trends
– Van Atteveldt & Peng, 2018
8. Algorithmically confounded
Behavior in big data systems is not
natural; it is driven by the
engineering goals of the systems Typical research areas
9. Dirty Studies involve:
Big data often includes a lot of
noise (e.g. junk, spam, spurious - Large and complex data
data points) - Consisting of digital traces and
10. Sensitive other “naturally occurring” data
Some of the information that - Requiring algorithmic solution to
companies and governments have analyse
is sensitive - Allowing the study of human
communication by applying and
testing communication theory
Pro’s and con’s of computational methods Political communication
Pro’s - Democratization and polarization
- We can study actual behavior - Hate speech
instead of simply self-reports Social media use
- We can study human being in their
social context instead of in an - Tracking of actual social media use
artificial lab setting - Spreading of behavior, information,
- We can increase our N (higher or emotions
power) Health communication
- Potential to uncover patterns and
insights that we couldn’t - Prevalence of health information
investigate before online

Con’s (online) journalism

- Techniques often (rather) - News coverage across decades
complicated - Gender equality
- Data is often proprietary (not
shared openly)
- Samples are often biased
- Often, data have only insufficient
metadata

, Example 1: analysing news coverage - Studied the media’s attribution of
gender-linked, and political traits to
- Analyse of the coverage of nuclear
US politicians
technology from 1945 to 2014 in
- All three masculine traits were
New York Times
more strongly associated with male
- 51.528 stories
politicians, but only the feminine
- Used LDA topic modelling to
physical traits were more strongly
extract latent topics and analysed
associated with female politicians
their occurrence over time
Example 5: Gender representation in TV
Example 2: Facebook data to predict
personality - Gender representations in over 10
years of daytime TV programming
- 58.000 volunteers who provided
- Used neural networks to
their FB likes, detailed
automatically detect gender in
demographic profiles and the
shown faces
results of several psychometric test
- Women on average remained
- One can predict a variety of
underrepresented on TV
personal characteristics and
- This strong overall bias was
personality traits from simple FB
mirrored across specific
likes
subsamples (news, sports,
Example 3: Dutch telegramsphere advertising)

- Full messaging history of 174
Dutch-language public Telegram
The “Facebook mood manipulation” study
chats/channels
| Kramer et al. 2014
- Used State-of-the-art-web-mining,
neural topic modelling, and social - Massive online experiment (N +-
network analysis techniques 700k)
- Findings raise concerns with - Main RQ: is emotion contagious?
respect to Telegram’s polarization - Experimental groups:
and radicalization capacity positive/negative/control
- Telegram users are active in and - Stimulus: hide
share content across different (negative/positive/random)
communities messages from FB timeline
- Over time, conspiracy-themed, far- - Measurement/dependent variables:
right activist, and COVID-19- sentiment of posts by user
sceptical communities dominated
Example 4: Gender stereotypes in political
news
- Gender differences in political
news coverage to determine
whether the media employ
stereotypical traits in portrayals of
1.095 US politicians
- 5 million US news stories
€6,18
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Maak kennis met de verkoper

Seller avatar
De reputatie van een verkoper is gebaseerd op het aantal documenten dat iemand tegen betaling verkocht heeft en de beoordelingen die voor die items ontvangen zijn. Er zijn drie niveau’s te onderscheiden: brons, zilver en goud. Hoe beter de reputatie, hoe meer de kwaliteit van zijn of haar werk te vertrouwen is.
Evu8 Vrije Universiteit Amsterdam
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
56
Lid sinds
3 jaar
Aantal volgers
34
Documenten
19
Laatst verkocht
1 week geleden

4,3

7 beoordelingen

5
3
4
3
3
1
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen