Text_Analytics_Week12_NEC_Solved
Using the attached files of around 3200 tweets per person, show a histogram (frequency distribution) of the tweets of both Dave and Julia. Use `UTC` to create the time stamp. Remember that the case of column headers matters. Make a dataframe of word frequency for each of Dave and Julia. Plot the frequencies against each other. Include a dividing line in red showing words nearby that are similar in frequency and words more distant which are shared less frequently. Create a stacked chart comparing the odds ratios of the top 15 words used by each tweeter. Remove twitter handles from the list of words. Calculate the word usage ratios (usage v. total) and display it on a log scale. Do you notice any interesting differences? Does anything stand out as a difference?
Gekoppeld boek
- 2017
- 9781491981658
- Onbekend
Geschreven voor
- Instelling
- Text Analytics
- Vak
- Text Analytics
Documentinformatie
- Geüpload op
- 18 april 2023
- Aantal pagina's
- 13
- Geschreven in
- 2022/2023
- Type
- Tentamen (uitwerkingen)
- Bevat
- Vragen en antwoorden
Onderwerpen
-
textanalyticsweek12necsolved
Ook beschikbaar in voordeelbundel