03. Wrangling
Data wrangling skills are among the most important to hone.
Agenda
September 9 + 11, 2024
- Tidy data
- Data verbs
September 16, 2024
- Relational data (
_join
) pivot
ing
September 18, 2024
map
ping
Readings
Class notes: Data Wrangling
Wickham (2017) Data transformation and Tidy data in R for Data Science.
If you aren’t super comfortable with R yet, check out Workflow: basics in R for Data Science.
Reflection questions
How and why is
|>
used? And how is it different from the layering symbol+
used inggplot()
?What are the main data wrangling verbs?
How do you distinguish the different
_join
functions? Are the_join
keys formatted in the same way across the two datasets? Are the data recorded in the same way (e.g., is age birthday or age at recording?) ?What are some of the ways to distinguish a data verb from a typical function?
Ethics considerations
- What is Jan 31 plus one month? And why does it matter that every analysis we do is a series of decisions? Keeping in mind that each of us might make a different decision, and all decisions have consequences.
Slides
In class slides for both 9/9/24 and 9/11/24.
In class slides for 9/16/24.
In class slides for 9/18/24.
Additional Resources
:::