─ Session info ───────────────────────────────────────────────────────────────
setting value
version R version 4.2.0 (2022-04-22)
os macOS Monterey 12.2.1
system aarch64, darwin20
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/Chicago
date 2022-04-28
pandoc 2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
quarto 0.9.294 @ /usr/local/bin/quarto
─ Packages ───────────────────────────────────────────────────────────────────
package * version date (UTC) lib source
sessioninfo * 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
[1] /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/library
──────────────────────────────────────────────────────────────────────────────
Over the past month or so, the r4ds online learning community founded by Jesse Maegan has been developing projects intended to help connect mentors and learners. One of the first projects born out of this collaboration is #TidyTuesday, a weekly social data project focused on using tidyverse
packages to clean, wrangle, tidy, and plot a new dataset every Tuesday.
If you are interested in joining the r4ds online learning community check out Jesse Maegan’s post here!
Every Monday we will release a new dataset on our GitHub that has been tamed, but does not always adhere to “tidy” data principles. This dataset will come from an article with an interesting plot. Our goal is to have you take a look at the raw data, and generate either a copy of the original plot or a novel take on the data! You can obviously use whatever techniques you feel are appropriate, but the data will be organized in a way that tidyverse
tools will work well!
Why such an emphasis on the tidyverse
?
The tidyverse
is an “opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.” The tidyverse
is at the core of the the R for Data Science text written by Garrett Grolemund and Hadley Wickham. This book is aimed to be beginner-friendly but also deep enough to empower R experts as well. The framework of both the book and the tidyverse
package is seen above.
We focus on the tidyverse
package as the r4ds online learning community was founded “with the goal of creating a supportive and responsive online space for learners and mentors to gather and work through the R for Data Science text”. Beyond that, the tidyverse
is consistent, powerful, and typically more beginner friendly. It is a good framework to get started with, and is complementary to base R (or the 1000s of other R packages).
Guidelines for TidyTuesday
To participate in TidyTuesday, you need to do a few things:
- Create and save an image of a plot from R
- Save the code used to recreate your plot (include data tidy steps!)
- Submit the plot and code on Twitter
- Use the
#TidyTuesday
hashtag (you can also tag me @thomas_mock) - Browse other submissions and like/comment on their work!
However, that might seem like a lot! So at minimum please submit your plot with the hashtag #TidyTuesday
.
All data will be posted on the data sets page on Monday. It will include the link to the original article (for context) and to the data set.
If you want to work on GitHub (a useful data science skill) feel free to post your code on GitHub! This will allow others to see and use your code, whereas an image of the code means they would have to re-type everything! Additionally, hosting on GitHub gives you a Data Science Portfolio to talk about/show in interviews, and allows you to access your code across different computers easily!
You can also upload your code into Carbon, a website the generates a high-quality image of your code.
Lastly, if you create your plot with the tidyverse
you can save high quality ggplot2
images!
Rules for TidyTuesday
We welcome all newcomers, enthusiasts, and experts to participate, but be mindful of a few things:
- This is NOT about criticizing the original authors. They are people like you and me and they have feelings. Focus on the data, the charts and improving your own techniques.
- This is NOT about criticizing or tearing down your fellow #RStats practitioners! Be supportive and kind to each other! Like other’s posts and help promote the #RStats community!
- The data set comes from the source article or the source that the article credits. Be mindful that the data is what it is and Tidy Tuesday is designed to help you practice data visualization and basic data wrangling.
- Use the hashtag #TidyTuesday on Twitter if you create your own version and would like to share it.
- Include a picture of the visualisation when you post to Twitter.
- Include a copy of the code used to create your visualization when you post to Twitter. Comment your code wherever possible to help yourself and others understand your process!
- Focus on improving your craft, even if you end up with someting simple! Make something quick, but purposeful!
- Give credit to the original data source whenever possible.
This week’s submissions!
Everyone did such a great job! I’m posting all the ones that I can find through the hashtag, you can always tag me in your post to make sure you get noticed in the future.
If you have an apple and I have an apple and we exchange these apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas.
— Thomas Mock (@thomas_mock) April 11, 2018
— George Bernard Shaw#TidyTuesday - spreading ideas!
Umair submitted the first TidyTuesday plot EVER!!
I plotted the costs for the last 5 years in the data pic.twitter.com/jNQwHI1mqu
— Umair Durrani (@umairdurrani87) April 2, 2018
Meenakshi learned about new tidyverse functions and made a beautiful rainbow plot!
#tidytuesday done on a monday. Learnt gather and fct_reorder. Loved it!
— Meenakshi Srinivasan 🐠 (@srini_meen) April 3, 2018
Made my plot 🌈🌈 Thanks Thomas!! 😃 pic.twitter.com/oqKV6HlBBt
Son had several takes on the data, all of which helped tell the story!
#TidyTuesday Prices always go up, but if you compare it to the annual average then interesting things happen. Something happened in Arizona, Ohio, Hawai.
— Son M (@SonGeo) April 3, 2018
code: https://t.co/xJ5kD185Os pic.twitter.com/i4BMDorq3c
Bren submitted clean code and two great takes on the data!
Just having a little bit of R fun this Tuesday. Found this #TidyTuesday and thought I could give my contribution. I gather() and summarise() all the Year variables though…makes a different result.
— Brenborbs (@brenborbon) April 3, 2018
Thanks @thomas_mock for this good idea. #rstats pic.twitter.com/q54fI9LZRl
John was one of our first submissions, and made a nice geo heatmap!
10 year tuition growth per state. Second map is with Hawaii filtered out as it was bit of an outlier and darkening the entire map.#TidyTuesday #R4DS #rstats pic.twitter.com/EPE07myFCi
— John Bray (@njbuzz19) April 3, 2018
The Part Time Analyst selected a range of states that showed nice differences!
my attempt at tidy tuesday. Difficult part was getting the cost from character to numeric #TidyTuesday #rstats pic.twitter.com/1fq70OSbhZ
— The Part Time Analyst (@parttimeanalyst) April 3, 2018
Paula made a clean difference from the mean barplot!
on #tidytuesday I got this code https://t.co/nabbUZxLm3 I'm wondering if the values are right in what I did similar to yours @brenborbon but different from the original plot 🤔 pic.twitter.com/49IYNxR2EJ
— Paula Andrea (@orchid00) April 3, 2018
Nivra submitted two takes on the data that both turned out great!
Here's my first go at #TidyTuesday, working with tuition data in the United States. Nothing is quite as smooth as #tidyverse and #rstats when doing data analysis. Let me know your thoughts! pic.twitter.com/DLoP31bOXK
— Dylan McDowell (@dylanjm_ds) April 5, 2018
Rohit created a shiny app, and output gif!
Finally built my first shiny app as a part of #TidyTuesday https://t.co/G6oJJ5t2HF #rstats #ggplot2 #R4DS pic.twitter.com/pExxXPBZWn
— Kumar Rohit Malhotra (@krohitm) April 5, 2018
submitted a nice, facetted graph with a swapped axis!
Hopefully not too late for a #tidytuesday submission pic.twitter.com/fqtc2YLXkh
— Raul (@raviolli77) April 4, 2018
Wire Monkey submitted a gif of the US with hex states!
Used https://t.co/fCLxYKVDTw#TidyTuesday to learn hexmap via https://t.co/CegVkyaBfs and gganimate. #rstats #maps #adaylate #r4ds pic.twitter.com/5LNS1q2bOJ
— Alyssa Goldberg (@WireMonkey) April 4, 2018
Vinicius submitted a good-looking heatmap!
Not something I would do in practice, but I used the opportunity to experiment with R. #TidyTuesday #rstats #ggplot2 pic.twitter.com/rJ3kX1C2AR
— Vinícius Félix (@H0Vinicius) April 4, 2018
Jake submitted a super clean slopegraph!
A slopegraph for #TidyTuesday, code available at: https://t.co/qS64cQ3oDJ pic.twitter.com/j34DA2DzN6
— Jake Kaupp (@jakekaupp) April 3, 2018
Robert ubmitted another take on Jake’s slopegraph!
I loved that approach. I wanted to see what the growth rates looked like identifiable by state, here went…. th code is https://t.co/HGqXMVnhnF#TidyTuesday pic.twitter.com/pnCf8aJqP2
— Robert Walker (@PieRatio) April 4, 2018
Frank wrote a great blogpost on his “4 hour process”!
New blog post: My first #TidyTuesday challenge, completed within a 4-hour time limit from concept to communication. #rstats #r4ds https://t.co/o82e1eOloo
— Frank Farach (@FrankFarach) April 4, 2018
Sam made a really well organized facetted map!
My #TidyTuesday submission, a first pass with geo_facet() and trying to get my head around nested data frames. Not sure the absolute ranking is as important as showing where each state is relative to the others https://t.co/Hp3wd7R4VL pic.twitter.com/K4FFY7rg74
— Sam Clifford (@samclifford) April 4, 2018
Isabella submitted a really nice beeswarm plot over time!
I love me some beeswarm plots… #TidyTuesday #rstats pic.twitter.com/76T94PU3el
— Isabella Velásquez (@ivelasq3) April 3, 2018
Miquel used some data he was already working on, with nice walk-though code!
My intent of exploratory analysis based on categorical data (factor variables). Group counting by conditions, creating two new datasets with the results and joining them by a common variable to plot the whole information. Thoughts? #TidyTuesday #rstats pic.twitter.com/NqHyHPH3qd
— Miguel Cosenza (@MiguelCos) April 3, 2018
If you made a plot and I missed it, feel free to contact me on Twitter with a link to your tweet. As #TidyTuesday
grows, there may be issues where Twitter doesn’t show ALL the plots to me – so it may be helpful to tag me directly in your post to guarantee I see it for sharing!
Here’s to next week! Good luck!
I’d also like to thank the #r4ds
Mentorship Pilot team for their help in conceptualizing TidyTuesday: Terence, Rosa Castillo, Andrew Macfarland, Ariel Levy, Burcukaniskan, Corrado Lanera, Jake Kaupp, Jason Baik, Jesse Maegan, Radovan Kavicky, Raul, and Shan. There are some other cool projects coming out of this group, so stay posted as they roll out over the next few months.
Other Useful Links
The R4DS Online Learning Community
The R for Data Science textbook
Carbon lets you post beautiful code directly to Twitter!
We will use the fivethirtyeight package frequently for “tame data
GitHub lets you host raw code for free!