The R-Podcast

Giving practical advice on how to use R for powerful and innovative data analyses.

The R-Podcast Episode 12: Using Version Control with R

This is not an April Fool's joke ... The R-Podcast is back once again! In this episode, I discuss the concept of version control and how you can get started with using the Git VCS right now with your R projects. Also I discuss a big batch of listener feedback, and highlight a couple of great visualization applications from the community using ggplot2. All of that and more on episode 12 of the R-Podcast!

Direct Download: [mp3 format] [ogg format]

Episode 12 Show Notes

The basics for version control and Git
Listener Feedback
R Community Roundup
Package pick
  • reports: An R package to assist in the workflow of writing academic articles and other reports (via TRinker's blog)
How to interact with the show
  • Submit your questions and comments via the R-Podcast contact page, or send an email to theRcast(at)gmail.com
  • Send in an audio comment via audio attachment to theRcast(at)gmail.com, or leave a voicemail on the R-Podcast voicemail hotline: +1-269-849-9780
  • Get show updates via our Twitter account: @theRcast
  • Follow us on our R-Podcast Google Plus page: gplus.to/thercast
  • Provide your favorite R community links at the R-Podcast subreddit: links.r-podcast.org/
Music Credits

The R-Podcast Episode 11: Reproducible Analysis Part 1 (Introduction)

Season 2 of the R-Podcast is up and running! This episode begins a multi-part series on reproducible analysis using R. In this episode I discuss the usage of Sweave and LaTeX for producing reproducible reports, an introduction to the capabilities of the knitr package (more episodes will be coming dedicated to this package), and my motivation for adapting reproducible analysis techniques and tools into my workflow. In our listener feedback segment I discuss a new means of providing feedback to the R-Podcast using our new sub-reddit page and introduce new segments highlighting interesting stories around the R community and useful packages. This promises to be an exciting season of the R-Podcast, and I hope you enjoy this episode!

The following resources are mentioned in this episode:

Direct Download: [mp3 format] [ogg format]

Episode 11 Time Stamps

00:00 The R-Podcast #011 Reproducible Analysis Part 1
00:40 Introduction
02:43 Reproducible Research: Introduction
08:18 Sweave overview
16:20 Knitr overview
20:20 The Duke University Research Saga
30:56 What version control can offer
38:34 Presenting results
42:18 Listener feedback
60:55 R community roundup
69:39 Package pick: plyr
72:04 Wrapping up: subscribe at www.r-podcast.org, theRcast@gmail.com, + 1-269-849-9780, Twitter @theRcast, Google Plus, links.r-podcast.org
77:21 End

The R-Podcast Episode 10: Adventures in Data Munging Part 2

I'm happy to present episode 10 of the R-Podcast! Season 1 of the R-Podcast concludes with part 2 of my series on data munging, in which I discuss issues surrounding importing data sets contained in HTML tables. I share how I used the XML and RCurl packages to validate and import data from hockey-reference.com for storage into a MySQL database. Our listener feedback segment contains another installment on the Pitfalls of R contributed by listener Frans. I want to thank everyone who has provided such positive feedback throughout the season, and I'm looking forward to providing some exciting new content for season 2. I hope you enjoy the episode and check out our new contact page if you would like to provide any feedback. Thanks for listening!

The following resources are mentioned in this episode:

Direct Download: [mp3 format] [ogg format]

Episode 10 Time Stamps

00:00 The R-Podcast #010 Adventures in Data Munging Part 2
00:33 Introduction
01:50 Wrapping up season 1 ... wait, what?
03:30 Rstudio team expands
05:41 R Community milestone
07:53 Discovering hockey-reference.com 
10:54 Tips for readHTMLtable
21:10 Checking for valid data first
29:23 Minor processing needed
35:18 Saving data to MySQL database
45:26 Listener Feedback: Andrew
54:58 Frans: Pitfalls of R segment 2
63:40 Wrapping up: subscribe to the podcast, theRcast@gmail.com, + 1-269-849-9780, Twitter @theRcast
69:14 End

The R-Podcast Episode 9: Adventures in Data Munging Part 1

It’s great to be back with a new episode after an eventful break! This episode begins a series on my adventures in data munging, a.k.a data processing. I discuss three issues that demonstrate the flexibility and versatility R brings for recoding messy values, important inconsistent data files, and pinpointing problematic observations and variables. We also have an extended listener feedback segment with an audio installment of the “pitfalls” of R contributed by listener Frans. I hope you enjoy this episode and keep passing along your feedback to theRcast(at)gmail.com and stop by the forums as well!

The following resources are mentioned in this episode:

Direct Download: [mp3 format] [ogg format]

Episode 9 Time Stamps

00:00 The R-Podcast #009: Adventures in Data Munging Part 1
00:31 Introduction
01:38 Big news: +1
03:53 R 2.15.1 released
04:26 UseR! 2012
07:20 Hockey Summary Project
10:30 Dealing with empty files
15:18 Importing inconsistent data files
28:15 Recoding using car package
35:08 Useful functions for pinpointing issues
44:55 Listener Feedback
45:14 Daniel: Advice on data munging
55:01 Frans: Pitfalls of R
66:28 Wrapping up: subscribe to the podcast, theRcast@gmail.com, + 1-269-849-9780, Twitter @theRcast, Google Plus
71:22 End

The R-Podcast Screencast 2: Visualization with ggplot2

Here is the second screencast episode of the R-Podcast to accompany episode 8 of the R-Podcast: Visualization with ggplot2. In this screencast I demonstrate a real-time session of using ggplot2 to create boxplots for a visualization of hockey attendance in the NHL. The R code created in this screencast is available in our GitHub repository, and also each of the online resources are linked below. I added some new tweaks to the recording of this screencast based on feedback from the first screencast episode. Please let me know what you think of this improved screencast! As always you can send your feedback via email or audio comment to theRcast(at)gmail.com, leave a voicemail on our voicemail hotline at +1-269-849-9780, or join our new forums and leave a comment for this episode! The following resources are mentioned in this episode:

The R-Podcast Screencast 2: Visualization with ggplot2 from Eric Nantz on Vimeo.