Apr 042013
 

Title

This is an R Markdown document. Markdown is a simple formatting syntax for authoring web pages (click the MD toolbar button for help on Markdown).

When you click the Knit HTML button a web page will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist    
##  Min.   : 4.0   Min.   :  2  
##  1st Qu.:12.0   1st Qu.: 26  
##  Median :15.0   Median : 36  
##  Mean   :15.4   Mean   : 43  
##  3rd Qu.:19.0   3rd Qu.: 56  
##  Max.   :25.0   Max.   :120

You can also embed plots, for example:

plot(cars)

plot of chunk unnamed-chunk-2

Sep 162012
 

I’m happy to present episode 10 of the R-Podcast! Season 1 of the R-Podcast concludes with part 2 of my series on data munging, in which I discuss issues surrounding importing data sets contained in HTML tables. I share how I used the XML and RCurl packages to validate and import data from hockey-reference.com for storage into a MySQL database. Our listener feedback segment contains another installment on the Pitfalls of R contributed by listener Frans. I want to thank everyone who has provided such positive feedback throughout the season, and I’m looking forward to providing some exciting new content for season 2. I hope you enjoy the episode and check out our new contact page if you would like to provide any feedback. Thanks for listening!

The following resources are mentioned in this episode:

Episode 10 Time Stamps

00:00 The R-Podcast #010 Adventures in Data Munging Part 2
00:33 Introduction
01:50 Wrapping up season 1 ... wait, what?
03:30 Rstudio team expands
05:41 R Community milestone
07:53 Discovering hockey-reference.com 
10:54 Tips for readHTMLtable
21:10 Checking for valid data first
29:23 Minor processing needed
35:18 Saving data to MySQL database
45:26 Listener Feedback: Andrew
54:58 Frans: Pitfalls of R segment 2
63:40 Wrapping up: subscribe to the podcast, theRcast@gmail.com, + 1-269-849-9780, Twitter @theRcast
69:14 End
Aug 052012
 

It’s great to be back with a new episode after an eventful break! This episode begins a series on my adventures in data munging, a.k.a data processing. I discuss three issues that demonstrate the flexibility and versatility R brings for recoding messy values, important inconsistent data files, and pinpointing problematic observations and variables. We also have an extended listener feedback segment with an audio installment of the “pitfalls” of R contributed by listener Frans. I hope you enjoy this episode and keep passing along your feedback to theRcast(at)gmail.com and stop by the forums as well!

The following resources are mentioned in this episode:

Episode 9 Time Stamps

00:00 The R-Podcast #009: Adventures in Data Munging Part 1
00:31 Introduction
01:38 Big news: +1
03:53 R 2.15.1 released
04:26 UseR! 2012
07:20 Hockey Summary Project
10:30 Dealing with empty files
15:18 Importing inconsistent data files
28:15 Recoding using car package
35:08 Useful functions for pinpointing issues
44:55 Listener Feedback
45:14 Daniel: Advice on data munging
55:01 Frans: Pitfalls of R
66:28 Wrapping up: subscribe to the podcast, theRcast@gmail.com, + 1-269-849-9780, Twitter @theRcast, Google Plus
71:22 End