rstats

LASSO regression using tidymodels and #TidyTuesday data for The Office

I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to tune more complex models. Today, I’m using this week’s #TidyTuesday dataset on The Office to show how to build a lasso regression model and choose regularization parameters!

Preprocessing and resampling using #TidyTuesday college data

I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first getting started to how to tune machine learning models. Today, I’m using this week’s #TidyTuesday dataset on college tuition and diversity at US colleges to show some data preprocessing steps and how to use resampling!

Hyperparameter tuning and #TidyTuesday food consumption

Last week I published a screencast demonstrating how to use the tidymodels framework and specifically the recipes package. Today, I’m using this week’s #TidyTuesday dataset on food consumption around the world to show hyperparameter tuning!

#TidyTuesday hotel bookings and recipes

Last week I published my first screencast showing how to use the tidymodels framework for machine learning and modeling in R. Today, I’m using this week’s #TidyTuesday dataset on hotel bookings to show how to use one of the tidymodels packages recipes with some simple models!

#TidyTuesday and tidymodels

This week I started my new job as a software engineer at RStudio, working with Max Kuhn and other folks on tidymodels. I am really excited about tidymodels because my own experience as a practicing data scientist has shown me some of the areas for growth that still exist in open source software when it comes to modeling and machine learning.

Opioid prescribing habits in Texas

A paper I worked on was just published in a medical journal. This is quite an odd thing for me to be able to say, given my academic background and the career path I have had, but there you go!

Text Mining in R: A Tidy Approach

I spoke on approaching text mining tasks using tidy data principles at rstudio::conf yesterday. I was so happy to have the opportunity to speak and the conference has been a great experience.

A Beginner's Guide to Travis-CI for R

Have you seen all those attractive green badges on other people’s R packages and thought, “I want a lovely green badge!” Always a nice feeling when Travis manages to actually build.

Joy to the World, and also Anticipation, Disgust, Surprise...

In my previous blog post, I analyzed my Twitter archive and explored some aspects of my tweeting behavior. When do I tweet, how much do retweet people, do I use hashtags?

Ten Thousand Tweets

I started learning the statistical programming language R this past summer, and discovering Hadley Wickham’s data visualization package ggplot2 has been a joy and a revelation. When I think back to how I made all the plots for my astronomy dissertation in the early 2000s (COUGH SUPERMONGO COUGH), I feel a bit in awe of what ggplot2 can do and how easy and, might I even say, delightful it is to use.