tidymodels

Predict #TidyTuesday NYT bestsellers

Will a book be on the NYT bestseller list a long time, or a short time? We walk through how to use wordpiece tokenization for the author names, and how to deploy your model as a REST API.

Predicting viewership for #TidyTuesday Doctor Who episodes

Using a tidymodels workflow can make many modeling tasks more convenient, but sometimes you want more flexibility and control of how to handle your modeling objects. Learn how to handle resampled workflow results and extract the quantities you are interested in.

Create a custom metric with tidymodels and NYC Airbnb prices

Predict prices for Airbnb listings in NYC with a data set from a recent episode of SLICED, with a focus on two specific aspects of this model analysis: creating a custom metric to evaluate the model and combining both tabular and unstructured text data in one model.

Multinomial classification with tidymodels and #TidyTuesday volcano eruptions

Lately I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to evaluate complex models. Today’s screencast demonstrates how to implement multiclass or multinomial classification using with this week’s #TidyTuesday dataset on volcanoes. 🌋 Here is the code I used in the video, for those who prefer reading instead of or in addition to video. Explore the data Our modeling goal is to predict the type of volcano from this week’s #TidyTuesday dataset based on other volcano characteristics like latitude, longitude, tectonic setting, etc.

Sentiment analysis with tidymodels and #TidyTuesday Animal Crossing reviews

A lot has been happening in the tidymodels ecosystem lately! There are many possible projects we on the tidymodels team could focus on next; we are interested in gathering community feedback to inform our priorities. If you are interested in sharing your opinion on next steps in tidymodels development, please take this short survey. Lately I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to tune more complex models.

Modeling #TidyTuesday GDPR violations with tidymodels

This is an exciting week for us on the tidymodels team; we launched tidymodels.org, a new central location with resources and documentation for tidymodels packages. There is a TON to explore and learn there! 🚀 You can check out the official blog post for more details. Today, I’m publishing here on my blog another screencast demonstrating how to use tidymodels. This is a good video for folks getting started with tidymodels, using this week’s #TidyTuesday dataset on GDPR violations.

PCA and the #TidyTuesday best hip hop songs ever

Lately I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to tune more complex models. Today, I’m exploring a different part of the tidymodels framework; I’m showing how to implement principal component analysis via recipes with this week’s #TidyTuesday dataset on the best hip hop songs of all time as determinded by a BBC poll of music critics. Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

Bootstrap resampling with #TidyTuesday beer production data

I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to tune more complex models. Today, I’m using this week’s #TidyTuesday dataset on beer production to show how to use bootstrap resampling to estimate model parameters. Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

Tuning random forest hyperparameters with #TidyTuesday trees data

I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to tune more complex models. Today, I’m using a #TidyTuesday dataset from earlier this year on trees around San Francisco to show how to tune the hyperparameters of a random forest model and then use the final best model. Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

LASSO regression using tidymodels and #TidyTuesday data for The Office

I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to tune more complex models. Today, I’m using this week’s #TidyTuesday dataset on The Office to show how to build a lasso regression model and choose regularization parameters! Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

Preprocessing and resampling using #TidyTuesday college data

I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first getting started to how to tune machine learning models. Today, I’m using this week’s #TidyTuesday dataset on college tuition and diversity at US colleges to show some data preprocessing steps and how to use resampling! Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

Hyperparameter tuning and #TidyTuesday food consumption

Last week I published a screencast demonstrating how to use the tidymodels framework and specifically the recipes package. Today, I’m using this week’s #TidyTuesday dataset on food consumption around the world to show hyperparameter tuning! Here is the code I used in the video, for those who prefer reading instead of or in addition to video. Explore the data Our modeling goal here is to predict which countries are Asian countries and which countries are not, based on their patterns of food consumption in the eleven categories from the #TidyTuesday dataset.

#TidyTuesday hotel bookings and recipes

Last week I published my first screencast showing how to use the tidymodels framework for machine learning and modeling in R. Today, I’m using this week’s #TidyTuesday dataset on hotel bookings to show how to use one of the tidymodels packages recipes with some simple models! Here is the code I used in the video, for those who prefer reading instead of or in addition to video.

#TidyTuesday and tidymodels

This week I started my new job as a software engineer at RStudio, working with Max Kuhn and other folks on tidymodels. I am really excited about tidymodels because my own experience as a practicing data scientist has shown me some of the areas for growth that still exist in open source software when it comes to modeling and machine learning. Almost nothing has had the kind of dramatic impact on my productivity that the tidyverse and other RStudio investments have had; I am enthusiastic about contributing to that kind of user-focused transformation for modeling and machine learning.