Watch every talk from the 2021 New York R Conference: Lander Analytics YouTube channel
Originating in 2015, the New York R Conference started off with five incredible years in-person in New York City, followed by a more remote experience over the last two. No matter, the 2021 R Conference was as good as any, helped by our incredible lineup of world class speakers, and our supportive R community.
The purpose of the R conference is to gather, share and inspire ideas. And while we could not gather as a community, we could certainly share; and hopefully inspire.
A special thanks to our amazing speakers:
Pictured above from left to right (Top): Megan Robertson, Jeroen Janssens, Danielle Oberdier, Jared P. Lander, Alexa Fredston (Middle): Jonathan Bratt, Caitlin Hudon, Mike Band, Asmae Toumi (Bottom): Adam Chekroud, Reenah Nahum Muldavski, Sarah Catanzaro, Rachael Tatman, Daniel Chen
Pictured above from left to right (Top): Max Kuhn, COL Krista Watts, Wes McKinney, David Robinson (Middle): Chrys Wu, Andrew Gelman, Sonia Ang (Bottom): Igor Skokan, Bernardo Lares, Mayari Montes de Oca, Jonathan Hersh
Here are just a few highlights from so many fanatic talks:
Andrew Gelman is wrong again!
Wrong Again! 30+ Years of Statistical Mistakes was the title of Professor’s Gelman’s talk. It did not disappoint. Gelman noted three lessons that will help you become a better data scientist:
1) Put your work in the open for others to review
2) Understand your mistakes
3) Fix your process to avoid repeating them
Perhaps his most memorable quote of them all:
"I want to inspire you to come to terms with your mistakes. You have it within yourself to become a better statistician or data scientist."
David Robinson introduces dbcooper, which turns any database into an R package
The headline speaks for itself. Every year, it seems like David comes to the R conference with the next game-changing package for professionals and R enthusiasts alike. This year, it was the introduction of the dbcooper package, which generates functions to more efficiently query data directly from a database in R.
And is it really a New York R Conference if David isn’t live-coding during his 20-minute talk? Of course he pulled it off, like always.
Dr. Rachel Tatman helps us all with natural language processing
NLP is all the rage in the world of data science. Dr. Rachel Tatman, a senior developer advocate at Rasa HQ, breaks down five common mistakes when dealing with language data, and how to recover from them.
The five common mistakes:
1. Transcribed speech != text
2. Not expecting variation
3. Doing too much text cleaning
4. Not using meta-data
Find out how to recover from these common issues in Rachel’s talk slides here: 5 mistakes you'll probably make with language data (and how to recover)
Max Kuhn shows us how the Pandemic “ruined his favorite dataset”
There may not be a single time series dataset that was not affected by the global shutdown that took place in March of last year.
How about Max's [favorite?] time series dataset, Chicago's Clark and Lake metro station. Or better known as the stop by the Chicago Bears' Soldier Field. How will Max (and the world) deal with the confounding factor of the pandemic on the seasonality of time series data?
Here is Max’s github repository of his slides, code, and dataset from his talk: The Global Pandemic Ruined My Favorite Data Set
The virtual happy hour was also back!
Happy hour is the best hour with our friends Jeff and Kayla from Remy Cointreau. We mixed and shook up a few of their signature cocktails with Cointreau and The Botanist gin. Attendees got to learn all about the different liquors and saw step-by-step on how to make a Bee’s Knees and a Cosmopolitan.
All proceeds from the A(R)T Auction went to the R Foundation, again!
Our famous A(R)T Auction took place again! We featured pieces by artists in the R Community, and all proceeds were donated to the R Foundation. The highest-selling piece at auction was the ggplot plate by Selina Carter. The second highest piece was Data and Gaoliath by Jacqueline Nolis. The third highest piece was Cross Street by DiKayo Data.
Live Podcast Recording of SuperDataScience
At the end of the second day of the conference Jon Krohn recorded a special live session of the SuperDataScience Podcast where he interviewed Drew Conway. They discussed many topics, including how Drew shepherded the New York Open Statistical Programming meetup through its early years and the genesis of the Data Science Venn Diagram which has become a staple of data science conferences everywhere. Drew has long been a driving force of the community so it was great to have him return and talk with Jon who we first met at the meetup!
The return of conference workshops
Back by popular demand, our series of workshops were conducted on September 1st, one week prior to the two-day conference. An inspiring group of instructors led the following 6 workshops:
Machine Learning in R with Max Kuhn
Geospatial Statistics and Mapping in R with Kaz Sakamoto
Git for Data Science with Daniel Chen
Exploring Data with the Tidyverse with David Robinson
Data Wrangling with Unix Power Tools with Jeroen Janssens
Pictured above from left to right (Top): Max Kuhn, Kaz Sakamoto (Middle): Jeroen Janessens, Daniel Chen, David Robinson (Bottom): Lucy D’Agostino McGowan, Malcolm Barret
Recreating the In-Person Experience
We recreated as much of the in-person experience as possible with attendee networking sessions, the speaker walk-on songs and fun facts, prize giveaways, an art auction, happy hour and a live recording of the SuperDataScience podcast. In addition to all of this, we mailed conference information, hex stickers, and other swag to each attendee (in the U.S.).
Thank you to our sponsors
This conference couldn’t have happened without all our wonderful sponsors. We appreciate all of you! Thank you Spring Health, RStudio, R Consortium, Visiting Nurse Service of New York, Pearson, Springer, Manning, Chapman & Hall/CRC, Cointreau and The Botanist Gin.
Shout out to the Lander Analytics Team!
Even though it was virtual, there was a lot of work that went into the conference, and I want to thank my amazing team at Lander Analytics along with our producer, Bill Prickett, for making it all come together.
A SPECIAL thanks to Jonathan Hersh for this one...
Jon debuted this in the middle of his talk. A loyal friend and true. SupeRman. I love it. Thanks, man.
Looking forward to a future in-person!
We were hoping the 2021 conference would be in-person. We had a venue lined up and everything. It was not an easy decision, but we felt it was the right thing to do to go virtual. This conference is all about the community, and when the timing is right, in-person events will return. Virtual or in-person, next year’s conference will take place June 8-10! See you there!
Government & Public Sector R Conference
The Government & Public Sector R Conference returns virtually this year December 8-10. There will be two days of talks and a day of workshops, all focused on data science for public service. This conference started as the DC R Conference and morphed into a focused conference for those working in government, NGOs and similar fields. We already have a great slate of speakers with more being added every week.
Jared P. Lander
Lander Analytics Chief Data Scientist