People are always curious about which programming languages they should learn — which are the most valuable? Which will get them a job? Which are easiest and hardest?
One language that’s been showing up a lot more on the “which programming languages to learn” list is R, a language focused on statistical computing (in fact, it’s #6 on IEEE Spectrum’s 2015 list of top programming languages).
Why is R getting more popular, what can it do, and where can you learn how to use it?
What is R? Who Uses It?
R is an extension of the programing language called S. Unlike S, R has gained a huge amount of popularity, largely because it’s a free alternative to very powerful software used for statistical computing like SAS, SPSS, and Matlab, all of which are high-priced. While R can be used for a variety of things, it’s best used for data analysis.
One of the reasons that it’s so powerful is that people can create and distribute “packages” that add to the base functionality of the language. A quick look at some of the most recent packages to be posted include one for directional statistics, another for multilevel joint modeling imputation, and — in a break from most uses of the language – for building “attractive résumé” using a database, LaTeX, and R.
Some of the world’s biggest companies use R.
According to Revolution Analytics, Google uses it to calculate return on investment (ROI) of advertising campaigns and predict economic activity. Microsoft uses it for matchmaking on the Xbox network. The National Weather Service generates graphics with it. oDesk uses the language to analyze results from experiments. Twitter includes R as part of its Data Science toolbox.
The possibilities for R are almost limitless — and as big data becomes a more important field, the ability to efficiently analyze it is going to increase in importance as well. R is great for data analysis, and its open-source, collaborative nature makes it one of the best tools out there. If you’re interested in becoming a data scientist, you’d do well to learn it.
Of course, because R’s interface is much more bare-bones than apps like SAS, SPSS, and Matlab, you’re going to need to put in a lot of work to become an expert. R has a rather steep learning curve if you’re looking to move beyond the basics, so you’ll need some high-quality learning resources if you’re going to start your journey off on the right foot.
Let’s go to some of the best.
Code School’s brief introduction, Try R, is a fantastic way to learn the basics. It’s presented in an interactive format, making it more interesting and effective than some other learning methods. You’ll learn about vectors, matrices, factors, basic stats, data frames, and how to extend R using outside libraries.
Best of all, the whole course is free. For a total beginner, this is tough to beat.
This course has three parts; the basics of R, exploring statistical concepts through programming, and a section in which researchers explain how they’ve used R and statistics to solve real-life scientific issues.
This course is focused on using R in the health sciences, but will be valuable for a range of people, from those who are familiar with statistics to those who are totally new to the field.
In a series of two-minute videos, you’ll go from the basics, like “What Is R?” to more advanced topics, including creating loops and running SQL commands in R to interact with databases. At the end, you’ll even learn how to make awesome coffee by timing your French press pourover with R.
If you’re looking for something a little different than the textbook-style learning of other resources, give this one a shot.
Kaggle is a website that hosts data analysis competitions that can win you a lot of money . . . but they’ll also help you get started with this introduction to machine learning with R. This is a quick, intermediate-level introduction to the relevant concepts, and it’s great if you’re interested in data analysis (and not just statistics) with R.
The primary things you’ll learn are DataCamp’s interface, decision trees, and random forests, which are great data modeling tools.
On the official R website, there’s a collection of manuals that cover a variety of topics, from the basics of R to instructions on how to write your own extensions. While you could read “Introduction to R” from cover to cover, it’s probably best used as a reference manual for when you run into problems and you need to find specific information about the language itself. The other documents on the list probably won’t come in handy until you’re an R expert, but this is a great page to have bookmarked nonetheless.
Econometrics in R (PDF Download), another free resource available from the website, is a popular resource for learning the language. It’s a bit dense, but it contains just about everything you’ll need to know to get started.
RStudio is an integrated development environment (IDE) for R — and although you don’t need to use it to become an R expert, you might find it very helpful. The RStudio website has a number of tutorials available, as well as links to other useful pages. There are book recommendations; an introduction to Shiny, a cool way to display your data results online; and information on R Markdown, another useful tool for sharing data.
There’s a mix of free and paid resources here, but if you spend a bit of time browsing around, you’ll find some really great things you can get without paying.
A Few More Worthy Resources on R
With its climbing popularity, you can find a few more sites for getting to know the language.
As with any other programming language, the best way to learn is to find a problem that you’d like to solve and start designing a solution. With some determination and these resources available, you’ll be using R to analyze data sets in no time.
Are you working with R? What are your favorite R resources? Share them below so we can all learn from them!