How to Do Basic Data Analysis in Excel
Pinterest Stumbleupon Whatsapp
Advertisement

Most of the time when you run statistics, you want to use statistical software. These tools are built to do calculations like t-tests, chi-square tests, correlations, and so on. Excel isn’t meant for data analysis. But that doesn’t mean you can’t do it.

Unfortunately, Excel’s statistical functions aren’t always intuitive. And they usually give you esoteric results. So instead of using stats functions, we’re going to use the go-to Excel statistics add-in: the Data Analysis Toolpak.

The Toolpak, despite its rather unfortunate spelling, includes a wide range of useful statistics functionality. Let’s see what we can do with Excel statistics.

Adding the Excel Data Analysis Toolpak

While you can do stats without the Data Analysis Toolpak, it’s much easier with it. To install the Toolpak in Excel 2016, go to File > Options > Add-ins.

Click Go next to “Manage: Excel Add-ins.”

basic data analysis in excel

In the resulting window, check the box next to Analysis Toolpak and then click OK.

basic data analysis in excel

If you correctly added the Data Analysis Toolpak to Excel, you’ll see a Data Analysis button in the Data tab, grouped into the Analysis section:

basic data analysis in excel

If you want even more power, be sure to check out Excel’s other add-ins Power Up Excel with 10 Add-Ins to Process, Analyze & Visualize Data Like a Pro Power Up Excel with 10 Add-Ins to Process, Analyze & Visualize Data Like a Pro Vanilla Excel is amazing, but you can make it even more powerful with add-ins. Whatever data you need to process, chances are someone created an Excel app for it. Here's a selection. Read More .

Descriptive Statistics in Excel

No matter what statistical test you’re running, you probably want to get Excel’s descriptive statistics first. This will give you information on means, medians, variance, standard deviation and error, kurtosis, skewness, and a variety of other figures.

Running descriptive statistics in Excel is easy. Click Data Analysis in the Data tab, select Descriptive Statistics, and select your input range. Click the arrow next to the input range field, click-and-drag to select your data, and hit Enter (or click the corresponding down arrow), as in the GIF below.

basic data analysis in excel

After that, make sure to tell Excel whether your data has labels, if you want the output in a new sheet or on the same one, and if you want summary statistics and other options.

basic data analysis in excel

After that, hit OK, and you’ll get your descriptive statistics:

basic data analysis in excel

Student’s t-Test in Excel

The t-test is one of the most basic statistical tests, and it’s easy to compute in Excel with the Toolpak. Click the Data Analysis button and scroll down until you see the t-test options.

basic data analysis in excel

You have three choices:

  • t-Test: Paired Two Sample for Means should be used when your measurements or observations were paired. Use this when you took two measurements of the same subjects, such as measuring blood pressure before and after an intervention.
  • t-Test: Two-Sample Assuming Equal Variances should be used when your measurements are independent (which usually means they were done on two different subject groups). We’ll discuss the “equal variances” part in a moment.
  • t-Test: Two-Sample Assuming Unequal Variances is also for independent measurements, but is used when your variances are unequal.

To test whether the variances of your two samples are equal, you’ll need to run an F-test. Find F-Test Two-Sample for Variances in the Analysis Tools list, select it, and click OK.

basic data analysis in excel

Enter your two datasets in the input range boxes. Leave the alpha value at 0.05 unless you have reason to change it — if you don’t know what that means, just leave. Finally, click OK.

Excel will give you the results in a new sheet (unless you selected Output Range and a cell in your current sheet):

basic data analysis in excel

You’re looking at the P-value here. If it’s less than 0.05, you have unequal variances. So to run the t-test, you should use the unequal variances option.

To run a t-test, select the appropriate test from the Analysis Tools window and select both sets of your data in the same manner as you did for the F-test. Leave the alpha value at 0.05, and hit OK.

basic data analysis in excel

The results include everything you need to report for a t-test: the means, degrees of freedom (df), t statistic, and the P-values for both one- and two-tailed tests. If the P-value is less than 0.05, the two samples are significantly different.

If you’re not sure whether to use a one- or two-tailed t-test, check out this explainer from UCLA.

ANOVA in Excel

The Excel Data Analysis Toolpak offers three types of analysis of variance (ANOVA). Unfortunately, it doesn’t give you the ability to run the necessary follow-up tests like Tukey or Bonferroni. But you can see if there’s a relationship between a few different variables.

Here are the three ANOVA tests in Excel:

  • ANOVA: Single Factor analyzes variance with one dependent variable and one independent variable. It’s preferable to using multiple t-tests when you have more than two groups.
  • ANOVA: Two-Factor with Replication is similar to the paired t-test; it involves multiple measurements on single subjects. The “two-factor” part of this test indicates that there are two independent variables.
  • ANOVA: Two-Factor without Replication involves two independent variables, but no replication in measurement.

We’ll be going over the single-factor analysis here. In our example, we’ll be looking at three sets of numbers, labeled “Intervention 1,” “Intervention 2,” and “Intervention 3.” To run an ANOVA, click Data Analysis, then select ANOVA: Single Factor.

basic data analysis in excel

Select the input range and make sure to tell Excel whether your groups are in columns or rows. I’ve also selected “Labels in first row” here so that the group names are displayed in the results.

After hitting OK, we get the following results:

basic data analysis in excel

Note that the P-value is less than 0.05, so we have a significant result. That means there’s a significant difference between at least two of the groups in the test. But because Excel doesn’t provide tests to determine which groups differ, the best you can do is look at the averages displayed in the summary. In our example, Intervention 3 looks like it’s probably the one that differs.

This isn’t statistically sound. But if you just want to see if there’s a difference, and see which group is probably causing it, it’ll work.

Two-factor ANOVA is more complicated. If you want to learn more about when to use the two-factor method, see this video from Sophia.org and the “without replication” and “with replication” examples from Real Statistics.

Correlation in Excel

Calculating correlation in Excel is much simpler than the t-test or an ANOVA. Use the Data Analysis button to open the Analysis Tools window and select Correlation.

basic data analysis in excel

Select your input range, identify your groups as columns or rows, and tell Excel whether you have labels. After that, hit OK.

basic data analysis in excel

You won’t get any measures of significance, but you can see how each group is correlated with the others. A value of one is an absolute correlation, indicating that the values are exactly the same. The closer to one the correlation value, the stronger the correlation.

Regression in Excel

Regression is one of the most commonly used statistical tests in industry, and Excel packs a surprising amount of power for this calculation. We’ll run a quick multiple regression in Excel here. If you’re not familiar with regression, check out HBR’s guide to using regression for business.

Let’s say our dependent variable is blood pressure, and our two independent variables are weight and salt intake. We want to see which is a better predictor of blood pressure (or if they’re both good).

Click Data Analysis and select Regression. You need to be careful when filling out the input range boxes this time. The Input Y Range box should contain your single dependent variable. The Input X Range box can include multiple independent variables. For a simple regression, don’t worry about the rest (though remember to tell Excel if you selected labels).

Here’s what our calculation looks like:

basic data analysis in excel

After hitting OK, you’ll get a big list of results. I’ve highlighted the P-value here for both weight and salt intake:

basic data analysis in excel

As you can see, the P-value for weight is greater than 0.05, so there’s no significant relationship there. The P-value for salt, however, is below 0.05, indicating that it’s a good predictor of blood pressure.

If you’re planning on presenting your regression data, remember that you can add a regression line to a scatterplot in Excel. It’s a great visual aid How to Visualize Your Data Analysis with Excel's Power Tools How to Visualize Your Data Analysis with Excel's Power Tools Excel is killing it with its advanced data management features. Once you have used one of the new tools, you will want them all. Become a master of your data analysis with power tools! Read More for this analysis.

Excel Statistics: Surprisingly Capable

While Excel isn’t known for its statistical power, it actually packs some really useful functionality. Especially once you download the Data Analysis Toolpak statistics add-in. I hope you’ve learned how to use the Toolpak, and that you can now play around on your own to figure out how to use more of its functions.

With this now under your belt, take your Excel skills to the next level with our articles on using Excel’s Goal Seek feature for more data crunching, mastering IF statements in Excel, and adding dropdown lists as cells in Excel.

I’ve also linked to other sites that have good statistics tutorials where we had to skip over confusing concepts. Be sure to check out our guide to free statistics resources Learn Statistics for Free with These 6 Resources Learn Statistics for Free with These 6 Resources Statistics has a reputation of a subject that's difficult to understand. But learning from the right resource will help you understand survey results, election reports, and your stats class assignments in no time. Read More , too.

Enjoyed this article? Stay informed by joining our newsletter!

Enter your Email

Leave a Reply

Your email address will not be published. Required fields are marked *

  1. Esmaeil AliZadeh
    March 1, 2018 at 4:48 am

    Thanks for the article. I think there is a little mistake (probably an oversight!) in doing F-test.
    When performing an F-test in excel, you HAVE to make sure that the variance of variable 1 is greater than the variance of variable 2 due to the fact that F-value = var(X1)/var(X2).
    In your case, you have to switch the columns!!

  2. Ecko Scentauri
    December 29, 2017 at 10:35 pm

    Excellent statistical topics, and article, in Excel usage. I seem to find nothing more interesting than reading about, and practicing, the vast potentials of Excel and various programming languages. There is also the fact of making one's self more marketable, for employment, by honing these technological skills. Once considered an Excel guru, I realized that what I did not know, or what I fell out of touch with, reduced my expertise (VBA). Now I treat every reading of Excel as if to be a new beginning. Along the way I have stumbled into some pretty neat clarifications and tricks that have triggered an broad array of ideas. The quest for what I do not know is everlasting. Thank you!