Excel’s GROUPBY function lets you group and aggregate data based on certain fields in your table of data. It also offers arguments that allow you to sort and filter your data, so you can tailor the output to meet your specific needs.

Although you canuse pivot tablesto achieve similar outcomes to the GROUPBY function, the GROUPBY function refreshes automatically if your data changes or is reordered, and it also lets you embed more functions for more refined data sorting.

An Excel table containing various cuisines and dishes, with numerical representations of their sales and customer ratings.

Argumentsatocare required:

Argumentsdtohare optional, and you can learn more about these in the final section of this article:

GROUPBY in Action: Using the Required Arguments Only

If you’re overwhelmed by the large number of arguments that the GROUPBY function has, it’s important to note at this stage that the GROUPBY function works perfectly well even if you only populate argumentsa,b, andc. So, first, I’ll show you how the GROUPBY function works with these three arguments alone.

Let’s imagine you own a chain of restaurants that serve up different dishes from different cuisines, and you’ve totted up the total sales and average customer rating for each cuisine-dish combination.

Two data extraction tables in Excel that will be populated using the GROUPBY function.

While these figures are useful, maybe you’re more interested in how the data compares across different categories. Specifically, you might want to find out the total cash each cuisine brings in, and the average customer rating for each type of dish.

Because the GROUPBY function returnsspilled arrays, you can’t use formatted Excel tables for their results.

The GROUPBY function being used in Excel to aggregate total sales by cuisine.

Let me take a moment to explain why GROUPBY would be my function of choice for executing these tasks for this specific dataset. If each cuisine and each dish only appeared once in the table, you would simplyuse the filter buttonsto reorder and analyze your data. However, since cuisines and dishes are repeated, using the GROUPBY function will let you pull data from within common categories together, giving you a clearer whole-picture idea of the sales and rating distributions.

To find out the sales totals for each cuisine, in cell F2, type:

Some figures in Excel are selected, and the Accounting button is highlighted.

Since you’re looking to group the data by cuisine, select the cells containing this variable, and add a comma. In this case, because the data is in a formatted Excel table called TabFood, a structured reference to the column name is added to my formula:

​​​​

The GROUPBY function being used in Excel to aggregate average rating by dish.

Then, since you’re looking to see the total sales for each of those cuisines, select the cells containing these figures, and add another comma:

The final mandatory argument is the function to be used on the aggregating data. In this case, since you want to find out thetotalsales for each cuisine, you need to insert the SUM function, and close the parentheses:

A range of cells in Excel containing decimalized numbers is selected, and the Increase and Decrease Decimal buttons are highlighted.

As well as using straightforward functions like SUM and AVERAGE in argumentc, you can alsouse Excel’s LAMBDA toolto create a function tailored to your needs.

When you press Enter, you’ll see that Excel has aggregated the total sales for each cuisine. Since you haven’t included any of the optional arguments in the GROUPBY function, the data is sorted in alphabetical order according to the values in column F by default, and there’s a total row at the bottom of your extracted data.

Unformatted data and their headers are added to a GROUPBY formula.

As the values in column G are financial, select the data and click the “Accounting” icon in the Number group of the Home tab on the ribbon.

Now, you want to find out the average customer rating for each type of dish, and the process for doing so is very similar.

The output of a GROUPBY formula that includes the field headers argument.

In cell I2, type:

Next, select the cells containing the category by which you want to group the data. In this case, it’s the different dishes. Remember to add a comma after each argument to move to the next.

Now, select the cells containing the data to be aggregated, and add another comma:

The total depth argument in the GROUPBY function in Excel is used to display subtotals in the output data, which are formatted in blue.

Finally, because your aim on this occasion is to find out theaveragecustomer rating for each dish type, the function argument needs to be AVERAGE.

After you press Enter, Excel will average the customer ratings for each dish type. Again, in the absence of any optional arguments, the data is sorted in alphabetical order according to the values in the left-hand column by default, and there’s a handy total row at the bottom.

The output of a GROUPBY formula that sorts the data by column 2.

As the values in column J are decimalized averages, tidy up the number of decimal places on display by clicking the “Increase Decimal” and “Decrease Decimal” buttons in the Number group of the Home tab.

If you’re happy with your GROUPBY result at this stage, you may stop reading here. However, continue reading to learn about GROUPBY’s optional arguments.

GROUPBY in Action: Using the Optional Arguments

Even though the GROUPBY function having five optional arguments alongside the three required arguments makes it seem more complicated, these additional options are actually only there to help you create an output that is more tailored to your needs. What’s more, you’re able to choose which optional arguments you want to use and skip over the ones you don’t.

Below, I’ll cover each of the optional arguments, so you can see how they will affect your data when you choose to include them.

Use a comma to jump from one argument to the next. For example, if you want to include the fourth and sixth arguments, but not the fifth, type[fourth argument],,[sixth argument]. The fifth argument would have been between where the two commas are positioned, but since there’s nothing in that space, Excel knows you’ve deliberately left this argument blank.

Field Headers

In the examples I used above, I typed the output column headers manually, since they’re not included in the result by default. However, if you want your output data to include the column headers as well as the data they contain, use the field headers argument.

Start by typing your GROUPBY formula, including the first three (required) arguments. In this case, let’s assume you want to group the cuisines by average customer ratings:

Notice how the header rows are included within the selections. Indeed, when selecting your data for the first two arguments, you should think ahead about whether you want your output data to duplicate the headings in your table.

Row fields and values arguments must be the same size. If you select headers in one, you must select headers in the other.

Finally, type a comma to move to the field headers argument and type:

Here’s the result when I type:

I can now format my duplicated column headings so that they’re clearly distinguishable from the data, just like in the original table.

The Benefit of Including Field Headers

The Drawback of Including Field Headers

If you change the headings in your original table, the output headings will adopt those changes.

You can’t change the output headings if you want to make them more specific than the original table headings.

Total Depth

The total depth argument lets you decide whether you want the result to display grand totals and, if you do, whether they should sit at the top or the bottom of your data. This argument also lets you choose whether to show subtotals.

For the total depth argument, type:

The options to display subtotals (2 and -2) only work if the row fields argument contains more than one column of data (in other words, subfields).

In this example, I’ve typed:

​​​​​​which uses commas to skip over the field categories argument, and tells Excel that I want subtotals to appear beneath each category and a grand total at the bottom of the data. I’ve then applied direct formatting to the subtotal rows to make the data easier to read.

Sort Order

The sort order field lets you tell Excel whether and how you want to sort the result. Using this argument really highlights why the GROUPBY function can be more useful than using a pivot table: as soon as you change any data in your original table, the whole output data reorders according to the sort order argument, whereas pivot tables require a manual refresh.

The number you enter for this argument represents the column in the result. For example, if you type1, this will sort the result by the first column in ascending or alphabetical order. On the other hand, typing-1will sort the result by the first column in descending or reverse alphabetical order.

which sorts the second column (sales) in descending order.

Filter Array

The filter array argument is less likely than the previous optional arguments to be used, though it can come to the rescue if your original data table contains rows that could interrupt your data.

In this example, the years in cells A2, A8, and A17 interrupt the GROUPBY function’s result.

I can use the filter array argument to tell Excel to ignore any cells in column A that contain numbers through the ISNUMBER function:

Field Relationship

Finally, the field relationship argument controls how data is grouped when the row fields argument references more than one column.

In this example, when the field relationship argument contains0(which is the default if the argument is omitted), GROUPBY returns a hierarchical result table, with each column individually represented with separate rows of data.

On the other hand, when the field relationship argument contains1, GROUPBY returns a result table that ignores hierarchy and sorts each column independently. In other words, categories aren’t nested, which is why you also can’t include subtotals in the result when you choose this field relationship option.

As well as using SUM and AVERAGE in the GROUPBY function argument, you can usethe PERENTOF function, which turns the data into percentages to show the proportion a subset makes up of a whole dataset.