# Combine Multiple Excel Worksheets

## [Combine Multiple Excel Worksheets Into a Single Pandas Dataframe](https://pbpython.com/pandas-excel-tabs.html)

![enter image description here](https://i0.wp.com/www.techoffside.com/wp-content/uploads/2019/03/photographing-spreadsheet-003.jpg?resize=750%2C375\&ssl=1)

### Introduction

One of the most commonly used pandas functions is `read_excel` . This short article shows how you can read in all the tabs in an Excel workbook and combine them into a single pandas dataframe using one command.

For those of you that want the TLDR, here is the command:

```python
df = pd.concat(pd.read_excel('2018_Sales_Total.xlsx', sheet_name=None), ignore_index=True)
```

Read on for an explanation of when to use this and how it works.

### Excel Worksheets

For the purposes of this example, we assume that the Excel workbook is structured like this:

![Excel Multiple Tabs](https://pbpython.com/images/excel_tabs.png)

The process I will describe works when:

* The data is not duplicated across tabs (sheet1 is one full month and the subsequent sheets have only a single month’s worth of data)
* The columns are all named the same
* You wish to read in all tabs and combine them

### Understanding read\_excel

The `read_excel` function is a [feature packed](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html) pandas function. For this specific case, we can use the `sheet_name` parameter to streamline the reading in of **all** the sheets in our Excel file.

Most of the time, you will read in a specific sheet from an Excel file:

```python
import pandas as pd

workbook_url = 'https://github.com/chris1610/pbpython/raw/master/data/2018_Sales_Total_Tabs.xlsx'
single_df = pd.read_excel(workbook_url, sheet_name='Sheet1')
```

If you carefully look at the documentation, you may notice that if you use `sheet_name=None` , you can read in all the sheets in the workbook at one time. Let’s try it:

```python
all_dfs = pd.read_excel(workbook_url, sheet_name=None)
```

Pandas will read in all the sheets and return a `collections.OrderedDict` object. For the purposes of the readability of this article, I’m defining the full url and passing it to `read_excel` . In practice, you may decide to make this one command.

Let’s inspect the resulting `all_dfs` :

```python
all_dfs.keys()
```

```python
odict_keys(['Sheet1', 'Sheet2', 'Sheet3', 'Sheet4', 'Sheet5', 'Sheet6'])
```

If you want to access a single sheet as a dataframe:

```python
all_dfs['Sheet1'].head()
```

|   | account number | name            | sku      | quantity | unit price | ext price | date                |
| - | -------------- | --------------- | -------- | -------- | ---------- | --------- | ------------------- |
| 0 | 412290         | Jerde-Hilpert   | S2-77896 | 43       | 76.66      | 3296.38   | 2018-03-04 23:10:28 |
| 1 | 383080         | Will LLC        | S1-93683 | 28       | 90.86      | 2544.08   | 2018-03-05 05:11:49 |
| 2 | 729833         | Koepp Ltd       | S1-30248 | 13       | 44.84      | 582.92    | 2018-03-05 17:33:52 |
| 3 | 424914         | White-Trantow   | S2-82423 | 38       | 50.93      | 1935.34   | 2018-03-05 21:40:10 |
| 4 | 672390         | Kuhn-Gusikowski | S1-50961 | 34       | 48.20      | 1638.80   | 2018-03-06 11:59:00 |

If we want to join all the individual dataframes into one single dataframe, use [pd.concat](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html):

```python
df = pd.concat(all_dfs, ignore_index=True)
```

In this case, we use `ignore_index` since the automatically generated indices of `Sheet1` , `Sheet2` , etc. are not meaningful.

If your data meets the structure outlined above, this one liner will return a single pandas dataframe that combines the data in each Excel worksheet:

```python
df = pd.concat(pd.read_excel(workbook_url, sheet_name=None), ignore_index=True)
```

### Summary

This trick can be useful in the right circumstances. It also illustrates how much power there is in a pandas command that “just” reads in an Excel file. The [full notebook](https://nbviewer.jupyter.org/github/chris1610/pbpython/blob/master/notebooks/Combine_Multiple_Excel_Sheets.ipynb) is available on [github](https://github.com/chris1610/pbpython/blob/master/notebooks/Combine_Multiple_Excel_Sheets.ipynb) if you would like to try it out for yourself.

Reference : <https://pbpython.com/pandas-excel-tabs.html>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://yo-sarawut.gitbook.io/tutorials/articles/pandas/combine-multiple-excel-worksheets.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
