Importing Data from Excel with pd.read_excel()

Alexander Hagmann
A free video tutorial from Alexander Hagmann
Data Scientist | Finance Professional | Entrepreneur
4.6 instructor rating • 6 courses • 27,751 students

Learn more from the full course

The Complete Pandas Bootcamp 2020: Data Science with Python

Pandas fully explained | 150+ Exercises | Must-have skills for Machine Learning & Finance | + Scikit-Learn and Seaborn

31:50:30 of on-demand video • Updated October 2020

  • Bring your Data Handling & Data Analysis skills to an outstanding level.
  • Learn and practice all relevant Pandas methods and workflows with Real-World Datasets
  • Learn Pandas based on NEW Version 1.0 (the days of versions 0.x are over)
  • Import, clean, and merge messy Data and prepare Data for Machine Learning
  • Master a complete Machine Learning Project A-Z with Pandas, Scikit-Learn, and Seaborn
  • Analyze, visualize, and understand your Data with Pandas, Matplotlib, and Seaborn
  • Practice and master your Pandas skills with Quizzes, 150+ Exercises, and Comprehensive Projects
  • Import Financial/Stock Data from Web Sources and analyze them with Pandas
  • Learn and master the most important Pandas workflows for Finance
  • Learn how to best transition from Versions 0.X to new Version 1.0
  • Learn the Basics of Pandas and Numpy Coding (Appendix)
  • Learn and master important Statistical Concepts with scipy
English [Auto] In this video you will learn how to import data from an excel file and it is very often the case that you have tabular data on an Excel file and you want to do more advanced operations with Python and pandas than first of all you have to import the data to pandas and convert the tabular data into a data frame. Then you make your advanced operations and after that you might want to export your data again to excel or CSP. And in this end the next video you will see how this works. So importing data from Excel actually works in pretty much the same way as with data stored and CSP files and instead of using the direct Panda's method read see a sphere. We have to apply the surprise read Excel method and actually there was a huge pen us update in January 219 from Russian point twenty three two point twenty four and read Excel method was one of the more heavily revised and updated methods. So this video shows you the brand new behavior of the Read Excel method and as a side note you need to update your Panda's version to 0 point twenty four. Otherwise the read Excel will not work in the same way as it is shown here in this video. All right so now let's start. And first of all as always we import pandas and then we will have a look here at the next set of fires. So you say it's the excel file so let's have a look here. So this is our say it's tabular data here and an excel file. So we have in total here our five sates man Mike Jim Steven Joe and Tom and we have in total four columns city countries say it's and bundles and now we want to import our data on two pandas and convert it into a panda state frame. So let's again have a look here at our Jupiter notebook. So here we are on our Jupiter notebook and that is the direct pen us method to read excel. And same message with the read CSP method you can pass here the file name. So the file name of our excel file is the states that Excel s. And by doing so we are importing our data on our excel file into a pen us data frame. And then we can save our pandas data frame and the variables say it so let's do this here and let's have a look here at say it. So here we can see on the left hand side that your BFA range index and the very first column here is actually our salesmen that actually you want to have as index of our data frame. And then we have here the four columns. And same as with the CSB method if you want to f a particular column as being the index of our data frame we can define this with the parameter index cards. So let's do this here A.A. read Excel method. We have to pass the index position of the column that we want to have as index of our data frame. So let's again have a look at our excel file. So zero based indexing also applies here and we want to fill our sets man as being the index. And here you say it's men are actually at index positions 0 so zero based. The next thing applies also here. So let's go back to our Jupiter notebook. So here we have to pass that the index position zero to the parameter index cards. So let's do this and let's have a look here. And now we have you on the left hand side. I will say it's men as the index of our sites that frame and we can also call you at the info method with uh some helpful information and also your does tries to adopt the data types that we have in our excel file. So here in the country and city columns we have actually strings. In the USA it's column we have in indigenous and in the bonus column we have float objects. All right so now let's move on and there's an additional parameter called header. So we also have header in the read CSB method. And here we can define which row of our tabular data should be the column header or the column labels. So let's again have a look at our excel file. And here we can see our desired column labels like column had us here and the very first row at an exposition 0. So city countries say it's upon us and therefore here in our case it should pass as the row. So that's the index position to the parameter header so actually the default argument for the header parameters zero. Anyway so here it's the same result as above and above. We didn't use them they had a perimeter but we can also change here the Herald had a pair of meter two flag sample one and here we use them the second row in our excel file as being the column labels or the column had us. So in this example he it's the second row here with Mike New York United States and so on and we actually fully dropped here the very first row with our actual column headers and last but not least we can also define that in our excel file. There's no appropriate row with the column header or column labels. And in this case we can pass here none to the head of parameter and by doing so panders creates here a Range Index so starting from 0 1 2 3 4 and actually the very first row where we actually had to use our column labels our column had us here actually in now the first row of our data frame. So in this case this doesn't make sense. So in our case we have a header here and it's located at the very first row at an exposition 0 0 next to this the additional parameter names and TV can change. So here we can define our desired column labels. So instead of having here a city we want to have a look city instead of half country low country and revenue instead of states and so on. And we also have to define the name for our index. So here in this example it should be named for example. So let's run the ELISA and we can see our changed column labels and also the name for our index. So here we have the names of our sites man Mike Tim Steven and so on. All right. So that's another very important parameter quality I use cause. And if we can define which columns you want to import from our excel file. So by default we are importing all columns that we have now excel file. But we could say we only want to have the columns with the Excel column had US aid to see us. So let's go back to our excel file and we can see our Excel column had us so ABC the e f g and so on. And let's assume that we only want to impart the first three columns so A B and C then we can do the c analogy with a notebook. So if you only want to impart the columns A B and C you can pass within quotation marks as a string aid to C to the used cars power meter. So let's do this here and we can see you we only have impart that the names and the city and country you and s we still have past the 0 2 the index called Pirate meter we have. Yeah I would say it's madness being the index of our data frame and actually the used cars parameters are quite versatile. So let's assume we only want to import uh the Excel columns C to E. So let's have a look again back. So the columns C to E uh these three columns here. So now we only want to impart C two years so as to this here and here we have um only the columns countries sales and panels and we can also perform more complicated things. So let's assume we want to have the Excel column A then we do not want to have B. And then we want to have C until e then we can do this here by choosing a. Then we have a comma separator and say in addition we want to have the column C until a year and also we have to pass this as a string. So let's have a look so here we have four columns. We have the statesman then countries sets and panels and actually we didn't import. The extra column B. So he has city. So he has city is missing. So let's go to our next example and we could also say that you want to import our columns from the very beginning until the C column. So this is no problem but uh. That's actually one thing that does not work. So we cannot say that we want to have our columns starting from C. and to the very last column so this gives an error message here. So now let's go on and instead of passing the exciting column had us you can also pass here as a list the index position of the columns that we want to impart. And also he has a robust indexing applies and we could say that we want to impart the columns at an exposition 0 3 and far so let's have a look at our excel file and the columns at an exposition of 0 3 and 4 are here 0. So that's I would say it's man then in exposition 1 2 and then 3 and 4. So here we should import the say it's men the names and the states and the bonus column. So let's try this out here and doesn't see exactly the case and the US version o point twenty three and earlier versions we were also able to pass an integer to the perimeter use calls. So for example 2. And by doing so we are importing all columns starting from the very first column to the column at index position to inclusive. So let's have a look here and actually it still works. So we import it. The column at index position 0 1 and 2 inclusive. But we have a future warning message that at some point in time in the future. This will not work anymore. So this has changed from pen as well as no point twenty three two point twenty four. And last but at least we could also pass a list with uh the column labels off our data frame for example city and states to the used cars per meter. And in this example this works perfectly but uh this is definitely not recommended. So my clear recommendation is that when you want to import data from Excel that you use actually here the Excel column label. So a b c d e and so on. All right. With this we are finished here with this video and I hope to see you also in the next video by.