pandas Cookbook

The goal of this cookbook (by Julia Evans) is to give you some concrete examples for getting started with pandas. These are examples with real-world data, and all the bugs and weirdness that that entails.

Here are links to the v0.1 release. For an up-to-date table of contents, see the pandas-cookbook GitHub repository. To run the examples in this tutorial, you’ll need to clone the GitHub repository and get IPython Notebook running. See How to use this cookbook.

  • A quick tour of the IPython Notebook: Shows off IPython’s awesome tab completion and magic functions.
  • Chapter 1: Reading your data into pandas is pretty much the easiest thing. Even when the encoding is wrong!
  • Chapter 2: It’s not totally obvious how to select data from a pandas dataframe. Here we explain the basics (how to take slices and get columns)
  • Chapter 3: Here we get into serious slicing and dicing and learn how to filter dataframes in complicated ways, really fast.
  • Chapter 4: Groupby/aggregate is seriously my favorite thing about pandas and I use it all the time. You should probably read this.
  • Chapter 5: Here you get to find out if it’s cold in Montreal in the winter (spoiler: yes). Web scraping with pandas is fun! Here we combine dataframes.
  • Chapter 6: Strings with pandas are great. It has all these vectorized string operations and they’re the best. We will turn a bunch of strings containing “Snow” into vectors of numbers in a trice.
  • Chapter 7: Cleaning up messy data is never a joy, but with pandas it’s easier.
  • Chapter 8: Parsing Unix timestamps is confusing at first but it turns out to be really easy.
Lessons for New pandas Users

For more resources, please visit the main repository.

  • 01 - Lesson: - Importing libraries - Creating data sets - Creating data frames - Reading from CSV - Exporting to CSV - Finding maximums - Plotting data
  • 02 - Lesson: - Reading from TXT - Exporting to TXT - Selecting top/bottom records - Descriptive statistics - Grouping/sorting data
  • 03 - Lesson: - Creating functions - Reading from EXCEL - Exporting to EXCEL - Outliers - Lambda functions - Slice and dice data
  • 04 - Lesson: - Adding/deleting columns - Index operations
  • 05 - Lesson: - Stack/Unstack/Transpose functions
  • 06 - Lesson: - GroupBy function
  • 07 - Lesson: - Ways to calculate outliers
  • 08 - Lesson: - Read from Microsoft SQL databases
  • 09 - Lesson: - Export to CSV/EXCEL/TXT
  • 10 - Lesson: - Converting between different kinds of formats
  • 11 - Lesson: - Combining data from various sources
Practical data analysis with Python

This guide is a comprehensive introduction to the data analysis process using the Python data ecosystem and an interesting open dataset. There are four sections covering selected topics as follows:

Excel charts with pandas, vincent and xlsxwriter