Python Data Analysis
上QQ阅读APP看书,第一时间看更新

Installing and exploring pandas

The minimal dependency set requirements for pandas is given as follows:

  • NumPy: This is the fundamental numerical array package that we installed and covered extensively in the preceding chapters
  • python-dateutil: This is a date-handling library
  • pytz: This handles time zone definitions

This list is the bare minimum; a longer list of optional dependencies can be located at system package manager, or from the source by checking out the code. The binary installers can be downloaded from http://pandas.pydata.org/getpandas.html.

The command to install pandas with pip is as follows:

$ pip install pandas

You may have to prepend the preceding command with sudo if your user account doesn't have sufficient rights. For most, if not all, Linux distributions, the pandas package name is python-pandas. Please refer to the manual pages of your package manager for the correct command to install. These commands should be the same as the ones summarized in Chapter 1, Getting Started with Python Libraries. To install from the source, we need to execute the following commands from the command line:

$ git clone git://github.com/pydata/pandas.git 
$ cd pandas
$ python setup.py install

This procedure requires the correct setup of the compiler and other dependencies; therefore, it is recommended only if you really need the most up-to-date version of pandas. Once we have installed pandas, we can explore it further by adding pandas-related lines to our documentation-scanning script pkg_check.py of the previous chapter. The program prints the following output:

pandas version 0.13.1
pandas.compat DESCRIPTION compat Cross-compatible functions for Python 2 and 3. Key items to import for 2/3 compatible code: * iterators: range(), map(),
pandas.computation 
pandas.core
pandas.io 
pandas.rpy 
pandas.sandbox 
pandas.sparse 
pandas.stats 
pandas.tests 
pandas.tools 
pandas.tseries 
pandas.util 

Unfortunately, the documentation of the pandas subpackages lacks informative descriptions; however, the subpackage names are descriptive enough to get an idea of what they are about.