CLRC offers its member constituents (including members of our five participating library systems and individual members) access to the entire lynda.com training library at no charge.

For more information on how to request an access code, please click here

This course shows how to review and derive information from datasets using Python. First, get an overview of data science and how it open source libraries like Python can be used for your data analysis need. Then, discover how to set up labs and data interpreters. Next, learn about how you can use pandas, NumPy, and SciPy for numerical processing, scientific programming, and extensive data exploration. With these options at your disposal, you’ll be ready for the following chapter which focuses on making predictions using machine learning tools, data classifiers, and clusters. The course concludes with a look at big data and how PySpark can be used for computing.

Topics include:

  • Configuring your system
  • Setting up labs
  • Using pandas, NumPy, and SciPy
  • Building a classifier
  • Clustering data
  • Working with big data and PySpark
  • Using MLlib
  • Beginning with Spark

 

View this entire Learning Python for Data Science and more in the lynda.com library.

CLRC’s subscription to lynda.com is a service of the Empire State Library Network, and is supported by Regional Bibliographic Data Bases and Interlibrary Resources Sharing (RBDB) funds.