Free Course Materials: Harvard CS109 Data Science

HarvardIn the Fall of 2013, Harvard University offered CS109 Data Science which is an excellent introductory course for those interested in getting a jump start into this exciting field. Most of the class materials including video lecture archives and slides are freely available online. This is a fantastic way to get ivy-league quality education, albeit without university credit. The course is currently taught by two Harvard professors: Hanspeter Pfister (Computer Science) and Joe Blitzstein (Statistics).

This course introduces the following aspects of data science:

  • Data munging, cleaning, and sampling
  • Data management to be able to access big data quickly and reliably
  • Exploratory data analysis to generate hypotheses and intuition
  • Prediction based on statistical methods such as regression and classification
  • Communication of results through visualization, stories, and summaries

The course is based on Python for all programming assignments and projects. IPython notebooks for CS109 are available on https://github.com/cs109/content