The Institute for Social and Economic Research and Policy (ISERP) is closed in accordance with the University’s COVID-19 policy.  Please click here for additional information and guidance for students, affiliates, and employees.

Machine Learning for the Social Sciences

QMSS S 5073: Machine Learning for the Social Sciences

MW 4:00pm-6:10pm

Join this summer course from ISERP and Columbia School of Professional Studies (SPS).

The course will run for the 6-week duration of the Columbia Summer Session D, from May 28th through July 5th, 2019.

QMSS S 5073 Machine Learning for Social Science is open to the public but requires registration with SPS prior to course registration. For more information on SPS application and registration, please visit their website and explore your options here

Course Goals:

Social scientists need to fully engage with machine learning approaches that are found in computer science, engineering, AI, tech and in industry. This course will provide a comprehensive overview of machine learning as it is applied in a number of domains. Every effort will be made to draw comparisons and contrasts between this machine learning approach and more traditional regression-based approaches in the social sciences. Emphasis will also be on opportunities to synthesize these two approaches. The course will start with an introduction to Python, the scikit-learn package, and GitHub. After that, there will be some discussion of data exploration, visualization in matplotlib, preprocessing, feature engineering, variable imputation, and feature selection. Supervised learning methods will be considered, including OLS models, linear models for classification, support vector machines, decision trees, and random forests, and gradient boosting. Calibration, model evaluation and strategies for dealing with imbalanced datasets, non-negative matrix factorization, and outlier detection will be considered next. This will be followed by unsupervised techniques: PCA, discriminant analysis, manifold learning, clustering, mixture models, cluster evaluation. Lastly, we will consider neural networks, convolutional neural networks for image classification and recurrent neural networks. Prerequisites are basic probability and statistics, basic linear algebra and calculus. The course will use Python, and so if students have programmed in at least one software language, that will make it easier to keep up with the course.


Instructor: Michael D. Parrott

Dr. Parrott is appointed Lecturer-in-Discipline within the Department of Political Science and he teaches GIS and Spatial Analysis, Theory and Methods, Data Analysis for the Social Sciences and Data Visualization with QMSS. Prior to joining the QMSS faculty, he was a 2016-2017 American Political Science Association Congressional Fellow. As an APSA fellow, he designed web-applications to organize, centralize, and automate data collection and everyday tasks for committee and personal office staff. Before that, he was a senior research analyst with a focus on GIS and spatial statistical analysis for the Campaign Finance Institute, a nonpartisan NPO in Washington, DC. He holds a PhD in Political Science with a focus on American politics and research methodology from the University of Maryland, an MA in Political Science from Fordham University, and a BA in Philosophy, Psychology, and Political Science from the University of Texas. His research interests include American governing institutions (especially Congress), interest groups, money and politics, and quantitative methodology. His current work examines how the design of political institutions shapes who wins and who loses in the policymaking process.


Back to Summer Courses Homepage


Don't want to miss our interesting news and updates! Make sure to join our newsletter list.

* indicates required

Contact us

For general questions about ISERP programs, services, and events.