Summer Courses

Summer 2019 Courses

Six-week summer 2019 courses offered by ISERP and Columbia School of Professional Studies (SPS).

The courses are open to the public as well as all current Columbia affiliates. Participants must register through SPS to enroll in these courses. Non-affiliates will also need to apply to SPS prior to registering. For more information on SPS application and registration, please visit their website and explore your options here.

NOTE: Successful enrollment in these classes does NOT confer admission to the Quantitative Methods in the Social Sciences MA Program. For more info on this program, see their website here.

QMSS S 5019 Data Analysis with Python

Gregory Eirich

Session D: MW 10:10am-12:00pm

This course is meant to provide an introduction to regression and applied statistics for the social sciences, with a strong emphasis on utilizing the Python software language to perform the key tasks in the data analysis workflow. The chief goal is to help students generate and interpret quantitative data in helpful and provocative ways. The hope is that by trying to measure the social world, students will see their thinking become clearer and their understandings of concepts grow more complex. They will also become competent at reading statistical results in social science publications and in other media. Only basic mathematics skills are assumed, but some more advanced math will be introduced as needed. For this course, a critical goal is to teach students how to manipulate and analyze data themselves using statistical software. We will focus almost exclusively on Python for this class (although, there will be a few cases where we will run R through Python because R can more readily do things than Python). There will be Python write-up assignments nearly each week, tied to hands-on data analysis lab sessions. These weekly assignments will be devoted to using Python to practice commands and to develop a paper using the General Social Survey, World Values Survey or another dataset of the student’s choosing. Click here to learn more about this course and the application.

 

QMSS S 5072 Modern Data Structures

Michael Parrott

Session D: M W 12:10 pm - 2:00 pm

Learn how to create your own R package, APIs, how to handle JSON data in R, and more! This course is intended to provide a detailed tour on how to access, clean, “munge” and organize data, both big and small. Each session will have simple, moderate and complex examples in class, with code to follow. Students will then practice additional exercises at home. The end point of each project would be to get the data organized and cleaned enough so that it is in a data-frame, ready for subsequent analysis and graphing. Therefore, no analysis or visualization (beyond just basic tables and plots to make sure everything was correctly organized) will be taught and this will free up substantial time for the “nitty-gritty” of all of this data wrangling. Click here to learn more about this course and the application.

 

QMSS S 5073 Machine Learning for Social Science

Michael Parrott

Session D: M W 4:00 pm-6:10 pm

Social scientists need to fully engage with machine learning approaches that are found in computer science, engineering, AI, tech and in industry. This course will provide a comprehensive overview of machine learning as it is applied in a number of domains. Every effort will be made to draw comparisons and contrasts between this machine learning approach and more traditional regression-based approaches in the social sciences. Emphasis will also be on opportunities to synthesize these two approaches. The course will start with an introduction to Python, the scikit-learn package, and github. After that, there will be some discussion of data exploration, visualization in matplotlib, preprocessing, feature engineering, variable imputation and feature selection. Supervised learning methods will be considered, including OLS models, linear models for classification, support vector machines, decision trees and random forests, and gradient boosting. Calibration, model evaluation and strategies for dealing with imbalanced datasets, non-negative matrix factorization and outlier detection will be considered next. This will be followed by unsupervised techniques: PCA, discriminant analysis, manifold learning, clustering, mixture models, cluster evaluation. Lastly, we will consider neural networks, convolutional neural networks for image classification and recurrent neural networks. Prerequisites are basic probability and statistics, basic linear algebra and calculus.  The course will use Python, and so if students have programmed in at least one software language, that will make it easier to keep up with the course. Click here to find out more about this course and the application.

Newsletter

Don't want to miss our interesting news and updates! Make sure to join our newsletter list.

* indicates required

Contact us

For general questions about ISERP programs, services, and events.