Modern Methods in Statistics, Econometrics, and Machine Learning

Spring 2023

This course provides a brief introduction to a variety of topics in modern statistics, econometrics, and machine learning. Exact topics are to be determined, but may include: neural networks, random forests, text analysis, network analysis, empirical processes, multiple testing, randomization inference, neyman orthogonality, and shape restrictions. Guest lecturers include:


Topics in Modern Econometrics (ECON 31760) – Spring 2023

Instructor #1: Stephane Bonhomme

Instructor #2: Azeem M. Shaikh

Teaching Assistant: Ian Pitman

Class: Wed. 3:30p-6:20p, SHFE 203

Course Description: This course will introduce students to various topics in modern econometrics through a series of lectures from distinguished visitors. The speakers will strive to make the lectures as self-contained as possible, but it will be assumed that students have adequate mathematical maturity and background equivalent to at least that of the first year Ph.D. sequence in econometrics in the economics department.

Grading: Grading will be based exclusively on attendance.

Schedule: The speakers have confirmed their participation in this course. When available, titles for the lectures have been provided.

March 22: Whitney Newey, MIT

The goal of this lecture is to explain how to deploy machine learning methods, such as Lasso, neural nets, and random forests, to infer economic and causal parameters that depend on regression functions. Plugging machine learners into moment functions that identify parameters of interest can lead to poor inference due to bias from regularization and/or model selection. This talk explains how to overcome this problem using debiased machine learning and discusses some of its properties. Empirical examples are given.

March 29: Daniel Wilhelm, University College London | “Inference for Ranks”

It is often desired to rank different populations according to the value of some feature of each population. For example, it may be desired to rank neighborhoods according to some measure of intergenerational mobility or countries according to some measure of academic achievement. These rankings are invariably computed using estimates rather than the true values of these features. As a result, there may be considerable uncertainty concerning the rank of each population. This lecture covers methods for inference involving ranks, e.g. inference on the rank of a population, inference on the set of k-best populations, and inference on regressions involving ranks.

April 5: Tim Armstrong, University of Southern California | “Topics in High (and Low) Dimensional Econometrics ”

This talk will cover some of my research interests on optimal estimation and inference in various settings.

April 12: Elena Manresa, NYU
April 19: Joe Romano, Stanford University | “Randomization and Permutation Inference”

Randomization and permutation methods of inference have made a resurgence in the statistics and econometrics literature, as well as other fields such as genetics. While most methods in econometrics rely on asymptotic approximations, randomization tests offer the potential for exact inference in finite samples. However, the tests can also be invalid in large samples unless they are applied appropriately. In this lecture, I will develop some of the fundamentals of randomization tests: basic construction, optimality, asymptotic behavior, and robustness. Applications include one- and two-sample problems (possibly high-dimensional), correlation, regression, and time series will be discussed. Multiple hypothesis testing may also be discussed, time permitting.

April 26: Stefan Wager, Stanford University
May 3: Karun Adusumilli, University of Pennsylvania | “Sequential learning - Old results and new developments”

Sequential experiments are experimental designs in which past data informs how data will be collected in the future. While the earliest work in this field goes back to the pioneering studies of Wald (1947) and Arrow et al (1949), recent years have seen a number of new advances and developments, primarily motivated by applications in automated decision making in tech companies (examples include ad placement, dynamic pricing, and A/B testing). These designs are also seeing growing popularity in economics and bio-statistics as well; in fact the FDA now actively recommends the use of such designs. The aim of this lecture is to give an overview of this rapidly growing field, primarily through the lens of bandit experiments, which is a canonical example of sequential learning. We will study some commonly used algorithms here and discuss their advantages and shortcomings. Special emphasis will also be placed on the Bayesian learning paradigm, along with the use of various decision theory criteria. Finally, the lecture will also briefly discuss some asymptotic approaches to these experiments and assess how this can aid in the design of optimal algorithms.

May 10: Xiaohong Chen, Yale University
May 17: Andres Santos, UCLA | “An Empirical Process View of the Bootstrap”

In this lecture, we will cover bootstrap consistency from the perspective of empirical process theory. We will pay special attention to the role of Delta Method and continuous mapping theory in explaining when we should expect the bootstrap to be consistent, and when we should expect it to fail.