Research

Publications

 

Shape-Enforcing Operators for Point and Interval Estimators (arXiv). Xi Chen, Victor Chernozhukov, Iván Fernández-Val, Scott Kostyshak, and Ye Luo (2021), Journal of Machine Learning Research 22 (220): 1-42.

Abstract: A common problem in econometrics, statistics, and machine learning is to estimate and make inference on functions that satisfy shape restrictions. For example, distribution functions are nondecreasing and range between zero and one, height growth charts are nondecreasing in age, and production functions are nondecreasing and quasi-concave in input quantities. We propose a method to enforce these restrictions ex post on generic unconstrained point and interval estimates of the target function by applying functional operators. The interval estimates could be either frequentist confidence bands or Bayesian credible regions. If an operator has reshaping, invariance, order-preserving, and distancereducing properties, the shape-enforced point estimates are closer to the target function than the original point estimates and the shape-enforced interval estimates have greater coverage and shorter length than the original interval estimates. We show that these properties hold for six different operators that cover commonly used shape restrictions in practice: range, convexity, monotonicity, monotone convexity, quasi-convexity, and monotone quasi-convexity, with the latter two restrictions being of paramount importance. The main attractive property of the post-processing approach is that it works in conjunction with any generic initial point or interval estimate, obtained using any of parametric, semiparametric or nonparametric learning methods, including recent methods that are able to exploit either smoothness, sparsity, or other forms of structured parsimony of target functions. The post-processed point and interval estimates automatically inherit and provably improve these properties in finite samples, while also enforcing qualitative shape restrictions brought by scientific reasoning. We illustrate the results with two empirical applications to the estimation of a height growth chart for infants in India and a production function for chemical firms in China.
 

The Three Arab Worlds on the Eve of the Arab Spring. James E. Rauch & Scott Kostyshak (2014), Handbook On Islam And Economic Life, edited by M. Kabir Hassan and Mervyn K. Lewis. Edward Elgar Publishing.

Abstract: In this chapter we measure socioeconomic progress in the Arab countries in the four decades preceding the Arab Spring by the extent to which various standard indicators did or did not converge to those of the rest of the world. We argue that measurement of this convergence has been hampered by the common practice of lumping together all Arab countries, which are too diverse to form a meaningful aggregate. Our analysis accordingly is based around a disaggregation of the Arab world into three groups of countries, which are then compared with their non-Arab counterparts.
 

The Three Arab Worlds. James E. Rauch & Scott Kostyshak (2009). Journal of Economic Perspectives 23 (3): 165-188.

Abstract: Given the attention currently focused on the Arab world in part as a result of adjustments in U.S. foreign policy, a fresh look at Arab socioeconomic performance is in order. The Arab world is defined by language rather than ethnicity. The League of Arab States, formed in 1945, consists of all countries in which (a dialect of) Arabic is the spoken language of the majority. It is useful to compare the human development diversity of the Arab world to that of Latin America, another vast geographic area defined by language and culture. Our strategy in this article is therefore to disaggregate the Arab world into Arab sub-Saharan Africa, Arab fuel-endowed economies, and a remainder we call the Arab Mediterranean, and to compare these three Arab worlds to non-Arab sub-Saharan Africa, non-Arab fuel endowed economies, and the rest of the non-Arab world. We will evaluate Arab socioeconomic progress from 1970 to as close to the present as the data allow.
 
 

Working Papers

 

Title: Choosing and Using Information in Evaluation Decisions, with Katherine B. Coffman and Perihan Saygin.

Abstract: Most studies of gender discrimination consider how male versus female candidates are assessed given otherwise identical information about them. But, in many settings of interest, evaluators have a choice about how much information to acquire about a candidate before making a final assessment. We use a large controlled experiment to explore how this type of endogenous information acquisition amplifies discriminatory outcomes in a simulated hiring environment. Across evaluators, we vary the composition of candidate pools, exploring not only environments where men outperform women on average but also environments with no gender difference or with a female advantage. Perhaps surprisingly, we observe no gender discrimination overall: conditional on their likelihood of being qualified, male and female candidates receive indistinguishable evaluations. But, we observe important differences across candidate pools. Candidates belonging to an advantaged group – the gender with the performance advantage in the pool – receive significantly better evaluations than equally qualified candidates in pools with no gender gap in performance. Similarly, candidates belonging to a disadvantaged group – the gender with a performance disadvantage in the pool – receive significantly worse evaluations relative to equally qualified candidates in pools with no gender gap in performance. This “relative advantage” bias appears in initial assessments, influences how evaluators update their beliefs about a candidate after acquiring more information, and persists in final evaluations. This bias has a significantly larger impact on evaluations when evaluators endogenously acquire information compared to treatments where we exogenously provide it, in part because we observe significant under-acquisition of information. We show that the relative advantage bias leads to two important types of mistakes: evaluators miss out on talented candidates from disadvantaged groups and over-select less talented candidates from advantaged groups.
 

Title: Flatness-Robust Critical Bandwidth

Abstract: Critical bandwidth (CB) can be used to test the multimodality of densities and regression functions, as well as for clustering methods. This paper proposes a solution to the well-known problem that CB tests are generally inconsistent if the function of interest is constant (“flat”) over an interval. The solution, flatness-robust CB (FRCB), exploits the fact that the problem manifests only from regions consistent with the null hypothesis, and thus identifying and excluding them does not alter the null or alternative sets. I provide sufficient conditions for consistency of FRCB, and simulations of a test of regression monotonicity demonstrate the finite-sample properties of FRCB compared with CB for various regression functions. I illustrate the usefulness of FRCB with an empirical analysis of the monotonicity of the conditional mean function of radiocarbon age with respect to calendar age.
 

Title: Down to the Last Strike: The Effect of the Jury Lottery on Conviction Rates, with Neel U. Sukhatme.

Abstract: How much does luck matter to a criminal defendant in a jury trial? We use rich data on jury selection to causally estimate how parties who are randomly assigned a less favorable jury (as proxied by whether their attorneys exhaust their peremptory strikes) fare at trial. Our novel identification strategy does not require the unrealistic exclusion restriction required by instrumental variable regression, and is unique in that it captures variation in juror predisposition from variables unobserved by the econometrician but observed by attorneys. We find that criminal defendants who lose the “jury lottery” are more likely to be convicted than their similarly-situated counterparts, with a significant effect for black defendants. For black defendants, strike exhaustion raises the chances of conviction by 16-18 percentage points. Our results are robust to alternate specifications and raise important policy questions about race and the use of peremptory strikes in the criminal justice system. In particular, our results suggest that increasing peremptory strike limits for defendants would decrease the variance in outcomes for similarly-situated black defendants.
 

Title: Non-Parametric Testing of U-Shapes, with an Application to the Midlife Satisfaction Dip

Abstract: Many theories in economics predict U-shaped relationships between variables. However, satisfactory tools to examine U-shapes are lacking. After explaining the limitations of the commonly employed quadratic specification, I propose a non-parametric test of U-shaped regression functions based on critical bandwidth. The test allows one to determine whether an inherent U-shape exists between two variables or the relationship is instead caused by correlation with other variables. I apply the test to the commonly observed U-shape of life satisfaction in age, and find that much of the U-shape can be explained by the increase in financial satisfaction that typically occurs later in life. This novel insight into a long-studied puzzle is not revealed by using a quadratic specification. A user-friendly and efficient R package is provided.

 

Work In Progress

 

  • The Partial Monotonicity Parameter: A Generalization of Monotonicity, with Ye Luo
  • Identification of Momentum in Elections, with Neel U. Sukhatme
  • Estimating Peer Effects, with an Application to Study Abroad, with Perihan Saygin