Graduate Student Researcher University of California, Santa Barbara, United States
Abstract Information: Machine learning methods have improved over the last decade and have made significant progress in their abilities to learn and understand more about quantitative datasets (Athey & Imbens, 2015). While machine learning has made significant technical advances, little is known about how we might be able to use machine learning models for policy and program evaluation. Machine learning techniques provide serious insight for predicting social outcomes by providing metrics that allow evaluators, policymakers, and researchers to assess model performance. Moreover, learning models can be used for improved policy decision making by providing more sophisticated ways of identifying program participants and predicted outcomes. For example, the Allegany County Fragile Families Challenge implemented machine learning to analyze administrative data and predict social and educational outcomes (Salganik et al., 2019). This paper will cover how statistical and machine learning techniques can be applied to large government datasets and administrative data to predict participant outcomes relative to specified model features. Furthermore, the paper addresses metrics used to evaluate machine learning models along with other technical steps that help evaluate model performance. The primary example will employ different machine learning models to data collected from the Integrated Postsecondary Education Data System administered by the U.S. Institute for Education Sciences. Metrics for evaluating model performance will be discussed considering use for evaluation and policymaking decisions. Athey, Susan & Imbens, Guido W., 2015. "Machine Learning for Estimating Heterogeneous Causal Effects," Research Papers 3350, Stanford University, Graduate School of Business. Salganik, M. J., Lundberg, I., Kindel, A. T., & McLanahan, S. (2019). Introduction to the Special Collection on the Fragile Families Challenge. Socius: Sociological Research for a Dynamic World, 5, 237802311987158.
Relevance Statement: While machine learning techniques have largely improved over the past decade (Athey & Imbens, 2015), the use of such models has been minimal and inaccessible. This paper considers the use of machine learning for policy and program evaluation by introducing applied machine learning methods to large government data in the higher education sector. The application of machine learning to a large government dataset for evaluation purposes provides several opportunities for evaluation practice. The first point of relevance for the evaluation community includes the application of machine learning to answer evaluation questions that may be too complex to answer through traditional quantitative methods. This is followed by the discussion of the impact that machine learning will have on evaluation design as well as advancements in quantitative methods to address complex social problems that evaluators encounter. Alternatively, the introduction of machine learning opens several opportunities to evaluate the effectiveness of such a method as a viable and sustainable tool for assessing policy and program performance. The relevance of the paper is mostly geared towards advancing methodological tools for evaluation, but also connects to building organizational capacity for evaluation. Machine learning techniques contribute to methodological standards by introducing a set of methods for predicting social outcomes that allow for more complex models and associated metrics for such models. Machine learning can be used to increase organizational capacity for evaluation to determine participant behavior and outcomes using government, administrative, and survey data. Finally, effective implementation of the method can provide evaluators and organizations the ability to make data-driven decisions at the policy and program levels. Athey, Susan & Imbens, Guido W., 2015. "Machine Learning for Estimating Heterogeneous Causal Effects," Research Papers 3350, Stanford University, Graduate School of Business.