Working paper
Target Variable Engineering
Jessica Clark
Abstract
How does the formulation of a target variable affect performance within the ML pipeline? The experiments in this study examine numeric targets that have been binarized by comparing against a threshold. We compare the predictive performance of regression models trained to predict the numeric targets vs. classifiers trained to predict their binarized counterparts. Specifically, we make this comparison at every point of a randomized hyperparameter optimization search to understand the effect of computational resource budget on the tradeoff between the two. We find that regression requires significantly more computational effort to converge upon the optimal performance, and is more sensitive to both randomness and heuristic choices in the training process. Although classification can and does benefit from systematic hyperparameter tuning and model selection, the improvements are much less than for regression. This work comprises the first systematic comparison of regression and classification within the framework of computational resource requirements. Our findings contribute to calls for greater replicability and efficiency within the ML pipeline for the sake of building more sustainable and robust AI systems.
Under review
Automated Promotion? A Study of the Fairness-Economic Tradeoffs in Reducing Crowdfunding Disparities via AI/ML
Lauren Rhue, Jessica Clark
Abstract
Digital platforms have a widely-documented issue with racial disparities, which can result in adverse reputational and economic consequences. Equitable promotion of projects across racial groups can mitigate these disparities. Our research explores how to more equitably determine which projects should be promoted by the platform. Platforms typically rely on their employees to decide what content to highlight, but human decisions are subject to cognitive and implicit biases. We examine whether an algorithmic-based approach to choosing which projects to promote can generate more equitable outcomes for people in traditionally marginalized groups while resulting in equivalent economic outcomes. We perform an observational and simulated study on more than 100,000 projects gathered from crowdfunding platform Kickstarter.com to determine whether machine learning models would diversify the set of promoted projects.
Our analysis yields three main findings. First, machine learning models—fairness- unaware and fairness-aware models—identify a more diverse set of projects to promote than those selected by employees. Second, promoting a more diverse set of projects diminishes but does not completely eliminate disparities between racial groups. Third, a more equitable promotion scheme does not substantially negatively affect core business outcomes for the platform. This study contributes to the information systems literature related to using machine learning to reduce racial disparities and to research examining the fairness-economic trade-off. Furthermore, this paper provides a practical path forward for digital platforms who want to increase participation from diverse groups, and for potential crowdfunding participants.
Under review
Not time but place: Location vs. previous choices for prosocial crowdfunding recommendation strategies
Lauren Rhue, Atiya Avery, Jessica Clark
Abstract
Donors on prosocial crowdfunding platforms have two critical motivations for donors: supporting local causes and supporting social connections. Platform recommendation strategies often leverage prior donor choices; however, donors’ choices may be driven by supply constraints rather than their true preferences. In these instances, a recommendation strategy based on donor attributes such as location may better reflect their true preferences. To understand the effectiveness of these two recommendation strategies, we conducted a randomized experiment with 200,000 donors in partnership with a prosocial crowdfunding platform. Donors were randomly selected to receive project recommendations that were either geographically close to their home or geographically close to their previously supported cause. We found that the local recommendation strategy increased the likelihood of clicks and donations. These results are driven by donors without social connections to the platform, indicating that social motivations supersede geographic motivations and suggesting that digital platforms should consider a hierarchical approach. We also found evidence that the local recommendation strategy yields a rich-get-richer effect. We discuss the implications of our findings for digital platforms as well as the practical implications for our research context of education in the United States.