Machine learning affords insight into what affects social and economic progress.
Decades before Thomas Piketty’s 2013 bestseller Capital in the Twenty-first Century galvanised an international conversation about increasing economic disparity in wealthy nations, development economists were probing the relationship between income inequality and socioeconomic progress. Results were mixed: Some found a direct link between pronounced inequality and poor economic development; others posited that inequality was positively correlated with development; still others could find no clear connection.
Nonetheless, academics, policymakers and business leaders increasingly worry about inequality, and believe that markedly unequal societies may be riven by political problems. Unequal societies may also have difficulty attaining positive educational and health outcomes and building institutions that encourage investments, making it hard to sustain broad-based prosperity. In existing research and debates, the Gini coefficient, an overall measure of income distribution, is ubiquitously employed as a rough predictor of a given country’s economic prospects. The narrow focus on the Gini coefficient strikes us as odd, since it is skewed toward inequality at the top. Whopping income gaps between millionaires and billionaires affect the Gini more strongly than the comparably meagre amounts dividing the poor from one another and from the lower middle class. Inequality as reflected by the Gini coefficient, then, arguably gives short shrift to poverty.
So is it inequality at the top or the bottom that really matters for predicting outcomes such as schooling, institutional quality and per-capita income? For our recent working paper “Income Distribution and Economic Development: Insights From Machine Learning”, we used machine learning techniques to put the Gini to the test alongside dozens of other measures of inequality. Our results underscore the frequently neglected role of poverty, as opposed to absolute inequality, in shaping development outcomes.
Poverty as predictor
Machine learning tools are excellent at making predictions based on existing datasets, a process which involves selecting, from among a pool of potential variables, a sparse subset that will affect a given outcome. We hypothesized that, when furnished with real-world data about how various economies fared over time, our computer models would pinpoint which, if any, measures of income distribution were predictive.
Using income distribution figures for 93 developing and advanced economies from 1988 (the earliest year for which data were available), as reported in household surveys, we generated a total of 37 inequality measures for each country – among them, the Gini coefficient as well as indices of absolute poverty, relative poverty (for mature economies), and a hybrid of the two. The machine learning tools allow us to ascertain which of the 37 measures gave the best indication of how these countries would look approximately 15 years later – as reflected in data from 2002-2003 on per capita income, secondary education enrolment rates and institutional stability.
The results were clear. As we write in the working paper, “From a pure prediction perspective, it is poverty that matters more than any other distributional statistic, including Gini.”
Shedding light on causes
While supervised machine learning always wins the prediction race, social scientists are more interested in making causal inferences. Predictive technique tells us little about causes – for instance, our findings could have been due to some other factor affecting both poverty and development outcomes. So, for the next stage of our study, we extended the machine learning techniques to causal inferences, and incorporated 67 explanatory variables that researchers have associated with long-term growth – such as schooling, demography and geographic characteristics. We find essentially the same result: The fraction of population living in poverty emerged as significant in predicting real-world outcomes while the Gini was not selected as a relevant predictor of either outcomes or poverty itself.
Finally, we threw in a historical factor known to increase inequality: land endowments. Many Latin American countries under colonial rule had their land cultivated for large-scale sugar plantations. Colonial overseers would compel subjugated populations to work these plantations as slave labour, creating a lingering legacy of deep inequality. Conversely, arable land in North America was more often cultivated for wheat plantations, whose relatively small scale contributed to the growth of an agricultural middle class. Wheat-to-sugar land endowment ratios, therefore, serve as a well-established proxy (or “instrument” in statistical jargon) for inequality. However, our regressions put a finer point on it: Poverty proved more significant than the Gini coefficient as the channel through which land endowments impacted societal outcomes.
A big difference
If the distinction between poverty and inequality seems merely academic, consider that according to our estimates, reducing Bolivia’s poverty level (50 percent) to that of Uruguay (10 percent) would virtually erase the 20 percent difference in secondary education enrolment rate between the two nations and produce a roughly equivalent increase in Bolivia’s per capita GDP. We expect that our findings would be somewhat less applicable to developed economies, where poverty is more a relative concept than a question of absolute need.
Still, for policymakers, the distinction between inequality and poverty makes a big difference. Refocusing from inequality to poverty would mean deprioritising policies designed to bolster the middle class at the expense of the rich, such as U.S. Senator Bernie Sanders’s proposal to eliminate tuition at public colleges and universities. Instead, governments should concentrate on pulling people out of poverty, with the expectation that society as a whole would eventually benefit. Such a stance would not rule out redistributive policies, but it would seem that a common-sense interpretation of the Pareto principle – the rich getting richer isn’t necessarily a bad thing, as long as no one else in society suffers – may be a good guideline.
Ilia Tsetlin is a Professor of Decision Sciences and the Chair of the Decision Sciences Area at INSEAD.
 In 2015, according to an annual survey published by the World Economic Forum, the widening gap between rich and poor was seen as the biggest risk facing the global economy over the next decade. The irony of the world’s richest furrowing their collective brows at the maladies of the poor, while drinking a 2000 Cheval Blanc in Swiss chalets, was not lost on The Economist. Each chalet was rented for around $700 a day, a number serendipitously close to the annual income used by the World Bank to calculate poverty (the $2 a day benchmark).