Poverty map inference using machine learning and fine-grain online data in African countries

Author: Lisette Espín Noboa
János Kertész
Márton Karsai

Poverty maps—spatial representations of economic wealth—are essential tools for governments and NGOs to adequately allocate infrastructure and services in places in need. Traditionally, such maps are inferred from Census and survey data, which commonly provide outdated and low-resolution socioeconomic information, especially in developing countries.

Remotely sensed data combined with advanced machine learning methods provided a recent breakthrough in poverty map inference. However, these methods are not optimized to produce accountable results that guarantee accurate predictions to all sub-populations.

Here we propose three regression models that predict wealth from seven independent data sources such as Google, Facebook and OpenStreetMap. Our models correct for noisy, biased and sparse ground-truth data in Sierra Leone, a country characterized by extreme poverty.

- The first model is based on a convolutional neural network architecture (CNN) that learns to predict the wealth of places from daylight-satellite images.

- The second model is an XGBoost regressor (XGB) that learns to predict wealth from metadata features such as mobility and infrastructure.

- The third and last model extends the second one by including a feature vector extracted from the third last layer of the CNN model (CNN+XGB).

Using our models, we found that mobility and population features are the best predictors of wealth. However, the model improves its predictive power when using all features at once. Our models explain more than 80% of the variation of wealth, which outperform previous approaches and help to make better decisions on feature selection and policy.