Print Visualizations at Data Stories 2022
Lisette Espín Noboa
Remotely sensed data combined with advanced machine learning methods provided a recent breakthrough in poverty map inference. However, these methods are not optimized to produce accountable results that guarantee accurate predictions to all sub-populations.
Here we propose three regression models that predict wealth from seven independent data sources such as Google, Facebook and OpenStreetMap. Our models correct for noisy, biased and sparse ground-truth data in Sierra Leone, a country characterized by extreme poverty.
- The first model is based on a convolutional neural network architecture (CNN) that learns to predict the wealth of places from daylight-satellite images.
- The second model is an XGBoost regressor (XGB) that learns to predict wealth from metadata features such as mobility and infrastructure.
- The third and last model extends the second one by including a feature vector extracted from the third last layer of the CNN model (CNN+XGB).
Using our models, we found that mobility and population features are the best predictors of wealth. However, the model improves its predictive power when using all features at once. Our models explain more than 80% of the variation of wealth, which outperform previous approaches and help to make better decisions on feature selection and policy.