Using machine learning and remote sensing to value property in Rwanda

Using machine learning and remote sensing to value property in Rwanda

Property valuation models can achieve mass valuation transparently and cheaply. This paper develops a number of property valuation models for Kigali, Rwanda, and tests them on a unique dataset combining remote sensing data and infrastructure and amenities data for properties in Kigali, with sales transaction data for 2015. We use a machine learning approach, Minimum Redundancy Maximum Relevance, to select from 511 features those that minimise ten-fold cross validated Mean Absolute Error. Cross validated diagnostics are used to eliminate overfitting given that our goal is to generate a model that can be used to extrapolate value estimates out of sample. The performance of Ordinary Least Squares (OLS) is compared to that of a range of spatial models. Our best model covering all taxed parcels, achieves a cross validated R2 of 0.600 and a cross-validated Mean Absolute Error of 0.541. We find that locational variables relating to connectivity are most consistently important for overall property value across different models. We also attempt to develop the most accurate method of calculating building values. Our recommendations for future property valuation in Rwanda are: i) Given that the goal is extrapolation of the model to estimate the value of all properties outside of the sample of transacted properties, it is essential to eliminate overfitting as far as possible. This can be done by optimising cross validated diagnostics such as Mean Absolute Error and R2. ii) The use of spatial models is desirable, in terms of out-of-sample accuracy, if and only if extensive testing of various spatial models alongside OLS on the basis of cross validated diagnostics, is possible; such models often overfit in sample but do not always outperform OLS out of sample. iii) Ideally a Computer Assisted Mass Appraisal would help determine full property taxes or land taxes, and not only building taxes, given that it is more accurate at estimating full property values or land values than it is at finding building values. Building values are also not directly observable and thus it is impossible to assess the accuracy of our imputed building value estimates. iv) Additional structural building data on variables such as building materials and numbers of rooms, which the Government of Rwanda plans to collect, would improve model accuracy.