Introduction
Property market appraisals
Rental market
Now, the rise of the “big data” concept may at last be setting the scene for a breakthrough for real estate… The analytical skills and experience to harness the data successfully are developing too, and market globalisation is serving to increase awareness of data best practice from different markets around the world.
Rental market valuations
Traditional models
and also alternative modelling paradigms that allow for a flexible expression of the relationship between rental value and the attributes, such as machine learning.Everything is related to everything else, but near things are more related than distant things,
Machine learning
Case description
Property information
Neighbourhood information
Estimation methods
Experimental procedure
Regression and machine learning
Practitioner approach
Goodness of fit
Results
Attribute | N/median | Estimate | Std error | t |
---|---|---|---|---|
Intercept | 487,253 | 6.4510 | 0.0067 | 957.7*** |
Flat | 212,275 | |||
Bungalow | 11,617 | 0.0073 | 0.0059 | 1.2 |
Detached | 31,996 | 0.0192 | 0.0037 | 5.2*** |
Semi-detached | 54,410 | − 0.0463 | 0.0032 | − 14.5*** |
Terraced | 111,087 | − 0.0185 | 0.0025 | − 7.4*** |
Unknown | 65,868 | 0.0169 | 0.0026 | 6.4*** |
1 bedroom | 94,379 | |||
2 bedrooms | 192,236 | 0.2772 | 0.0024 | 116.8*** |
3 bedrooms | 123,546 | 0.5157 | 0.0028 | 186.7*** |
4 bedrooms | 41,505 | 0.7607 | 0.0033 | 228.6*** |
5 bedrooms | 12,558 | 1.0080 | 0.0043 | 235.7*** |
6 and more bedrooms | 7097 | 1.2650 | 0.0051 | 248.3*** |
Unknown bedrooms | 15,932 | − 0.0881 | 0.0050 | − 17.7*** |
1 bathroom | 194,157 | |||
2 bathrooms | 45,440 | 0.1314 | 0.0026 | 50.8*** |
3 bathrooms | 6767 | 0.3343 | 0.0047 | 71.2*** |
4 bathrooms | 1150 | 0.5347 | 0.0085 | 63.3*** |
5 and more bathrooms | 622 | 0.6633 | 0.0107 | 62.0*** |
Unknown bathrooms | 239,117 | 0.1169 | 0.0024 | 48.2*** |
1 reception room | 159,999 | |||
2 reception rooms | 41,912 | 0.0020 | 0.0030 | 0.7 |
3 reception rooms | 4921 | 0.0681 | 0.0060 | 11.4*** |
4 reception rooms | 723 | 0.2235 | 0.0113 | 19.8*** |
5 and more reception rooms | 191 | 0.3379 | 0.0189 | 17.9*** |
Unknown reception rooms | 279,507 | − 0.0333 | 0.0024 | − 13.9*** |
January | 50,988 | |||
February | 37,309 | − 0.0220 | 0.0036 | − 6.2*** |
March | 39,601 | − 0.0179 | 0.0035 | − 5.1*** |
April | 38,037 | − 0.0098 | 0.0035 | − 2.8** |
May | 40,414 | 0.0095 | 0.0034 | 2.8** |
June | 42,095 | − 0.0090 | 0.0034 | − 2.7** |
July | 44,808 | − 0.0031 | 0.0033 | − 0.9 |
August | 39,791 | 0.0068 | 0.0035 | 2.0* |
September | 37,994 | − 0.0041 | 0.0035 | − 1.2 |
October | 43,005 | 0.0086 | 0.0034 | 2.5* |
November | 42,037 | 0.0238 | 0.0034 | 7.0*** |
December | 31,174 | 0.0042 | 0.0038 | 1.1 |
Up to 4 web site visits per day | 24,094 | |||
5–10 web site visits per day | 14,610 | 0.0244 | 0.0055 | 4.4*** |
11–20 web site visits per day | 23,114 | − 0.0199 | 0.0050 | − 3.9*** |
21–60 web site visits per day | 39,969 | − 0.0469 | 0.0046 | − 10.3*** |
61 and more web site visits per day | 29,423 | − 0.0754 | 0.0050 | − 15.2*** |
Unknown site visits | 356,043 | 0.0230 | 0.0037 | 6.2*** |
Affluent achievers | 60,017 | |||
Rising prosperity | 136,624 | − 0.1961 | 0.0026 | − 74.5*** |
Comfortable communities | 98,779 | − 0.2798 | 0.0028 | − 99.7*** |
Financially stretched | 92,146 | − 0.3463 | 0.0031 | − 112.9*** |
Urban adversity | 96,472 | − 0.4212 | 0.0031 | − 134.3*** |
Not private households | 3008 | − 0.0994 | 0.0090 | − 11.1*** |
ACORN not known | 207 | − 0.1028 | 0.0274 | − 3.8*** |
Distance from the City of London (logged in model) | 113.95 km | − 0.2862 | 0.00079 | − 363.2*** |
Distance from railway station (logged in model) | 1.11 km | − 0.0204 | 0.0010 | − 20.0*** |
Outstanding primary school | 91,869 | |||
Good primary school | 308,287 | − 0.0487 | 0.0019 | − 26.2*** |
Requires improvement primary school | 79,841 | − 0.0614 | 0.0026 | − 24.0*** |
Inadequate primary school | 7256 | − 0.0972 | 0.0071 | − 13.7*** |
Outstanding secondary school | 119,014 | |||
Good secondary school | 245,070 | − 0.0760 | 0.0018 | − 43.2*** |
Requires improvement secondary school | 96,715 | − 0.1047 | 0.0024 | − 44.6*** |
Inadequate secondary school | 26,454 | − 0.1269 | 0.0044 | − 28.9*** |
Retail health | 30.53 | 0.0025 | 0.00005 | 52.2*** |
Access health | 7.21 | − 0.0001 | 0.00008 | − 1.9 |
Environment health | 25.32 | 0.0004 | 0.00004 | 10.5*** |
Regression model
Machine learning and practitioner approach
Training | GLM | GB | SVM | Cubist | MARS | Best MLA |
---|---|---|---|---|---|---|
Jan | 0.53 | 0.57 | 0.48 | 0.59 | 0.47 | 0.65 |
Feb | 0.56 | 0.59 | 0.53 | 0.62 | 0.49 | 0.66 |
Mar | 0.51 | 0.56 | 0.47 | 0.59 | 0.45 | 0.67 |
Apr | 0.56 | 0.61 | 0.48 | 0.63 | 0.47 | 0.66 |
May | 0.56 | 0.59 | 0.50 | 0.60 | 0.49 | 0.66 |
Jun | 0.54 | 0.57 | 0.51 | 0.60 | 0.48 | 0.64 |
Jul | 0.54 | 0.56 | 0.50 | 0.59 | 0.48 | 0.64 |
Aug | 0.56 | 0.60 | 0.50 | 0.63 | 0.48 | 0.64 |
Sep | 0.54 | 0.57 | 0.50 | 0.59 | 0.47 | 0.64 |
Oct | 0.51 | 0.56 | 0.50 | 0.60 | 0.46 | 0.63 |
Nov | 0.49 | 0.52 | 0.46 | 0.54 | 0.43 | 0.64 |
Dec | 0.49 | v54 | 0.47 | 0.57 | 0.43 | 0.63 |
Testing | PBA | GLM | GB | SVM | Cubist | MARS | Ensemble | Best MLA |
---|---|---|---|---|---|---|---|---|
Jan | 0.55 | 0.56 | 0.62 | 0.56 | 0.65 | 0.47 | 0.67 | 0.68 |
Feb | 0.53 | 0.55 | 0.61 | 0.57 | 0.64 | 0.50 | 0.65 | 0.64 |
Mar | 0.48 | 0.49 | 0.52 | 0.48 | 0.56 | 0.43 | 0.57 | 0.58 |
Apr | 0.52 | 0.55 | 0.58 | 0.55 | 0.65 | 0.47 | 0.65 | 0.64 |
May | 0.41 | 0.44 | 0.48 | 0.44 | 0.50 | 0.39 | 0.51 | 0.52 |
Jun | 0.53 | 0.59 | 0.63 | 0.60 | 0.67 | 0.52 | 0.68 | 0.68 |
Jul | 0.55 | 0.58 | 0.66 | 0.61 | 0.66 | 0.53 | 0.69 | 0.69 |
Aug | 0.51 | 0.53 | 0.58 | 0.56 | 0.62 | 0.48 | 0.63 | 0.62 |
Sep | 0.52 | 0.57 | 0.64 | 0.57 | 0.68 | 0.51 | 0.69 | 0.68 |
Oct | 0.49 | 0.56 | 0.59 | 0.57 | 0.63 | 0.49 | 0.64 | 0.63 |
Nov | 0.52 | 0.57 | 0.63 | 0.54 | 0.64 | 0.48 | 0.66 | 0.66 |
Dec | 0.51 | 0.56 | 0.61 | 0.57 | 0.66 | 0.51 | 0.67 | 0.60 |
All | 0.51 | 0.54 | 0.59 | 0.55 | 0.63 | 0.48 | 0.64 | 0.64 |
Testing | PBA | GLM | GB | SVM | Cubist | MARS | Ensemble | Best MLA |
---|---|---|---|---|---|---|---|---|
Jan | 7.95 | 16.62 | 16.07 | 13.80 | 13.59 | 20.73 | 13.44 | 13.28 |
Feb | 8.17 | 16.55 | 15.22 | 13.30 | 13.46 | 20.66 | 13.04 | 13.02 |
Mar | 8.35 | 16.28 | 15.24 | 13.32 | 13.22 | 20.66 | 13.14 | 12.89 |
Apr | 8.47 | 15.83 | 15.00 | 13.13 | 13.31 | 20.49 | 12.95 | 13.05 |
May | 8.62 | 15.94 | 14.85 | 12.99 | 13.04 | 20.01 | 13.32 | 12.98 |
Jun | 8.82 | 16.02 | 15.07 | 13.39 | 13.36 | 19.83 | 13.04 | 13.13 |
Jul | 9.23 | 15.68 | 14.82 | 12.97 | 12.91 | 19.69 | 12.87 | 12.57 |
Aug | 9.26 | 15.70 | 14.74 | 13.02 | 12.90 | 19.92 | 12.91 | 12.74 |
Sep | 9.26 | 15.12 | 14.40 | 12.55 | 12.38 | 19.25 | 12.40 | 12.31 |
Oct | 9.80 | 16.14 | 15.17 | 13.40 | 13.39 | 19.67 | 13.39 | 13.10 |
Nov | 9.95 | 16.70 | 15.76 | 13.83 | 13.89 | 19.64 | 14.46 | 13.36 |
Dec | 9.73 | 15.77 | 14.76 | 13.20 | 12.35 | 19.36 | 13.00 | 13.03 |
All | 9.07 | 16.04 | 15.11 | 13.25 | 13.18 | 20.01 | 13.06 | 12.95 |
Discussion and evaluation
Comparative performance
Scale | Log transformed | Original | |||
---|---|---|---|---|---|
Source | PBA model | Löchl [59] | Fuss and Koller [25] | PBA Model | McCord, Davis [30] |
Location | 15 km and 12 months | Table 9, SARerr | Table 4/C, STAR | 15 km and 12 months | |
Testing data | 1 month ahead | In sample | 1 day ahead | 1 month ahead | In sample |
≤ 2% | 54.69 | 72.65 | 15.1 | 13.3 | |
≤ 5% | 83.39 | 98.02 | 37.4 | 32.2 | 33.7 |
≤ 8% | 91.85 | 99.93 | |||
≤ 10% | 94.42 | 64.8 | 53.3 | 60.9 | |
≤ 15% | 97.38 | 80.9 | 66.9 | 79.3 | |
≤ 20% | 98.66 | 89.3 |
Validation
Limitations
Extensions
Conclusions
Big data seek to combine processing power and specialist analytical skills to bring together huge, disparate and often incompatible data sets from different sources. If big data are to be “the next frontier for innovation, competition, and productivity” as the title of the McKinsey report [66] suggests, it would seem important for the real estate industry, and researchers in the sector, to identify areas where the value of harnessing big data outweighs the perceived advantages of keeping data private, and to start exploiting them.