Abstract
Big data analytics have shown a tremendous impact on modern politics—among which the election forecasting modeling is notable that utilizes the large scale heterogeneous data sources, such as polls, surveys, and social media popularity to build prediction models by exploiting the power of machine learning and artificial intelligence. In this article, we present a novel machine learning-based election forecasting model that predicted Pakistan’s 2018 General Election with the highest accuracy and won a nation-wide competition. To capture the winning probability of individual candidates in a constituency, the model taped an array of statistics from different data sources. Past election data was employed to mine demographic trends of each party across the districts, Twitter, and approval polls were exploited to snap current popularity levels. By employing Bayesian optimization, the model combined the probabilities from different sources by ‘rigging’ the results for ten seats as a win, where competition was expected to be one-sided. In contrast to the existing models that only predict the aggregate share of votes for different political parties at the national level, our model also effectively predicted the winning candidates for every national assembly seat. The seat share of political parties in the national assembly was predicted with 83% accuracy. Of the total 270 constituencies, 230 winners were among the top two candidates, predicted by the proposed technique. Our model produces the most accurate results of the election compared to all the opinion polls and surveys held before the election 2018 in the country. We showed that big data tools and techniques coupled with the right mixture of machine learning and artificial intelligence models could have a significant impact on modern day political landscape.
Similar content being viewed by others
References
Akram H (2018) Dunya election cell survey 2018 . https://dunyanews.tv/en/Pakistan/449132-Dunya-Election-Cell-Survey-2018-results. Accessed 22 Sep 2018
Ananiadou S, Thompson P, Nawaz R (2013) Enhancing search: events and their discourse context. In: International conference on intelligent text processing and computational linguistics, Springer, pp 318–334
Andreas Graefe AC (2014) State-by-state political economy model. https://pollyvote.com/en/components/econometric-models/jerome-jerome/
Andrew Mercer CDKM (2018) Why 2016 election polls missed their mark . http://www.pewresearch.org/fact-tank/2016/11/09/why-2016-election-polls-missed-their-mark/. Accessed 22 Sep 2018
Batista-Navarro RT, Kontonatsios G, Mihăilă C, Thompson P, Rak R, Nawaz R, Korkontzelos I, Ananiadou S (2013) Facilitating the analysis of discourse phenomena in an interoperable nlp platform. In: international conference on intelligent text processing and computational linguistics, Springer, pp 559–571
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305
Blumenthal M (2014) Polls, forecasts, and aggregators. PS: Polit Sci Polit 47(2):297–300
Brochu E, Cora VM, De Freitas N (2010) A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:10122599
Campbell JE, Norpoth H et al (2017) A recap of the 2016 election forecasts. PS: Polit Sci Polit 50(2):331–338
Craig Timberg ED (2018) Twitter is sweeping out fake accounts like never before, putting user growth at risk. https://goo.gl/meB6pK. Accessed 20 Dec 2018
Dassonneville R, Lewis-Beck MS (2014) Comparative election forecasting. synthetic models for europe. In: Conference on methodological innovations in the study of elections in Europe and beyond, College Station
Dassonneville R, Lewis-Beck MS, Mongrain P (2017) Forecasting dutch elections: an initial model from the March 2017 legislative contests. Res Polit 4(3):2053168017720023
Duncan P (2018) How the pollsters got it wrong on the EU referendum. https://www.theguardian.com/politics/2016/jun/24/how-eu-referendum-pollsters-wrong-opinion-predict-close. Accessed 22 Sep 2018
Dwi Prasetyo N, Hauff C (2015) Twitter-based election prediction in the developing world. In: Proceedings of the 26th ACM Conference on Hypertext & Social Media, ACM, pp 149–158
Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89
Gallup (2018) Elections exclusive: 3 poll results in, Who will you vote for Pakistan?. https://goo.gl/HYiTZX. Accessed 22 Sep 2018
Gayo-Avello D (2012) “I wanted to predict elections with twitter and all i got was this lousy paper”–a balanced survey on election prediction using twitter data. arXiv preprint arXiv:12046441
Holbrook TM (2012) Incumbency, national conditions, and the 2012 presidential election. PS: Polit Sci Polit 45(4):640–643
IPOR (2018) National Survey of Current Political Situation in Pakistan. http://ipor.com.pk/wp-content/uploads/2018/07/National-Survey-of-Current-Political-Situation-in-Pakistan.pdf. Accessed 22 Sep 2018
Jahangir M, Afzal H, Ahmed M, Khurshid K, Nawaz R (2017) An expert system for diabetes prediction using auto tuned multi-layer perceptron. In: 2017 Intelligent Systems Conference (IntelliSys), IEEE, pp 722–728
Kagan V, Stevens A, Subrahmanian V (2015) Using twitter sentiment to forecast the 2013 pakistani election and the 2014 indian election. IEEE Intell Syst 1:2–5
Lewis-Beck MS, Rice TW (1984) Forecasting presidential elections: a comparison of naive models. Polit Behav 6(1):9–21
Lewis-Beck MS, Tien C (2012) Election forecasting for turbulent times. PS: Polit Sci Polit 45(4):625–629
Lewis-Beck MS, Tien C (2016) The political economy model: 2016 us election forecasts. PS: Polit Sci Polit 49(4):661–663
Lewis-Beck MS, Tien CP (2018) House forecasts: structure-x models for 2018. PS: Polit Sci Polit 51(S1):17–20
Lytras M, Aljohani NR, Hussain A, Luo J, Zhang JX (2018) Cognitive computing track chairs’ welcome & organization. In: Companion of the The Web Conference 2018 on The Web Conference 2018, International World Wide Web Conferences Steering Committee, pp 247–250
Lytras MD, Raghavan V, Damiani E (2017) Big data and data analytics research: from metaphors to value space for collective wisdom in human decision making and smart machines. Int J Semantic Web Inf Syst (IJSWIS) 13(1):1–10
Manzar Elahi SH (2018) PML-N remains most popular party, Nawaz most favourite leader: survey . https://www.geo.tv/latest/169121. Accessed 22 Sep 2018
Mislove A, Lehmann S, Ahn Y-Y, Onnela J-P, Rosenquist JN (2011) Understanding the demographics of twitter users. ICWSM 11(5th):25
Mustafaraj E, Finn S, Whitlock C, Metaxas PT (2011) Vocal minority versus silent majority: Discovering the opionions of the long tail. In: Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), 2011 IEEE Third International Conference on, IEEE, pp 103–110
Prokop A (2018) The terrifying uncertainty at the heart of fivethirtyeights election forecasts. https://www.vox.com/2018/10/24/18009356/fivethirtyeight-nate-silver-election-2018-forecast-analysis. Accessed 20 Dec 2018
ProPakistani (2018) First ever election prediction contest in Pakistan concludes. https://propakistani.pk/2018/08/01/first-ever-election-prediction-contest-in-pakistan-concludes/. Accessed 22 Sep 2018
Shardlow M, Batista-Navarro R, Thompson P, Nawaz R, McNaught J, Ananiadou S (2018) Identification of research hypotheses and new knowledge from scientific literature. BMC Med Inf Decis Making 18(1):46
Silver N (2018) How fivethirtyeights house, senate and governor models work. https://fivethirtyeight.com/methodology/how-fivethirtyeights-house-and-senate-models-work/. Accessed 20 Dec 2018
Skoric M, Poor N, Achananuparp P, Lim E-P, Jiang J (2012) Tweets and votes: A study of the 2011 singapore general election. In: System Science (HICSS), 2012 45th Hawaii International Conference on, IEEE, pp 2583–2591
Snoek J, Larochelle H, Adams R (2012) Practical bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp 2951–2959
Temming M (2018) How twitter bots get people to spread fake news. https://www.sciencenews.org/article/twitter-bots-fake-news-2016-election. Accessed 20 Dec 2018
Tien C, Lewis-Beck MS (2016) In forecasting the 2016 election result, modelers had a good year. pollsters did not. USApp–American Politics and Policy Blog
Traugott MW (2014) Public opinion polls and election forecasting. PS: Polit Sci Polit 47(2):342–344
Tumasjan A, Sprenger TO, Sandner PG, Welpe IM (2010) Predicting elections with twitter: what 140 characters reveal about political sentiment. ICWSM 10(1):178–185
Whiteley PF (2005) Forecasting seats from votes in british general elections. Br J Polit Int Rel 7(2):165–173
Acknowledgements
We are very thankful to Muhammad Asim, Iqra Akram and Fahad Shamshad for their insightful discussions and support in data collection.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Code and supplementary material are available at https://awaisrauf.github.io/election_prediction.
Rights and permissions
About this article
Cite this article
Awais, M., Hassan, SU. & Ahmed, A. Leveraging big data for politics: predicting general election of Pakistan using a novel rigged model. J Ambient Intell Human Comput 12, 4305–4313 (2021). https://doi.org/10.1007/s12652-019-01378-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-019-01378-z