Skip to main content
Top
Published in: Review of Accounting Studies 2/2021

02-10-2020

Using machine learning to detect misstatements

Authors: Jeremy Bertomeu, Edwige Cheynel, Eric Floyd, Wenqiang Pan

Published in: Review of Accounting Studies | Issue 2/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Machine learning offers empirical methods to sift through accounting datasets with a large number of variables and limited a priori knowledge about functional forms. In this study, we show that these methods help detect and interpret patterns present in ongoing accounting misstatements. We use a wide set of variables from accounting, capital markets, governance, and auditing datasets to detect material misstatements. A primary insight of our analysis is that accounting variables, while they do not detect misstatements well on their own, become important with suitable interactions with audit and market variables. We also analyze differences between misstatements and irregularities, compare algorithms, examine one-year- and two-year-ahead predictions and interpret groups at greater risk of misstatements.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
Another possible solution is to fit the model using a rolling window or exclude observations from firms used to build the model. However, both of these choices severely restrict the sample effectively used for cross-validation.
 
2
In a previous version of the manuscript, we focused on all restatements, including restatements not reported in 8-K; however, many of these restatements need not include large events. We thank Andy Imdieke for this suggestion.
 
3
Audit Analytics provides the restated amount for each year only for the five most recent years impacted by the restatement. However, firms’ restatements can often impact more than five years of financial data. The impact on accounting numbers prior to the most recent five years is usually reported as a cumulative charge to retained earnings, and, in practice, firms need not retrospectively adjust all prior years. To account for this, we assume that the cumulative effect to retained earnings is distributed evenly across the misstatement span identified in the restatement filing. If the span is missing, we allocate the unexplained cumulative change to the year prior to the last year with an income effect.
 
4
RUSBoost refers to random undersampling, such that balanced samples are constructed by randomly drawing from the sample. With a heavily imbalanced dataset, however, nonrandom undersampling may perform better than random sampling. In untabulated results, we used the sampling method of Perols et al. (2016), but it did not perform better than RUSboost in our dataset. Under this alternate sampling method, the AUC is 69.4%, and the detection rate of restatements is 60.0% and of AAERs is 81.1%. This method ranks better than logistic models but slightly worse than GBRT and random under-sampling in Tables 10 and 15. An important difference is that there are more material misstatements than AAERs, so the benefits of nonrandom sampling to alleviate imbalance are more muted.
 
5
We report the summary statistics of the important predictors in Table 6.
 
6
As in any multivariate descriptive analysis with multiple correlated variables, interpretation requires some caution since the method may select one variable over another for reasons that relate primarily to the fitting procedure. Later on, we list the set of important variables in other methods and observe that, while many variables are common to multiple algorithms, there are also some differences.
 
7
For variables combining market and accounting information, such as book-to-market and earnings-to-price, we allocate half weight to each category.
 
8
The theoretical model of Bertomeu and Marinovic (2015) also predicts this relation, as firms that endogenously retain more soft assets tend to be more credible.
 
9
We only document the results with the backward logistic model because the forward logistic and simple logistic models exhibit the same results. Backward and forward logistic models are much more sparse; that is, they use fewer variables than GBRT and simple logistic models. However, they do not appear to perform better than a simple logistic model. This finding suggests complex interactions in the entire population of potential predictors capture misstatements.
 
10
We report in Table10 bootstrapped standard errors, retraining and testing the model 200 times on randomly drawn datasets. Differences between the performance of most models tend to be greater than two standard errors, indicating these differences are significant. In untabulated analyses, we bootstrapped differences between model performance and confirm that differences between models are significantly different from zero at conventional levels.
 
11
In untabulated results, we also estimate the model by separating the restatement sample into positive and negative period income effects, under the conjecture that positive effects may reflect reversals or incentives to influence the stock price downward. See Kasnik (1999) for an extensive continuing literature. We divide restatements into three categories: negative income effects (overstatement), zero income effects, and positive income effects (understatement). We then build three models and predict the probability of overstatement, understatement, and a zero income effect separately. We do not find any notable improvement to predictive power in the test sample, likely because these alternative methods reduce the size of the dataset used to estimate the model.
 
12
In Panel B of Table 12, we obtain similar results after excluding firms with restatements in the training sample. Machine learning algorithms continue to perform better, compared to the logistic model, but feature lower catch rates.
 
13
In untabulated analyses, we compute the number of caught misstatements at least a year before the AAERs. Out of 29 misstatements caught by GBRT in the test sample, they relate to 20 AAERs fillings, and all of them are detected at least a year (often more than a year) before the AAER is filed.
 
14
We still estimate the models as in panel A using the entire population of misstatements and AAERs. One alternative would have been to estimate a model using only AAER-misstatement pairs as irregularities. However, the number of observations here becomes too small to build a model with reasonable out-of-sample performance.
 
15
In untabulated results, we find very low predictive ability when we predict the first misstatement year.
 
16
InTrees can imply redundant conditions if an inequality is repeated twice or is a subset of another inequality. In these cases, we only report the stricter condition.
 
17
This result coincides with the Stata package Boost, with code boost Res EP Soft , distribution(logistic) train(1) bag(1) interaction(2) maxiter(1) shrink(1) predict(pred).
 
Literature
go back to reference Abbasi, A., Albrecht, C., Vance, A., & Hansen, J. (2012). Metafraud: a meta-learning framework for detecting financial fraud. Mis Quarterly, 36(4), 1293–1327.CrossRef Abbasi, A., Albrecht, C., Vance, A., & Hansen, J. (2012). Metafraud: a meta-learning framework for detecting financial fraud. Mis Quarterly, 36(4), 1293–1327.CrossRef
go back to reference Avramov, D., Chordia, T., Jostova, G., & Philipov, A. (2009). Credit ratings and the cross-section of stock returns. Journal of Financial Markets, 12 (3), 469–499.CrossRef Avramov, D., Chordia, T., Jostova, G., & Philipov, A. (2009). Credit ratings and the cross-section of stock returns. Journal of Financial Markets, 12 (3), 469–499.CrossRef
go back to reference Bao, Y., Ke, B., Li, B., Julia Yu, Y., & Zhang, J. (2020). Detecting accounting fraud in publicly traded us firms using a machine learning approach. Journal of Accounting Research, 58(1), 199–235.CrossRef Bao, Y., Ke, B., Li, B., Julia Yu, Y., & Zhang, J. (2020). Detecting accounting fraud in publicly traded us firms using a machine learning approach. Journal of Accounting Research, 58(1), 199–235.CrossRef
go back to reference Barton, J., & Simko, P.J. (2002). The balance sheet as an earnings management constraint. The Accounting Review, 77(s-1), 1–27.CrossRef Barton, J., & Simko, P.J. (2002). The balance sheet as an earnings management constraint. The Accounting Review, 77(s-1), 1–27.CrossRef
go back to reference Beneish, M.D. (1999). The detection of earnings manipulation. Financial Analysts Journal, 55(5), 24–36.CrossRef Beneish, M.D. (1999). The detection of earnings manipulation. Financial Analysts Journal, 55(5), 24–36.CrossRef
go back to reference Bertomeu, J., & Marinovic, I. (2015). A Theory of hard and soft information. The Accounting Review, 91(1), 1–20.CrossRef Bertomeu, J., & Marinovic, I. (2015). A Theory of hard and soft information. The Accounting Review, 91(1), 1–20.CrossRef
go back to reference Blackburne, T., Kepler, J., Quinn, P., & Taylor, D. (2020). Undisclosed sec investigations. Forthcoming Management Science. Blackburne, T., Kepler, J., Quinn, P., & Taylor, D. (2020). Undisclosed sec investigations. Forthcoming Management Science.
go back to reference Cheffers, M., Whalen, D., & Usvyatsky, O. (2010). 2009 financial restatements: A nine year comparison. Audit Analytics Sales (February). Cheffers, M., Whalen, D., & Usvyatsky, O. (2010). 2009 financial restatements: A nine year comparison. Audit Analytics Sales (February).
go back to reference Cheynel, E., & Levine, C. (2020). Public disclosures and information asymmetry: A theory of the mosaic. The Accounting Review, 95(1), 79–99.CrossRef Cheynel, E., & Levine, C. (2020). Public disclosures and information asymmetry: A theory of the mosaic. The Accounting Review, 95(1), 79–99.CrossRef
go back to reference Dechow, P.M., & Dichev, I.D. (2002). The quality of accruals and earnings: The role of accrual estimation errors. The Accounting Review, 77(s-1), 35–59.CrossRef Dechow, P.M., & Dichev, I.D. (2002). The quality of accruals and earnings: The role of accrual estimation errors. The Accounting Review, 77(s-1), 35–59.CrossRef
go back to reference Dechow, P.M., Ge, W., Larson, C.R., & Sloan, R.G. (2011). Predicting material accounting misstatements. Contemporary Accounting Research, 28(1), 17–82.CrossRef Dechow, P.M., Ge, W., Larson, C.R., & Sloan, R.G. (2011). Predicting material accounting misstatements. Contemporary Accounting Research, 28(1), 17–82.CrossRef
go back to reference DeFond, M.L., Raghunandan, K., & Subramanyam, K.R. (2002). Do non–audit service fees impair auditor independence? evidence from going concern audit opinions. Journal of Accounting Research, 40(4), 1247–1274.CrossRef DeFond, M.L., Raghunandan, K., & Subramanyam, K.R. (2002). Do non–audit service fees impair auditor independence? evidence from going concern audit opinions. Journal of Accounting Research, 40(4), 1247–1274.CrossRef
go back to reference Deng, H. (2018). Interpreting tree ensembles with inttrees. International Journal of Data Science and Analytics, pp 1–11. Deng, H. (2018). Interpreting tree ensembles with inttrees. International Journal of Data Science and Analytics, pp 1–11.
go back to reference Ding, K., Lev, B., Peng, X., Sun, T., & Vasarhelyi, M.A. (2020). Machine learning improves accounting estimates. Review of Accounting Studies, pp 1–37. Ding, K., Lev, B., Peng, X., Sun, T., & Vasarhelyi, M.A. (2020). Machine learning improves accounting estimates. Review of Accounting Studies, pp 1–37.
go back to reference Dutta, I., Dutta, S., & Raahemi, B. (2017). Detecting financial restatements using data mining techniques. Expert Systems with Applications, 90, 374–393.CrossRef Dutta, I., Dutta, S., & Raahemi, B. (2017). Detecting financial restatements using data mining techniques. Expert Systems with Applications, 90, 374–393.CrossRef
go back to reference Ettredge, M.L., Sun, L., Lee, P., & Anandarajan, A.A. (2008). Is earnings fraud associated with high deferred tax and/or book minus tax levels?. Auditing: A Journal of Practice & Theory, 27(1), 1–33.CrossRef Ettredge, M.L., Sun, L., Lee, P., & Anandarajan, A.A. (2008). Is earnings fraud associated with high deferred tax and/or book minus tax levels?. Auditing: A Journal of Practice & Theory, 27(1), 1–33.CrossRef
go back to reference Fanning, K.M., & Cogger, K.O. (1998). Neural network detection of management fraud using published financial data. Intelligent Systems in Accounting, Finance & Management, 7(1), 21–41.CrossRef Fanning, K.M., & Cogger, K.O. (1998). Neural network detection of management fraud using published financial data. Intelligent Systems in Accounting, Finance & Management, 7(1), 21–41.CrossRef
go back to reference Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27(8), 861– 874.CrossRef Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27(8), 861– 874.CrossRef
go back to reference Frankel, R.M., Johnson, M.F., & Nelson, K.K. (2002). The relation between auditors’ fees for nonaudit services and earnings management. The Accounting Review, 77(s-1), 71–105.CrossRef Frankel, R.M., Johnson, M.F., & Nelson, K.K. (2002). The relation between auditors’ fees for nonaudit services and earnings management. The Accounting Review, 77(s-1), 71–105.CrossRef
go back to reference Friedman, J.H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, pp 1189–1232. Friedman, J.H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, pp 1189–1232.
go back to reference Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning Vol. 1. New York: Springer series in statistics. Friedman, J., Hastie, T., & Tibshirani, R. (2001). The Elements of Statistical Learning Vol. 1. New York: Springer series in statistics.
go back to reference Garfinkel, J.A. (2009). Measuring investors’ opinion divergence. Journal of Accounting Research, 47(5), 1317–1348.CrossRef Garfinkel, J.A. (2009). Measuring investors’ opinion divergence. Journal of Accounting Research, 47(5), 1317–1348.CrossRef
go back to reference Glosten, L.R., & Milgrom, P.R. (1985). Bid, ask and transaction prices in a specialist market with heterogeneously informed traders. Journal of Financial Economics, 14(1), 71–100.CrossRef Glosten, L.R., & Milgrom, P.R. (1985). Bid, ask and transaction prices in a specialist market with heterogeneously informed traders. Journal of Financial Economics, 14(1), 71–100.CrossRef
go back to reference Green, B.P., & Choi, J.H. (1997). Assessing the risk of management fraud through neural network technology. Auditing, A Journal of Practice and Theory, 16, 14–28. Green, B.P., & Choi, J.H. (1997). Assessing the risk of management fraud through neural network technology. Auditing, A Journal of Practice and Theory, 16, 14–28.
go back to reference Guelman, L. (2012). Gradient boosting trees for auto insurance loss cost modeling and prediction. Expert Systems with Applications, 39(3), 3659–3667.CrossRef Guelman, L. (2012). Gradient boosting trees for auto insurance loss cost modeling and prediction. Expert Systems with Applications, 39(3), 3659–3667.CrossRef
go back to reference Gupta, R., & Gill, N.S. (2012). A solution for preventing fraudulent financial reporting using descriptive data mining techniques. International Journal of Computer Applications. Gupta, R., & Gill, N.S. (2012). A solution for preventing fraudulent financial reporting using descriptive data mining techniques. International Journal of Computer Applications.
go back to reference Hribar, P., Kravet, T., & Wilson, R. (2014). A New measure of accounting quality. Review of Accounting Studies, 19(1), 506–538.CrossRef Hribar, P., Kravet, T., & Wilson, R. (2014). A New measure of accounting quality. Review of Accounting Studies, 19(1), 506–538.CrossRef
go back to reference Johnson, V.E., Khurana, I.K., & Kenneth Reynolds, J. (2002). Audit-firm tenure and the quality of financial reports. Contemporary Accounting Research, 19(4), 637–660.CrossRef Johnson, V.E., Khurana, I.K., & Kenneth Reynolds, J. (2002). Audit-firm tenure and the quality of financial reports. Contemporary Accounting Research, 19(4), 637–660.CrossRef
go back to reference Kasznik, R. (1999). On the association between voluntary disclosure and earnings management. Journal of Accounting Research, 37(1), 57–81.CrossRef Kasznik, R. (1999). On the association between voluntary disclosure and earnings management. Journal of Accounting Research, 37(1), 57–81.CrossRef
go back to reference Kim, Y.J., Baik, B., & Cho, S. (2016). Detecting financial misstatements with fraud intention using multi-class cost-sensitive learning. Expert Systems with Applications, 62, 32–43.CrossRef Kim, Y.J., Baik, B., & Cho, S. (2016). Detecting financial misstatements with fraud intention using multi-class cost-sensitive learning. Expert Systems with Applications, 62, 32–43.CrossRef
go back to reference Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., & Mullainathan, S. (2017). Human decisions and machine predictions. The Quarterly Journal of Economics, 133(1), 237–293. Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., & Mullainathan, S. (2017). Human decisions and machine predictions. The Quarterly Journal of Economics, 133(1), 237–293.
go back to reference Kornish, L.J., & Levine, C.B. (2004). Discipline with common agency: The case of audit and nonaudit services. The Accounting Review, 79(1), 173–200.CrossRef Kornish, L.J., & Levine, C.B. (2004). Discipline with common agency: The case of audit and nonaudit services. The Accounting Review, 79(1), 173–200.CrossRef
go back to reference Larcker, D.F., Richardson, S.A., & Tuna, Irem. (2007). Corporate governance, accounting outcomes, and organizational performance. The Accounting Review, 82(4), 963–1008.CrossRef Larcker, D.F., Richardson, S.A., & Tuna, Irem. (2007). Corporate governance, accounting outcomes, and organizational performance. The Accounting Review, 82(4), 963–1008.CrossRef
go back to reference Laux, V., & Newman, P.D. (2010). Auditor liability and client acceptance decisions. The Accounting Review, 85(1), 261–285.CrossRef Laux, V., & Newman, P.D. (2010). Auditor liability and client acceptance decisions. The Accounting Review, 85(1), 261–285.CrossRef
go back to reference Lin, J.W., Hwang, M.I., & Becker, J.D. (2003). A Fuzzy neural network for assessing the risk of fraudulent financial reporting. Managerial Auditing Journal, 18(8), 657–665.CrossRef Lin, J.W., Hwang, M.I., & Becker, J.D. (2003). A Fuzzy neural network for assessing the risk of fraudulent financial reporting. Managerial Auditing Journal, 18(8), 657–665.CrossRef
go back to reference Lobo, G.J., & Zhao, Y. (2013). Relation between audit effort and financial report misstatements: Evidence from quarterly and annual restatements. The Accounting Review, 88(4), 1385–1412.CrossRef Lobo, G.J., & Zhao, Y. (2013). Relation between audit effort and financial report misstatements: Evidence from quarterly and annual restatements. The Accounting Review, 88(4), 1385–1412.CrossRef
go back to reference Perols, J. (2011). Financial statement fraud detection: An analysis of statistical and machine learning algorithms. Auditing: A Journal of Practice & Theory, 30(2), 19–50.CrossRef Perols, J. (2011). Financial statement fraud detection: An analysis of statistical and machine learning algorithms. Auditing: A Journal of Practice & Theory, 30(2), 19–50.CrossRef
go back to reference Perols, J.L., Bowen, R.M., Zimmermann, C., & Samba, B. (2016). Finding needles in a haystack: Using data analytics to improve fraud prediction. The Accounting Review, 92(2), 221–245.CrossRef Perols, J.L., Bowen, R.M., Zimmermann, C., & Samba, B. (2016). Finding needles in a haystack: Using data analytics to improve fraud prediction. The Accounting Review, 92(2), 221–245.CrossRef
go back to reference Ragothaman, S., & Lavin, A. (2008). Restatements due to improper revenue recognition: a neural networks perspective. Journal of Emerging Technologies in Accounting, 5(1), 129–142.CrossRef Ragothaman, S., & Lavin, A. (2008). Restatements due to improper revenue recognition: a neural networks perspective. Journal of Emerging Technologies in Accounting, 5(1), 129–142.CrossRef
go back to reference Romanus, R.N., Maher, J.J., & Fleming, D.M. (2008). Auditor industry specialization, auditor changes, and accounting restatements. Accounting Horizons, 22(4), 389–413.CrossRef Romanus, R.N., Maher, J.J., & Fleming, D.M. (2008). Auditor industry specialization, auditor changes, and accounting restatements. Accounting Horizons, 22(4), 389–413.CrossRef
go back to reference Samuels, D., Taylor, D.J., & Verrecchia, R.E. (2018). Financial misreporting: Hiding in the shadows or in plain sight?. Samuels, D., Taylor, D.J., & Verrecchia, R.E. (2018). Financial misreporting: Hiding in the shadows or in plain sight?.
go back to reference Rijsbergen, V., & Joost, C. (2004). The geometry of information retrieval. Cambridge University Press. Rijsbergen, V., & Joost, C. (2004). The geometry of information retrieval. Cambridge University Press.
go back to reference Whiting, D.G., Hansen, J.V., McDonald, J.B., Albrecht, C., & Steve Albrecht, W. (2012). Machine learning methods for detecting patterns of management fraud. Computational Intelligence, 28(4), 505–527.CrossRef Whiting, D.G., Hansen, J.V., McDonald, J.B., Albrecht, C., & Steve Albrecht, W. (2012). Machine learning methods for detecting patterns of management fraud. Computational Intelligence, 28(4), 505–527.CrossRef
go back to reference Zhang, Y., & Haghani, A. (2015). A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies, 58, 308–324.CrossRef Zhang, Y., & Haghani, A. (2015). A gradient boosting method to improve travel time prediction. Transportation Research Part C: Emerging Technologies, 58, 308–324.CrossRef
Metadata
Title
Using machine learning to detect misstatements
Authors
Jeremy Bertomeu
Edwige Cheynel
Eric Floyd
Wenqiang Pan
Publication date
02-10-2020
Publisher
Springer US
Published in
Review of Accounting Studies / Issue 2/2021
Print ISSN: 1380-6653
Electronic ISSN: 1573-7136
DOI
https://doi.org/10.1007/s11142-020-09563-8

Other articles of this Issue 2/2021

Review of Accounting Studies 2/2021 Go to the issue

OriginalPaper

Analyst teams