Weitere Artikel dieser Ausgabe durch Wischen aufrufen
Despite legislative attempts to curtail financial statement fraud, it continues unabated. This study makes a renewed attempt to aid in detecting this misconduct using linguistic analysis with data mining on narrative sections of annual reports/10-K form. Different from the features used in similar research, this paper extracts three distinct sets of features from a newly constructed corpus of narratives (408 annual reports/10-K, 6.5 million words) from fraud and non-fraud firms. Separately each of these three sets of features is put through a suite of classification algorithms, to determine classifier performance in this binary fraud/non-fraud discrimination task. From the results produced, there is a clear indication that the language deployed by management engaged in wilful falsification of firm performance is discernibly different from truth-tellers. For the first time, this new interdisciplinary research extracts features for readability at a much deeper level, attempts to draw out collocations using n-grams and measures tone using appropriate financial dictionaries. This linguistic analysis with machine learning-driven data mining approach to fraud detection could be used by auditors in assessing financial reporting of firms and early detection of possible misdemeanours.
The Economist. Accounting Scandals. The dozy watchdogs, December 13th, 2014.
Rezaee Z, Riley R. Financial statement fraud. 2nd ed. John Wiley & Sons; 2009.
Zack G. Financial statement fraud: strategies for detection and investigation. Hoboken, NJ: John Wiley & Sons; 2013.
Ngai EWT, Yong Hu, Wong YH, Chen Y, Sun X. The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decis Support Syst 2011;50(3):559–69. CrossRef
Aburrous M, Hossain MA, Dahal K, Thabtah F. Intelligent phishing detection system for e-banking using fuzzy data mining. J Expert Syst Appl 2010;37(12):7913–21. CrossRef
Perols J. Financial statement fraud detection: an analysis of statistical and machine learning algorithms, auditing. J Pract Theory 2011;30(2):19–50.
Ravisankar P, Ravi V, Raghava RG, Bose L. Detection of financial statement fraud and feature selection using data mining techniques. Decis Support Syst 2011;50(2):491–500. CrossRef
Kanerva P. Hyperdimensional computing: an introduction to computing in distributed representation with high-dimensional random vectors. Cogn Comput 2009;1(2):139–59. CrossRef
Deloitte. Ten things about financial statement fraud. 3rd ed. A review of SEC enforcement releases, 2000–2008.
Balakrishnan R, Qiu XY, Srinivasan P. On the predictive ability of narrative disclosures in annual reports. Eur J Oper Res 2010;202(3):789–801. CrossRef
Department for Business Innovation and Skills. The future of narrative reporting, consulting on a new reporting framework. 2011. Accessible at https://www.gov.uk/government/consultations/thefutureofnarrativereportingafurtherconsultation.
Hancock J, Curry L, Goorha S, Woodworth M. On lying and being lied to: a linguistic analysis of deception in computer-mediated communication. Discourse Process 2008;45(1):1–23. CrossRef
Zhou L, Burgoon J, Nunamaker J, Twitchell D. Automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communication. Group Decis Negot 2004;13(1):81–106. CrossRef
Humpherys S, Moffit K, Burns M, Burgoon J, Felix W. Identification of fraudulent financial statements using linguistic credibility analysis. Decis Support Syst 2011;50(3):585–94. CrossRef
Meyer P. Liespotting proven techniques to detect deception. New York: St. Martin's Press; 2011.
Moffit K, Burns M, Felix W, Burgoon J. Using lexical bundles to discriminate between fraudulent and non-fraudulent financial reports on. In: SIG-ASYS Pre-ICIS 2010 workshop; 2010.
McNamara DS, Graesser AC, McCarthy PM, Cai Z. Automated evaluation of text and discourse with Coh-Metrix. Cambridge: Cambridge University Press; 2014. CrossRef
Loughran T, McDonald B. Measuring readability in financial disclosures. J Finance 2014;69(4):1643–71. CrossRef
Cambria E, Hussain A. Sentic computing: techniques, tools, and applications. Dordrecht, Netherlands: Springer; 2012. CrossRef
Loughran T, McDonald B. When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. J Finance 2011;66(1):35–65. CrossRef
Feng L. Annual report readability, current earnings, and earnings persistence. J Account Econ 2008;45(23):221–47.
Jensen M, Meckling W. Theory of the firm: managerial behavior, agency costs and ownership structure. J Finance Econ 1976;3(4):305–60. CrossRef
Sufi A. Information asymmetry and financing arrangements: evidence from syndicated loans. J Finance 2007;62(2):629–68. CrossRef
Baginski SP, Hassell JM, Hillison WA. Voluntary causal disclosures: tendencies and capital market reaction. Rev Quant Account Finance 2000;15(4):47–67.
Adelberg A. Narrative disclosures contained in financial reports: means of communication or manipulation. Account Bus Res 1979;9(35):179–90. CrossRef
Merkl-Davies DM, Brennan NM. Discretionary disclosure strategies in corporate narratives: incremental information or impression management? J Account Lit 2007;27:116–96.
Feldman R, Govindaraj G, Livnat J, Segal B. Management’s tone change, post earnings announcement drift and accruals. Rev Account Stud 2010;15:915–53. CrossRef
Bloomfield RJ. The incomplete revelation hypothesis and financial reporting. Account Horizons 2002;16(3):223–43. CrossRef
Sinclair J. Corpus, concordance, collocation. Oxford: Oxford University Press; 1991.
Chomsky N. Language and other cognitive systems. What Is special about language? Lang Learn Dev 2011;7(4):263–78. CrossRef
Summers SL, Sweeney JT. Fraudulently misstated financial statements and insider trading: an empirical analysis. Acc Rev 1998;73(1):131–46.
Kothari S, Li X, Short J. The effect of disclosures by management, analysts, and business press on cost of capital, return volatility, and analyst forecasts: a study using content analysis. Account Rev 2009;84(5):1639–70. CrossRef
Minhas S, Poria S, Hussain A, Hussainey K. A review of artificial intelligence and biologically inspired computational approaches to solving issues in narrative financial disclosure. Lect Notes Comput Sci 2012:317–27.
Chintalapati S, Jyotsna G. Application of data mining techniques for financial accounting fraud detection scheme. Int J Adv Res Comput Sci Softw Eng 2013;3(11):717–24.
Kumar P, Ravi V. Bankruptcy prediction in banks and firms via statistical and intelligent techniques—a review. Eur J Oper Res 2007;180(1):1–28. CrossRef
Sabau A, Panigrahi PK. Survey of clustering based financial fraud detection research. Inf Econ 2012;16(1):110–22.
Tetlock P. Giving content to investor sentiment: the role of media in the stock market. J Finance 2007;62(3):1139–68. CrossRef
Kin L, Ramos, F, Rogo R. Earnings management and annual report readability. Seminar Series, Singamore Management University; 2015.
Cecchini M, Aytug H, Koehler G, Pathak P. Making words work: using financial text as a predictor of financial events. Decis Support Syst 2010;50(1):164–75. CrossRef
Zhou W, Kapoor G. Detecting evolutionary financial statement fraud. Decis Support Syst 2010;50(3):570–5. CrossRef
Goel S, Gangolly J. Can linguistic predictors detect fraudulent financial filings? J Emerg Technol Account 2010;7(1):25–46. CrossRef
Glancy F, Yadav S. A computational model for financial reporting fraud detection. Decis Support Syst 2011;50(3):557–648. CrossRef
Purda L, Skillicorn D. Accounting variables, deception, and a bag of words: assessing the tools of fraud detection. Contemp Account Res 2014;32(3):1193–223. CrossRef
Kotsiantis SB. Supervised machine learning: a review of classification techniques. Proceedings of conference on emerging artificial intelligence applications in Computer Engineering: real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologies; 2007.
Feinerer I, Hornik K, Meyer D. Text mining infrastructure in R. J Stat Softw 2008;25(5):1–54. CrossRef
Kuhn M. Building predictive models in R using the caret package. J Stat Softw 2008;28(5):1–26. CrossRef
Hajj M, Rayson P, Young S, Walker M. Detecting document structure in a very large corpus of UK financial reports. In: Proceedings of the ninth international conference on language resources and evaluation; 2014.
Kursa M. Feature selection with the Boruta Package. J Stat Softw 2010;36(11):1–13. CrossRef
Durran N, Hall C, McCarthy P, McNamara D. The linguistic correlates of conversational deception: comparing natural language processing technologies. Appl Psycholinguist 2010;31(3):439–62. CrossRef
Keikha M, Razavian NS, Oroumchian F, Hassan SR. Document representation and quality of text: an analysis survey of text mining: clustering, classification, and retrieval. 2nd ed. Spinger; 2008.
Guelman L. Gradient boosting trees for auto insurance loss cost modeling and prediction. Expert Syst Appl 2012;39(3):3659–67. CrossRef
Zong Yu, Zhenghu Yang, Guandong Xu. Applied data mining. Taylor and Francis Group; 2013.
Friedman J. Stochastic gradient boosting. Comput Stat Data Anal 2002;38(4):367–78. CrossRef
Brown I. An experimental comparison of classification techniques for imbalanced credit scoring data sets using SAS ® Enterprise Miner. Proceedings of SAS Global Forum; 2012.
Gupta V, Kaur N. A novel hybrid text summarization system for Punjabi text. Cogn Comput 2015;8(2):261–77. CrossRef
Kathleen A, Kaminski T. Can financial ratios detect fraudulent financial reporting? Manag Audit J 2004;9(1):15–28.
Rutherford BA. Genre analysis of corporate annual report narratives a corpus linguistics-based approach. Int J Bus Commun 2005;42(4):349–78. CrossRef
- From Spin to Swindle: Identifying Falsification in Financial Text
- Springer US
Neuer Inhalt/© ITandMEDIA, Best Practices für die Mitarbeiter-Partizipation in der Produktentwicklung/© astrosystem | stock.adobe.com