Swipe to navigate through the chapters of this book
Documents are seldom created in a vacuum. In all literature, there exists some influencing factor either in the form of cited documents, collaboration, or documents which authors have read. This influence can be seen within their works, and is present as a latent variable. This chapter demonstrates a novel method for quantifying these influences and representing them in a semantically understandable fashion. The model is constructed by representing documents as tensors, decomposing them into a set of factors, and then searching the corpus factors for similarity.
Please log in to get access to this content
To get access to this content you need the following product:
Antonia, A., Craig, H., & Elliott, J. (2014). Language chunking, data sparseness, and the value of a long marker list: Explorations with word n-grams and authorial attribution. Literary and Linguistic Computing, 29(2), 147–163. CrossRef
Bader, B., Berry, M. W., & Browne, M. (2007). Discussion tracking in Enron email using PARAFAC. In M. W. Berry & M. Castellanos (Eds.), Survey of Text Mining, chapter 8 (pp. 147–163). Berlin: Springer.
Bro, R. (1997). Parafac. Tutorial and applications. Chemometrics and Intelligent Laboratory Systems, 38(2), 149–171. CrossRef
Burrows, J. (2006). All the way through: Testing for authorship in different frequency strata. Literary and Linguistic Computing, 22(1), 27–47. CrossRef
Burrows, J., & Craig, H. (2017). The joker in the pack?: Marlowe, Kyd, and the co-authorship of Henry VI, part 3. In G. Taylor & G. Egan (Eds.), The New Oxford Shakespeare Authorship Companion, chapter 11 (pp. 194–217). Oxford: Oxford University Press.
Cantuariensis, A. (c. 1078). Proslogion. Ordo Sancti Benedicti.
Carroll, J. D., & Chang, J.-J. (1970). Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition. Psychometrika, 35(3), 283–319. CrossRef
Comon, P., Luciani, X., & De Almeida, A. L. (2009). Tensor decompositions, alternating least squares and other tales. Journal of Chemometrics, 23(7–8), 393–405. CrossRef
Craig, H., & Kinney, A. F. (2009). Sheakespeare, Computers, and the Mystery of Authorship. Cambridge: Cambridge University Press. CrossRef
Dietz, L., Bickel, S., & Scheffer, T. (2007). Unsupervised prediction of citation influences. In Proceedings of the 24th International Conference on Machine Learning (pp. 233–240). New York: ACM.
Harshman, R. A. (1970). Foundations of the parafac procedure: Models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics, 16.
Harshman, R. A., & Lundy, M. E. (1994). Parafac: Parallel factor analysis. Computational Statistics & Data Analysis, 18(1):39–72. CrossRef
Hitchcock, F. L. (1927). The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics, 6(1–4), 164–189. CrossRef
Jiang, Z., Liu, X., & Gao, L. (2014). Dynamic topic/citation influence modeling for chronological citation recommendation. In Proceedings of the 5th International Workshop on Web-scale Knowledge Representation Retrieval & Reasoning (pp. 15–18). New York: ACM.
Kawamae, N. (2016). N-gram over context. In Proceedings of the 25th International Conference on World Wide Web (pp. 1045–1055). New York: International World Wide Web Conferences Steering Committee. CrossRef
Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD workshop on Research Issues in Data Mining and Knowledge Discovery (pp. 2–11). New York: ACM.
Liu, J., Liu, J., Wonka, P., & Ye, J. (2012). Sparse non-negative tensor factorization using columnwise coordinate descent. Pattern Recognition, 45(1), 649–656. CrossRef
Lowe, R. E. (2018). Textual Influence Modeling Through Non-Negative Tensor Decomposition. PhD thesis, University of Tennessee, Knoxville.
Serfass, D. (2017). Dynamic biometric recognition of handwritten digits using symbolic aggregate approximation. In Proceedings of the SouthEast Conference (pp. 1–4). New York: ACM.
Stamatatos, E. (2011). Plagiarism detection based on structural information. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management (pp. 1221–1230). New York: ACM.
- Using Non-negative Tensor Decomposition for Unsupervised Textual Influence Modeling
Robert E. Lowe
Michael W. Berry
- Springer International Publishing
- Sequence number
- Chapter number
- Chapter 4