ABSTRACT
This article describes the Compus visualization system that assists in the exploration and analysis of structured document corpora encoded in XML. Compus has been developed for and applied to a corpus of 100 French manuscript letters of the 16th century, transcribed and encoded for scholarly analysis using the recommendations of the Text Encoding Initiative. By providing a synoptic visualization of a corpus and allowing for dynamic queries and structural transformations, Compus assists researchers in finding regularities or discrepancies, leading to a higher level analysis of historic source. Compus can be used with other richly encoded text corpora as well.
- 1.Christopher Ahlberg and Ben Shneiderman. Visual Information Seeking: Tight Coupling of Dynamic Query Filters with Starfield Displays. In Human Factors in Computing Systems. Conference Proceedings CHI'94, pages 313-317, 1994. ACM. Google ScholarDigital Library
- 2.Christopher Ahlberg Spotfire: An Information Exploration Environment SIGMOD REPORT, 25(4), pp. 25-29, December 1996. Google ScholarDigital Library
- 3.Bosak J. Shakespeare 2.00. Available at http://metalab.unc.edu/bosak/xml/eg/shaks200.zipGoogle Scholar
- 4.Bray, T. Paoli, J. Sperberg-McQueen C. M. Eds. Extensible Markup Language (XML) 1.0, Recommendation of the W3 Consortium. Feb. 1998. Available at http://www.w3.org/TR/REC-xmlGoogle Scholar
- 5.Caton, Paul. "Putting Renaissance Women Online," New Models and Opportunities, ICCC/IFIP Working Conference on Electronic Publishing '97, April 1997. See also at http://www.wwp.brown.edu/Google Scholar
- 6.Clark, J. XSL Transformations (XSLT) Version 1.0 W3C Working Draft. Available at http://www.w3.org/TR/WD-xsltGoogle Scholar
- 7.Cover, R. The SGML/XML Web Page, Available at http://www.oasis-open.org/cover/.Google Scholar
- 8.Dufournaud, N. Comportements et relations sociales en Bretagne vers 1530, d'apres les lettres de remission, Memoire de Maitrise, Univ. De Nantes. Available at http://palissy.humana.univ-nantes.fr/cete/txt/remission/Memoire.pdfGoogle Scholar
- 9.Stephen G. Eick and Joseph L. Steffen and Eric E. Sumner Jr. Seesoft - A Tool For Visualizing Line Oriented Software Statistics IEEE Transactions on Software Engineering, pp. 957-68, November 1992. Google ScholarDigital Library
- 10.Fekete, J.-D. and Dufournaud N. Analyse historique de sources manuscrites : application de TEI ~ un corpus de lettres de remission du XVIieme siecle Special issue "Les documents anciens", (Hermes), vol.3, 1-2, 1999, pp. 117-134 (in French).Google Scholar
- 11.Friedland, L. E. and Price-Wilkin J. TEI and XML in Digital Libraries, Workshop July 1998, Library of Congress. Available at http://www.hti.umich.edu/misc/ssp/workshops/teidlf/Google Scholar
- 12.Charles F. Goldfarb and Yuri Rubinsky The SGML handbook, Clarendon Press, 1990. Google ScholarDigital Library
- 13.Christopher G. Healey Choosing Effective Colours for Data Visualization Proceedings of the Conference on Visualization, pp. 263-270, IEEE, October 27- Nov 1 1996. Google ScholarDigital Library
- 14.Nancy Ide and Dan Greenstein, Eds. Tenth Anniversary of the Text Encoding Initiative, Computer and the Humanities, 33(1-2), 1999.Google Scholar
- 15.Keim D. A.: Pixel-oriented Database Visualizations , Sigmod Record, Special Issue on Information Visualization, Dec. 1996. Google ScholarDigital Library
- 16.Ian Lancashire, John Bradley, Willard McCarty, Michael Stairs, Using TACT with Electronic Texts. New York: MLA, December 1996.Google Scholar
- 17.E. Lecolinet and L. Likforman-Sulem and L. Robert and F. Role and J-L. Lebrave An Integrated Reading and Editing Environment for Scholarly Research on Literary Works and their Handwritten Sources DL'98: Proceedings of the 3rd ACM International Conference on Digital Libraries, pp. 144- 151, 1998. Google ScholarDigital Library
- 18.Marchionini, G., Plaisant, C., Komlodi, A. Interfaces and Tools for the Library of Congress National Digital Library Program Information Processing & Management, 34, 5, pp. 535-555, 1998. Google ScholarDigital Library
- 19.The Oxford Text Archives, available at http://ota.ahds.ac.uk/.Google Scholar
- 20.Siemund, Rainer, and Claudia Claridge. 1997. "The Lampeter Corpus of Early Modern English Tracts." ICAME Journal 21, 61-70., Norwegian Computing Centre for the Humanities.Google Scholar
- 21.Randall M. Rohrer and David S. Ebert and John L. Sibert The Shape Of Shakespeare: Visualizing Text Using Implicit Surfaces Proceedings IEEE Symposium on Information Visualization 1998, pp. 121-129, 1998. Google ScholarDigital Library
- 22.George G. Robertson and Jock D. Mackinlay The Document Lens Proceedings of the ACM Symposium on User Interface Software and Technology, Visualizing Information, pp. 101-108, 1993. Google ScholarDigital Library
- 23.C. M. Sperberg-McQueen and Lou Burnard (eds.) Guidelines for Electronic Text Encoding and Interchange (TEI P3), Volumes 1 and 2, The Association for Computers and the Humanities, the Association for Computational Linguistics, and the Association for Literary and Linguistic Computing, 1994. Google ScholarDigital Library
Index Terms
- Compus: visualization and analysis of structured documents for understanding social life in the 16th century
Recommendations
Logical Structure Analysis and Generation for Structured Documents: A Syntactic Approach
This paper presents a syntactic method for sophisticated logical structure analysis that transforms document images with multiple pages and hierarchical structure into an electronic document based on SGML/XML. To produce a logical structure more ...
A Model and Framework for Visualization Exploration
Visualization exploration is the process of extracting insight from data via interaction with visual depictions of that data. Visualization exploration is more than presentation; the interaction with both the data and its depiction is as important as ...
Transforming documentation from the XML doctypes used for the apache website to DITA
SIGDOC '01: Proceedings of the 19th annual international conference on Computer documentationA primary factor behind the enormous interest in XML is the support it provides for transforming documents to meet the needs of information-processing applications as well as human readers working with HTML, print, and other presentation media. This ...
Comments