ABSTRACT
We describe the design and use of a personal digital library system, UpLib. The system consists of a full-text indexed repository accessed through an active agent via a Web interface. It is suitable for personal collections comprising tens of thousands of documents (including papers, books, photos, receipts, email, etc.), and provides for ease of document entry and access as well as high levels of security and privacy. Unlike many other systems of the sort, user access to the document collection is assured even if the UpLib system is unavailable. It is "universal" in the sense that documents are canonically represented as projections into the text and image domains, and uses a predominantly visual user interface based on page images. UpLib can thus handle any document format which can be rendered as pages. Provision is made for alternative representations existing alongside the text-domain and image-domain representation, either stored or generated on demand. The system is highly extensible through user scripting, and is intended to be used as a platform for further work in document engineering. UpLib is assembled largely from open-source components (the current exception being the OCR engine, which is proprietary).
- E. Adar, D. Kargar, and L. A. Stein. Haystack: per-user information environments. In Proceedings of the eighth international conference on Information and knowledge management, pages 413--422. ACM Press, 1999. Google ScholarDigital Library
- M. J. Adler and C. V. Doren. How to Read a Book. Touchstone Books, revised edition, 1972.Google Scholar
- B. B. Bederson. Photomesa: a zoomable image browser using quantum treemaps and bubblemaps. In Proceedings of the 14th annual ACM Symposium on User Interface Software and Technology, pages 71--80. ACM Press, 2001. Google ScholarDigital Library
- E. A. Bier, M. C. Stone, K. Pier, W. Buxton, and T. D. DeRose. Toolglass and magic lenses: The see-through interface. In Proceedings of SIGGRAPH '93, ACM Computer Graphics Annual Conference Series, pages 73--80, Anaheim, California, August 1993. Google ScholarDigital Library
- K. Bollacker, S. Lawrence, and C. L. Giles. CiteSeer: An autonomous web agent for automatic retrieval and identification of interesting publications. In K. P. Sycara and M. Wooldridge, editors, Proceedings of the Second International Conference on Autonomous Agents, pages 116--123, New York, 1998. ACM Press. Google ScholarDigital Library
- C. Bowman, P. Danzig, D. Hardy, U. Manber, M. Schwartz, and D. Wessels. Harvest: A scalable, customizable discovery and access system. Technical Report CU-CS-732-94, University of Colorado, Boulder, Colorado, 1994.Google ScholarCross Ref
- T. M. Breuel, W. C. Janssen, K. Popat, and H. S. Baird. Paper to PDA. In Proceedings of the 16th IAPR Internation Conference on Pattern Recognition, pages 467--479, Quebec City, Canada, August 2002. IAPR. Google ScholarDigital Library
- B. A. T. Brown, A. J. Sellen, and K. P. O'Hara. A diary study of information capture in working life. In Proceedings of 2000 ACM Special Interest Group on Computer-Human Interaction (CHI2000), pages 438--445, 2000. Google ScholarDigital Library
- J. R. Davis, C. Lagoze, and D. B. Krafft. Dienst: Building a production technical report server. In Proceedings of the 1995 Advances in Digital Libraries Conference, pages 259--271, McClean, Virginia, May 1995. IEEE Computer Society, IEEE. Google ScholarDigital Library
- DELOS Working Group 2.1. Survey on existing digital library systems, January 2001. http://www.sztaki.hu/delos\_wg21.Google Scholar
- P. Dourish, W. K. Edwards, A. LaMarca, and M. Salisbury. Presto: an experimental architecture for fluid interactive document spaces. ACM Transactions on Computer-Human Interaction (TOCHI), 6(2):133--161, 1999. Google ScholarDigital Library
- A. Graham, H. Garcia-Molina, A. Paepcke, and T. Winograd. Time as essence for photo browsing through personal digital libraries. In Proceedings of the Second ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL'02), pages 326--335, Portland, Oregon, July 2002. Google ScholarDigital Library
- D. Huynh, D. Karger, and D. Quan. Haystack: A platform for creating, organizing and visualizing information using rdf. In Proceedings of the Semantic Web Workshop, The Eleventh World Wide Web Conference 2002, 2002.Google Scholar
- J. D. Mackinlay, G. G. Robertson, and S. K. Card. The perspective wall: detail and context smoothly integrated. In Proceedings of the SIGCHI conference on Human factors in computing systems: Reaching through technology, pages 173--176, New Orleans, Louisiana, 1991. ACM. Google ScholarDigital Library
- C. C. Marshall and C. Ruotolo. Reading-in-the-small: A study of reading on small form factor devices. In Proceedings of the Second ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL'02), pages 56--64, Portland, Oregon, July 2002. Google ScholarDigital Library
- W. M. Newman, C. R. Dance, A. S. Taylor, S. A. Taylor, M. Taylor, and T. Aldhous. Camworks: A video-based tool for efficient capture from paper source documents. In Proceedings of the IEEE Conference on Multimedia Systems, volume 2, pages 647--653, 1999. Google ScholarDigital Library
- T. A. Phelps and R. Wilensky. The Multivalent browser: a platform for new ideas. In Proceedings of the 2001 ACM Symposium on Document Engineering, pages 58--67, Atlanta, Georgia, 2001. ACM. See also http://www.cs.berkeley.edu/ phelps/Multivalent/ . Google ScholarDigital Library
- S. Putz. Design and implementation of the system-33 document service. Technical Report ISTL-NLTT-93-07-01, Xerox Palo Alto Research Center, 3333 Coyote Hill Road -- Palo Alto, CA 94304, 1993.Google Scholar
- B. N. Schilit, M. N. Price, and G. Golovchinsky. Digital library information appliances. In Proceedings of Digital Libraries `98, Pittsburgh, PA, June 1998. ACM. Google ScholarDigital Library
- The Apache Project. Jakarta Lucene Overview, 2003. See http://jakarta.apache.org/lucene/docs/index.html.Google Scholar
- The ht://Dig Group. ht://Dig -- Internet search engine software, 2003. See http://www.htdig.org/.Google Scholar
- R. Wilensky. Personal libraries: Collection management as a tool for lightweight personal and group document management. Technical Report SDSC TR-2001-9, San Diego Supercomputer Center, 9500 Gilman Drive -- La Jolla, CA 92093-0505, 2001.Google Scholar
- I. H. Witten, R. J. McNab, S. J. Boddie, and D. Bainbridge. Greenstone: A comprehensive open-source digital library software system. In Proceedings of the Fifth ACM International Conference on Digital Libraries, 2000. Google ScholarDigital Library
- I. H. Witten, A. Moffat, and T. C. Bell. Managing Gigabytes. Morgan Kaufmann, 2nd edition, 1999.Google Scholar
Index Terms
- UpLib: a universal personal digital library system
Recommendations
The UpLib personal digital library system
JCDL '05: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital librariesWe demonstrate the operation of UpLib, a visually-oriented personal digital library system.
Collaborative extensions for the UpLib system
JCDL '04: Proceedings of the 4th ACM/IEEE-CS joint conference on Digital librariesThe UpLib personal digital library system is specifically designed for secure use by a single individual. However, collaborative operation of multiple UpLib repositories is still possible. This paper describes two mechanisms that have been added to ...
Document image analysis for digital libraries
IWRIDL '06: Proceedings of the 2006 international workshop on Research issues in digital librariesDigital Libraries have many forms -- institutional libraries for information dissemination, document repositories for record-keeping, and personal digital libraries for organizing personal thoughts, knowledge, and course of action. Digital image content ...
Comments