skip to main content
article
Free Access

Electronic document addressing: dealing with change

Published:01 September 2000Publication History
Skip Abstract Section

Abstract

The management of electronic document collections is fundamentally different from the management of paper documents. The ephemeral nature of some electronic documents means that the document address (i.e., reference details of the document) can become incorrect some time after coming into use, resulting in references, such as index entries and hypertext links, failing to correctly address the document they describe. A classic case of invalidated references is on the World Wide Web—links that point to a named resource fail when the domain name, file name, or any other aspect of the addressed resource is changed, resulting in the well-known Error 404. Additionally, there are other errors which arise from changes to document collections.

This paper surveys the strategies used both in World Wide Web software and other hypertext systems for managing the integrity of references and hence the integrity of links. Some strategies are preventative, not permitting errors to occur; others are corrective, discovering references errors and sometimes attempting to correct them; while the last strategy is adaptive, because references are calculated on a just-in-time basis, according the current state of the document collection.

Skip Supplemental Material Section

Supplemental Material

References

  1. ACM. 2000. ACM Digital Library, http://www.acm. org/dl/.]]Google ScholarGoogle Scholar
  2. ARNOLD-MOORE, T. AND SACKS-DAVIS, R. 1994. Databases of Legislation: The Problems of Consolidation, Technical Report CITRI/TR-94- 9, Royal Melbourne Institute of Technology.]]Google ScholarGoogle Scholar
  3. ASHMAN, H. 1997. Theory and Practice of Large- Scale Hypermedia Management Systems, Ph.D. thesis, Royal Melbourne Institute of Technology.]]Google ScholarGoogle Scholar
  4. ASHMAN, H. AND DAVIS, H. 1998. Missing the 404: link integrity on the World Wide Web. In Proceedings of the Seventh International World Wide Web Conference, Elsevier, 761-762; also issued as Computer Networks and ISDN Systems 30, 1-7, http://www.scu.edu.au/programme/panels/ 1942/com1942.htm.]] Google ScholarGoogle Scholar
  5. ASHMAN, H., GARRIDO, A., AND OINAS-KUKKONEN, H. 1997. Hand-made and computed links, precomputed and dynamic links. In Proceedings of Hypermedia-Information Retrieval- Multimedia '97 (HIM '97) Conference, 191-208.]]Google ScholarGoogle Scholar
  6. BERNERS-LEE, T. 1996. Universal resource identifiers in WWW: a unifying syntax for the expression of names and addresses of objects on the network as used in the World Wide Web, World Wide Web Journal 1, 2 3-19.]]Google ScholarGoogle Scholar
  7. BERNERS-LEE, T., FIELDING, R., AND FRYSTYK, H. 1996. Hypertext transfer protocol HTTP/1.0, World Wide Web Journal 1, 2 59-94.]]Google ScholarGoogle Scholar
  8. BROWNE, S., DONGARRA, J., GREEN, S., MOORE, K., PEPIN, T., ROWAN, T., AND WADE, R. 1995. Location-Independent Naming for Virtual Distributed Software Repositories, http://www.netlib.org/utk/-papers/lifn/main. html.]]Google ScholarGoogle Scholar
  9. CAJUN. 2000. The CAJUN Project. Electronic Publishing Research Group. http://cajun.cs.nott. ac.uk.]]Google ScholarGoogle Scholar
  10. CARR, L., HILL, G., DE ROURE, D., HALL, W., AND DAVIS, H. 1996. Open information services. In Proceedings of the Fifth International WWW Conference; also issued as Computer Networks and ISDN Systems 28, 7-11, 1027-1036, http://www5conf.inria.fr/fich html/papers/P12/ Overview.html.]] Google ScholarGoogle Scholar
  11. CHANKHUNTHOD, A., DANZIG, P., NEERDAELS, C., SCHWARTZ, M., AND WORRELL, K. 1995. A Hierarchical Internet Object Cache, http://excalibur.usc.edu/cache-html/cache.html.]]Google ScholarGoogle Scholar
  12. CNRI. Corporation for National Research Initiatives. 1998. The Handle System, http://www. handle.net/.]]Google ScholarGoogle Scholar
  13. CONNOLLY, D. 1996. Names and addresses; URIs, URLs, URNs, URCs. http://www.w3.org/pub/ www/Addressing/.]]Google ScholarGoogle Scholar
  14. CREECH, M. 1996. Author-oriented link management. In Proceedings of the 5th International WWW Conference; also issued as Computer Networks and ISDN Systems 28, 7-11, 1015-1025, http://www5conf.inria.fr/fich html/papers/P11/ Overview.html.]] Google ScholarGoogle Scholar
  15. DAVIS, H. 1995. To embed or not to embed, Communications of the ACM 38, 8 (Aug.), 108-109.]] Google ScholarGoogle Scholar
  16. DAVIS, H. 1998. Referential integrity of links in open hypermedia systems. In Proceedings of ACM Hypertext '98, 207-216.]] Google ScholarGoogle Scholar
  17. DAVIS, H. HALL, W., HEATH, I., HILL, G., AND WILKINS, R. 1992. Towards an integrated information environment with open hypermedia systems. In Proceedings of the Second European Conference on Hypertext, ACM, 181-190.]] Google ScholarGoogle Scholar
  18. IANELLA, R., SUE, H., AND LEONG, D. 1996. BURNS: basic urn service resolution for the internet. In Proceedings of the Asia-Pacific World Wide Web Conference, Beijing and Hong Kong, http://www.dstc.edu.au/Research/Research/ Resource Discovery/publications/apweb96/ index.html.]]Google ScholarGoogle Scholar
  19. INGHAM, D., CAUGHY, S., AND LITTLE, M. 1996. Fixing the "broken-link" problem: the w3objects approach. In Proceedings of the 5th International WWW Conference; also issued as Computer Networks and ISDN Systems 28, 7-11, 1255-1268, http://www5conf.inria.fr/ fich html/papers/P32/Overview.html.]] Google ScholarGoogle Scholar
  20. IDF98. International DOI Foundation. 1998. About the DOI, http://www.doi.org/about the doi. html.]]Google ScholarGoogle Scholar
  21. Jane's. 2000. Jane's Information Group, All the World's Aircraft, CD-ROM.]]Google ScholarGoogle Scholar
  22. KANTOR, B. AND LAPSLEY, P. 1986. Network News Transfer Protocol-A Proposed Standard for the Stream-Based Transmission of News. Internet RFC 977, http://www.w3.org/ Protocols/rfc977/rfc977.txt.]] Google ScholarGoogle Scholar
  23. KAPLAN, S. AND MAAREK, Y. 1990. Incremental maintenance of semantic links in dynamically changing hypertext systems, Interacting with Computers 2, 3, 337-366.]] Google ScholarGoogle Scholar
  24. KAPPE, F. 1995. A scalable architecture for maintaining referential integrity in distributed information systems, Journal of Universal Computer Science 1, 2 http://www. iicm.edu/jucs 1 2/a scalable architecture for.]]Google ScholarGoogle Scholar
  25. LUOTONEN, A. AND ALTIS, K. 1994. World wide web proxies. In Proceedings of the WWW'94 conference; also issued as Computer Networks and ISDN Systems 27, 2, 147-154, http://www.cern.ch/PapersWWW94/luotonen.ps.]] Google ScholarGoogle Scholar
  26. MAIOLI, C., SOLA, F., AND VITALI, F. 1993. Wide area distribution issues in hypertext systems. In Proceedings of ACM SIGDOC '93, 185-198.]] Google ScholarGoogle Scholar
  27. NELSON, T. 1988. Managing immense storage. Byte 13, 1 (Jan.), 225-238.]] Google ScholarGoogle Scholar
  28. OCLC, 1996. Online Computer Library Center, Inc. PURL, http://purl.oclc.org/.]]Google ScholarGoogle Scholar
  29. OJP. 1999. Open Journal Project. http://journals. ecs.soton.ac.uk.]]Google ScholarGoogle Scholar
  30. PITKOW, J. 1998. Summary of WWW characterizations, In Proceedings of the 7th International World Wide Web Conference, Elsevier, 551-558; also issued as Computer Networks and ISDN Systems 30, 1-7, http://www.scu.edu.au/programme/fullpapers/ 1877/com1877.htm.]] Google ScholarGoogle Scholar
  31. PITKOW, J. AND JONES, R. 1996. Supporting the web: a distributed hyperlink database system. In Proceedings of the 5th International WWW Conference; also issued as Computer Networks and ISDN Systems 28, 7-11, 981-991, http://www5conf.inria.fr/fich html/papers/P10/ Overview.html.]] Google ScholarGoogle Scholar
  32. TANAKA, K., NISHIKAWA, N., HIRAYIMA, S., AND NANBA, K. 1991. Query pairs as hypertext links. In Proceedings of the 7th International Conference on Data Engineering, IEEE Computer Science Press, 456-463.]] Google ScholarGoogle Scholar
  33. THISTLEWAITE, P. 1995. Managing large hypermedia information bases: a case study involving the Australian parliament. Proceedings of the Ausweb '95 Conference, 223-228, http://ausweb.scu.edu.au/sponsored/ausweb/ ausweb95/papers/management/thistlewaite/.]]Google ScholarGoogle Scholar
  34. THISTLEWAITE, P. 1997. Automatic construction and management of large open webs. In M. AGOSTI and J. ALLAN eds., Special issue on methods and tools for the automatic construction of hypermedia. Information Processing and Management 33, 2, 161-173, Elsevier.]] Google ScholarGoogle Scholar
  35. TICHY, W. 1985. RCS: A system for version control. Software-Practice and Experience 15, 7, 637- 654.]] Google ScholarGoogle Scholar
  36. VANZYL, A., CESNIK, B., HEATH, I., AND DAVIS, H. 1994. Open hypertext systems: An examination of requirements, and analysis of implementation strategies, comparing microcosm, hyperTED, and the world wide web, http://www.inf-wiss.unikonstanz.de/Res/openhypermedia.html.]]Google ScholarGoogle Scholar
  37. VERBYLA, J. AND ASHMAN, H. 1994. A userconfigurable hypermedia-based interface via the functional model of the link, Hypermedia 6, 3, 193-208.]]Google ScholarGoogle Scholar

Index Terms

  1. Electronic document addressing: dealing with change

      Recommendations

      Reviews

      Claudiu Popescu

      This article analyzes the problem of the integrity of electronic documents, in particular, of Web sites. The main problem is that hyperlinks are frequently changed, producing the well-known Error 404. Based on the stunning fact that the average life of a WWW document is only 50 days, the article shows that solutions must be found. Link integrity is identified as the main problem. Eleven solutions for link integrity are then discussed. Each solution has its scope and degree of customer satisfaction. Each solution is presented either by examples of systems which implements it or references to papers were it is discussed in depth. The author has tested each solution, mentioning the advantages and drawbacks. I consider very interesting the topic of this paper, about a very critical aspect of today's computer usage. The paper gives a broad and interesting description of problems and solutions. It has also many good references.

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Computing Surveys
        ACM Computing Surveys  Volume 32, Issue 3
        Sept. 2000
        135 pages
        ISSN:0360-0300
        EISSN:1557-7341
        DOI:10.1145/367701
        Issue’s Table of Contents

        Copyright © 2000 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 September 2000
        Published in csur Volume 32, Issue 3

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader