Abstract
Detecting and representing changes to data is important for active databases, data warehousing, view maintenance, and version and configuration management. Most previous work in change management has dealt with flat-file and relational data; we focus on hierarchically structured data. Since in many cases changes must be computed from old and new versions of the data, we define the hierarchical change detection problem as the problem of finding a "minimum-cost edit script" that transforms one data tree to another, and we present efficient algorithms for computing such an edit script. Our algorithms make use of some key domain characteristics to achieve substantially better performance than previous, general-purpose algorithms. We study the performance of our algorithms both analytically and empirically, and we describe the application of our techniques to hierarchically structured documents.
- ACM95 S. Abiteboul, S. Cluet, and T. Milo. A database interface for file updates. In Proceedings of the A CM SIGMOD International Conference on Management of Data, 1995. Google ScholarDigital Library
- CRGMW95 S. Chawathe, A. Rajaraman, H. Garcia- Molina, and J. Widom. Change detection in hierarchlcally structured information. Available by anonymous ftp from db.stanford.edu in directory pub/chawathe/199S/, 1995.Google Scholar
- GHJ+93 S. Ghandeharizadeh, R. Hull, D. Jacobs, et al. On implementing a language for specifying active database execution models. In Proceedings of the Nineteenth International Conference on Very Large Data Bases, Dublin, Ireland, August 1993. Google ScholarDigital Library
- GM95 A. Gupta and I.S. Mumick. Maintenance of materialized views: Problems, techniques, and appfications. IEEE Data Engineering Bulletin, Special Issue on Materialized Views and Data Warehousing, 18(2):3-18, June 1995.Google Scholar
- HGMW+95 J. Hammer, H. Garcia-Molina, J. Widom, W. Labio, and Y. Zhuge. The Stanford Data Warehousing Project. IEEE Data Engineering Bulletin, Special Issue on Materialized Views and Data Warehousing, 18(2):41-48, June 1995.Google Scholar
- HKG+94 H.C. Howard, A.M. Keller, A. Gupta, K. Krishnamurthy, K.H. Law, P.M. Teicholz, S. Tiwari, and J. Ullman. Versions, configurations, and constraints in CEDB. CIFE Working Paper 31, Center for Integrated Facilities Engineering, Stanford University, April 1994.Google Scholar
- IK93 W.H. Inmon and C. Kelley. Rdb/VMS: Developing the Data Warehouse. QED Publishing Group, Boston, Massachussetts, 1993. Google ScholarDigital Library
- Kif95 M. Kifer. EDIFF-a comprehensive interface to diff for Emacs 19. Available through anonymous ftp at ftp. ca. sunysb, edu, 1995.Google Scholar
- LGM95 W. Labio and H. Garcia-Molina. Efficient algorithms to compare snapshots. Manuscript, available by anonymous ftp from db.stanford.edu in pub/labio/199S/, 1995.Google Scholar
- Mye86 E. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1(2):251-266, 1986.Google ScholarCross Ref
- PGMW95 Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object exchange across heterogeneous information sources. In Proceedings of the Eleventh International Conference on Data Engineering, pages 251-260, Taipei, Taiwan, March 1995. Google ScholarDigital Library
- SZ90 D. Shasha and K. Zhang. Fast algorithms for the unit cost editing distance between trees. Journal of Algorithms, 11:581-621, 1990. Google ScholarDigital Library
- WC96 J. Widom and S. Ceri. Active Database Systems: Triggers and Rules for Advanced Database Processing. Morgan Kaufmann, San Francisco, California, 1996. Google ScholarDigital Library
- ZGMHW95 Y. Zhuge, H. Garcia-Molina, 3. Hammer, and J. Widom. View maintenance in a warehousing environment. In Proceedings of the A CM SIGMOD International Conference on Management of Data, pages 316- 327, San Jose, California, May 1995. Google ScholarDigital Library
- Zha95 K. Zhang. Personal communication, May 1995.Google Scholar
- ZS89 K. Zhang and D. Shasha. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal of Computing, 18(6):1245-1262, 1989. Google ScholarDigital Library
Index Terms
- Change detection in hierarchically structured information
Recommendations
Change detection in hierarchically structured information
SIGMOD '96: Proceedings of the 1996 ACM SIGMOD international conference on Management of dataDetecting and representing changes to data is important for active databases, data warehousing, view maintenance, and version and configuration management. Most previous work in change management has dealt with flat-file and relational data; we focus on ...
Managing Hierarchically Structured DTN-Like Networks
Network management in Delay/Disruption Tolerant Networks (DTNs) is an active research topic and covers topics such as system architecture, roles of actors, and management protocol. The existing solutions either expect a flat management hierarchy ...
Probabilistic reasoning with hierarchically structured variables
IJCAI'05: Proceedings of the 19th international joint conference on Artificial intelligenceMany practical problems have random variables with a large number of values that can be hierarchically structured into an abstraction tree of classes. This paper considers how to represent and exploit hierarchical structure in probabilistic reasoning. ...
Comments