ABSTRACT
The Strudel system applies concepts from database management systems to the process of building Web sites. Strudel's key idea is separating the management of the site's data, the creation and management of the site's structure, and the visual presentation of the site's pages. First, the site builder creates a uniform model of all data available at the site. Second, the builder uses this model to declaratively define the Web site's structure by applying a “site-definition query” to the underlying data. The result of evaluating this query is a “site graph”, which represents both the site's content and structure. Third, the builder specifies the visual presentation of pages in Strudel's HTML-template language. The data model underlying Strudel is a semi-structured model of labeled directed graphs.
We describe Strudel's key characteristics, report on our experiences using Strudel, and present the technical problems that arose from our experience. We describe our experience constructing several Web sites with Strudel and discuss the impact of potential users' requirements on Strudel's design. We address two main questions: (1) when does a declarative specification of site structure provide significant benefits, and (2) what are the main advantages provided by the semi-structured data model.
- 1.S. Abiteboul. Querying semi-structured data. In Proceedings of the 1CDT, 1997. Google ScholarDigital Library
- 2.S. Adali, K. Candan, Y. Papakonstantinou, and V. Subrahmanian. Query caching and optimization in distributed mediator systems. In Proceedings of SIGMOD- 96, 1996. Google ScholarDigital Library
- 3.G. Arocena and A. Mendelzon. WebOQL: Restructuring documents, database and webs. In Proceedings of International Conference on Data Engineering, pages 24-33, 1998. Google ScholarDigital Library
- 4.D. Atkins, T. Ball, M. Benedikt, G. Bruns, K. Cox, P. Mataga, and K. Rehor. Experience with a domain specific language for form-based services. In Proceedings of Conference on Domain-Specific Languages, pages 37- 49, 1998. Google ScholarDigital Library
- 5.P. Atzeni, G. Mecca, and P. Merialdo. To weave the web. In Proceedings of VLDB, pages 206-215, 1997. Google ScholarDigital Library
- 6.P. Buneman. Semistructured data. In Proceedings of the 16th A CM SIGA CT-SIGMOD-SIGART Symposium on Principles of Database Systems, Tucson, Arizona, pages 117-121, 1997. Google ScholarDigital Library
- 7.P. Buneman, S. Davidson, M. Fernandez, and D. Suciu. Adding structure to unstructured data. In ICDT, pages 336-350, Deplhi, Greece, 1997. Springer Verlag. Google ScholarDigital Library
- 8.P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu. A query language and optimization techniques for unstructured data. In Proceedings of SIGMOD-96, pages 505-516, 1996. Google ScholarDigital Library
- 9.S. Chawathe, H. Garcia-Molina, J. Hammer, K. Ireland, Y. Papakonstantinou, J. Ullman, and J. Widom. The TSIMMIS project: Integration of heterogenous information sources. In proceedings of IPSJ, Tokyo, japan, October 1994.Google Scholar
- 10.S. Cluet, C. Delobel, J. Simeon, and K. Smaga. Your mediators need data conversion. In To appear in Proceedings of SIGMOD, 1998. Google ScholarDigital Library
- 11.O. M. Duschka and M. R. Genesereth. Answering recursive queries using views. In Proceedings of the 16th A CM SIGA CT-SIGMOD-SIGART Symposium on Principles of Database Systems, Tucson, Arizona., 1997. Google ScholarDigital Library
- 12.M. Fernandez, D. Florescu, J. Kang, A. Levy, and D. Suciu. System demonstration- STRUDEL: A web-site management system. In A CM SIGMOD Conference on Management of Data, 1997. Google ScholarDigital Library
- 13.M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for a web-site management system. SIGMOD Record, 26(3):4-11, September 1997. Google ScholarDigital Library
- 14.M. Fernandez, D. Florescu, A. Levy, and D. Suciu. Reasoning about Web-site structure, 1998. Submitted for publication.Google Scholar
- 15.M. Fernandez, D. Florescu, A. Levy, and D. Suciu. Warehousing and incremental evaluation for Web-site management, 1998. Submitted for publication.Google Scholar
- 16.D. Florescu, A. Levy, and D. Suciu. A query optimization algorithm for semistructured data. Technical report, AT&T Labs, 1997.Google Scholar
- 17.D. Florescu, L. Raschid, and P. Valduriez. A methodology for query reformulation in CIS using semantic knowledge. Int. journal of Intelligent ~4 Cooperative Information Systems, special issue on Formal Methods in Cooperative Information Systems, 5(4), 1996.Google Scholar
- 18.M. Friedman and D. Weld. Efficient execution of information gathering plans. In Proceedings of IJCAL 1997.Google Scholar
- 19.L. Haas, D. Kossmann, E. Wimmers, and J. Yang. Optimizing queries across diverse data sources. In Proceedings of the 23rd VLDB Conference, Athens, Greece, 1997. Google ScholarDigital Library
- 20.R. Hull. Managing semantic heterogeneity in databases: A theoretical perspective. In Proceedings of the 16th A CM SIGA CT-SIGMOD-SIGART Symposium on Principles of Database Systems, Tucson, Arizona, pages 51-61, 1997. Google ScholarDigital Library
- 21.A. Y. Levy, A. Rajaraman, and J. J. Ordille. Querying heterogeneous information sources using source descriptions. In Proceedings of the 22nd VLDB Conference, Bombay, India., 1996. Google ScholarDigital Library
- 22.P. Paolini and P. Fraternali. A conceptual model and a tool environment for developing more scalable, dynamic, and customizable web applications, in Proceedings of EDBT Conference, Valencia, Spain, 1998. Google ScholarDigital Library
- 23.A. Tomasic, L. Raschid, and P. Valduriez. A data model and query processing techniques for scaling access to distributed heterogeneous databases in Disco. IEEE Transactions on Computers, special issue on Distributed Computing Systems, 1997.Google Scholar
- 24.J. D. Ullman. information integration using logical views. In Proceedings of the International Conference on Database Theory, 1997. Google ScholarDigital Library
- 25.P. T. Wood. Queries on Graphs. PhD thesis, University of Toronto, Toronto, Canada, M5S 1A1, December 1988. Available as University of Toronto Technical Report CSRI-223.Google Scholar
- 26.M. Zloof. Query-by-Example: a data base language. IBM Systems Journal, 16:4:324-343, 1977.Google ScholarDigital Library
Index Terms
- Catching the boat with Strudel: experiences with a Web-site management system
Recommendations
Catching the boat with Strudel: experiences with a Web-site management system
The Strudel system applies concepts from database management systems to the process of building Web sites. Strudel's key idea is separating the management of the site's data, the creation and management of the site's structure, and the visual ...
Catching web crawlers in the act
ICWE '06: Proceedings of the 6th international conference on Web engineeringThis paper recommends a new approach to the detection and containment of Web crawler traverses based on clickstream data mining. Timely detection prevents crawler abusive consumption of Web server resources and eventual site contents privacy or ...
Comments