The fourth SIGMOD Ph.D. Workshop on Innovative Database Research (IDAR), which is co-located with the 2010 ACM SIGMOD/PODS Conference, was held in Indianapolis, USA on June 11, 2010. The workshop provides a forum for Ph.D. students, who are working on topics related to the SIGMOD conference series, to present, discuss, and receive feedback on their research.
This years workshop features 9 paper presentations which were selected from a total of 16 research submissions. In addition, the workshop program includes a keynote address by Divesh Srivastava.
Proceeding Downloads
Building a power-aware database management system
In today's large-scale data centers, energy costs (i.e., the electricity bill) are projected to outgrow that of hardware. Despite a long history of research in energy-saving techniques, especially low-power hardware, little work has been done to improve ...
Event sequence processing: new models and optimization techniques
Many modern applications, including online financial feeds, tag-based mass transit systems and RFID-based supply chain management systems transmit real-time data streams. There is a need for a special-purpose event stream processing technology to ...
Exploiting locality for query processing and compression in scientific databases
Improvements in the efficiency of scientific simulations have lead to requirements of large databases. The data captured in such simulations is of large scale and poses challenges in storage, transfer and query processing. However, the data are ...
Improved approaches to mine rare association rules in transactional databases
Rare association rules are the association rules consisting of rare items. It is difficult to mine rare association rules with the single minimum support based approaches such as Apri-ori and FP-growth as they suffer from rare item problem. In the ...
Multiple relationship based deduplication
Deduplication refers to the task of finding instances that refer to the same entity in a given table. Several techniques have been presented based on a pairwise comparison and a typical result is the definition of three sets of records i) pairwise ...
Put all eggs in one basket: an OLTP and OLAP database approach for traceability data
Accurate tracking and tracing of moving objects is an emerging trend in vertical industries like retail, logistics, and manufacturing. In order to monitor objects in business processes, more and more companies are deploying upcoming technologies like ...
Specification and verification of web services transactions
Research in transactions planning has recognized the evolvement of Web Services as an industry standard to implement transactional business processes. Web transactions are formed by integrating services in an ad-hoc manner. Distributed transaction ...
Statistical modeling of large distribution sets
In this paper we deal with a ubiquitous problem in data management: hierarchical model estimation for large distribution sets. This particular problem arises in many applications. Classification, top-k query processing, clustering and outlier detection ...
Unsupervised strategies for information extraction by text segmentation
Information extraction by text segmentation (IETS) applies to cases in which data values of interest are organized in implicit semi-structured records available in textual sources (e.g. postal addresses, bibliographic information, ads). It is an ...
Weighted set similarity: queries and updates
Consider a universe of items, each of which is associated with a weight, and a database consisting of subsets of these items. Given a query set, a weighted set similarity query identifies either (i) all sets in the database whose similarity to the query ...