Abstract
It is possible to significantly reduce the average cost of information retrieval from a large shared database by partitioning data items stored within each record into a primary and a secondary record segment. An analytic model, based upon knowledge of data item lengths, transportation costs, and retrieval patterns, is developed to assist an analyst with this assignment problem. The model is generally applicable to environments in which a database resides in secondary storage, and is useful for both uniprogramming and multiprogramming systems. A computationally tractable record design algorithm has been implemented as a Fortran program and applied to numerous problems. Realistic examples are presented which demonstrate a potential for reducing total system cost by more than 65 percent.
- 1 BALINSKI, M L On a selection problem Manag Sct 17, 3 (March 1970), 230-231Google Scholar
- 2 BENNER, F H On designing generalized file records for management mformat,on systems Proc AFIPS 1967 FJCC, Vol 31, AFIPS Press, Montvale, N J , 1967, pp 291-303Google Scholar
- 3 CAN~I~G, R G Structures for future systems EDP Analyzer 12, 8 (Aug 1974)Google Scholar
- 4 EDMONDS, J , AND KARP, R M Theoretical ~mprovements m algorithmic efficiency for network flow problems J ACM 19, 2 (1972), 248-264 Google Scholar
- 5 EISNER, M j Bicntenon mathematical programs Tech Rep 238, Operations Res , Cornell U , Ithaca, N Y ,July 1974Google Scholar
- 6 FORD, L R , AND FULKERSON, D R Flows m Networks Pnnceton U Press, Pnnceton, N.J., 1962Google Scholar
- 7 GARrlNKEL, R S., AND NEMI~AUSER, G.L. Integer Programmmg Wiley, New York, 1972, Ch. 4.Google Scholar
- 8 GEOFFRION, A M Solving bl-cntenon mathematical programs Oper Res 15, 1 (Jan 1967), 39-54Google Scholar
- 9 HAHN, B A new technique for the compression of storage of data Comm ACM 17, 8 (Aug 1974), 434- 436 Google Scholar
- 10 HEAeS, H S Storage analysis of a compression coding for document data bases INFOR 10, 1 (Feb 1972), 47-61Google Scholar
- 11 HOFFER, J A A clustenng approach to the generation of subfdes for the design of a computer data base Ph D Dlss , Dep Operations Res , Cornell U , Ithaca, N Y , 1975 Google Scholar
- 12 IBb,l CORP latroduct~on to IBM d~rect-access storage dewces and organlzat~ons GD 20-1649, IBM Corp, White Pitons, N Y , 1974Google Scholar
- 13 KENNEDY, S R The use of access frequencies m data base orgamzat~on Ph D Dlss, Dep Operations Res, Cornell U , Ithaca, N Y , 1973Google Scholar
- 14 KING, W F On the Selection of Indices for a Fde Res Rep RJ1341, IBM Res Lab , San Jose, Cahf, Jan 1974Google Scholar
- 15 LUM, U Y, AND LING, H An opt~mlzat3on problem on the selection of secondary keys Proc 1971 ACM Annual Conf, pp 349-356 Google Scholar
- 16 MAXWELL, W L, AND SEVERANCE, D G Comparison of alternatwes for the representation of data ~tem values m an mformat~on system Proc. Wharton Conf on Res. on Comptrs m Orgamzauons. U. of Pennsylvanm, Phdadelphm, Pa , Oct 1973, pp 121-136Google Scholar
Index Terms
- Mathematical Techniques for Efficient Record Segmentation in Large Shared Databases
Recommendations
Query evaluation techniques for large databases
Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible ...
Record Matching over Query Results from Multiple Web Databases
Record matching, which identifies the records that represent the same real-world entity, is an important step for data integration. Most state-of-the-art record matching methods are supervised, which requires the user to provide training data. These ...
Scalable Privacy-Preserving Record Linkage for Multiple Databases
CIKM '14: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge ManagementPrivacy-preserving record linkage (PPRL) is the process of identifying records that correspond to the same real-world entities across several databases without revealing any sensitive information about these entities. Various techniques have been ...
Comments