Skip to main content

2018 | OriginalPaper | Buchkapitel

Processing Missing Information in Big Data Environment

verfasst von : Yuxin Chen, Shun Li, Jiahui Yao

Erschienen in: Data Mining and Big Data

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

How to handle missing information is essential for system efficiency and robustness in the field of the database. Missing information in big data environment tends to have richer semantics, leading to more complex computational logic, as well as affecting operations and implement. The existing methods either have limited semantic expression ability or do not consider the influence of big data environment. To solve these problems, this paper proposes a novel missing information processing method. Combining the practical case of the big data environment, we summary the missing information into two types: unknown and nonexistent value, and define four-valued logic to support the logic operation. The relational algebra is extended systematically to describe the data operations. We implement our approach on the dynamic table model in the self-developed big data management system Muldas. Experimental results on real large-scale sparse data sets show the proposed approach has the good ability of semantic expression and computational efficiency.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Tsichritzis, D., Klug, A.: The ANSI/X3/SPARC DBMS framework report of the study group on database management systems. Inf. Syst. 3(3), 173–191 (1978)CrossRef Tsichritzis, D., Klug, A.: The ANSI/X3/SPARC DBMS framework report of the study group on database management systems. Inf. Syst. 3(3), 173–191 (1978)CrossRef
2.
Zurück zum Zitat Candan, K.S., Grant, J., Subrahmanian, V.: A unified treatment of null values using constraints. Inf. Sci. 98(1–4), 99–156 (1997)CrossRef Candan, K.S., Grant, J., Subrahmanian, V.: A unified treatment of null values using constraints. Inf. Sci. 98(1–4), 99–156 (1997)CrossRef
3.
Zurück zum Zitat Roth, M.A., Korth, H.F., Silberschatz, A.: Null values in nested relational databases. Acta Informatica 26(7), 615–642 (1989)MathSciNetCrossRef Roth, M.A., Korth, H.F., Silberschatz, A.: Null values in nested relational databases. Acta Informatica 26(7), 615–642 (1989)MathSciNetCrossRef
4.
Zurück zum Zitat Codd, E.F.: Extending the database relational model to capture more meaning. ACM Trans. Database Syst. (TODS) 4(4), 397–434 (1979)CrossRef Codd, E.F.: Extending the database relational model to capture more meaning. ACM Trans. Database Syst. (TODS) 4(4), 397–434 (1979)CrossRef
5.
Zurück zum Zitat Codd, E.F.: Missing information (applicable and inapplicable) in relational databases. ACM SIGMOD Rec. 15(4), 53–53 (1986)CrossRef Codd, E.F.: Missing information (applicable and inapplicable) in relational databases. ACM SIGMOD Rec. 15(4), 53–53 (1986)CrossRef
6.
Zurück zum Zitat Codd, E.F.: More commentary on missing information in relational databases (applicable and inapplicable information). ACM SIGMOD Rec. 16(1), 42–50 (1987)CrossRef Codd, E.F.: More commentary on missing information in relational databases (applicable and inapplicable information). ACM SIGMOD Rec. 16(1), 42–50 (1987)CrossRef
7.
Zurück zum Zitat Gessert, G.: Four valued logic for relational database systems. ACM SIGMOD Rec. 19(1), 29–35 (1990)CrossRef Gessert, G.: Four valued logic for relational database systems. ACM SIGMOD Rec. 19(1), 29–35 (1990)CrossRef
8.
Zurück zum Zitat Vassiliou, Y.: Null values in data base management a denotational semantics approach. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pp. 162–169. ACM (1979) Vassiliou, Y.: Null values in data base management a denotational semantics approach. In: Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data, pp. 162–169. ACM (1979)
9.
Zurück zum Zitat Lipski Jr., W.: On semantic issues connected with incomplete information databases. ACM Trans. Database Syst. (TODS) 4(3), 262–296 (1979)CrossRef Lipski Jr., W.: On semantic issues connected with incomplete information databases. ACM Trans. Database Syst. (TODS) 4(3), 262–296 (1979)CrossRef
10.
Zurück zum Zitat Date, C.: Null values in database management. In: BNCOD, pp. 147–166 (1982) Date, C.: Null values in database management. In: BNCOD, pp. 147–166 (1982)
11.
Zurück zum Zitat Yue, K.-B.: A more general model for handling missing information in relational databases using a 3-valued logic. ACM SIGMOD Rec. 20(3), 43–49 (1991)CrossRef Yue, K.-B.: A more general model for handling missing information in relational databases using a 3-valued logic. ACM SIGMOD Rec. 20(3), 43–49 (1991)CrossRef
12.
Zurück zum Zitat Date, C.: A critique of the SQL database language. ACM SIGMOD Rec. 14(3), 8–54 (1984)CrossRef Date, C.: A critique of the SQL database language. ACM SIGMOD Rec. 14(3), 8–54 (1984)CrossRef
15.
Zurück zum Zitat Silberschatz, A., Korth, H.F., Sudarshan, S., et al.: Database System Concepts, vol. 4. McGraw-Hill, New York (1997)MATH Silberschatz, A., Korth, H.F., Sudarshan, S., et al.: Database System Concepts, vol. 4. McGraw-Hill, New York (1997)MATH
16.
Zurück zum Zitat Martinez, M.V., Molinaro, C., Grant, J., Subrahmanian, V.: Customized policies for handling partial information in relational databases. IEEE Trans. Knowl. Data Eng. 25(6), 1254–1271 (2013)CrossRef Martinez, M.V., Molinaro, C., Grant, J., Subrahmanian, V.: Customized policies for handling partial information in relational databases. IEEE Trans. Knowl. Data Eng. 25(6), 1254–1271 (2013)CrossRef
18.
Zurück zum Zitat Dugas, M., et al.: Missing semantic annotation in databases. Methods Inf. Med. 53(6), 516–517 (2014)CrossRef Dugas, M., et al.: Missing semantic annotation in databases. Methods Inf. Med. 53(6), 516–517 (2014)CrossRef
19.
Zurück zum Zitat Hartmann, S., Kohler, H., Leck, U., Link, S., Thalheim, B., Wang, J.: Constructing armstrong tables for general cardinality constraints and not-null constraints. Ann. Math. Artif. Intell. 73(1–2), 139–165 (2015)MathSciNetCrossRef Hartmann, S., Kohler, H., Leck, U., Link, S., Thalheim, B., Wang, J.: Constructing armstrong tables for general cardinality constraints and not-null constraints. Ann. Math. Artif. Intell. 73(1–2), 139–165 (2015)MathSciNetCrossRef
Metadaten
Titel
Processing Missing Information in Big Data Environment
verfasst von
Yuxin Chen
Shun Li
Jiahui Yao
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-319-93803-5_60

Premium Partner