Skip to main content

2019 | OriginalPaper | Buchkapitel

Realistic Synthetic Data Generation: The ATEN Framework

verfasst von : Scott McLachlan, Kudakwashe Dube, Thomas Gallagher, Jennifer A. Simmonds, Norman Fenton

Erschienen in: Biomedical Engineering Systems and Technologies

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Getting access to real medical data for research is notoriously difficult. Even when data exist they are usually incomplete and subject to restrictions due to confidentiality and privacy. Synthetic data (SD) are best replacements for real data but must be verifiably realistic. There is little or no investigation into systematically achieving realism in SD. This work investigates this problem, and contributes the ATEN framework, which incorporates three component approaches: (1) THOTH for synthetic data generation (SDG); (2) RA for characterising realism is SD, and (3) HORUS for validating realism in SD. The framework is found promising after its use in generating the realistic synthetic EHR (RS-EHR) for labour and birth. This framework is significant in guaranteeing realism in SDG projects. Future efforts focus on further validation of ATEN in a controlled multi-stream SDG process.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat McGraw-Hill: McGraw-Hill Dictionary of Scientific and Technical Terms, 6th edn. McGraw-Hill, London (2003) McGraw-Hill: McGraw-Hill Dictionary of Scientific and Technical Terms, 6th edn. McGraw-Hill, London (2003)
2.
Zurück zum Zitat Rubin, D.: Discussion: statistical disclosure limitation. J. Off. Stat. 9, 461–468 (1993) Rubin, D.: Discussion: statistical disclosure limitation. J. Off. Stat. 9, 461–468 (1993)
3.
Zurück zum Zitat Alter, H.: Creation of a synthetic data set by linking records of the Canadian survey of consumer finances with the family expenditure survey. Ann. Econ. Soc. Meas. 3(2), 373–397 (1994) Alter, H.: Creation of a synthetic data set by linking records of the Canadian survey of consumer finances with the family expenditure survey. Ann. Econ. Soc. Meas. 3(2), 373–397 (1994)
4.
Zurück zum Zitat Wolff, E.: Estimates of the 1969 size distribution of household wealth in the US from a synthetic data base Trans.). In: Smith, J. (ed.) Modelling the Distribution and Intergenerational Transmission of Wealth. University of Chicago Press, Chicago (1980) Wolff, E.: Estimates of the 1969 size distribution of household wealth in the US from a synthetic data base Trans.). In: Smith, J. (ed.) Modelling the Distribution and Intergenerational Transmission of Wealth. University of Chicago Press, Chicago (1980)
5.
Zurück zum Zitat Green, P.E., Rao, V.R.: Conjoint measurement for quantifying judgmental data. J. Mark. Res. 8(3), 355–363 (1971) Green, P.E., Rao, V.R.: Conjoint measurement for quantifying judgmental data. J. Mark. Res. 8(3), 355–363 (1971)
6.
Zurück zum Zitat Birkin, M., Clarke, M.: SYNTHESIS – a synthetic spatial information system for urban and regional analysis: methods and examples. Environ. Plan. 20(1), 1645–1671 (1998) Birkin, M., Clarke, M.: SYNTHESIS – a synthetic spatial information system for urban and regional analysis: methods and examples. Environ. Plan. 20(1), 1645–1671 (1998)
7.
Zurück zum Zitat Stedinger, J., Taylor, M.: Synthetic streamflow generation: model verification and validation. Water Resour. Res. 18(4), 909–918 (1982)CrossRef Stedinger, J., Taylor, M.: Synthetic streamflow generation: model verification and validation. Water Resour. Res. 18(4), 909–918 (1982)CrossRef
8.
Zurück zum Zitat Geweke, J., Porter-Hudak, S.: The estimation and application of long memory series models. J. Time Ser. Anal. 4(4), 221–238 (1983)MathSciNetMATHCrossRef Geweke, J., Porter-Hudak, S.: The estimation and application of long memory series models. J. Time Ser. Anal. 4(4), 221–238 (1983)MathSciNetMATHCrossRef
9.
Zurück zum Zitat Graham, V.A., Hollands, K., Unny, T.E.: A time series model for Kt with application to global synthetic weather generation. Sol. Energy 40(2), 83–92 (1988)CrossRef Graham, V.A., Hollands, K., Unny, T.E.: A time series model for Kt with application to global synthetic weather generation. Sol. Energy 40(2), 83–92 (1988)CrossRef
10.
Zurück zum Zitat Delleur, J., Kavvas, M.: Stochastic models for monthly rainfall forecasting and synthetic generation. J. Appl. Meteorol. 17, 1528–1536 (1978)CrossRef Delleur, J., Kavvas, M.: Stochastic models for monthly rainfall forecasting and synthetic generation. J. Appl. Meteorol. 17, 1528–1536 (1978)CrossRef
11.
Zurück zum Zitat Barse, E., Kvarnstrom, H., Jonsson, E.: Synthesizing test data for fraud detection systems. Paper presented at the 19th Annual Computer Security Applications Conference (2003) Barse, E., Kvarnstrom, H., Jonsson, E.: Synthesizing test data for fraud detection systems. Paper presented at the 19th Annual Computer Security Applications Conference (2003)
12.
Zurück zum Zitat Houkjaer, K., Torp, K., Wind, R.: Simple and realistic data generation. Paper presented at the VLDB 2006 (2006) Houkjaer, K., Torp, K., Wind, R.: Simple and realistic data generation. Paper presented at the VLDB 2006 (2006)
13.
Zurück zum Zitat Mouza, C., et al.: Towards an automatic detection of sensitive information in a database. Paper presented at the 2nd International Conference on Advances in Database Knowledge and Database Applications (2010) Mouza, C., et al.: Towards an automatic detection of sensitive information in a database. Paper presented at the 2nd International Conference on Advances in Database Knowledge and Database Applications (2010)
14.
Zurück zum Zitat Whiting, M., Haack, J., Varley, C.: Creating realistic, scenario-based synthetic data for test and evaluation of information analytics software. Paper presented at the 2008 Workshop on Beyond Time and Errors: Novel Evaluation Methods for Information Visualisation (BELIV 2008) (2008) Whiting, M., Haack, J., Varley, C.: Creating realistic, scenario-based synthetic data for test and evaluation of information analytics software. Paper presented at the 2008 Workshop on Beyond Time and Errors: Novel Evaluation Methods for Information Visualisation (BELIV 2008) (2008)
15.
Zurück zum Zitat Gargiulo, F., Ternes, S., Huet, S., Deffuant, G.: An iterative approach for generating statistically realistic populations of households. PLOS ONE 5(1), e8828 (2010)CrossRef Gargiulo, F., Ternes, S., Huet, S., Deffuant, G.: An iterative approach for generating statistically realistic populations of households. PLOS ONE 5(1), e8828 (2010)CrossRef
16.
Zurück zum Zitat Srikanthan, R.M.T.: Stochastic generation of annual, monthly and daily climate data: a review. Hydrol. Earth Syst. Sci. Discuss. 5(4), 653–670 (2001)CrossRef Srikanthan, R.M.T.: Stochastic generation of annual, monthly and daily climate data: a review. Hydrol. Earth Syst. Sci. Discuss. 5(4), 653–670 (2001)CrossRef
17.
Zurück zum Zitat Wan, L., Zhu, J., Bertino, L., Wang, H.: Initial ensemble generation and validation for ocean data assimilation using HYCOM in the Pacific. Ocean Dyn. 58, 81 (2008)CrossRef Wan, L., Zhu, J., Bertino, L., Wang, H.: Initial ensemble generation and validation for ocean data assimilation using HYCOM in the Pacific. Ocean Dyn. 58, 81 (2008)CrossRef
18.
Zurück zum Zitat Killourhy, K., Maxion, R.: Toward realistic and artefact-free insider-threat data. Paper presented at the 23rd Annual Computer Security Applications Conference (CSAC) (2007) Killourhy, K., Maxion, R.: Toward realistic and artefact-free insider-threat data. Paper presented at the 23rd Annual Computer Security Applications Conference (CSAC) (2007)
19.
Zurück zum Zitat Sperotto, A., Sadre, R., Van Vliet, F., Pras, A.: A labelled data set for flow-based intrusion detection. Paper presented at the 9th IEEE International Workshop on IP Operations and Management (IPOM 2009) (2009) Sperotto, A., Sadre, R., Van Vliet, F., Pras, A.: A labelled data set for flow-based intrusion detection. Paper presented at the 9th IEEE International Workshop on IP Operations and Management (IPOM 2009) (2009)
20.
Zurück zum Zitat Zanero, S.: Flaws and frauds in the evaluation of IDS/IPS technologies. Paper presented at the Forum of Incident Response and Security Teams (FIRST 2007) (2007) Zanero, S.: Flaws and frauds in the evaluation of IDS/IPS technologies. Paper presented at the Forum of Incident Response and Security Teams (FIRST 2007) (2007)
21.
Zurück zum Zitat Ascoli, G., Krichmar, J., Nasuto, S., Senft, S.: Generation, description and storage of dendritic morphology data. Philos. Trans. R. Soc. Lond. 365, 1131–1145 (2001)CrossRef Ascoli, G., Krichmar, J., Nasuto, S., Senft, S.: Generation, description and storage of dendritic morphology data. Philos. Trans. R. Soc. Lond. 365, 1131–1145 (2001)CrossRef
22.
Zurück zum Zitat Bozkurt, M., Harman, M.: Automatically generating realistic test input from web services. Paper presented at the 6th International Symposium on Service Oriented System Engineering (2011) Bozkurt, M., Harman, M.: Automatically generating realistic test input from web services. Paper presented at the 6th International Symposium on Service Oriented System Engineering (2011)
23.
Zurück zum Zitat Drechsler, J., Reiter, J.: An empirical evaluation of easily implemented, non-parametric methods for generating synthetic datasets. Comput. Stat. Data Anal. 55(12), 3232–3243 (2011)CrossRef Drechsler, J., Reiter, J.: An empirical evaluation of easily implemented, non-parametric methods for generating synthetic datasets. Comput. Stat. Data Anal. 55(12), 3232–3243 (2011)CrossRef
25.
Zurück zum Zitat Ohm, P.: Broken promises of privacy: responding to the surprising failure of anonymisation. UCLA Law Rev. 57, 1701 (2010) Ohm, P.: Broken promises of privacy: responding to the surprising failure of anonymisation. UCLA Law Rev. 57, 1701 (2010)
26.
Zurück zum Zitat Sweeney, L., Abu, A., Winn, J.: Identifying Participants in the Personal Genome Project by Name. Data Privacy Lab, Harvard University (2013) Sweeney, L., Abu, A., Winn, J.: Identifying Participants in the Personal Genome Project by Name. Data Privacy Lab, Harvard University (2013)
28.
Zurück zum Zitat Stratigopoulos, H., Mir, S., Makris, Y.: Enrichment of limited training sets in machine-learning-based analog/RF test. Paper presented at the DATE 2009 (2009) Stratigopoulos, H., Mir, S., Makris, Y.: Enrichment of limited training sets in machine-learning-based analog/RF test. Paper presented at the DATE 2009 (2009)
29.
Zurück zum Zitat Wu, X., Wang, Y., Zheng, Y.: Privacy preserving database application testing. Paper presented at the WPES 2003 (2003) Wu, X., Wang, Y., Zheng, Y.: Privacy preserving database application testing. Paper presented at the WPES 2003 (2003)
30.
Zurück zum Zitat McLachlan, S., et al.: Learning health systems: the research community awareness challenge. BCS J. Innov. Health Inform. 25(1), 038–040 (2018)CrossRef McLachlan, S., et al.: Learning health systems: the research community awareness challenge. BCS J. Innov. Health Inform. 25(1), 038–040 (2018)CrossRef
31.
Zurück zum Zitat Jaderberg, M., K. Simonyan, A. Vedaldi and A. Zisserman. (2014). Synthetic data and artificial neural networks for natural scene text recognition. arXiv:1406.2227 Jaderberg, M., K. Simonyan, A. Vedaldi and A. Zisserman. (2014). Synthetic data and artificial neural networks for natural scene text recognition. arXiv:​1406.​2227
32.
Zurück zum Zitat Penduff, T., Barnier, B., Molines, J., Madec, G.: On the use of current meter data to assess the realism of ocean model simulations. Ocean Model. 11(3), 399–416 (2006)CrossRef Penduff, T., Barnier, B., Molines, J., Madec, G.: On the use of current meter data to assess the realism of ocean model simulations. Ocean Model. 11(3), 399–416 (2006)CrossRef
33.
Zurück zum Zitat Putnam, H.: Realism and reason. In: Proceedings and Addresses of the American Philosophical Association, vol. 50, no. 6, pp. 483–498 (1977)CrossRef Putnam, H.: Realism and reason. In: Proceedings and Addresses of the American Philosophical Association, vol. 50, no. 6, pp. 483–498 (1977)CrossRef
34.
Zurück zum Zitat Barlas, Y.: Formal aspects of model validity and validation in system dynamics. Syst. Dyn. Rev. 12(3), 183–210 (1996)CrossRef Barlas, Y.: Formal aspects of model validity and validation in system dynamics. Syst. Dyn. Rev. 12(3), 183–210 (1996)CrossRef
35.
Zurück zum Zitat Carley, K.: Validating Computational Models. Carnegie Mellon University, Cambridge (1996) Carley, K.: Validating Computational Models. Carnegie Mellon University, Cambridge (1996)
36.
Zurück zum Zitat Brinkhoff, T.: Generating traffic data. IEEE Data Eng. Bull. 26(2), 19–25 (2003) Brinkhoff, T.: Generating traffic data. IEEE Data Eng. Bull. 26(2), 19–25 (2003)
37.
Zurück zum Zitat Giannotti, F., Mazzoni, A., Puntoni, S., Renso, C.: Synthetic generation of cellular network positioning data. Paper presented at the 13th Annual ACM International Workshop on Geographic Information Systems (2005) Giannotti, F., Mazzoni, A., Puntoni, S., Renso, C.: Synthetic generation of cellular network positioning data. Paper presented at the 13th Annual ACM International Workshop on Geographic Information Systems (2005)
38.
Zurück zum Zitat Stodden, V.: The scientific method in practice: reproducibility in the computational sciences. SSRN Paper 1550193. MIT Sloan School of Management (2010) Stodden, V.: The scientific method in practice: reproducibility in the computational sciences. SSRN Paper 1550193. MIT Sloan School of Management (2010)
39.
Zurück zum Zitat Collins, H.: Changing Order: Replication and Induction in Scientific Practice. University of Chicago Press, Chicago (1992) Collins, H.: Changing Order: Replication and Induction in Scientific Practice. University of Chicago Press, Chicago (1992)
40.
Zurück zum Zitat Moss, P.: Can there be validity without reliability? Educ. Res. 23(2), 5–12 (1994)CrossRef Moss, P.: Can there be validity without reliability? Educ. Res. 23(2), 5–12 (1994)CrossRef
41.
Zurück zum Zitat Tsvetovat, M., Carley, K.: Generation of realistic social network datasets for testing of analysis and simulation tools. Technical report 9. DTIC (2005) Tsvetovat, M., Carley, K.: Generation of realistic social network datasets for testing of analysis and simulation tools. Technical report 9. DTIC (2005)
42.
Zurück zum Zitat Richardson, I., Thomson, M., Infield, D.: A high-resolution domestic building occupancy model for energy demand simulations. Energy Build. 40(8), 1560–1566 (2008)CrossRef Richardson, I., Thomson, M., Infield, D.: A high-resolution domestic building occupancy model for energy demand simulations. Energy Build. 40(8), 1560–1566 (2008)CrossRef
44.
Zurück zum Zitat Efstratiadis, A., Dialynas, Y., Kozanis, S., Koutsoyiannis, D.: A multivariate stochastic model for the generation of synthetic time series at multiple time scales reproducing long-term persistence. Environ. Model. Softw. 62, 139–152 (2014)CrossRef Efstratiadis, A., Dialynas, Y., Kozanis, S., Koutsoyiannis, D.: A multivariate stochastic model for the generation of synthetic time series at multiple time scales reproducing long-term persistence. Environ. Model. Softw. 62, 139–152 (2014)CrossRef
45.
Zurück zum Zitat Van den Bulcke, T., et al.: SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinform. 7(1), 43 (2006)CrossRef Van den Bulcke, T., et al.: SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms. BMC Bioinform. 7(1), 43 (2006)CrossRef
47.
Zurück zum Zitat Gafurov, T., Usaola, J., Prodanovic, M.: Incorporating spatial correlation into stochastic generation of solar radiation data. Sol. Energy 115, 74–84 (2015)CrossRef Gafurov, T., Usaola, J., Prodanovic, M.: Incorporating spatial correlation into stochastic generation of solar radiation data. Sol. Energy 115, 74–84 (2015)CrossRef
48.
Zurück zum Zitat Brissette, F.P., Khalili, M., Leconte, R.: Efficient stochastic generation of multi-site synthetic precipitation data. J. Hydrol. 345(3), 121–133 (2007)CrossRef Brissette, F.P., Khalili, M., Leconte, R.: Efficient stochastic generation of multi-site synthetic precipitation data. J. Hydrol. 345(3), 121–133 (2007)CrossRef
49.
Zurück zum Zitat Gainotti, S., et al.: Improving the informed consent process in international collaborative rare disease research: effective consent for effective research. Eur. J. Hum. Genet. 24, 1248 (2016)CrossRef Gainotti, S., et al.: Improving the informed consent process in international collaborative rare disease research: effective consent for effective research. Eur. J. Hum. Genet. 24, 1248 (2016)CrossRef
50.
51.
Zurück zum Zitat Greene, J.C., Caracelli, V., Graham, W.F.: Toward a conceptual framework for mixed-method evaluation designs. Educ. Eval. Policy Anal. 11(3), 255–274 (1989)CrossRef Greene, J.C., Caracelli, V., Graham, W.F.: Toward a conceptual framework for mixed-method evaluation designs. Educ. Eval. Policy Anal. 11(3), 255–274 (1989)CrossRef
52.
Zurück zum Zitat McLachlan, S., Dube, K., Gallagher, T., Daley, B., Walonoski, J.: The ATEN framework for creating the realistic synthetic electronic health record. Paper presented at the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018), Madiera, Portugal (2018) McLachlan, S., Dube, K., Gallagher, T., Daley, B., Walonoski, J.: The ATEN framework for creating the realistic synthetic electronic health record. Paper presented at the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018), Madiera, Portugal (2018)
53.
Zurück zum Zitat Lydiard, T.: Overview of the current practice and research initiatives for the verification and validation of KBS. Knowl. Eng. Rev. 7(2), 101–113 (1992)CrossRef Lydiard, T.: Overview of the current practice and research initiatives for the verification and validation of KBS. Knowl. Eng. Rev. 7(2), 101–113 (1992)CrossRef
54.
Zurück zum Zitat Ishigami, M., Cumings, J., Zetti, A., Chen, S.: A simple method for the continuous production of carbon nanotubes. Chem. Phys. Lett. 319(5), 457–459 (2000)CrossRef Ishigami, M., Cumings, J., Zetti, A., Chen, S.: A simple method for the continuous production of carbon nanotubes. Chem. Phys. Lett. 319(5), 457–459 (2000)CrossRef
55.
Zurück zum Zitat Mahmoud, E.: Accuracy in forecasting: a survey. J. Forecast. 3(2), 139–159 (1984)CrossRef Mahmoud, E.: Accuracy in forecasting: a survey. J. Forecast. 3(2), 139–159 (1984)CrossRef
56.
Zurück zum Zitat Nicoletti, I., Migliorati, G., Pagliacci, M., Grignani, F., Riccardi, C.: A rapid and simple method for measuring thymocyte apoptosis by propidium iodide staining and flow cytometry. J. Immunol. Methods 139(2), 271–279 (1991)CrossRef Nicoletti, I., Migliorati, G., Pagliacci, M., Grignani, F., Riccardi, C.: A rapid and simple method for measuring thymocyte apoptosis by propidium iodide staining and flow cytometry. J. Immunol. Methods 139(2), 271–279 (1991)CrossRef
57.
Zurück zum Zitat Rosevear, A.: Immobilised biocatalysts – a critical review. J. Chem. Technol. Biotechnol. 34(3), 127–150 (1984)CrossRef Rosevear, A.: Immobilised biocatalysts – a critical review. J. Chem. Technol. Biotechnol. 34(3), 127–150 (1984)CrossRef
58.
Zurück zum Zitat Parnas, D., Clements, P.: A rational design process: how and why to fake it. IEEE Trans. Softw. Eng. 2, 251–257 (1986)CrossRef Parnas, D., Clements, P.: A rational design process: how and why to fake it. IEEE Trans. Softw. Eng. 2, 251–257 (1986)CrossRef
61.
Zurück zum Zitat Mwogi, T., Biondich, P., Grannis, S.: An evaluation of two methods for generating synthetic HL7 segments reflecting real-world health information exchange transactions. Paper presented at the AMIA Annual Symposium Proceedings (2014) Mwogi, T., Biondich, P., Grannis, S.: An evaluation of two methods for generating synthetic HL7 segments reflecting real-world health information exchange transactions. Paper presented at the AMIA Annual Symposium Proceedings (2014)
62.
Zurück zum Zitat McLachlan, S., Dube, K., Gallagher, T.: Using CareMaps and health statistics for generating the realistic synthetic electronic healthcare record. Paper presented at the International Conference on Healthcare Informatics (ICHI 2016), Chicago, USA (2016) McLachlan, S., Dube, K., Gallagher, T.: Using CareMaps and health statistics for generating the realistic synthetic electronic healthcare record. Paper presented at the International Conference on Healthcare Informatics (ICHI 2016), Chicago, USA (2016)
63.
Zurück zum Zitat Cassa, C., Olson, K., Mandl, K.: System to generate semisynthetic data sets of outbreak clusters for evaluation of outbreak-detection performance. Morb. Mortal. Wkly Rep. (MMWR) 53, 231 (2004) Cassa, C., Olson, K., Mandl, K.: System to generate semisynthetic data sets of outbreak clusters for evaluation of outbreak-detection performance. Morb. Mortal. Wkly Rep. (MMWR) 53, 231 (2004)
64.
Zurück zum Zitat Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: Knowledge discovery and data mining: towards a unifying framework. KDD 96, 82–88 (1996) Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P.: Knowledge discovery and data mining: towards a unifying framework. KDD 96, 82–88 (1996)
65.
Zurück zum Zitat Fernandez-Arteaga, V., et al.: Association between completed suicide and environmental temperature in a Mexican population, using the KDD approach. Comput. Methods Programs Biomed. 135, 219–224 (2016)CrossRef Fernandez-Arteaga, V., et al.: Association between completed suicide and environmental temperature in a Mexican population, using the KDD approach. Comput. Methods Programs Biomed. 135, 219–224 (2016)CrossRef
66.
Zurück zum Zitat Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge discovery and interactive data mining in Bopinformatics: state-of-the-art, future challenges and research directions. BMC Bioinform. 15(6), I1 (2014)CrossRef Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge discovery and interactive data mining in Bopinformatics: state-of-the-art, future challenges and research directions. BMC Bioinform. 15(6), I1 (2014)CrossRef
67.
Zurück zum Zitat Mitra, S., Pal, S., Mitra, P.: Data mining in soft computing framework: a survey. IEEE Trans. Neural Netw. 13(1), 3–14 (2002)CrossRef Mitra, S., Pal, S., Mitra, P.: Data mining in soft computing framework: a survey. IEEE Trans. Neural Netw. 13(1), 3–14 (2002)CrossRef
68.
Zurück zum Zitat Nijssen, G.M., Halpin, T.A.: Conceptual Schema and Relational Database Design: A Fact Oriented Approach. Prentice Hall Inc., Upper Saddle River (1989) Nijssen, G.M., Halpin, T.A.: Conceptual Schema and Relational Database Design: A Fact Oriented Approach. Prentice Hall Inc., Upper Saddle River (1989)
69.
Zurück zum Zitat Han, J., Cai, Y., Cercone, N.: Data-driven discovery of quantitative rules in relational databases. IEEE Trans. Knowl. Data Eng. 5(1), 29–40 (1993)CrossRef Han, J., Cai, Y., Cercone, N.: Data-driven discovery of quantitative rules in relational databases. IEEE Trans. Knowl. Data Eng. 5(1), 29–40 (1993)CrossRef
70.
Zurück zum Zitat Sanderson, M., Croft, B.: Deriving concept hierarchies from text. Paper presented at the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1999) Sanderson, M., Croft, B.: Deriving concept hierarchies from text. Paper presented at the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (1999)
71.
Zurück zum Zitat Barnes, C.A.: Concepts Hierarchies for Extensible Databases. Naval Postgraduate School, Monterey (1990) Barnes, C.A.: Concepts Hierarchies for Extensible Databases. Naval Postgraduate School, Monterey (1990)
72.
Zurück zum Zitat Ganter, B., Willie, R.: Applied lattice theory: formal concept analysis. In: General Latice Theory. Birkhauser, Basel (1997) Ganter, B., Willie, R.: Applied lattice theory: formal concept analysis. In: General Latice Theory. Birkhauser, Basel (1997)
73.
Zurück zum Zitat Rodriguez-Jiminez, J., Cordero, P., Enciso, M., Rudolph, S.: Concept lattices with negative information: a characterisation theorem. Inf. Sci. 369(51), 51–62 (2016)CrossRef Rodriguez-Jiminez, J., Cordero, P., Enciso, M., Rudolph, S.: Concept lattices with negative information: a characterisation theorem. Inf. Sci. 369(51), 51–62 (2016)CrossRef
74.
Zurück zum Zitat Bex, G., Neven, F., Schwentick, T., Tuyls, K.: Inference of concise DTDs from XML data. Paper presented at the 32nd International Conference on Very Large Databases (2006) Bex, G., Neven, F., Schwentick, T., Tuyls, K.: Inference of concise DTDs from XML data. Paper presented at the 32nd International Conference on Very Large Databases (2006)
75.
Zurück zum Zitat Laranjeiro, N., Vieira, M., Madeira, H.: Improving web services robustness. Paper presented at the IEEE International Conference on Web Services ICWS 2009 (2009) Laranjeiro, N., Vieira, M., Madeira, H.: Improving web services robustness. Paper presented at the IEEE International Conference on Web Services ICWS 2009 (2009)
76.
Zurück zum Zitat Oreskes, N., Shrader-Frechette, K., Belitz, K.: Verification, validation and confirmation of numerical models in the earth sciences. Science 263(5147), 641–646 (1994)CrossRef Oreskes, N., Shrader-Frechette, K., Belitz, K.: Verification, validation and confirmation of numerical models in the earth sciences. Science 263(5147), 641–646 (1994)CrossRef
77.
Zurück zum Zitat McLachlan, S.: Realism in synthetic data generation. Master of Philosophy in Science MPhil, Massey University, Palmerston North, New Zealand (2017). Available from database McLachlan, S.: Realism in synthetic data generation. Master of Philosophy in Science MPhil, Massey University, Palmerston North, New Zealand (2017). Available from database
Metadaten
Titel
Realistic Synthetic Data Generation: The ATEN Framework
verfasst von
Scott McLachlan
Kudakwashe Dube
Thomas Gallagher
Jennifer A. Simmonds
Norman Fenton
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-29196-9_25