Skip to main content
Log in

Techniques for Computing Fitness of Use (FoU) for Time Series Datasets with Applications in the Geospatial Domain

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

Time series data are widely used in many applications including critical decision support systems. The goodness of the dataset, called the Fitness of Use (FoU), used in the analysis has direct bearing on the quality of the information and knowledge generated and hence on the quality of the decisions based on them. Unlike traditional quality of data which is independent of the application in which it is used, FoU is a function of the application. As the use of geospatial time series datasets increase in many critical applications, it is important to develop formal methodologies to compute their FoU and propagate it to the derived information, knowledge and decisions. In this paper we propose a formal framework to compute the FoU of time series datasets. We present three different techniques using the Dempster–Shafer belief theory framework as the foundation. These three approaches investigate the FoU by focusing on three aspects of data: data attributes, data stability, and impact of gap periods, respectively. The effectiveness of each approach is shown using an application in hydrological datasets that measure streamflow. While we use hydrological information analysis as our application domain in this research, the techniques can be used in many other domains as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. J.L. Goodall, D.R. Maidment, and J. Sorenson. “Representation of spatial and temporal data,” in ArcGIS, AWRA GIS and Water Resources III Conference, Nashville, TN, 2004.

  2. National Drought Monitor Center. http://drought.unl.edu/, Last accessed January 29, 2007.

  3. X. Yao. “Research issues in spatio-temporal data mining,” in University Consortium for Geographic Information Science (UCGIS) Workshop on Geospatial Visualization and Knowledge Discovery. Lansdowne, Virginia (White Paper), Nov. 18–20, 2003.

  4. Meta Group. Data Warehouse Scorecard. Meta Group, 1999.

  5. U. Grimmer and H. Hinrichs. “A methodological approach to data quality management supported by data mining,” in Proc. of the 6th International Conference on Information Quality (IQ 2001), 2001.

  6. G. Shafer. A Mathematical Theory of Evidence. Princeton University Press: Princeton, NJ, 1976.

    Google Scholar 

  7. E. Yudkowsky. “An intuitive explanation of Bayesian reasoning,” in http://yudkowsky.net/bayes/bayes.html, Last Accessed 01/12/2007.

  8. A. Gelman. Bayesian Data Analysis. CRC Press: Boca Raton, FL, 2004.

    Google Scholar 

  9. Y.W. Lee and D.M. Strong. “Knowing—why about data processes and data quality,” Journal of Management Information Systems, Vol. 20(3):13–39, 2003–2004, winter.

    Google Scholar 

  10. R.Y. Yang, M.P. Ready, and H.B. Kon. “Toward quality data: an attribute-based approach,” Decision Support Systems, Vol. 12:349–372, 1995.

    Google Scholar 

  11. L.L. Pipino, Y.W. Lee, and R.Y. Wang. “Data quality assessment,” Communications of ACM, Vol. 45:211–218, 2002, April.

    Google Scholar 

  12. D.P. Ballou and H.L. Pazer. “Modeling data and process quality in multi-input, multi-output information system,” Management Science, Vol. 31(2):150–162, 1985.

    Article  Google Scholar 

  13. K. Huang, Y.W. Lee, and R.Y. Wang. Quality Information and Knowledge. Prentice Hall: Upper Saddle River, NJ, 1999.

    Google Scholar 

  14. A.X. Zhu. “Research issues on uncertainty in geographic data and GIS-based analysis,” in Research Agenda for Geographic Information Science, pp. 197–223, 2004.

  15. M.P. Lynch and A.J. Saalfeld. “Conflation: Automated map compilation—a video game approach,” in Proc. of Auto-Carto 7, Falls Church, VA, 1985.

  16. H. Foley, F. Petty, M. Cobb, and K.B. Shaw. “Utilization of an expert system for the analysis of semantic characteristics for improved conflation in geographic information system,” in Proc. of the 10th International Conference on Industrial and Engineering Applications of AI, pp. 267–275, Atlanta, GA, 1997.

  17. NCGIA. A research agenda for geographic information and analysis. Technical Report 92-7, 1992.

  18. M.F. Goodchild and S. Gopal. Accuracy of Spatial Databases. Taylor and Francis: London, 1990.

    Google Scholar 

  19. M. Blakemore. “Generalization and error in spatial databases,” Cartographica, Vol. 21:131–139, 1983.

    Google Scholar 

  20. N.R. Chrisman and M.K. Lester. “A diagnostic test for error in categorical maps, Auto-Carto 10,” in Technical Papers of the 1991 ACSM-ASPRS Annual Convention, Vol. 6, pp. 330–348, Baltimore, MD, 1991.

  21. P.F. Fisher. “Models of uncertainty in spatial data,” in P.A. Longley, M.F. Goodchild, D.J. Maguire, and D.W. Rhind (Eds.), Geographical Information System: Principles and Technical Issues, 191–205, Wiley: New York, 1999.

    Google Scholar 

  22. A.X. Zhu. “Measuring uncertainty in class assignment for natural resource maps using a similarity model,” Photogrammetric Engineering and Remote Sensing, Vol. 63:1195–1202, 1997.

    Google Scholar 

  23. S.C. Guptill and J.L. Morrison. Elements of Spatial Data Quality. Elsevier: Tarrytown, NY, 1995.

    Google Scholar 

  24. T. Dasu and T. Johnson. “AT&T Labs—Research SDM-2002,” in World Wide Web: http://www.dataquality-research.com/index.html, 2002, April.

  25. J. Hipp, U. Güntzer, and U. Grimmer. “Data quality mining — making a virtue of necessity,” in Proc. of the 6th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2001), pp. 52–57, Santa Barbara, California, 2001.

  26. R. Srikant and R. Agrawal. “Mining generalized association rules,” in Proc. of 21st VLDC Conference, 1995.

  27. D. Luebbers, U. Grimmer, and M. Jarke. “Systematic development of data mining-based data quality tools,” in Proc. of the 29th VLDB Conference, Berlin, Germany, 2003.

  28. J. Theodore and D. Tamraparni. “Comparing massive high-dimensional data sets,” in Proc. of ACM SIGKDD Conference, 1998.

  29. R.Y. Liu and K. Singh. “A quality index based on data depth and multivariate rank tests,” Journal of the American Statistical Association, Vol. 88(421):252–268 1993.

    Google Scholar 

  30. P. Vassiliadis, A. Vagena, S. Skiadopoulos, N. Karayannidis, and T. Sellis. “Arktos: a tool for data cleaning and transformation in data warehouse environments,” IEEE Data Engineering Bulletin, Vol. 23(4):42–47, 2000.

    Google Scholar 

  31. R.Y. Wang, H.B. Kon, and S.E. Madnick. “Data quality requirements analysis and modeling,” in Proc. of Ninth International Conference on Data Engineering, Vienna, Austria, 1993 (April).

  32. B.K. Kahn, D.M. Strong, and R.Y. Wang. “Information quality benchmark: product and service performance,” Communications of the ACM, Vol. 45(4):184–192, 2002.

    Article  Google Scholar 

  33. Y.W. Lee, D.M. Strong, B.K. Kahn, and R.Y. Wang. “AIMQ: A methodology for information quality assessment,” Information and Management, Vol. 40(2):133–146, 2002.

    Article  Google Scholar 

  34. G. Shankaranarayanan and M. Ziad. “Managing data quality in dynamic decision environment: An information product approach,” Journal of Data Management, Vol. 14(4): 14–32, 2003.

    Google Scholar 

  35. J.R. Eastman. “Uncertainty management in GIS: Decision support tools for effective use of spatial data, Chapter 18,” in C. Hunsaker, M. Goodchild, M. Friedl, and E. Case (Eds.), Spatial Uncertainty in Ecology: Implications for Remote Sensing and GIS Applications, 379–390, Springer: New York, 2001.

    Google Scholar 

  36. K. Sentz and S. Ferson. Combination of evidence in Dempster–Shafer belief theory, SANDIA Technical Report, SAND2002-0835, in Word Wide Web at http://www.sandia.gov/epistemic/Reports/SAND2002-0835.pdf, 2002, April.

  37. D. Konks and S. Challa. An introduction to Bayesian and Dempster–Shafer data fusion, DSTO-TR-1436, Edinburgh, Australia, in Word Wide Web at http://www.dsto.defence.gov.au/publications/2563/DSTO-TR-1436.pdf, 2005, November.

  38. F. Cremer, E. den Breejen, and K. Schutte. “Sensor data fusion for antipersonnel land mine detection,” in Proc. of EuroFusion98, pp. 55–60, 1998, October.

  39. J. Braun. “Dempster–Shafer theory and Bayesian reasoning in multisensor data fusion, sensor fusion: architectures, algorithms and applications IV,” in Proc. of SPIE 4051, pp. 255–266, 2000.

  40. G. Mihaila, L. Raschid, and M.E. Vidal. “Querying, “quality of data” metadata,” in Proc. of the Third IEEE Meta-data Conference, Bethesda, Maryland, 1999, April.

  41. J.C. Giarratano and G.D. Riley. “Expert systems: principles and programming,” in Principles and Programming, 4th edn. Course Technology, 2004.

  42. SAS Institute. SAS/ETS User’s Guide, Version 8. SAS Publishing: Cary, NC, 1999.

    Google Scholar 

  43. L.-K. Soh, A. Samal, and W. Waltman. Watershed study: correlation analysis on seven watersheds in Nebraska. Technical Report, Department of Computer Science and Engineering, University of Nebraska, 2003.

  44. K.L. McGraw and M.R. Seale. “Knowledge elicitation with multiple experts: considerations and techniques,” Artificial Intelligence Review, Vol. 2(1):31–44, 2004.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashok Samal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fu, L., Soh, LK. & Samal, A. Techniques for Computing Fitness of Use (FoU) for Time Series Datasets with Applications in the Geospatial Domain. Geoinformatica 12, 91–115 (2008). https://doi.org/10.1007/s10707-007-0025-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-007-0025-0

Keywords

Navigation