Skip to main content
Erschienen in: GeoInformatica 1/2015

01.01.2015

Domain-driven co-location mining

Extraction, visualization and integration in a GIS

verfasst von: Frédéric Flouvat, Jean-François N’guyen Van Soc, Elise Desmier, Nazha Selmaoui-Folcher

Erschienen in: GeoInformatica | Ausgabe 1/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Co-location mining is a classical problem in spatial pattern mining. Considering a set of boolean spatial features, the goal is to find subsets of features frequently located together. It has wide applications in environmental management, public safety, transportation or tourism. These last years, many algorithms have been proposed to extract frequent co-locations. However, most solutions do a “data-centered knowledge discovery” instead of a “expert-centered knowledge discovery”. Successfully providing useful and interpretable patterns to experts is still an open problem. In this setting, we propose a domain-driven co-location mining approach that combines constraint-based mining and cartographic visualization. Experts can push new domain constraints into the mining algorithm, resulting in more relevant patterns and more efficient extraction. Then, they can visualize solutions using a new concise and intuitive cartographic visualization of co-locations. Using this original visualization approach, they identify new interesting patterns, and use uninteresting ones to define new constraints and refine their analysis. These proposals have been integrated into a prototype based on PostGIS geographic information system. Experiments have been done using a real geological datasets studying soil erosion, and results have been validated by a domain expert.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Bocca JB, Jarke M, Zaniolo C (eds) VLDB. Morgan Kaufmann, Burlington, Massachusetts, pp 487–499 Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: Bocca JB, Jarke M, Zaniolo C (eds) VLDB. Morgan Kaufmann, Burlington, Massachusetts, pp 487–499
2.
Zurück zum Zitat Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: Buneman P, Jajodia S (eds) SIGMOD conference. ACM Press, pp 207–216 Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: Buneman P, Jajodia S (eds) SIGMOD conference. ACM Press, pp 207–216
3.
Zurück zum Zitat Andrienko GL, Andrienko NV (1999) Knowledge-based visualization to support spatial data mining. In: IDA, pp 149–160 Andrienko GL, Andrienko NV (1999) Knowledge-based visualization to support spatial data mining. In: IDA, pp 149–160
4.
Zurück zum Zitat Andrienko GL, Andrienko NV, Rinzivillo S, NanniM, Pedreschi D, Giannotti F (2009) Interactive visual clustering of large collections of trajectories. In: VAST. IEEE Computer Society, pp 3–10 Andrienko GL, Andrienko NV, Rinzivillo S, NanniM, Pedreschi D, Giannotti F (2009) Interactive visual clustering of large collections of trajectories. In: VAST. IEEE Computer Society, pp 3–10
5.
Zurück zum Zitat Arctur D, Zeiler M (2004) Designing geodatabases: case studies in Gis data modeling. Environmental Systems Research Arctur D, Zeiler M (2004) Designing geodatabases: case studies in Gis data modeling. Environmental Systems Research
6.
Zurück zum Zitat Atherton J, Olson D, Farley L, Qauqau I (2005) Fiji watersheds at risk: watershed assessment for healthy reefs and fisheries Atherton J, Olson D, Farley L, Qauqau I (2005) Fiji watersheds at risk: watershed assessment for healthy reefs and fisheries
7.
Zurück zum Zitat Bayardo RJ Jr (1998) Efficiently mining long patterns from databases. In: Haas LM, Tiwary A (eds) SIGMOD conference. ACM Press, pp 85–93 Bayardo RJ Jr (1998) Efficiently mining long patterns from databases. In: Haas LM, Tiwary A (eds) SIGMOD conference. ACM Press, pp 85–93
8.
Zurück zum Zitat Bertini E, Lalanne D (2010) Investigating and reflecting on the integration of automatic data analysis and visualization in knowledge discovery. SIGKDD Explor Newsl 11(2):9–18CrossRef Bertini E, Lalanne D (2010) Investigating and reflecting on the integration of automatic data analysis and visualization in knowledge discovery. SIGKDD Explor Newsl 11(2):9–18CrossRef
9.
Zurück zum Zitat Bogorny V, Valiati JF, da Silva Camargo S, Engel PM, Kuijpers B, Alvares LO (2006) Mining maximal generalized frequent geographic patterns with knowledge constraints. In: ICDM. IEEE Computer Society, pp 813–817 Bogorny V, Valiati JF, da Silva Camargo S, Engel PM, Kuijpers B, Alvares LO (2006) Mining maximal generalized frequent geographic patterns with knowledge constraints. In: ICDM. IEEE Computer Society, pp 813–817
10.
Zurück zum Zitat Boulicaut JF, Jeudy B (2010) Constraint-based data mining. In: Data mining and knowledge discovery handbook, pp 339–354 Boulicaut JF, Jeudy B (2010) Constraint-based data mining. In: Data mining and knowledge discovery handbook, pp 339–354
11.
Zurück zum Zitat Brunk C, Kelly J, Kohavi R (1997) Mineset: an integrated system for data mining. In: KDD, pp 135–138 Brunk C, Kelly J, Kohavi R (1997) Mineset: an integrated system for data mining. In: KDD, pp 135–138
12.
Zurück zum Zitat Burdick D, Calimlim M, Gehrke J (2001) Mafia: a maximal frequent itemset algorithm for transactional databases. In: ICDE. IEEE Computer Society, pp 443–452 Burdick D, Calimlim M, Gehrke J (2001) Mafia: a maximal frequent itemset algorithm for transactional databases. In: ICDE. IEEE Computer Society, pp 443–452
13.
Zurück zum Zitat Cao L (2008) Domain driven data mining (d3m). In: ICDM workshops. IEEE Computer Society, pp 74–76 Cao L (2008) Domain driven data mining (d3m). In: ICDM workshops. IEEE Computer Society, pp 74–76
14.
Zurück zum Zitat Ceci M, Appice A, Malerba D (2007) Discovering emerging patterns in 1004 spatial databases: a multi-relational approach. In: PKDD, vol 4702. Springer, LNCS, pp 390–397 Ceci M, Appice A, Malerba D (2007) Discovering emerging patterns in 1004 spatial databases: a multi-relational approach. In: PKDD, vol 4702. Springer, LNCS, pp 390–397
15.
Zurück zum Zitat Celik M, Kang JM, Shekhar S (2007) Zonal co-location pattern discovery with dynamic parameters. In: ICDM. IEEE Computer Society, pp 433–438 Celik M, Kang JM, Shekhar S (2007) Zonal co-location pattern discovery with dynamic parameters. In: ICDM. IEEE Computer Society, pp 433–438
16.
Zurück zum Zitat Chen K, Liu L (2003) Validating and refining clusters via visual rendering. In: ICDM. IEEE Computer Society, pp 501–504 Chen K, Liu L (2003) Validating and refining clusters via visual rendering. In: ICDM. IEEE Computer Society, pp 501–504
17.
Zurück zum Zitat De Marchi F, Petit JM (2003) Zigzag: a new algorithm for mining large inclusion dependencies in database. In: ICDM. IEEE Computer Society, pp 27–34 De Marchi F, Petit JM (2003) Zigzag: a new algorithm for mining large inclusion dependencies in database. In: ICDM. IEEE Computer Society, pp 27–34
18.
Zurück zum Zitat Desmier E, Flouvat F, Gay D, Selmaoui-Folcher N (2011) A clustering-based visualization of colocation patterns. In: Desai BC, Cruz IF, Bernardino J (eds) IDEAS. ACM, pp 70–78 Desmier E, Flouvat F, Gay D, Selmaoui-Folcher N (2011) A clustering-based visualization of colocation patterns. In: Desai BC, Cruz IF, Bernardino J (eds) IDEAS. ACM, pp 70–78
19.
Zurück zum Zitat Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp 226–231 Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: KDD, pp 226–231
20.
Zurück zum Zitat Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Mag 17(3):37–54 Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining to knowledge discovery in databases. AI Mag 17(3):37–54
21.
Zurück zum Zitat Flouvat F, DeMarchi F, Petit JM(2004) ABS: Adaptive Borders Search of frequent itemsets. In: Bayardo RJ, Goethals B, Zaki MJ (eds) FIMI, CEUR-WS.org, CEUR Workshop Proceedings, vol 126 Flouvat F, DeMarchi F, Petit JM(2004) ABS: Adaptive Borders Search of frequent itemsets. In: Bayardo RJ, Goethals B, Zaki MJ (eds) FIMI, CEUR-WS.org, CEUR Workshop Proceedings, vol 126
22.
Zurück zum Zitat Flouvat F, De Marchi F, Petit JM (2009) The izi project: easy prototyping of interesting pattern mining algorithms. In: Advanced techniques for datamining and knowledge discovery. Springer, LNCS, pp 1–15 Flouvat F, De Marchi F, Petit JM (2009) The izi project: easy prototyping of interesting pattern mining algorithms. In: Advanced techniques for datamining and knowledge discovery. Springer, LNCS, pp 1–15
23.
Zurück zum Zitat Guo D (2009) Flow mapping and multivariate visualization of large spatial interaction data. Trans Vis Comput Graph 15(6):1041–1048CrossRef Guo D (2009) Flow mapping and multivariate visualization of large spatial interaction data. Trans Vis Comput Graph 15(6):1041–1048CrossRef
24.
Zurück zum Zitat Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update, vol 11 Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update, vol 11
25.
Zurück zum Zitat Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: ChenW, Naughton JF, Bernstein PA (eds) SIGMOD conference. ACM, pp 1–12 Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: ChenW, Naughton JF, Bernstein PA (eds) SIGMOD conference. ACM, pp 1–12
26.
Zurück zum Zitat Heer J, Boyd D (2005) Vizster: visualizing online social networks, pp 23–25 Heer J, Boyd D (2005) Vizster: visualizing online social networks, pp 23–25
27.
Zurück zum Zitat Hsu W, Lee ML, Wang J (2007) Temporal and spatio-temporal data mining. IGI Publishing, Hershey Hsu W, Lee ML, Wang J (2007) Temporal and spatio-temporal data mining. IGI Publishing, Hershey
28.
Zurück zum Zitat Huang Y, Shekhar S, Xiong H (2004) Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans Knowl Data Eng 16(12):1472–1485CrossRef Huang Y, Shekhar S, Xiong H (2004) Discovering colocation patterns from spatial data sets: a general approach. IEEE Trans Knowl Data Eng 16(12):1472–1485CrossRef
29.
Zurück zum Zitat Huang Y, Pei J, Xiong H (2006) Mining co-location patterns with rare events from spatial data sets. GeoInformatica 10(3):239–260CrossRef Huang Y, Pei J, Xiong H (2006) Mining co-location patterns with rare events from spatial data sets. GeoInformatica 10(3):239–260CrossRef
30.
Zurück zum Zitat Huhtala Y, Kärkkäinen J, Porkka P, Toivonen H (1999) Tane: an efficient algorithm for discovering functional and approximate dependencies. Comput J 42(2):100–111CrossRef Huhtala Y, Kärkkäinen J, Porkka P, Toivonen H (1999) Tane: an efficient algorithm for discovering functional and approximate dependencies. Comput J 42(2):100–111CrossRef
31.
Zurück zum Zitat Jaffré T (1992) Floristic and ecological diversity of the vegetation on ultramafic rocks in new caledonia. The vegetation of ultramafic (serpentine) soils, pp 101–107 Jaffré T (1992) Floristic and ecological diversity of the vegetation on ultramafic rocks in new caledonia. The vegetation of ultramafic (serpentine) soils, pp 101–107
32.
Zurück zum Zitat Janeja VP, Adam NR, Atluri V, Vaidya J (2010) Spatial neighborhood based anomaly detection in sensor datasets. Data Min Knowl Discov 20(2):221–258CrossRef Janeja VP, Adam NR, Atluri V, Vaidya J (2010) Spatial neighborhood based anomaly detection in sensor datasets. Data Min Knowl Discov 20(2):221–258CrossRef
33.
Zurück zum Zitat Jaudoin H, Flouvat F, Petit JM, Toumani F (2009) Towards a scalable query rewriting algorithm in presence of value constraints. J Data Semant 12:37–65CrossRef Jaudoin H, Flouvat F, Petit JM, Toumani F (2009) Towards a scalable query rewriting algorithm in presence of value constraints. J Data Semant 12:37–65CrossRef
34.
Zurück zum Zitat Keim DA, Schneidewind J, Sips M (2005) FP-Viz: visual frequent pattern mining. In: Proceedings of IEEE symposium on information visualization (InfoVis ’05), Poster Paper Keim DA, Schneidewind J, Sips M (2005) FP-Viz: visual frequent pattern mining. In: Proceedings of IEEE symposium on information visualization (InfoVis ’05), Poster Paper
35.
Zurück zum Zitat Koperski K, Han J (1995) Discovery of spatial association rules in geographic information databases. In: Egenhofer MJ, Herring JR (eds) SSD, vol 951. Springer, Lecture Notes in Computer Science, pp 47–66 Koperski K, Han J (1995) Discovery of spatial association rules in geographic information databases. In: Egenhofer MJ, Herring JR (eds) SSD, vol 951. Springer, Lecture Notes in Computer Science, pp 47–66
36.
Zurück zum Zitat Leung CKS, Irani P, Carmichael CL (2008) Wifisviz: effective visualization of frequent itemsets. In: ICDM. IEEE Computer Society, pp 875–880 Leung CKS, Irani P, Carmichael CL (2008) Wifisviz: effective visualization of frequent itemsets. In: ICDM. IEEE Computer Society, pp 875–880
37.
Zurück zum Zitat Lin DI, Kedem ZM (1998) Pincer search: a new algorithm for discovering the maximum frequent set. In: Schek HJ, Saltor F, Ramos I, Alonso G (eds) EDBT, vol 1377. Springer, Lecture Notes in Computer Science, pp 105–119 Lin DI, Kedem ZM (1998) Pincer search: a new algorithm for discovering the maximum frequent set. In: Schek HJ, Saltor F, Ramos I, Alonso G (eds) EDBT, vol 1377. Springer, Lecture Notes in Computer Science, pp 105–119
38.
Zurück zum Zitat Lisi FA, Malerba D (2004) Inducing multi-level association rules from multiple relations. Mach Learn 55(2):175–210CrossRef Lisi FA, Malerba D (2004) Inducing multi-level association rules from multiple relations. Mach Learn 55(2):175–210CrossRef
39.
Zurück zum Zitat Lloyd S (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137CrossRef Lloyd S (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28(2):129–137CrossRef
40.
Zurück zum Zitat Malerba D (2008) A relational perspective on spatial data mining. Int J Data Mining Model Manag 1(1):103–118CrossRef Malerba D (2008) A relational perspective on spatial data mining. Int J Data Mining Model Manag 1(1):103–118CrossRef
41.
Zurück zum Zitat Mannila H, Toivonen H (1997) Levelwise search and borders of theories in knowledge discovery. Data Min Knowl Disc 1(3):241–258CrossRef Mannila H, Toivonen H (1997) Levelwise search and borders of theories in knowledge discovery. Data Min Knowl Disc 1(3):241–258CrossRef
42.
Zurück zum Zitat McGarry K (2005) A survey of interestingness measures for knowledge discovery. Knowl Eng Rev 20(01):39CrossRef McGarry K (2005) A survey of interestingness measures for knowledge discovery. Knowl Eng Rev 20(01):39CrossRef
43.
Zurück zum Zitat Morrison A, Ross G, Chalmers M (2003) Fast multidimensional scaling through sampling, springs and interpolation. Inf Vis 2(1):68–77CrossRef Morrison A, Ross G, Chalmers M (2003) Fast multidimensional scaling through sampling, springs and interpolation. Inf Vis 2(1):68–77CrossRef
44.
Zurück zum Zitat Ng RT, Lakshmanan LVS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained associations rules. ACM SIGMOD Record 27(2):13–24CrossRef Ng RT, Lakshmanan LVS, Han J, Pang A (1998) Exploratory mining and pruning optimizations of constrained associations rules. ACM SIGMOD Record 27(2):13–24CrossRef
45.
Zurück zum Zitat Nourine L, Petit JM (2012) Extending set-based dualization: application to pattern mining. In: Raedt LD, Bessière C, Dubois D, Doherty P, Frasconi P, Heintz F, Lucas PJF (eds) ECAI, vol 242. IOS Press, Frontiers in Artificial Intelligence and Applications, pp 630–635 Nourine L, Petit JM (2012) Extending set-based dualization: application to pattern mining. In: Raedt LD, Bessière C, Dubois D, Doherty P, Frasconi P, Heintz F, Lucas PJF (eds) ECAI, vol 242. IOS Press, Frontiers in Artificial Intelligence and Applications, pp 630–635
46.
Zurück zum Zitat Pei J, Han J, Lakshmanan LVS (2001) Mining frequent itemsets with convertible constraints. Data Eng (Section 4):433–442 Pei J, Han J, Lakshmanan LVS (2001) Mining frequent itemsets with convertible constraints. Data Eng (Section 4):433–442
47.
Zurück zum Zitat Pelleg D, Moore AW (2000) X-means: extending k-means with efficient estimation of the number of clusters. In: Langley P (ed) ICML. Morgan Kaufmann, Burlington, Massachusetts, pp 727–734 Pelleg D, Moore AW (2000) X-means: extending k-means with efficient estimation of the number of clusters. In: Langley P (ed) ICML. Morgan Kaufmann, Burlington, Massachusetts, pp 727–734
48.
Zurück zum Zitat Qian F, He Q, He J (2009) Mining spatial co-location patterns with dynamic neighborhood constraint. In: ECML/PKDD’09, vol 5782. Springer, LNCS, pp 238–253 Qian F, He Q, He J (2009) Mining spatial co-location patterns with dynamic neighborhood constraint. In: ECML/PKDD’09, vol 5782. Springer, LNCS, pp 238–253
49.
Zurück zum Zitat Raedt LD, Zimmerman A (2007) Constraint-based pattern set mining. In: ICDM. IEEE Computer Society, pp 1–12 Raedt LD, Zimmerman A (2007) Constraint-based pattern set mining. In: ICDM. IEEE Computer Society, pp 1–12
50.
Zurück zum Zitat Selmaoui-Folcher N, Flouvat F, Gay D, Rouet I (2011) Spatial pattern mining for soil erosion characterization. IJAEIS 2(2):73–92 Selmaoui-Folcher N, Flouvat F, Gay D, Rouet I (2011) Spatial pattern mining for soil erosion characterization. IJAEIS 2(2):73–92
51.
Zurück zum Zitat Shekhar S, Huang Y (2001) Discovering spatial co-location patterns: a summary of results. In: SSTD, pp 236–256 Shekhar S, Huang Y (2001) Discovering spatial co-location patterns: a summary of results. In: SSTD, pp 236–256
52.
Zurück zum Zitat Tobler W (1979) Cellular geography. In: Gale S, Olsson G (eds) Philosophy in geography. Reidel, Dordrecht, pp 379–389 Tobler W (1979) Cellular geography. In: Gale S, Olsson G (eds) Philosophy in geography. Reidel, Dordrecht, pp 379–389
53.
Zurück zum Zitat Yang J, PengW,Ward MO, Rundensteiner EA (2003) Interactive hierarchical dimension ordering, spacing and filtering for exploration of high dimensional datasets. In: INFOVIS. IEEE Computer Society, pp 105–112 Yang J, PengW,Ward MO, Rundensteiner EA (2003) Interactive hierarchical dimension ordering, spacing and filtering for exploration of high dimensional datasets. In: INFOVIS. IEEE Computer Society, pp 105–112
54.
Zurück zum Zitat Yoo JS, Bow M (2012) Mining spatial colocation patterns: a different framework. Data Min Knowl Discov 24(1):159–194CrossRef Yoo JS, Bow M (2012) Mining spatial colocation patterns: a different framework. Data Min Knowl Discov 24(1):159–194CrossRef
55.
Zurück zum Zitat Yoo JS, Shekhar S (2006) A joinless approach for mining spatial colocation patterns. IEEE TKDE 18(10):1323–1337 Yoo JS, Shekhar S (2006) A joinless approach for mining spatial colocation patterns. IEEE TKDE 18(10):1323–1337
56.
Zurück zum Zitat Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: KDD, pp 283–286 Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: KDD, pp 283–286
Metadaten
Titel
Domain-driven co-location mining
Extraction, visualization and integration in a GIS
verfasst von
Frédéric Flouvat
Jean-François N’guyen Van Soc
Elise Desmier
Nazha Selmaoui-Folcher
Publikationsdatum
01.01.2015
Verlag
Springer US
Erschienen in
GeoInformatica / Ausgabe 1/2015
Print ISSN: 1384-6175
Elektronische ISSN: 1573-7624
DOI
https://doi.org/10.1007/s10707-014-0209-3

Weitere Artikel der Ausgabe 1/2015

GeoInformatica 1/2015 Zur Ausgabe