Skip to main content
Erschienen in: KI - Künstliche Intelligenz 1/2013

01.02.2013 | Doctoral and Postdoctoral Dissertations

From Texts to Networks: Detecting and Managing the Impact of Methodological Choices for Extracting Network Data from Text Data

verfasst von: Jana Diesner

Erschienen in: KI - Künstliche Intelligenz | Ausgabe 1/2013

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This thesis (Diesner in Technical Report CMU-ISR-12-101, 2012) addresses a series of methodological problems related to extracting information on socio-technical networks from natural language text data. Theories and models from the social sciences are leveraged and combined with computational approaches to (a) construct, analyze and compare network data and (b) combine text data and network data for analysis. This thesis entails various projects that serve three purposes: First, the impact of various common coding choices, including reference resolution and co-occurrence-based link formation, on network data and analysis results is empirically identified across multiple types of text data and domains. Second, different relation extraction methods are compared across various over-time, open-source, large-scale datasets with respect to the resulting network data and analysis results. This study offers a complement to traditional strategies for accuracy assessment. The relation extraction methods considered include network data construction based on (a) manually versus automatically built thesauri, (b) meta-data, and (c) collaboration with subject matter experts. Third, the concepts of grouping and roles from network analysis are integrated with text mining methods to enable the theoretically grounded, joint consideration of text data and network data for real-world applications.
Overall, in this thesis, an interdisciplinary and computationally rigorous approach is used; thereby advancing the intersection of network analysis, natural language processing and computing. The contributions made with this work help people to utilize text data for network analysis, and to collect, manage and interpret rich network data at any scale. These steps are preconditions for asking substantive and graph-theoretic questions, testing hypotheses, and advancing theories about networks.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

KI - Künstliche Intelligenz

The Scientific journal "KI – Künstliche Intelligenz" is the official journal of the division for artificial intelligence within the "Gesellschaft für Informatik e.V." (GI) – the German Informatics Society - with constributions from troughout the field of artificial intelligence.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Fußnoten
1
In Natural Language Processing (NLP) and Information Extraction (IE), this task is also known as Named Entity Recognition.
 
2
In NLP and IE, this step, and sometimes all three steps together, is also referred to as Relation Extraction.
 
Literatur
1.
Zurück zum Zitat Abello J, Broadwell P, Tangherlini TR (2012) Computational folkloristics. Commun ACM 55(7):60–70 CrossRef Abello J, Broadwell P, Tangherlini TR (2012) Computational folkloristics. Commun ACM 55(7):60–70 CrossRef
2.
Zurück zum Zitat Alderson D (2008) Catching the ‘network science’ bug: insight and opportunity for the operations researcher. Oper Res 56(5):1047–1065 MathSciNetMATHCrossRef Alderson D (2008) Catching the ‘network science’ bug: insight and opportunity for the operations researcher. Oper Res 56(5):1047–1065 MathSciNetMATHCrossRef
3.
Zurück zum Zitat Brin S (1999) Extracting patterns and relations from the World Wide Web. Paper presented at The World Wide Web and databases, Valencia, Spain, March 27–28, 1998, pp. 172–183 Brin S (1999) Extracting patterns and relations from the World Wide Web. Paper presented at The World Wide Web and databases, Valencia, Spain, March 27–28, 1998, pp. 172–183
4.
Zurück zum Zitat Burt R, Lin N (1977) Network time series from archival records. In: Heise DR (ed) Sociological methodology, vol 1977. Jossey-Bass, San Francisco, pp 224–254 Burt R, Lin N (1977) Network time series from archival records. In: Heise DR (ed) Sociological methodology, vol 1977. Jossey-Bass, San Francisco, pp 224–254
5.
Zurück zum Zitat Carley KM, Palmquist M (1991) Extracting, representing, and analyzing mental models. Soc Forces 70(3):601–636 Carley KM, Palmquist M (1991) Extracting, representing, and analyzing mental models. Soc Forces 70(3):601–636
6.
Zurück zum Zitat Danowski JA (1993) Network analysis of message content. Prog Commun Sci 12:198–221 Danowski JA (1993) Network analysis of message content. Prog Commun Sci 12:198–221
7.
Zurück zum Zitat Diesner J (2012) Uncovering and managing the impact of methodological choices for the computational construction of socio-technical networks from texts. Technical report CMU-ISR-12-101 Diesner J (2012) Uncovering and managing the impact of methodological choices for the computational construction of socio-technical networks from texts. Technical report CMU-ISR-12-101
8.
Zurück zum Zitat Diesner J, Carley KM (2010) Relation extraction from texts (in German, title: Extraktion relationaler Daten aus Texten). In: Stegbauer C, Häußling R (eds) Handbook network research (Handbuch Netzwerkforschung). Vs Verlag, Wiesbaden, pp 507–521 Diesner J, Carley KM (2010) Relation extraction from texts (in German, title: Extraktion relationaler Daten aus Texten). In: Stegbauer C, Häußling R (eds) Handbook network research (Handbuch Netzwerkforschung). Vs Verlag, Wiesbaden, pp 507–521
9.
Zurück zum Zitat Diesner J, Carley KM, Tambayong L (2012) Extracting socio-cultural networks of the Sudan from open-source, large-scale text data. Comput Math Organ Theory 18(3):328–339 CrossRef Diesner J, Carley KM, Tambayong L (2012) Extracting socio-cultural networks of the Sudan from open-source, large-scale text data. Comput Math Organ Theory 18(3):328–339 CrossRef
10.
Zurück zum Zitat Gerner D, Schrodt P, Francisco R, Weddle J (1994) Machine coding of event data using regional and international sources. Int Stud Q 38(1):91–119 CrossRef Gerner D, Schrodt P, Francisco R, Weddle J (1994) Machine coding of event data using regional and international sources. Int Stud Q 38(1):91–119 CrossRef
11.
Zurück zum Zitat Hämmerli A, Gattiker R, Weyermann R (2006) Conflict and cooperation in an actors’ network of Chechnya based on event data. J Confl Resolut 50(2):159–175 CrossRef Hämmerli A, Gattiker R, Weyermann R (2006) Conflict and cooperation in an actors’ network of Chechnya based on event data. J Confl Resolut 50(2):159–175 CrossRef
12.
Zurück zum Zitat Hartley R, Barnden J (1997) Semantic networks: visualizations of knowledge. Trends Cogn Sci 1(5):169–175 CrossRef Hartley R, Barnden J (1997) Semantic networks: visualizations of knowledge. Trends Cogn Sci 1(5):169–175 CrossRef
13.
Zurück zum Zitat Janas J, Schwind C (1979) Extensional semantic networks. In: Findler NV (ed) Associative networks. Representation and use of knowledge by computers. Academic Press, New York, pp 267–302 Janas J, Schwind C (1979) Extensional semantic networks. In: Findler NV (ed) Associative networks. Representation and use of knowledge by computers. Academic Press, New York, pp 267–302
14.
Zurück zum Zitat Johnson JC, Krempel L (2004) Network visualization: The “Bush team” in Reuters news ticker, 9/11–11/15/01. J Soc Struct 5 Johnson JC, Krempel L (2004) Network visualization: The “Bush team” in Reuters news ticker, 9/11–11/15/01. J Soc Struct 5
15.
Zurück zum Zitat Parastatidis S, Viegas E, Hey T (2009) Viewpoint: smart cyberinfrastructure for research. A view of semantic computing and its role in research. Commun ACM 52(12):33–37 CrossRef Parastatidis S, Viegas E, Hey T (2009) Viewpoint: smart cyberinfrastructure for research. A view of semantic computing and its role in research. Commun ACM 52(12):33–37 CrossRef
16.
Zurück zum Zitat Trigg R, Weiser M (1986) TEXTNET: a network-based approach to text handling. ACM Trans Inf Syst 4(1):1–23 CrossRef Trigg R, Weiser M (1986) TEXTNET: a network-based approach to text handling. ACM Trans Inf Syst 4(1):1–23 CrossRef
Metadaten
Titel
From Texts to Networks: Detecting and Managing the Impact of Methodological Choices for Extracting Network Data from Text Data
verfasst von
Jana Diesner
Publikationsdatum
01.02.2013
Verlag
Springer-Verlag
Erschienen in
KI - Künstliche Intelligenz / Ausgabe 1/2013
Print ISSN: 0933-1875
Elektronische ISSN: 1610-1987
DOI
https://doi.org/10.1007/s13218-012-0225-0

Weitere Artikel der Ausgabe 1/2013

KI - Künstliche Intelligenz 1/2013 Zur Ausgabe

Community

News

Doctoral and Postdoctoral Dissertations

Crowd-Powered Systems