Skip to main content
Top
Published in: Quality & Quantity 2/2014

01-03-2014

A Monte Carlo permutation test for co-occurrence data

Author: Balázs Kovács

Published in: Quality & Quantity | Issue 2/2014

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Researchers commonly use co-occurrence counts to assess the similarity of objects. This paper illustrates how traditional association measures can lead to misguided significance tests of co-occurrence in settings where the usual multinomial sampling assumptions do not hold. I propose a Monte Carlo permutation test that preserves the original distributions of the co-occurrence data. I illustrate the test on a dataset of organizational categorization, in which I investigate the relations between organizational categories (such as “Argentine restaurants” and “Steakhouses”).

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Obviously, the number of permutation needed depends on the sample size. Given that there are about 5,000 organizations with two or more categories, and given that most organizations are in two or three categories, 10,000 permutations are likely enough to arrive at random category associations.
 
Literature
go back to reference Agresti, A.: A survey of exact inference for contingency tables. Stat. Sci. 7, 131–177 (1992)CrossRef Agresti, A.: A survey of exact inference for contingency tables. Stat. Sci. 7, 131–177 (1992)CrossRef
go back to reference Breiger, R.L.: The duality of persons and groups. Soc. Forces 53, 181–190 (1974) Breiger, R.L.: The duality of persons and groups. Soc. Forces 53, 181–190 (1974)
go back to reference Dean, J., Henzinger, M.R.: Finding related pages in the World Wide Web. Comput. Netw. 31, 1467–1479 (1999)CrossRef Dean, J., Henzinger, M.R.: Finding related pages in the World Wide Web. Comput. Netw. 31, 1467–1479 (1999)CrossRef
go back to reference Garfield, E.: Citation analysis as a tool in journal evaluation. Science 178, 471–479 (1972)CrossRef Garfield, E.: Citation analysis as a tool in journal evaluation. Science 178, 471–479 (1972)CrossRef
go back to reference Good, P.I.: Permutation, Parametric and Bootstrap Tests of Hypotheses. Springer, New York (2005) Good, P.I.: Permutation, Parametric and Bootstrap Tests of Hypotheses. Springer, New York (2005)
go back to reference Hubert, L.J.: Combinatorial data analysis. Psychometrika 50, 449–467 (1985)CrossRef Hubert, L.J.: Combinatorial data analysis. Psychometrika 50, 449–467 (1985)CrossRef
go back to reference Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA (1999) Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA (1999)
go back to reference Pearson, K.: On a criterion that a given system of deviations from the probable in the case of correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos. Mag. 5, 157–175 (1900)CrossRef Pearson, K.: On a criterion that a given system of deviations from the probable in the case of correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philos. Mag. 5, 157–175 (1900)CrossRef
go back to reference Wickens, T.D.: Multiway Contingency Tables Analysis for the Social Sciences. Lawrence Erlbaum Associates, Hillsdale (1989) Wickens, T.D.: Multiway Contingency Tables Analysis for the Social Sciences. Lawrence Erlbaum Associates, Hillsdale (1989)
Metadata
Title
A Monte Carlo permutation test for co-occurrence data
Author
Balázs Kovács
Publication date
01-03-2014
Publisher
Springer Netherlands
Published in
Quality & Quantity / Issue 2/2014
Print ISSN: 0033-5177
Electronic ISSN: 1573-7845
DOI
https://doi.org/10.1007/s11135-012-9817-x

Other articles of this Issue 2/2014

Quality & Quantity 2/2014 Go to the issue

Premium Partner