Skip to main content
Top

2011 | OriginalPaper | Chapter

3. Probability, Statistics, and Related Methods

Author : Boris L. Milman

Published in: Chemical Identification and its Quality Assurance

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The probability/statistical methods used for identification purposes are briefly considered. The basic statement is that many phenomena and procedures included in qualitative analysis are of a probabilistic nature. The probability of yes/no responses in target detection is described by binomial distribution. Values of quantities required for identification, such as retention times in chromatography, wavelengths and frequencies in optical spectroscopy, masses in mass spectrometry, intensities (heights, areas) of any analytical signals, are considered as normally distributed (including t-distributed) ones over probabilities. Parameters of the distributions are used in calculations incorporated into procedures of detection and identification. Multivariate statistics connected with chemometrics is essential for classification/authentication of samples, i.e., qualitative analysis II. Bayesian statistics takes into account a prior probability that an analyte is present in a sample.
In the second part of this chapter, operations of setting up, testing, and screening of hypotheses as the core processes of qualitative analysis, are considered. The simplest are hypotheses for a detection operation, e.g., ‘\( {H_0} \): an analyte is absent in the sample’. In identification, analogous hypotheses: ‘\( {H_0} \): the analyte is compound A’, and ‘\( {\overline H_0} \): the analyte is not compound A’ are set up and tested. Identification hypotheses are transformed into experimental and statistical ones to be accepted or rejected on the basis of corresponding criteria, both range/tolerance and statistical criteria. False acceptance or rejection of hypotheses leads to false positive/negative results of identification or detection, the probability of which can be estimated.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Footnotes
1
In this book, two similar but not the same meanings of sample occur. They are “a part of something to be tested” (this case) and “a subset of random values selected from a population” (statistical issues).
 
2
In the case of a large number of replicate measurements, the population standard deviation σ can be estimated by the sample parameter s.
 
Literature
1.
go back to reference Lloyd E (1984) Handbook of applicable mathematics, vol 6, Statistics. Wiley, Chichester Lloyd E (1984) Handbook of applicable mathematics, vol 6, Statistics. Wiley, Chichester
2.
go back to reference Meier PC, Zund RE (1993) Statistical methods in analytical chemistry. Wiley, New York Meier PC, Zund RE (1993) Statistical methods in analytical chemistry. Wiley, New York
3.
go back to reference Sharaf MA, Illman DL, Kowalski BR (1986) Chemometrics. Wiley, New York Sharaf MA, Illman DL, Kowalski BR (1986) Chemometrics. Wiley, New York
4.
go back to reference Massart DL, Vandeginste BGM, Deming SN, Michotte Y, Kaufman L (1988) Chemometrics: a textbook. Elsevier, Amsterdam Massart DL, Vandeginste BGM, Deming SN, Michotte Y, Kaufman L (1988) Chemometrics: a textbook. Elsevier, Amsterdam
5.
go back to reference Varmuza K, Filzmoser P (2009) Introduction to multivariate statistical analysis in chemometrics. CRC Press, Boca Raton, FLCrossRef Varmuza K, Filzmoser P (2009) Introduction to multivariate statistical analysis in chemometrics. CRC Press, Boca Raton, FLCrossRef
6.
go back to reference Thompson SK (1992) Sampling. Wiley, New York Thompson SK (1992) Sampling. Wiley, New York
7.
go back to reference Milman BL, Konopelko LA (2000) Identification of chemical substances by testing and screening of hypotheses. I. General. Fresenius J Anal Chem 367:621–628CrossRef Milman BL, Konopelko LA (2000) Identification of chemical substances by testing and screening of hypotheses. I. General. Fresenius J Anal Chem 367:621–628CrossRef
8.
go back to reference Jurado JM, Alcázar A, Pablos F, Martín MJ, González AG (2005) Classification of aniseed drinks by means of cluster, linear discriminant analysis and soft independent modelling of class analogy based on their Zn, B, Fe, Mg, Ca, Na and Si content. Talanta 66:1350–1354CrossRef Jurado JM, Alcázar A, Pablos F, Martín MJ, González AG (2005) Classification of aniseed drinks by means of cluster, linear discriminant analysis and soft independent modelling of class analogy based on their Zn, B, Fe, Mg, Ca, Na and Si content. Talanta 66:1350–1354CrossRef
10.
go back to reference Goux WJ (1989) NMR pattern recognition of peracetylated mono- and oligosaccharide structures. Classification of residues using principal-component analysis, K-nearest neighbor analysis, and SIMCA class modeling. J Magn Reson 85:457–469 Goux WJ (1989) NMR pattern recognition of peracetylated mono- and oligosaccharide structures. Classification of residues using principal-component analysis, K-nearest neighbor analysis, and SIMCA class modeling. J Magn Reson 85:457–469
11.
go back to reference Aruga R, Mirti P, Casoli A, Palla G (1999) Classification of ancient proteinaceous painting media by the joint use of pattern recognition and factor analysis on GC/MS data. Fresenius J Anal Chem 365:559–566CrossRef Aruga R, Mirti P, Casoli A, Palla G (1999) Classification of ancient proteinaceous painting media by the joint use of pattern recognition and factor analysis on GC/MS data. Fresenius J Anal Chem 365:559–566CrossRef
12.
go back to reference Hristozov D, Da Costa FB, Gasteiger J (2007) Sesquiterpene lactones-based classification of the family Asteraceae using neural networks and k-nearest neighbors. J Chem Inf Model 47:9–19CrossRef Hristozov D, Da Costa FB, Gasteiger J (2007) Sesquiterpene lactones-based classification of the family Asteraceae using neural networks and k-nearest neighbors. J Chem Inf Model 47:9–19CrossRef
13.
go back to reference Elomaa M, Lochmüller CH, Kudrjashova M, Kaljurand M (2000) Classification of polymeric materials by evolving factor analysis and principal component analysis of thermochromatographic data. Thermochimica Acta 362:137–144CrossRef Elomaa M, Lochmüller CH, Kudrjashova M, Kaljurand M (2000) Classification of polymeric materials by evolving factor analysis and principal component analysis of thermochromatographic data. Thermochimica Acta 362:137–144CrossRef
14.
go back to reference Anderson KA, Magnuson BA, Tschirgi ML, Smith B (1999) Determining the geographic origin of potatoes with trace metal analysis using statistical and neural network classifiers. J Agric Food Chem 47:1568–1575CrossRef Anderson KA, Magnuson BA, Tschirgi ML, Smith B (1999) Determining the geographic origin of potatoes with trace metal analysis using statistical and neural network classifiers. J Agric Food Chem 47:1568–1575CrossRef
15.
go back to reference Pell M, Ljunggren H (1996) Composition of the bacterial population in sand-filter columns receiving artificial wastewater, evaluated by soft independent modelling of class analogy (SIMCA). Water Res 30:2479–2487CrossRef Pell M, Ljunggren H (1996) Composition of the bacterial population in sand-filter columns receiving artificial wastewater, evaluated by soft independent modelling of class analogy (SIMCA). Water Res 30:2479–2487CrossRef
16.
go back to reference Walczak B, Morin-Allory L, Lafosse M, Dreux M, Chrétien JR (1987) Factor analysis and experiment design in high-performance liquid chromatography. VII. Classification of 23 reversed-phase high-performance liquid chromatographic packings and identification of factors governing selectivity. J Chromatogr A 395:183–202CrossRef Walczak B, Morin-Allory L, Lafosse M, Dreux M, Chrétien JR (1987) Factor analysis and experiment design in high-performance liquid chromatography. VII. Classification of 23 reversed-phase high-performance liquid chromatographic packings and identification of factors governing selectivity. J Chromatogr A 395:183–202CrossRef
17.
go back to reference Zeng Y, Hopke PK (1990) Methodological study applying three-mode factor analysis to three-way chemical data sets. Chemometrics Intell Lab Syst 7:237–250CrossRef Zeng Y, Hopke PK (1990) Methodological study applying three-mode factor analysis to three-way chemical data sets. Chemometrics Intell Lab Syst 7:237–250CrossRef
18.
go back to reference Harwood VJ, Whitlock J, Withington V (2000) Classification of antibiotic resistance patterns of indicator bacteria by discriminant analysis: use in predicting the source of fecal contamination in subtropical waters. Appl Environ Microbiol 66:3698–3704CrossRef Harwood VJ, Whitlock J, Withington V (2000) Classification of antibiotic resistance patterns of indicator bacteria by discriminant analysis: use in predicting the source of fecal contamination in subtropical waters. Appl Environ Microbiol 66:3698–3704CrossRef
19.
go back to reference Serrano S, Villarejo M, Espejo R, Jodral M (2004) Chemical and physical parameters of Andalusian honey: classification of Citrus and Eucalyptus honeys by discriminant analysis. Food Chem 87:619–625CrossRef Serrano S, Villarejo M, Espejo R, Jodral M (2004) Chemical and physical parameters of Andalusian honey: classification of Citrus and Eucalyptus honeys by discriminant analysis. Food Chem 87:619–625CrossRef
20.
go back to reference Moret I, Di Leo F, Giromini V, Scarponi G (1994) Multiple discriminant analysis in the analytical differentiation of Venetian white wines. 4. Application to several vintage years and comparison with the k nearest-neighbor classification. J Agric Food Chem 32:329–333CrossRef Moret I, Di Leo F, Giromini V, Scarponi G (1994) Multiple discriminant analysis in the analytical differentiation of Venetian white wines. 4. Application to several vintage years and comparison with the k nearest-neighbor classification. J Agric Food Chem 32:329–333CrossRef
22.
go back to reference Wiberg K, Hagman A, Burén P, Jacobsson SP (2001) Determination of the content and identity of lidocaine solutions with UV-visible spectroscopy and multivariate calibration. Analyst 126:1142–1148CrossRef Wiberg K, Hagman A, Burén P, Jacobsson SP (2001) Determination of the content and identity of lidocaine solutions with UV-visible spectroscopy and multivariate calibration. Analyst 126:1142–1148CrossRef
23.
go back to reference Vohradský J (1997) Adaptive classification of two-dimensional gel electrophoretic spot patterns by neural networks and cluster analysis. Electrophoresis 18:2749–2754CrossRef Vohradský J (1997) Adaptive classification of two-dimensional gel electrophoretic spot patterns by neural networks and cluster analysis. Electrophoresis 18:2749–2754CrossRef
24.
go back to reference McNeil VH, Cox ME, Preda M (2005) Assessment of chemical water types and their spatial variation using multi-stage cluster analysis, Queensland, Australia. J Hydrol 310:181–200CrossRef McNeil VH, Cox ME, Preda M (2005) Assessment of chemical water types and their spatial variation using multi-stage cluster analysis, Queensland, Australia. J Hydrol 310:181–200CrossRef
25.
go back to reference Chun J, Atalan E, Ward AC, Goodfellow M (1993) Artificial neural network analysis of pyrolysis mass spectrometric data in the identification of Streptomyces strains. FEMS Microbiol Lett 107:321–326CrossRef Chun J, Atalan E, Ward AC, Goodfellow M (1993) Artificial neural network analysis of pyrolysis mass spectrometric data in the identification of Streptomyces strains. FEMS Microbiol Lett 107:321–326CrossRef
26.
go back to reference Song XH, Hopke PK (1999) Classification of single particles analyzed by ATOFMS using an artificial neural network, ART-2A. Anal Chem 71:860–865CrossRef Song XH, Hopke PK (1999) Classification of single particles analyzed by ATOFMS using an artificial neural network, ART-2A. Anal Chem 71:860–865CrossRef
27.
go back to reference Sivia DS (2001) Data analysis: a Bayesian tutorial. Oxford University Press, Clarendon Sivia DS (2001) Data analysis: a Bayesian tutorial. Oxford University Press, Clarendon
28.
go back to reference Spiehler VR, O’Donnell CM, Gokhale DV (1988) Confirmation and certainty in toxicology screening. Clin Chem 34:1535–1539 Spiehler VR, O’Donnell CM, Gokhale DV (1988) Confirmation and certainty in toxicology screening. Clin Chem 34:1535–1539
29.
go back to reference Ellison SLR, Gregory S, Hardcastle WA (1998) Quantifying uncertainty in qualitative analysis. Analyst 123:1155–1161CrossRef Ellison SLR, Gregory S, Hardcastle WA (1998) Quantifying uncertainty in qualitative analysis. Analyst 123:1155–1161CrossRef
30.
go back to reference Milman BL, Konopelko LA (2000) Identification of chemical substances by testing and screening of hypotheses I. General. Fresenius J Anal Chem 367:621–628CrossRef Milman BL, Konopelko LA (2000) Identification of chemical substances by testing and screening of hypotheses I. General. Fresenius J Anal Chem 367:621–628CrossRef
31.
go back to reference Milman BL (2005) Identification of chemical compounds. Trends Anal Chem 24:493–508CrossRef Milman BL (2005) Identification of chemical compounds. Trends Anal Chem 24:493–508CrossRef
32.
go back to reference Emerenciano VDP, Ferreira MJP, Branco MD, Dubois JE (1998) The application of Bayes’ theorem in natural products as a guide for skeleton identification. Chemometrics Intell Lab Syst 40:83–92CrossRef Emerenciano VDP, Ferreira MJP, Branco MD, Dubois JE (1998) The application of Bayes’ theorem in natural products as a guide for skeleton identification. Chemometrics Intell Lab Syst 40:83–92CrossRef
33.
go back to reference Latorre MJ, Peña R, García S, Herrero C (2000) Authentication of Galician (N.W. Spain) honeys by multivariate techniques based on metal content data. Analyst 125:307–312CrossRef Latorre MJ, Peña R, García S, Herrero C (2000) Authentication of Galician (N.W. Spain) honeys by multivariate techniques based on metal content data. Analyst 125:307–312CrossRef
34.
go back to reference Roussel S, Bellon-Maurel V, Roger JM, Grenier P (2003) Fusion of aroma. FT-IR and UV sensor data based on the Bayesian inference. Application to the discrimination of white grape varieties. Chemometrics Intell Lab Syst 65:209–219CrossRef Roussel S, Bellon-Maurel V, Roger JM, Grenier P (2003) Fusion of aroma. FT-IR and UV sensor data based on the Bayesian inference. Application to the discrimination of white grape varieties. Chemometrics Intell Lab Syst 65:209–219CrossRef
35.
go back to reference Alterovitz G, Liu J, Afkhami E, Ramoni MF (2007) Bayesian methods for proteomics. Proteomics 7:2843–2855CrossRef Alterovitz G, Liu J, Afkhami E, Ramoni MF (2007) Bayesian methods for proteomics. Proteomics 7:2843–2855CrossRef
36.
go back to reference Toher D, Downey G, Murphy TB (2007) A comparison of model-based and regression classification techniques applied to near infrared spectroscopic data in food authentication studies. Chemometrics Intell Lab Syst 89:102–115CrossRef Toher D, Downey G, Murphy TB (2007) A comparison of model-based and regression classification techniques applied to near infrared spectroscopic data in food authentication studies. Chemometrics Intell Lab Syst 89:102–115CrossRef
37.
go back to reference Hibbert DB, Armstrong N (2009) An introduction to Bayesian methods for analyzing chemistry data. II. A review of applications of Bayesian methods in chemistry. Chemometrics Intell Lab Syst 97:211–220CrossRef Hibbert DB, Armstrong N (2009) An introduction to Bayesian methods for analyzing chemistry data. II. A review of applications of Bayesian methods in chemistry. Chemometrics Intell Lab Syst 97:211–220CrossRef
38.
go back to reference Beyermann K (1984) Organic trace analysis. Ellis Horwood, Chicester Beyermann K (1984) Organic trace analysis. Ellis Horwood, Chicester
39.
go back to reference Currie LA (1995) Nomenclature in evaluation of analytical methods, including detection and quantification capabilities (IUPAC Recommendations 1995). Pure Appl Chem 67:1699–1723CrossRef Currie LA (1995) Nomenclature in evaluation of analytical methods, including detection and quantification capabilities (IUPAC Recommendations 1995). Pure Appl Chem 67:1699–1723CrossRef
40.
go back to reference Hartstra J, Franke JP, de Zeeuw RA (2000) How to approach substance identification in qualitative bioanalysis. J Chromatogr B 739:125–137CrossRef Hartstra J, Franke JP, de Zeeuw RA (2000) How to approach substance identification in qualitative bioanalysis. J Chromatogr B 739:125–137CrossRef
41.
go back to reference Eriksson J, Chait BT, Fenyö D (2000) A statistical basis for testing the significance of mass spectrometric protein identification results. Anal Chem 72:999–1005CrossRef Eriksson J, Chait BT, Fenyö D (2000) A statistical basis for testing the significance of mass spectrometric protein identification results. Anal Chem 72:999–1005CrossRef
42.
go back to reference Neyman J (1968) Introductory course of probability theory and mathematical statistics (In Russian). Nauka, Moscow Neyman J (1968) Introductory course of probability theory and mathematical statistics (In Russian). Nauka, Moscow
43.
go back to reference March JG (1994) Primer on decision making: how decisions happen. Simon and Schuster, New York March JG (1994) Primer on decision making: how decisions happen. Simon and Schuster, New York
44.
go back to reference Vershinin VI, Derendyaev BG, Lebedev KS (2002) Computer-Assisted Identification of Organic Compounds (In Russian). Akademkniga, Moscow Vershinin VI, Derendyaev BG, Lebedev KS (2002) Computer-Assisted Identification of Organic Compounds (In Russian). Akademkniga, Moscow
45.
go back to reference Elyashberg M, Blinov K, Williams A (2009) A systematic approach for the generation and verification of structural hypotheses. Magn Reson Chem 47:371–389CrossRef Elyashberg M, Blinov K, Williams A (2009) A systematic approach for the generation and verification of structural hypotheses. Magn Reson Chem 47:371–389CrossRef
47.
go back to reference Nesvizhskii AI, Vitek O, Aebersold R (2007) Analysis and validation of proteomic data generated by tandem mass spectrometry. Nat Methods 4:787–797CrossRef Nesvizhskii AI, Vitek O, Aebersold R (2007) Analysis and validation of proteomic data generated by tandem mass spectrometry. Nat Methods 4:787–797CrossRef
48.
go back to reference Milman BL, Kovrizhnych MA (2000) Identification of chemical substances by testing and screening of hypotheses. II. Determination of impurities in n-hexane and naphthalene. Fresenius J Anal Chem 367:629–634CrossRef Milman BL, Kovrizhnych MA (2000) Identification of chemical substances by testing and screening of hypotheses. II. Determination of impurities in n-hexane and naphthalene. Fresenius J Anal Chem 367:629–634CrossRef
Metadata
Title
Probability, Statistics, and Related Methods
Author
Boris L. Milman
Copyright Year
2011
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-642-15361-7_3

Premium Partners