Skip to main content

2020 | OriginalPaper | Buchkapitel

3. Data Understanding, Representation, and Visualization

verfasst von : Ameet V Joshi

Erschienen in: Machine Learning and Artificial Intelligence

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This chapter introduces the concepts of understanding, representing, and visualizing the data. These are essential steps in the before one starts to build a machine learning model or artificially intelligent application. Although these concepts might appear trivial, when the dimensionality of the data is more than 3, they quickly become quite non-trivial and difficult. This chapter introduces dimensionality reduction techniques like principal component analysis and linear discriminant analysis for the purpose of better visualization of high dimensional data. The better visualization gives the user insights into the data distribution and relation of various features with each other and with the output. These insights are valuable when making various choices in the subsequent machine learning pipeline.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Sometimes the network consisting of such devices is referred to as internet of things or IOT.
 
2
The structured formats like svmlight are more useful in case of sparse data, as they add significant overhead when the data is fully populated. A sparse data is data with high dimensionality (typically in hundreds or more), but with many samples missing values for multiple attributes. In such cases, if the data is given as a fully populated matrix, it will take up a huge space in memory. The formats like svmlight employ a name-value pair approach to specify the name of attribute and its value in pair. The name-value pairs are given for only the attributes where value is present. Thus each sample can now have different number of pairs. The model needs to assume that for all the missing name-value pairs, the data is missing. In spite of added names in each sample, due to the sparsity of the data, the file is much smaller.
 
Metadaten
Titel
Data Understanding, Representation, and Visualization
verfasst von
Ameet V Joshi
Copyright-Jahr
2020
DOI
https://doi.org/10.1007/978-3-030-26622-6_3

Neuer Inhalt