2022 | Book

# Elementary Statistical Methods

Author: Sahana Prasad

Publisher: Springer Nature Singapore

2022 | Book

Author: Sahana Prasad

Publisher: Springer Nature Singapore

This is the first book of two volumes covering the basics of statistical methods and analysis. Significant topics include concepts of research and data analysis, descriptive statistics, probability and distributions, correlation and regression, and statistical inference. The book includes useful examples and exercises as well as relevant case studies for proper implementation of the discussed tools. This book will be a valuable text for undergraduate students of statistics, management, economics, and psychology, wanting to gain basic understanding of statistics and the usage of its various concepts.

Advertisement

Abstract

In real life, we come across a lot of information. When analysed thoughtfully help us make informed decisions. We can use fundamentals of math, logical tools and statistical methodologies to decode the nuances of the data. Data is the new buzzword as there is a lot of data which helps in taking correct decisions and make quality policies. It has brought new challenges as expertise and domain knowledge is required to understand the data and apply relevant tools for analysis. This also means that the measurement of data has to be done using correct scales. Visualization of data is an important aspect of presenting data and performing preliminary analysis. Many types of graphs and diagrams are available to represent data but each one of them has a specified usage, which can be understood by getting a feel of the data and the corresponding visualization tool.

Abstract

When we analyse data, we would wish to have a single or the typical value that represents and summarizes all the data points in a meaningful way. In any data set, there is a tendency of data points to cluster around a central value, and this tendency is called central tendency. It is a statistical tool that defines/describes the entire data set in one single value as well as compares different data sets. The data can further be analysed using measures of dispersion, which gives the “spread” of data values from the mean/average value. One of the measures, namely standard deviation and its relative measure, the coefficient of variation is an important indicator of stability, consistency, uniformity and reliability of data values in a group. Understanding the shape of data is also crucial to data analysis, which can be done through skewness and kurtosis. Skewness measures the asymmetry in data, while kurtosis measures the concentration of data values in the tails. Kurtosis is helpful in finance to understand risks in an investment.

Abstract

These kinds of sentences are commonly used in real life. Probability or possibility or chances of occurrence are used in those cases where there is uncertainty. If we throw a ball from a height and there is nothing to stop the ball midway, it will surely fall to the ground. There is no uncertainty here. But think of a dart thrown at a board. It might strike anywhere. An experiment can be deterministic or random. In a deterministic experiment, the outcome is known, and there is no uncertainty. If we know all possible outcomes of any experiment but are not sure which outcome will occur, then we use the concept of probability to estimate the most likely outcome. Probabilistic concepts are helpful in real life as events cannot be predicted accurately but can be approximated. When expected outcomes of any experiment are predicted using theoretical concepts, we term it as a probability distribution, which is defined by statistical concepts like mean, variance, skewness, kurtosis, etc. They are defined by a mathematical function which is used to estimate various probabilities. Normal distribution is the most important and used distribution, which has applications in many areas.

Abstract

As the world starts processing data, decisions become more data-centred, and hypothesis testing is very crucial in this procedure. It helps to extend conclusions based on sample to the population and also test if any changes in the sample statistics are significant or due to sheer chance. It is a statistical analysis tool, which draws conclusion about a given population value, association of variables and significance testing. There are two types of hypothesis, the null and alternate hypotheses. In addition, we have to understand the concepts of errors, level of significance, power of a test and others to make use of this concept.

Abstract

Correlation and regression are the techniques which are used to investigate if there is a relationship between two quantitative variables. Correlation answers three questions—is there a relationship, what is the strength of relationship and direction of the relationship? Regression expresses this relationship in a mathematical form so that the equation can be used for predicting other values. However correlation does not deal with causation, that is even a high degree of correlation cannot be used to confirm which is the cause and which is the effect variable. However, this condition is clearly defined in “regression or connection between two or more things”. A more formal definition says, “Correlation is a statistical method used to assess a possible association between two or more variables”. In statistics, correlation is an indispensable tool that forms the basis for in-depth statistical analysis like forecasting, decision-making and simulation.