End - End Statistics for Data Science

Statistics for Data Science 

Definition 

Statistics is the science, or a branch of mathematics, that involves collecting, classifying, analyzing, interpreting, and presenting numerical facts and data. It is especially handy when dealing with populations too numerous and extensive for specific, detailed measurements. Statistics are crucial for drawing general conclusions relating to a dataset from a data sample. 

Types of Statistics 

 There are two types of Statistics: 
1. Descriptive Statistics 
2. Inferential Statistics 

Types of Data


Let’s explore all types of data.

Quantitative Data:

1. Discrete Data • It can take only discrete values. Discrete information contains only a finite number of possible values. Those values cannot be subdivided meaningfully. Here, things can be counted in whole numbers. 

• Example: Number of students in the class, Number of bank accounts.

 2. Continuous data • It represents measurements and therefore their values can’t be counted but they can be measured. 

• Example: Height of a person (which you can describe by using intervals on the real number line), Average Rainfall, Body Temperature 

Qualitative Data/Categorical Data:

1. Nominal Data • Nominal values represent discrete units and are used to label variables that have no quantitative value. Just think of them as “labels.” Note that nominal data that has no order. Therefore, if you would change the order of its values, the meaning would not change. 

• Example: Gender Type (Male, Female or Others), Language spoken by an individual (English, Spanish, French, Hindi, or Others)

 2. Ordinal Data • Ordinal values represent discrete and ordered units. It is therefore nearly the same as nominal data, except that its ordering matters. 

• Example: Student’s performance in the exam (Outstanding, Good, Average, Unsatisfactory, Failed). You can associate a rank or an order with each and every label, i.e., Outstanding (1), Good (2) and so on. 

Example of all types of data in a tabular form

Let’s take an example of ‘Student’ table below:

From the above example, we can see all four types of data. ‘Age’ is Discrete, ‘Height’ is Continuous, ‘Sex’ is Nominal and ‘Academic Performance’ is Ordinal data.

Sample Data & Population Data

  • A population is the entire group that you want to draw conclusions about. 
  • A sample is the specific group that you will collect data from. The size of the sample is always less than the total size of the population.
Sampling Techniques:
  • Simple Random Sampling 
  • Systematic Sampling 
  • Stratified Sampling 
  • Cluster Sampling

Most Popular Sampling Techniques 


Descriptive Statistics:

  • Descriptive statistics describe, show, and summarize the basic features of a dataset found in a given study, presented in a summary that describes the data sample and its measurements. It helps analysts to understand the data better. 
  • Descriptive statistics represent the available data sample and do not include theories, inferences, probabilities, or conclusions. That’s a job for inferential statistics.

Topics under descriptive statistics: 

  • Measures of central tendency 
  • Measures of variability 
  • Distribution (Also Called Frequency Distribution) 

Let’s start with one topic at a time. 

Measures of Central Tendency:
There are three fundamental concepts under this topic: 
  • Mean 
  • Median 
  • Mode

Let’s have a look one concept at a time 































       null hypothesis.




















































For Download Click Here.

Comments

Popular posts from this blog

Step by step to perform NLP Algorithms

Navigating the Future of Finance: Intersecting Facts for Success