Skip to content

Descriptive Statistics

Different types of data

graph TD 
  Data --> Qualitative --> Nominal
    Qualitative --> Ordinal
    Data --> Quantitative --> Discrete
    Quantitative --> Continuous                 

Lets take an example of a collection of shirt πŸ‘• with different attributes, for example



βœ… Color πŸŸ₯ 🟦 🟩

βœ… Pattern πŸ”― ❇️ ✳️

βœ… Size πŸ‘•

βœ… Rating ⭐ | ⭐⭐ | ⭐⭐⭐

βœ… Price 385 | 319.44 | 674.11

βœ… Discount 7.5% | 30.5% | 20%



Even for this small dataset we have a lot of variety among the data.

The data above can be divided into different types as below



Qualitative Data


βœ… Color πŸ…°οΈ 🟦 🟩

βœ… Pattern πŸ”― ❇️ ✳️

βœ… Size πŸ‘•

βœ… Rating ⭐ ⭐⭐ ⭐⭐⭐


Qualitative or categorical attributes are those which describe the object under consideration using a finite set of discrete classes.



Nominal :

Now if we take just the example of Color and Pattern, there is no natural ordering in these attributes

Nominal attributes are those qualitative attributes in which there is no natural ordering in the values that an attribute can take.

Ordinal :

Whereas for Size and Rating attributes, there is a natural ordering in these attributes

Ordinal attributes are those qualitative attributes in which there is a natural ordering in the values that an attribute can take.


Let’s see an example of Ordinal and Nominal Diseases :

Nominal Ordinal
Employee πŸ’†πŸ½β€β™‚οΈ Gender (Male, Female, Other) Income Range (low, med., High)
Healthcare βž• Disease (Non -) Communicable Health Risk (Small, Med., Large)
Agriculture 🚜 Crop Type (Kharif, Rabi) Farm Type (Small, Med. , Large)
Government 🏦 Nationality (Indian, Nepalese etc.) Opinion (Agree, Neutral, Disagree)

Quantitative Data :


βœ… Price 385 319.44 674.11

βœ… Discount 7.5% 30.5% 20%


All the attributes have numerical values


Quick Recap of types of numbers
\[ {Whole Numbers}\{{0,1,2,3...}\}Β  \subsetΒ  {Integers}\{{..-4, -3, ...0....4, 5 ....}\} \subset {Rational Numbers}\{ ... \frac{1}{2},Β  \frac{3}{1}...\} \]
\[ {Irrational Numbers}\{ \pi,Β  \sqrt{2}Β  \} \subset \{Real Numbers\} \]

Quantitative attributes are those which have numerical values and which are used to count or measure certain properties of a population


Discrete :

βœ… No of buttons πŸ”˜ : 12 15 17

βœ… Days of Delivery 🚚 : 1 4 6

Discrete attributes are those quantitative attributes which can take on only a finite numbers of numerical values (Integers).

Continuous :

βœ… Price 385 319.44 674.11

βœ… Discount 7.5% 30.5% 20%

Continuous attributes refer to quantitative attributes which can take on fractional values (Real Numbers).



It need not be fractional values all the times, as long as some of them are fractional, or the attribute can take fractional values.

Continuous Discrete
Employee πŸ’†πŸ½β€β™‚οΈ Income tax, gross salary # Projects, #family members
Healthcare βž• Cholesterol level, sugar level days of treatment, weeks of pregnancy
Agriculture 🚜 Total yield, acres # of farmers, # of crops farmers
Government 🏦 GDP, Tax rates # of citizens, # of villages

Ordinal V/S Discrete


\[ Ratings \:\:\:\: \frac{very\:poor}{1} \:\: \frac{\:poor}{2}\:\: \frac{\:okay}{3}\:\: \frac{\:good}{4}\:\: \frac{very\:good}{5} \]

Lets take an example of ratings where very poor is denoted by1, poor by 2 and so on.


Why is ratings not discrete (quantitative) ?

Although expressed as numbers the notion of distance is not well defined. ie. the differnce between very poor and poor, or poor and okay need not be same in notion.

In simple terms, the distance between good and very good may not be the same as the difference between good and okay, although the difference in the numeric rating may be the same.


But why bother about data types?

The type of statistical analysis depends on the type of variable

For example, let's look at below example to see how statistical analysis depends on the type of values.


Thus, in case of Qualitative Attributes, can we answer the below questions.

❌ What is the average color of all the shirts in my catalogue?

❌ What is the average nationality of students in this course?

βœ… What is the frequency of the colour red.

Similarly based on the nature of data whether qualitative or quantitative we can perform certain tests. Thus for qualitative attributes we will learn some of these in later chapters.

❌ Regression Analysis

βœ… Analysis of Variance (ANOVA)

βœ… Chi-square test



Similarly in case of Quantitative Attributes we can answer below questions

Quantitative Attributes (Discrete)

βœ… What is the average value in the dataset?

βœ… What is the spread of the data?

βœ… What is the frequency of a given value?

βœ… Regression Analysis

Quantitative Attributes (Continuous)

βœ… Regression Analysis

βœ… What is the average value in the dataset?

βœ… What is the spread of the data?

❌ What is the frequency of a given value?


How to describe Qualitative Data?

One way to describe qualitative data is to describe the frequency of the data.


Frequency of a value

βœ… How many times does the color red appear?

The count of the total number of times a value appear in the data is called its frequency

Thus in a frequency plot,

βœ… Horizontal Axis : Value of the categorical attribute

βœ… Vertical Axis : Counts of these values

βœ… Height of bar : Proportional to count

An example of a Frequency plot is a long-tailed distribution.

Frequency Plots (Long-Tailed Distributions)

  • A large number of tall bars are at the beginning of the plot

  • A large number of short bars at the end

  • Very common in many real world scenarios

Frequency Plots (Uniform Distribution)

  • All values are equally likely

Relative Frequency Plots

What percentage of farms grow groundnut?

Relative frequencies are easier to interpret than absolute frequencies

Grouped Frequency Bar Charts

Has the farming pattern changed across years?

βœ… Compare different sets of data

βœ… Each bar corresponds to one set


We will learn a lot more about Plooting and types of distributions etc in later chapters in detail.