Descriptive Statistics Explained | Types, Formulas, and Real-Life Examples

What Are Descriptive Statistics

Descriptive statistics are a set of statistical tools used to describe, summarise, and present data in a meaningful way. Rather than drawing conclusions beyond the data itself, they focus on showing what the data reveals about a particular group or situation.

In simple terms, descriptive statistics help transform raw data into clear insights through numbers, tables, and graphs. They

Simplify complex information
Makes it easier to understand patterns and averages within a dataset
Serves as the first step in data analysis
Allows researchers to summarise findings before moving on to deeper inferential techniques.

Example Of Descriptive Statistics In Research

Imagine you surveyed 100 students about their study hours per week. Using descriptive statistics, you could calculate the average (mean) number of study hours, find the most common (mode) value, and identify the spread (standard deviation) of the data. This summary gives a clear overview of students’ study habits without making predictions, which is where inferential statistics would come in.

Types Of Descriptive Statistics

Descriptive statistics are generally divided into four main types:

Measures of central tendency
Measures of dispersion
Measures of frequency and distribution
Measures of position

A. Measures of Central Tendency

These measures identify the centre or average point of a dataset. They summarise where most data points cluster. The three main types are:

Mean: The arithmetic average of all values.

Mean Example: If students scored 70, 75, and 80, the mean score is (70 + 75 + 80) ÷ 3 = 75.

Median: The middle value when data is arranged in order.

Median Example: For scores 60, 70, 80, the median is 70.

Mode: The value that occurs most frequently.

Mode Example: If scores are 65, 70, 70, 80, the mode is 70.

B. Measures of Dispersion (Variability)

While central tendency tells us the “middle,” measures of dispersion explain how spread out the data is.

Range: The difference between the highest and lowest values.

Example: If the highest mark is 90 and the lowest is 60, the range is 30.

Variance: Shows how much each value differs from the mean.
Standard Deviation: The most common measure of variability, showing the average distance of each data point from the mean. A higher standard deviation indicates that values are more spread out, while a lower one means they are closer to the mean.

C. Measures of Frequency and Distribution

These describe how often each value or category appears in a dataset. Researchers use frequency tables, bar charts, histograms, and pie charts to visualise this distribution.

Example: A frequency table showing how many students fall into different grade ranges (A, B, C, D) helps identify performance trends quickly.

D. Measures of Position

These indicate where a particular value lies within a dataset.

Percentiles: Show the relative standing of a value. For example, scoring in the 90th percentile means performing better than 90% of participants.
Quartiles: Divide data into four equal parts, helping detect data spread and outliers.

Ranks: Assign numerical positions to values, often used in competitive analysis or performance ranking.

Descriptive Statistics Formulas And Examples

Below are the basic formulas for mean, median, mode, variance, and standard deviation, with simple numeric examples and step-by-step calculations.

1. Mean (Arithmetic Average)

Formula (population or sample mean):

Example dataset: 4, 8, 6, 5, 3

Step-by-step calculation

Sum the values: 4 + 8 + 6 + 5 + 3 = 26
Count the values: n = 5
Divide: x = 26 ÷ 5 = 5.2

Result: Mean = 5.2

2. Median (Middle Value)

Procedure: Sort values and pick the middle. If n is even, median = average of the two middle values.

Example A (odd n): 4, 8, 6, 5, 3

Sort: 3, 4, 5, 6, 8
Middle value (3rd of 5) = 5

Example B (even n): 3, 4, 5, 6

Sort: 3, 4, 5, 6 (already sorted)
Middle two values = 4 and 5 → median = (4 + 5) ÷ 2 = 4.5

3. Mode (Most Frequent Value)

The value(s) that occur most often are called the Mode. A dataset may have one mode, multiple modes, or no mode.

Example: 2, 3, 3, 5, 7 → mode = 3 (appears twice)
Example (no mode): 4, 8, 6, 5, 3 → no value repeats → no mode

4. Variance (Average Squared Deviation)

There are two common versions:

Population variance (σ²)

Use the population formula when you have the entire population. Use a sample formula when your data is a sample from a larger population.

Example dataset (same as earlier): 4, 8, 6, 5, 3; mean x = 5.2

Step-by-step calculation of squared deviations

Compute deviations from the mean:
- 4 − 5.2 = −1.2 → squared = 1.44
- 8 − 5.2 = 2.8 → squared = 7.847
- 6 − 5.2 = 0.8 → squared = 0.640
- 5 − 5.2 = −0.2 → squared = 0.040
- 3 − 5.2 = −2.2 → squared = 4.844
Sum squared deviations: 1.44 + 7.84 + 0.64 + 0.04 + 4.84 = 14.80
Population variance (divide by N = 5): σ^2 = 14.80 ÷ 5 = 2.96
Sample variance (divide by n − 1 = 4): s^2 = 14.80 ÷ 4 = 3.70

Results: Population variance = 2.96; Sample variance = 3.70

5. Standard Deviation (Square Root Of Variance)

SD Formulas:

Using the variance results above

Population standard deviation: σ = 2.96 ≈ 1.72
Sample standard deviation: s = 3.70 ≈ 1.92

Interpretation:

Standard deviation gives the average distance of observations from the mean. Smaller values indicate data points are closer to the mean, while larger values indicate they are more spread out.

Quick Reference

Mean: X = Xn
Median: Middle value after sorting (or average of middle two if even n)
Mode: Most frequent value(s)
Population variance: 2=(x –)N
Sample variance: s2 = (x –x)n -1
Standard deviation: = 2

Short Worked Example Summary (Dataset: 4, 8, 6, 5, 3)

Mean = 5.2
Median = 5
Mode = none (no repeats)
Population variance = 2.96 → Population SD ≈ 1.72

Here are some of the most widely used descriptive statistics tools that help summarise and interpret data efficiently.

1. Microsoft Excel

Descriptive statistics in Excel are simple to perform using built-in functions like AVERAGE, MEDIAN, MODE, STDEV, and VAR.

Researchers can also use the “Data Analysis Toolpak” to automatically generate detailed statistical summaries, including mean, standard deviation, and variance.

Excel’s charts and graphs, like bar charts and histograms, make it easy to visualise trends and compare data points.

2. SPSS (Statistical Package for the Social Sciences)

SPSS is a powerful statistical software widely used in academic and professional research. It allows users to compute descriptive statistics with just a few clicks, generating clear tables for mean, median, mode, and standard deviation.

It is handy for handling large datasets and creating detailed statistical reports that include both descriptive and inferential outputs.

3. R and Python

Both R and Python are advanced programming languages popular in data science and academic research.

They allow researchers to automate descriptive statistics, visualise data using packages like ggplot2 (R) or matplotlib (Python), and perform custom analyses.

For example, you can calculate means and standard deviations across thousands of data points in seconds while producing professional-quality visualisations.

4. Google Sheets or Online Calculators

For quick analysis, Google Sheets and free online descriptive statistics calculators offer accessible options.

Google Sheets supports basic statistical functions and simple charts, making it ideal for students and small-scale projects.

Online tools like GraphPad, CalculatorSoup, or Social Science Statistics are convenient for quick calculations when software access is limited.

Descriptive Vs Inferential Statistics

While descriptive statistics summarise existing data, inferential statistics go a step further by drawing conclusions about a larger population based on a sample.

Comparison Table

Comparison Point	Descriptive Statistics	Inferential Statistics
Purpose	Summarizes and organizes data collected from a sample or population.	Makes predictions or generalizations about a larger population based on a sample.
Focus	Describes what is known and visible in the dataset.	Infers what is unknown and extends findings beyond the data collected.
Techniques	Mean, median, mode, range, variance, standard deviation.	t-tests, ANOVA, regression, correlation, and $chi^2$ (Chi-Square) tests.
Data Used	The entire dataset or the sample itself.	A sample is used to represent and make conclusions about a larger population.
Visuals	Charts, tables, and graphs (histograms, box plots) to display data distribution.	Confidence intervals, p-values, and hypothesis testing results.
Example Output	“The average height of 100 students is 170 cm.”	“We are 95% confident the average height of all students is between 168 and 172 cm.”

When to use descriptive vs inferential statistics?

Use descriptive statistics when you want to present and summarise data you already have (e.g., survey results, exam scores).
Use inferential statistics when you aim to predict or test hypotheses about a larger population based on sample data.

Examples

Descriptive example: “The average age of respondents was 28 years.”
Inferential example: “There is a significant difference between the average ages of male and female respondents.”

Frequently Asked Questions

academhelper.com academhelper.com