Unlock the full potential of AI with Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Population Pyramid — Interesting Visualization of Population Statistics using R
Latest   Machine Learning

Population Pyramid — Interesting Visualization of Population Statistics using R

Last Updated on July 26, 2023 by Editorial Team

Author(s): Supriya Ghosh

Originally published on Towards AI.

Data Visualization

Population Pyramid — Interesting Visualization of Population Statistics using R
Pic on Unsplash by Eugene Tkachenko

The proverb “ A picture is worth a thousand words” is always put to test when we make an attempt to address and communicate technical complexities through visuals. These visuals always tend to increase understanding and perception of the challenges and reward us with insights not even thought of before. The driving force behind developing such visualizations is squeezing out maximum value and information from the data for further analysis. In fact, this is the top goal of all experts in their respective fields.

One such Visual is the “Population Pyramid”.


What are these Population Pyramids?


What benefits do they bring?

Let us understand in detail.

Population Pyramids are extensively used by demographers as a tool for understanding the composition or structure of a population within a specific area. The area can represent a city, a country, a region, or a globe. Ecologists, sociologists, and economists also use them widely in their disciplines/fields to study and compare populations.

Definition: Population Pyramid

Population Pyramid is the interpretative graphical representation of the age and sex(gender) composition/distribution of the chosen population at a given time period.

They are also popularly called as age-sex pyramids or age-sex diagrams as they specifically represent age and sex structure within the population of a chosen space. The name pyramid comes from their usual shape which is mostly triangular. However, there can be shape variations depending upon the age and sex composition of the population.

Characteristics of Population Pyramid

1. Population Pyramid displays the total population segmented into various age groups which are further subdivided into males and females.

2. The Pyramid represents separate component cohorts of population composition.

3. Generally, three types of population pyramids can be created from age-sex distributions — expansive, constrictive, and stationary.

4. Expansive represents larger numbers or percentages of the population

in the younger age groups, constrictive displays lower numbers or

percentages of the population in the younger age groups and stationary or

near-stationary represents somewhat equal numbers or percentages for

almost all age groups.

5. The graphic starts from the youngest age group at the bottom (base of the pyramid) to the oldest at the top (apex of the pyramid).

6. The fertility rate of a population is the most important parameter to influence the shape of the pyramid as a greater number of children per parent will give the broader base of the pyramid with the median age being a younger population.

7. Mortality will also have an influence on the shape but impacts much less than fertility but is more complex.

8. Pyramid with a broad base means that younger age groups make up a considerable large proportion of the population while a narrow or pointed top means older age groups make up a considerably smaller proportion of the population.

Benefits of the Population Pyramid

1. Population Pyramid provides a considerable amount of information about fertility, mortality, migration, male-female ratio, the number of dependents(children, elderly people, etc.), and other population dynamics within the context of the chosen area.

2. They reveal the growth or decline within the population over a period of time and form a foundation for tracking and populating major demographic shifts because of major events such as disease, disasters, and other crises. This in turn allows researchers to predict economic needs based on patterns, indicate the status of the level of development within the area, and enable easy comparison of future and historical trends.

3. Population pyramids are useful for examining historical and current population trends as well as forecasting future trends to plan for future development. In turn, this helps government as well as private parties to plan for the distribution of services for specific areas based on population needs.

4. Multiple Population Pyramids can be used to compare patterns across nations or selected population groups.

5. Population pyramids are quite handy tools and provide very effective graphical presentations. Their greatest strength lies in being easily understandable by almost everyone, regardless of statistical and mathematical skills.

Interpreting the Population Pyramid

A population pyramid contains paired back-to-back bar graphs stacked on top of one another to display the numbers or percentages of males and females in each age group. Representation follows a convention where males are displayed on the left side and females on the right side. The numbers or percentages of the population are represented on the horizontal axis and the age group on the vertical axis. There is a vertical line placed centrally (i.e., middle of the graph) separating male representation from female. Age is often grouped into categories of 5-year or so. The youngest age groups are represented by the bottom-most bar and the oldest age group by the uppermost bar. The length of the horizontal bar depicts the number or percentage of males or females in the specific age group for the chosen population. If females are more, bars on the right side of the middle axis are longer than the bars on the right side and vice-versa.

It is important to draw Population pyramids to the same scale if they are to be used for comparison and also should depict the same age categories.

Going further, let us now discuss how can we create such pyramids using the ‘R’ language?

Method 1 (Using ‘pyramid’ package in ‘R’)

#Building Population pyramid
# Read Population pyramid data frame
Pyramid_DF <- read_xlsx("Pyramid_data_frame.xlsx" )
View(Pyramid_DF)# Building pyramid based on Number of Males and Females
Pyramid_DF1 <- data.frame(Pyramid_DF$Number_of_Males,Pyramid_DF$Number_of_Females,Pyramid_DF$Age_Category)
pyramid(Pyramid_DF1, Rcol="#FF9999", Lcol = "#009999",main = "Population Pyramid
using total number of population",
Llab="No. of Males", Rlab="No. of Females", Clab="Age_Group")
# Building pyramid based on Percentage of Males and Females
Pyramid_DF2 <- data.frame(Pyramid_DF$Percentage_of_Males,Pyramid_DF$Percentage_of_Females,Pyramid_DF$Age_Category)
pyramid(Pyramid_DF2, Rcol="green", Lcol = "yellow",main = "Population Pyramid using total
percentage of population",
Llab="Percentage of Males", Rlab="Percentage of Females", Clab="Age_Group")

Data-frame representation in tabular form

Population Pyramid Screenshot

Image depicts Population Pyramid for total population (No. of Males + Females)
Image depicts Population Pyramid for total population (Percentage of Males + Females)

Method 2 (Using ‘plotrix’ package in ‘R’)

main="Population pyramid",lxcol=mcol,rxcol=fcol,
# three column matrices (Expanding pyramid beyond male-female numbers )
# group by age
unit="Bowls per month",lxcol=c("#ff99ff","#66ff66","#00ffff"),
# put a box around it
# giving a title
mtext("Recipe temperature bearable by age and sex of consumer",3,2,cex=1.5)
# Putting a legend
legend(par("usr")[1],11,c("Very hot","Optimal","Too Cold"),
fill=c("#ff99ff","#66ff66","#00ffff"), bty = 'n',cex = 0.7)
# Restore the margins and background

Population Pyramid Screenshot

Image depicts Population Pyramid for total population further extended according to recipe temperature borne by the population

I hope the concept is clear with the above code and visuals.

Final Thoughts

Population pyramids are quite handy tools when working with age-sex distributions and used frequently by demographers, researchers, ecologists, sociologists, etc. They provide very effective graphic presentations. Probably their greatest asset is that they are so easily understandable to almost everyone, regardless of statistical and mathematical skills.

Thanks for reading !!!

You can follow me on medium as well as

LinkedIn: Supriya Ghosh

And Twitter: @isupriyaghosh

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓