Introduction
Suppose you have a list of countries and their population. The quantile rank (and percentile rank) of your country correspond the fraction of countries with populations lower or equal than your country.
The difference is that the quantile goes from 0 to 1, and the percentile goes from 0% to 100%.
- 0.25 quantile = 25th percentile = lower quartile
- 0.5 quantile = 50th percentile = median
- 0.75 quantile = 75th percentile = upper quartile
- etc.
So if your country has more inhabitants than 75% of the other countries in the world, it is
- in the 0.75 quantile
- in the 75th percentile
- in the upper quartile.
Let’s compute the quantile rank of your country.
Practice
import pandas as pd import numpy as np
We will use a simplified version of the WorldBank population per country dataset – the original csv file is available here.
df = pd.read_csv("../data/countries-population-2018.csv") df = df.dropna() df['population'] = df['population'].apply(lambda x: int(x)) df.to_csv("../data/countries-population-2018.csv", index=False) df.head(3)
country | population | |
---|---|---|
0 | aruba | 105845 |
1 | afghanistan | 37172386 |
2 | angola | 30809762 |
def QuantileRank(df, country): # your country's population population = int(df[df['country']==country]['population']) # countries with population lower or equal than your country lower = df[df['population'] <= population] # number of such countries n_lower = len(lower.index) # total number of countries n_countries = len(df.index) # percntile rank quantile_rank = n_lower/n_countries return quantile_rank def PercentileRank(df, country): # This is just the quantile rank, times 100 quantile_rank = QuantileRank(df, country) percentile_rank = 100.0*quantile_rank return percentile_rank
Canada is the 81th percentile
PercentileRank(df, 'canada')
81.73076923076923
India is in the 99th percentile
PercentileRank(df, 'india')
99.51923076923077
Full code on my Github here.