Quartiles, Quantiles and Percentiles

Introduction

Suppose you have a list of countries and their population. The quantile rank (and percentile rank) of your country correspond the fraction of countries with populations lower or equal than your country.

The difference is that the quantile goes from 0 to 1, and the percentile goes from 0% to 100%.

  • 0.25 quantile = 25th percentile = lower quartile
  • 0.5 quantile = 50th percentile = median
  • 0.75 quantile = 75th percentile = upper quartile
  • etc.

So if your country has more inhabitants than 75% of the other countries in the world, it is

  • in the 0.75 quantile
  • in the 75th percentile
  • in the upper quartile.

Let’s compute the quantile rank of your country.

Practice

import pandas as pd
import numpy as np

We will use a simplified version of the WorldBank population per country dataset – the original csv file is available here.

df = pd.read_csv("../data/countries-population-2018.csv")
df = df.dropna()
df['population'] = df['population'].apply(lambda x: int(x))
df.to_csv("../data/countries-population-2018.csv", index=False)
df.head(3)
  country population
0 aruba 105845
1 afghanistan 37172386
2 angola 30809762
def QuantileRank(df, country):
    
    # your country's population
    population = int(df[df['country']==country]['population'])
    # countries with population lower or equal than your country
    lower = df[df['population'] <= population]
    # number of such countries
    n_lower = len(lower.index)
    # total number of countries
    n_countries = len(df.index)
    # percntile rank
    quantile_rank = n_lower/n_countries
    return quantile_rank

def PercentileRank(df, country):
    
    # This is just the quantile rank, times 100
    quantile_rank = QuantileRank(df, country)
    percentile_rank = 100.0*quantile_rank
    return percentile_rank

Canada is the 81th percentile

PercentileRank(df, 'canada')
81.73076923076923

India is in the 99th percentile

PercentileRank(df, 'india')
99.51923076923077

Full code on my Github here.