__STYLES__

Discovery of the importance of handwashing - Python

Tools used in this project
Discovery of the importance of handwashing - Python

About this project

The data is about the number of births and deaths in two clinics such as "clinic 1" and "clinic 2".

1. Data Preparation:

Importing modules

import pandas as pd

Read datasets/yearly_deaths_by_clinic.csv into yearly

yearly = pd.read_csv("datasets/yearly_deaths_by_clinic.csv")

Print out yearly

print(yearly.head())

2. Calculating the proportion of deaths in Clinics 1 and 2:

Calculate the proportion of deaths per no. births

yearly['proportion_deaths'] = yearly['deaths']/yearly['births']

Extract Clinic 1 data into clinic_1 and Clinic 2 data into clinic_2

clinic_1 = yearly[yearly['clinic'] == 'clinic 1']

clinic_2 = yearly[yearly['clinic'] == 'clinic 2']

Print out clinic_1

print(clinic_1.head())

3. Plotting the proportion of deaths in Clinics 1 and 2:

Import matplotlib

import matplotlib.pyplot as plt

This makes plots appear in the notebook

%matplotlib inline

Plot the yearly proportion of deaths at the two clinics

ax = clinic_1.plot(x="year", y="proportion_deaths",

label="clinic_1")

clinic_2.plot(x="year", y="proportion_deaths",

label="clinic_2", ax=ax, ylabel="proportion_deaths")

Output:

Image attached

Interpretation: The proportion of deaths is consistently so much higher in Clinic 1. The only difference between the clinics was that many medical students served at Clinic 1, while mostly midwife students served at Clinic 2. While the midwives only tended to the women giving birth, the medical students also spent time in the autopsy rooms examining corpses.

So, handwashing was made mandatory in clinic 1. The monthly data from Clinic 1 is analyzed further to see if handwashing had any effect.

4. Loading clinic 1 data:

Read datasets/monthly_deaths.csv into monthly

monthly = pd.read_csv('datasets/monthly_deaths.csv', parse_dates=['date'])

Calculate the proportion of deaths per no. births

monthly["proportion_deaths"] = monthly["deaths"] / monthly["births"]

Print out the first rows in monthly

print(monthly.head())

5. Highlighting decline in the proportion of deaths:

Date when handwashing was made mandatory

handwashing_start = pd.to_datetime('1847-06-01')

Split monthly into before and after handwashing_start

before_washing = monthly[monthly['date'] < handwashing_start]

after_washing = monthly[monthly['date'] >= handwashing_start]

Plot monthly proportion of deaths before and after handwashing

ax = before_washing.plot(x = 'date', y = 'proportion_deaths', label = 'before_washing')

after_washing.plot(x = 'date', y = 'proportion_deaths', ax = ax, ylabel = 'Proportion deaths')

Output:

Image attached.

Interpretation:

It can be observed that the proportion of deaths has drastically reduced after mid-1847 - from which handwashing was made mandatory.

6. Difference in the mean monthly proportion of deaths:

import numpy as np

The difference in mean monthly proportion of deaths due to handwashing

before_proportion = before_washing['proportion_deaths']

after_proportion = after_washing['proportion_deaths']

mean_diff = np.mean(after_proportion) - np.mean(before_proportion)

mean_diff

Output:

-0.08395660751183336

Interpretation:

It reduced the proportion of deaths by around 8 percentage points.

7. Bootstrap analysis:

import numpy as np

A bootstrap analysis of the reduction of deaths due to handwashing

boot_mean_diff = []

for i in range(3000):

boot_before = before_proportion.sample(frac=1, replace=True)

boot_after = after_proportion.sample(frac=1, replace=True)

boot_mean_diff.append(boot_after.mean() - boot_before.mean())

Calculating a 95% confidence interval from boot_mean_diff

confidence_interval = pd.Series(boot_mean_diff).quantile([0.025, 0.975])

confidence_interval

Output:

0.025 -0.101535

0.975 -0.067587

dtype: float64

Final Conclusion:

So it can be inferred that handwashing reduced the proportion of deaths by between 6.7 and 10 percentage points, according to a 95% confidence interval.

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining