Heavy Metal | Python

About this project

Iron Ore is mined in 50 countries and is the primary source of iron for the world's steel industries.

To purify Iron Ore, the pulp is mixed with Starch and Amina, which strip away other materials from the Iron. Air is then injected into the liquid mixture, creating bubbles on which the Iron will rise to the top. The purified Iron is essentially scraped off the top.

For this project, I am assuming the role of a Data Analyst for a manufacturing company. There was an incident reported to the flotation plant manager on June 1 regarding one of the Flotation Columns. He wants to be armed with information from that day for the investigation. Using Python,

  • I found no anomalies in % Iron Concentration
  • % Silica Concentrate and Ore Pulp pH values were as expected
  • Flotation Column 5 flow, where the incident was reported, appeared normal

Data Set

This is real data taken from March 2017 to September 2017 and can be downloaded here. Each row is a time point at either 20-second or 1-hour intervals. The 24 attribute columns include data on Iron concentration, the other chemical concentrations, flow, density, and pH.


I used Python in the Deepnote IDE for this analysis, with seaborn and matplotlib libraries for visualizations. The entire data set has 737,453 rows and the date column was a string object. I converted the date and filtered the data for June 1. During the morning scrum meeting, the engineering team suggested that we look not just at Iron concentration, but also Silica and pH in Column 5, so the data set was filtered for those attributes as well.

For an overview of the relationships between these factors, I created a Pair Grid. No obvious correlations or anomalies were observed, which was confirmed with a correlation matrix.

Pair PlotCorrelation Matrix

To dig deeper, I extracted the hour from the date and added a new column, then used a for loop to create line plots of each variable at each hour of the day.

According to the engineers, there is nothing unexpected in these graphs: the Iron and Silica are inversely proportional and changes in those levels appear to coincide with the pulp being replaced in the column around hour 18.


Flotation Column 5, where the incident was reported, operated as expected on June 1.

Discussion and feedback(0 comments)
2000 characters remaining