__STYLES__

Data driven content creation with Python and SQL.

Tools used in this project
Data driven content creation with Python and SQL.

About this project

Project background

The objective behind this project was to build a framework that would allow me to curate data from a variety of sources to create data driven content programmatically.

Problem objective

Resumoo is a media site with content dedicated to helping individuals write the best resume possible.

The challenge was to replace our editorially driven content with data driven programmatic articles.

On average, we were spending approximately $95 - $100 per article. After creating ~125 or so articles that were no longer sustainable.

We needed a way to create valuable content while drastically reducing cost.

Solution

Step 1: Build the content template. First, we built a content template with placeholders for the data points to inform the data curation process. {https://share.getcloudapp.com/9ZuKKDOl}

Step 2: Source the data. I built web scrapers using Python and the beautifulsoup library to scrape data from the BLS and postjobfree.com. The BLS had job specific data like specific soft skills and the hiring outlook.

Whereas the resumes on postjobfree.com had a lot of rich information on resume objectives, summaries at specific technical skills. I ended up curating approximately ~100,000 resumes.

undefinedStep 3: Parse the data and store it in a database. At this stage, we needed to store the data to be able to retrieve it later and build an article from it.

undefinedI set up a database within PostgreSQL and created tables for our certifications, hard and soft skills, objectives and our BLS data.

Step 4: Build the article compiler. Using Python, SQL and HTML I built a compiler that connected the content template to the curated data points in our database to generate template content.

Python wrapped in SQL. This pulls the number of resumes in our database with neither an objective or summary present and formats it as a percentage.

undefinedWith this process, I was able to create ~70 articles in total at an approximate cost of $3.70 per article. Which is a savings of about $95 per article.

Additional project images

Discussion and feedback(0 comments)
2000 characters remaining