__STYLES__
Tools used in this project
45 STEPS FROM WORD TO CSV

About this project

Background

"What is this?! How can I use this address table from MS WORD ? Do I have to manually paste each of these 50 addresses into every letter and envelope?" - I heard from my colleague at our forest district office.

"Maybe I can convert them to a CSV file in Power Query. Then you will have right data for your form letters and envelopes." - I offered.

The purpose of the project was to recreate this real-life situation. I took the list of the most popular names and surnames in Poland, then added some street names and postal codes from Warsaw. The fake addresses were then grouped by category ( type of building) in a Google Docs table. To make things a bit more complicated, I changed the order of some rows and added information about place ( but only in two records).

undefinedHere is a Google Docs file with all the records.

Originally I have done part of the work in excel, and then I finished it in Power Query. It didn`t take more than 40 minutes to be ready with the CSV file needed for generating serial letters, envelopes, and automatically registering correspondence. Without CSV file all of this dull and repetitive work would have to be done manually.

The amount of saved time can be estimated from 3 to 4 hours.

Let`s get down to Power Query

For this project, I decided to do all the stuff in Power Query.

Here is a Power BI file that contains the PQ script.

The first step is to paste the data with the "Enter Data" option. If you want to reuse my script and paste your data you can call out this option from the first step. Here is how to do it:

undefined

The nice surprise is that all the columns from doc are inserted into one column in proper order. Remember that the script was created for the specific data and addresses from htedoc file. Using it with different dataset will require adapting it to the new data.

First, the job was to categorize the data with conditional columns.

After that, the hardest part was to properly assign individual rows to addressees. I used the multiple SELF JOIN for that. The one thing that confused me during the process were the disappearing rows from the first table that didn`t have a match in the second one. I was astonished by this because I always used LEFT JOIN for table merging. After some time I discovered that these rows are secretly thrown to the end of the table. Sorting by the ID field resolved the problem.

After categorizing and ordering the data, I used the pivot option. That was the moment when each of the addressees had only one row in the table.

undefined

The last stage was to prepare the table for serial letters and mail registering system. Address data such as house number, street, or postal code needed to be separated into individual columns. For that, I widely used text after/before/between delimiter and again conditional columns. The final table looked like this:

undefined

Discussion and feedback(0 comments)
2000 characters remaining