Project goal:
The goal of this SQL data project was to utilize data cleaning techniques upon the Nashville Housing raw data source.
Data source:
Source of data set was found from this hyperlink:
PortfolioProjects/Nashville Housing Data for Data Cleaning.xlsx at main · AlexTheAnalyst/PortfolioProjects · GitHub
Import data set and data tool used:
Imported the Microsoft Excel csv file called Nashville Housing into Microsoft SQL Server Management Studio.
Data set details:
Selected and viewed the first 1,000 rows and columns to preview the data as a relatively small sample set to gather insights of what data needed to be modified.
The following column names were found in the sample data set:
- UniqueID
- ParcelID
- LandUse
- PropertyAddress
- SaleDate
- SalePrice
- LegalReference
- SoldAsVacant
- OwnerName
- OwnerAddress
- Acreage
- TaxDistrict
- LandValue
- BuildingValue
- TotalValue
- YearBuilt
- Bedrooms
- FullBath
- HalfBath
Data Cleaning methods:
Overview of data cleaning methods that were utilized:
- Standardize Date Format
- Fill input for data columns
- Separating Column information into specific individualized columns.
- Alternative method for separating columns into specific individualized columns.
- Case Statements
- Removing duplicate columns (Note: utilize temp tables instead of removing columns is preferred)
- Alternative method for deleting columns (Note: only perform with permission and also with file backup methods in place)
Github.com - Source code:
Below is a Github hyperlink that contains the data cleaning SQL source code that was used in this project.
Github.com - SQL source code for Nashville data project
Conclusion and thankfulness for viewing data project:
Please feel free to reach out to me on LinkedIn if you have any comments or questions.
Lastly, thank you very much for viewing this SQL data project.