__STYLES__
We created dummy hospital data using the Python FAKER package.
The dataset includes the following administration and operational data. See Annex-1 for details on data.
We performed exploratory data analyses on the given data in Python. These analyses include understanding the data and its type and basic statistics including minimum, maximum, outliers, and quartile range. Further to understand the data, we created a correlation matrix, box plots, histogram, and bins to see the data distribution and identify skewness and kurtosis.
What questions were asked and what do the end users expect from us?
a. HR and Administration Analysis:
1. Employee turnover analysis based on department and designation.
2. Salary distribution and comparison across departments and designations.
3. Gender diversity analysis across departments.
4. Experience vs. salary analysis.
b. Operations Analysis:
1. Patient demographics (age, gender) across different diseases and departments.
2. Disease frequency analysis across departments.
3. Revenue analysis based on patient billing across departments.
4. Patient load analysis (entry and discharge dates) for operational efficiency.
After a basic understanding of the data and the demand from the end-user, we imported, analyzed, and transformed the data in Power Query as per our requirements.
After the data transformation, we created some new columns and measures for detailed analysis. These DAX are given in the Annex-2
Once the data is cleaned, transformed, and measures were created, we designed a roadmap listing the type of analysis and representation on different types of graphs. See Annex-3 for details.
Annex-1
1. Administration Dataset Variables Explanation
2. Operations Dataset Variables Explanation
Annex-2
Key DAX Commands
Total Employees = COUNT(administration_data[Employee ID])
Average Salary = AVERAGE(administration_data[Salary])
Total Revenue = SUM(operations_data[Total Bill])
Total Patients = COUNT(operations_data[Patient ID])
Period of Stay = AVERAGEX(operations_data, DATEDIFF(operations_data[Entry Date], operations_data[Discharge Date],DAY))
Added a new column in Administration Data, with the following conditions:
Years of Experience < 5, "Junior",
Years of Experience >= 5 & < 15, "Mid-Level",
Years of Experience >= 15, "Senior"
Annex-3
Graphs for Each Indicator in Power BI:
Employee Count by Department (Bar Chart)
Average Salary by Department (Column Chart)
Gender Distribution (Pie Chart or Donut Chart)
Experience vs. Salary Analysis (Scatter Plot)
Patient Count by Disease (Stacked Bar Chart)
Total Revenue by Department (Treemap)
Average Length of Stay by Department (Line Chart)
Revenue Over Time (Line Chart)