Skip to content

End-to-end analysis using Star Schema (MySQL). Advanced SQL and Python integration powered Statistical Analysis. Identified healthcare cost drivers, performed rigorous data cleaning, and delivered strategic insights for operational and quality optimization.

Notifications You must be signed in to change notification settings

MukeshTheAnalyst/healthcare-sql-python-stat-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Star Schema to Strategic Insight: SQL & Python for Healthcare Cost Analysis

πŸ“„ Project Summary

This project demonstrates a complete end-to-end data analytics lifecycle for a complex healthcare dataset. The primary goal was to transform raw operational data into actionable business intelligence for improving cost management and patient satisfaction.

The solution showcases the crucial integration of Data Modeling (Star Schema), Advanced SQL, and Statistical Analysis in Python to validate hypotheses, identify key cost drivers, and deliver strategic recommendations.

The process involved:

  • Data Understanding and profiling in Python (Pandas).
  • Designing and implementing a Star Schema in MySQL.
  • Performing rigorous Data Cleaning and Transformation using advanced SQL techniques.
  • Integrating SQL and Python for Statistical Analysis and validation.
  • Drawing Strategic Insights and preparing final reports.

πŸ“‚ Data Source

The dataset consists of eight related operational tables covering patient visits, costs, and providers. The data is used strictly for educational and portfolio purposes. Note: Due to the sensitivity of healthcare data and licensing, the raw data files are not shared in this repository.


πŸ“ Project Structure & Deliverables

This repository contains all code and final outputs for the project.

Folder/File Description
1_Healthcare_Data_CSV_Files/ (NOT INCLUDED in repo due to license/sensitivity) Placeholder for the 8 raw operational CSV tables.
2_Healthcare_Data_Final_Deliverables/ Contains the finalized outputs, reports, and code scripts.
README.md Project documentation and overview (this file).

Included Deliverables (within 2_Healthcare_Data_Final_Deliverables/):

  • 01_Data_Understanding_and_Profiling.ipynb (Jupyter Notebook)
  • 02_Star_Schema_DDL_and_Transformation.sql (SQL Script)
  • 03_Advanced_SQL_EDA.sql (SQL Script)
  • 04_Python_Statistical_Analysis.ipynb (Jupyter Notebook)
  • 05_Final_Analysis_Report.pdf (PDF Report)
  • 06_Presentation_Slides.pptx (PowerPoint Slides)

πŸš€ Key Skills Demonstrated

  • Data Modeling: Star Schema design, Relational Database implementation, Normalization, Primary/Foreign Key Constraint Management.
  • Advanced SQL: Window Functions, Subqueries, CASE logic for conditional aggregation, Data Cleaning and Transformation.
  • Python: Pandas for Data Understanding and ETL staging, Statistical Analysis (Correlation/Distribution), Seaborn for Visualization.
  • Business Intelligence: Translating complex findings into actionable insights and strategic recommendations for business stakeholders.

🧠 Lessons Learned

  • Database Design Precedes Analysis: Emphasized the importance of designing a scalable Star Schema to ensure data integrity and ease of querying.
  • SQL-Python Synergy: Demonstrated the efficiency of using SQL for transformation and Python for granular Statistical Modeling and hypothesis validation.
  • Actionable Strategy: The necessity of linking analytical metrics directly to strategic outcomes (Cost Management, Quality Improvement) for maximum business impact.

🌐 Portfolio Site


πŸ‘¨β€πŸ’» Author

Mukesh Shirke

  • [GitHub Profile Link]
  • [Portfolio Site Link]
  • [LinkedIn Profile Link]

⚠️ Disclaimer

This project is for educational and portfolio purposes only.

About

End-to-end analysis using Star Schema (MySQL). Advanced SQL and Python integration powered Statistical Analysis. Identified healthcare cost drivers, performed rigorous data cleaning, and delivered strategic insights for operational and quality optimization.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published