Professional Resume

Peter H. Dao

Senior Data Engineer  ·  Healthcare & Enterprise  ·  Sacramento, CA

Sacramento, CA
916-849-1978
Summary

Professional Summary

Senior Data Engineer and IT Strategist with over 15 years of full-stack software and analytics experience, including 7 years designing and operating scalable, HIPAA-compliant data platforms. Proven track record architecting ETL pipelines, big-data solutions (Spark, Databricks, Microsoft Fabric), and data warehouses for healthcare operations — driving up to 50% reductions in pipeline latency and ensuring 100% PII/PHI regulatory adherence. Recognized mentor and cross-functional leader skilled at translating business requirements into high-value BI dashboards and predictive models.

Scalable Data Pipelines HIPAA / PHI Safeguards ETL Frameworks (SSIS, Airflow) Data Governance & Traceability Big-Data (Spark, Databricks) Python & SQL Scripting Azure & On-Prem Data Lakes Data Warehousing & Modeling BI & Reporting (Power BI, SSRS) Microsoft Fabric Medallion Architecture Architecture & Technical Docs
Career

Work Experience

Sutter Health
Senior Data Engineer — Cloud Data & Platform Engineering
2019 – Present
Roseville, CA
Current · 6+ yrs
  • Engineered batch and near-real-time ETL pipelines in Databricks & SSIS processing 10M+ records/day; reduced end-to-end latency by 50%.
  • Built Python-based data-quality framework integrated into orchestration pipelines (Airflow, SQL Agent) detecting schema drift, reducing production errors by 75%.
  • Architected a HIPAA-compliant Azure Blob Storage data lake with automated PII/PHI masking and role-based access controls ensuring 100% regulatory adherence.
  • Partnered with analytics teams to deliver Power BI dashboards within 4 hours of data ingestion, enabling data-driven clinical operations decisions.
  • Authored comprehensive pipeline specifications and data-model diagrams, cutting new-hire onboarding time by 30%.
Sutter Health
Senior Technology Analyst — Web & DBA
2009 – 2019
Roseville, CA
10 yrs
  • Led migration of on-prem SQL Server data warehouse to cloud-based platform, improving query performance by 60%.
  • Developed .NET services and SSIS workflows to ingest and standardize multi-source healthcare data into the enterprise data warehouse.
  • Streamlined Excel-based ETL macros and SSRS reports, saving 10+ hours of manual effort weekly.
  • Mentored 10+ junior engineers on SQL performance tuning and data-governance best practices.
Sutter Health
Data Analyst / Programmer
2007 – 2009
2 yrs
  • Created complex Excel macros and MS Access applications to pre-process HR data feeds, improving data-prep speed by 80%.
  • Collaborated with end users to translate business requirements into dynamic reporting tools and dashboards.
Vision Service Plan (VSP) — IT Division
Senior Application Developer
1998 – 2006
Rancho Cordova, CA
8 yrs
  • Managed full SDLC for web-based claims applications; requirements analysis, prototyping, and technical documentation.
  • Implemented intranet portal reducing manual communication tasks by 70%; trained new developers on .NET best practices.
Ross Stores
Senior Programmer Analyst
1989 – 1997
San Francisco / San Jose, CA
8 yrs
  • Developed and maintained mission-critical business applications in C and Windows for a national retail chain.
  • Conducted systems analysis and design, coordinating requirements across Bay Area offices.
Projects

Selected Achievements

Healthcare · Real-Time

Streaming Analytics Pilot

Real-time telemetry pipeline using Spark Structured Streaming enabling sub-second latency for 50+ clinical devices.

Spark StreamingKafkaDatabricksAzure
HR Data · HIPAA

Workday Enterprise Data Lake

HIPAA-compliant Azure Blob data lake with automated PII masking supporting 5M+ records/month, zero breaches.

Azure BlobPythonPII MaskingRBAC
Microsoft Fabric · Retail

Fabric Retail Analytics Platform

End-to-end Bronze/Silver/Gold architecture with star schema and automated Power BI dashboards via PySpark.

Microsoft FabricPySparkPower BIDAX
Data Quality · Automation

Automated Quality Framework

Python framework in Airflow pipelines for automated schema-drift detection, reducing production errors 75%.

PythonAirflowSQL Agent
Cloud Migration · DW

On-Prem to Cloud DW Migration

Migrated on-premises SQL Server DW to cloud platform, improving query performance by 60% across healthcare ops.

SQL ServerAzure.NETSSIS
ETL Optimization

Pipeline Performance Tuning

Optimized SSIS workflows and Python scripts for parallel execution, significantly reducing compute overhead.

SSISPythonParallelism
Tech Stack

Technical Proficiencies

Languages & Frameworks
PythonSQLPySparkC#Spark.NET / ADO.NETDAX
Platforms & Databases
DatabricksMicrosoft FabricAzure Data FactorySQL ServerOracleMySQLDB2Kafka
ETL & Orchestration
SSISAirflowAzure Blob / OneLakeMedallion ArchitectureStar SchemaETL / ELT
BI & Reporting
Power BISSRSCrystal ReportsKPI Dashboards
DevOps & Tools
Git / GitHubAzure DevOpsJIRALinux / UnixVS CodeJupyter
Governance & Compliance
HIPAA / PHIPII MaskingRBACData GovernanceSchema Drift Detection
Education

Academic Background

M.S.
Information Systems
Golden Gate University
San Francisco, CA
Thesis: PCs in Small Business
B.S.
Business Administration
San Francisco State University
San Francisco, CA
Information Systems concentration
A.A.
Computer Science
City College of San Francisco
San Francisco, CA
Fortran · C · COBOL · Systems Analysis
Certifications

Certifications & Continuous Learning

DP-700: Microsoft Fabric Data Engineer Associate (Preparing)
Microsoft · Microsoft Learn Fabric Engineering Paths
SSIS & T-SQL
LinkedIn Learning
Intro to Data Engineering & Big Data Analytics
Coursera
Big Data Class & Technical Certification
Dezyre · Simplilearn
Web Development Certificate
UC Davis Extension