Portfolio Project
Store-Level Loss & Sales ETL
SQL ETL + Anomaly Detection
Context
We didn't have a single view of security incidents, theft hotspots, and boycott-driven sales swings.
Role
- Led the analysis: SQL modeling/ETL, anomaly detection, and reporting.
Approach
- Joined incident, sales, and HR tables in SQL and automated KPIs with views and stored procedures.
- Built Python dashboards to compare theft vs. sales by format, state, and time.
- Used anomaly detection to flag outlier stores and associates.
Impact
- Found a cluster of outlier stores averaging 14 incidents per store (about 4–5× peers).
- Flagged high-theft regions (up to about $991 per store per day).
- Quantified year-over-year boycott drops: –28.7% (May ’23), –11.6% (Jun ’23), –60.2% (Jul ’23).
- Flagged a small set of high-risk associates (anonymized); one outlier averaged $249 per item on two items.
Data Modeling and ETL
I built a SQL layer that joins incidents, theft, sales, and HR attributes so the team can slice risk by store, region, and time.
- Joined incident, sales, and employee tables and created reusable KPI views.
- Standardized date keys and dimensions (store format, region/state) so filters stay consistent.
- Anonymized identifiers so insights can be shared safely.
Analysis Workflow
- Ran targeted queries to find high-incident formats, top-theft states, and high-risk associates.
- Measured boycott impact with year-over-year comparisons around the timeline.
- Used Python charts to highlight hotspots, trends, and outliers.
Anomaly Detection
- Flagged outlier stores and associates by comparing against peer baselines, not raw totals.
- Separated 'high frequency' (incidents per store) from 'high severity' (value per item).
- Kept the output focused on a short list people can investigate.
What I'd Improve
- Add automated weekly anomaly alerts and an audit trail for investigations.
- Normalize by traffic or shipments to separate volume from risk changes.
- Explore causal analysis for boycott drivers and mitigation tests.
Links
Notes
Store, state, and employee identifiers are anonymized in the case study.