Portfolio Project

Pizza Tips Regression Modeling

Excel Analytics & Regression Modeling

Analytics Excel Statistics AWS

Context

Tips varied a lot by neighborhood and housing type. I wanted to see what actually drives them.

Approach

  • Merged 1,251 delivery tickets with NOAA weather, then cleaned the data in Power Query.
  • Ran a multiple regression in Excel: Tip = f(cost, delivery time, rain, max/min temperature).

Impact

  • Order cost explains ~38% of tip variance (about +$1.10 tip per +$10 bill).
  • Apartment customers tipped ~28% less than house residents (p < 0.001).
  • Weather and delivery time didn’t show a meaningful effect on tip size.

Data Integration

I merged delivery tickets with NOAA weather to test common ideas about what drives tipping.

  • Combined 1,251 deliveries with daily weather features (rain, max/min temperature, wind).
  • Cleaned the dataset in Power Query and derived tip percentage and delivery time (minutes).
  • Separated housing types (apartment vs. house) to test neighborhood effects.

Exploratory Analysis

  • Found a strong positive correlation (0.62) between order cost and tip amount.
  • Rainfall had only a mild relationship with delivery duration (correlation 0.14).
  • Order counts more than doubled in summer/early fall (clear seasonality).

Regression and Hypothesis Tests

  • Ran a multiple regression: Tip = f(cost, delivery time, rain, max/min temperature).
  • Result: order cost was the main driver, explaining ~38% of tip variance (≈ +$1.10 tip per +$10 bill).
  • Validated housing differences with a two-sample t-test: apartment customers tipped ~28% less (p < 0.001).

What I'd Improve

  • Add distance, time-of-day, and driver controls to reduce omitted-variable bias.
  • Model tip percentage and tip amount separately to avoid conflating larger orders with generosity.
  • Use mixed-effects models to capture repeated customers or neighborhood-level variance.

Links