Groupon Deals Data Analysis

Setup

A. Background:

Some Groupon deals have a minimal requirement, e.g., the deal only works when there are at least 100 committed buyers.
Groups:
- Control group: deals without the minimal requirement
- Treatment group: deals with minimal requirement

B. Question at Hand

Does having the minimal requirement affect the deal outcomes, such as revenue, quantity sold, and Facebook likes received?

C. Need for propensity matching

Unequal distribution of Treatment in Outcomes
- High revenue and low revenue
- High Quantity sold vs Low Quantity sold
- High Facebook likes received vs Low Facebook likes received

D. Features to be used

What features to select: As we will illustrate later, the following features/variables should be excluded:

Features/variables that predict treatment status perfectly, such as min_req feature, which the treatment feature is directly derived from (see the code notebook for the result of adding min_req).
Features/variables that may be affected by the treatment

Data Analysis

1. Read the groupon data

df = pd.read_csv('./data/groupon.csv')
df.info()

2. Extract features for propensity score matching

3. Visualize Effect size using Cohen's D

fig, ax = plt.subplots(figsize=(15, 5))
ax = sns.barplot(data=all_stats_df, x='effect_size', y='feature', hue='matching', orient='h')

4. Visualize P-value significance of t-test

fig, ax = plt.subplots(figsize=(15, 5))
ax = sns.barplot(data=all_stats_df, x='log_P', y='feature', hue='matching', orient='h')
ax.set_xlabel('-log(P-value) of t-test between control and treatment groups')
ax.axvline(x=-np.log10(0.05), color='r', linestyle='--', label='alpha = -np.log10(0.05)')
ax.legend()

5. Distribution of Quantity Sold

col = 'quantity_sold'
ax = sns.distplot(matched_df[col])
iqr = np.percentile(matched_df[col], 75) - np.percentile(matched_df[col], 25)
upper_bound = np.percentile(matched_df[col], 75) + 3.0 * iqr
lower_bound = np.percentile(matched_df[col], 75) + 1.0 * iqr
ax.axvline(x=np.mean(matched_df[col]), color='r', linestyle='--', label='mean')
ax.axvline(x=upper_bound, color='g', linestyle='--', label='tukey upper bound')
ax.axvline(x=lower_bound, color='g', linestyle='--', label='tukey lower bound')
ax.legend()

Groupon Deals Data Analysis

Setup

A. Background:

B. Question at Hand

C. Need for propensity matching

D. Features to be used

Data Analysis

1. Read the groupon data

2. Extract features for propensity score matching

3. Visualize Effect size using Cohen's D

4. Visualize P-value significance of t-test

5. Distribution of Quantity Sold

Conclusion