100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached 4.2 TrustPilot
logo-home
Summary

Summary Customer Analytics - 2021

Rating
3.3
(4)
Sold
22
Pages
33
Uploaded on
11-12-2021
Written in
2021/2022

A clear summary of the lectures by George Knox of Customer Analytics, as part of the MScs Marketing Analytics/Management. Including an overview of the symbols and formulas used

Institution
Course











Whoops! We can’t load your doc right now. Try again or contact support.

Written for

Institution
Study
Course

Document information

Uploaded on
December 11, 2021
File latest updated on
December 13, 2021
Number of pages
33
Written in
2021/2022
Type
Summary

Subjects

Content preview

Module 1 – Uncertainty



Contents
Module 1 – Uncertainty ............................................................................................................................................................. 2
Test and roll ........................................................................................................................................................................... 3
Option value .......................................................................................................................................................................... 3
Classical uncertainty .............................................................................................................................................................. 3
Bayesian approach ................................................................................................................................................................ 4
Comparing posteriors ............................................................................................................................................................ 4
Size of test group ................................................................................................................................................................... 4
Module 2 – RFM ......................................................................................................................................................................... 6
RFM ....................................................................................................................................................................................... 6
Empirical Bayes and clumpiness ............................................................................................................................................ 8
Lecture 3 – Logistic Regression................................................................................................................................................. 10
Churn ................................................................................................................................................................................... 10
Logistic regression ............................................................................................................................................................... 11
Overfitting ........................................................................................................................................................................... 12
Lifts and optimal targeting .................................................................................................................................................. 14
Lecture 4 – Subset selection, LASSO, decision trees and random forests ................................................................................ 17
Subset (of predictors) selection ........................................................................................................................................... 17
LASSO .................................................................................................................................................................................. 18
Decision trees ...................................................................................................................................................................... 19
Random forests ................................................................................................................................................................... 21
Lecture 5 – Collaborative filtering, Cross-selling, Upselling...................................................................................................... 22
NPTB models ....................................................................................................................................................................... 22
Introduction to recommender systems ............................................................................................................................... 23
Recommender system models ............................................................................................................................................ 23
Lecture 6 – CLV in a contractual setting ................................................................................................................................... 26
CLV – Definitions ................................................................................................................................................................. 27
CLV - Geometric Model ....................................................................................................................................................... 27
RLV....................................................................................................................................................................................... 28
Heterogeneity and retention rates ...................................................................................................................................... 28
sBG model ........................................................................................................................................................................... 29
Lecture 7 – CLV in a non-contractual setting ............................................................................................................................ 30
Intro ..................................................................................................................................................................................... 31
BGBB .................................................................................................................................................................................... 31
Interpreting results BGBB .................................................................................................................................................... 32
CLV RLV ................................................................................................................................................................................ 32
Extensions ........................................................................................................................................................................... 33




1

,Module 1 – Uncertainty


Module 1 – Uncertainty
N The population (all the customers)
n The (test) sample
m The margin (profit) per response
𝑝̂ The estimate of the response rate
c Cost (of marketing)
p The true population response rate
B Number of bootstrap samples
α Used in the Bayesian prior and indicates the heterogeneity of the customers.
The closer this value is to 0, the more extreme the difference is between
segments that do respond and those who do not. This can be seen as an
indication of the number of successes
β The other parameter in the prior. This can be seen as the number of failures
∝ Meaning it’s related to it. In the context of distributions, it means it does not
always count to 1
𝜎 The standard deviation of the population
s The standard deviation of the sample


1 Sample mean estimate. If you don’t have a
𝑝̂ = ∑𝑖 𝑥𝑖
𝑛
sample, then 𝑝̂ can be based on past data
𝜎2 Standard error
𝑠𝑒 = √
𝑛
𝑝(1−𝑝) The standard error of p
𝑠𝑒(𝑝) = √ 𝑛
N-s The number of failures in the population
s ≈ √𝜇 (1 − 𝜇) An approximation of the s, which is allowed
when 𝜇 is between 0 and 1
𝑦𝐴 ~𝑁(𝑚𝐴 , 𝑠 2 ), The likelihood
𝑚𝐴 ~𝑁(𝜇, 𝜎 2 )), The posterior distribution of group A
𝑐
𝑝=𝑚 The threshold. If the response rate is higher, it
means it’s profitable
𝑎
E[𝑝] = 𝑎+𝑏. The expected response rate. This is based on
the parameters of the prior
1 𝑛 −1 The standard deviation of a normal-normal
𝜎 = √( + ) distribution
𝜎02 𝑠2
𝜇 𝑛 𝑦̅ The mean of a normal-normal distribution
𝜇= 𝜎 2 (𝜎02 + 𝑠2 )
0



2 Optimal sample size of group A if it has a
𝑁 𝑠 2 3 𝑠 2 3 𝑠 2
𝑛𝐴∗ = √ 4 (𝜎) + (4 (𝜎) ) − 4 (𝜎) , normal-normal model

f(𝑝) ∝ 𝑝𝑎−1 (1 − 𝑝)𝑏−1 , The distribution of the prior in the Bayesian
approach
f(𝑝) ∝ 𝑝𝑎+𝑥−1 (1 − 𝑝)𝑏+𝑛−𝑠−1. The distribution of the posterior in the Bayesian
approach


Customer analytics: There’s a shift from a focus on the product to a focus on the customer since
1990. You use customer data and statistical models to make business decisions, such as who to
target, who to test, number of subscriptions and CLV

2

,Module 1 – Uncertainty


Customer lifestyle: Marketing is al about acquiring, developing and retaining customers. The CLV has
3 stages:

1) Customer acquisition: How customers are born or first contact with the firm
2) Customer development: Change in behaviour over time: buying more (up-selling) or different
things (cross-selling)
3) Customer retention: Preventing customers death or churn

Test and roll
Test & roll experiments:

• Test sample (size = n): A subset of customers. After you have send, collect and analyse the
responses, you use the results to decide whether the send to send to the rest of the
population. After the test results are in, you have the option, but not the obligation, to roll
out. Hence, this is an option
• Rollout sample (size = N – n): The rest of the population. You only roll out if the E[rollout
profit] > 0

Expected rollout profits: E[rollout profit] = (N-n)(m*𝑝̂ -c)

Estimate: An estimation of characteristics of the population based on a sample

Option value
Option value: It is assumed that the test provides perfect information. The value is E[profit|test] –
E[profit | no test]. If the test predicts a failure (i.e. E[rollout profit] < 0), you will not roll out and you
will only have the costs of the test. Hence:

• E[profit | test]: The expected profits after you know the test results, e.g. if there’s a 30%
chance of success with, a profit margin m of 50, a response rate p of 0,05 when there’s
success and 0.01 when it’s a failure and the cost c are 1.50. There are 50,000 customers in
total and you have a 10%, than, if it’s a success, the profits are (m*p-c) = (50*0.05-1.5) = 1
per customer and, if it’s a failure (50*0.01-1.5) = -1. You are not gonna roll out if the test
predicts a failure (which is in 70% of the cases), so you only have 5000 losses. The E[profit |
test] is 0.3*(50000) + 0.7*-5000 = 11500. This is also the maximum amount of money you are
willing to pay for the test (i.e. you are willing to pay <11500 to know what the outcome of
the project will be)
• E[profit | no test], First you calculate how much the expected value is of the project per
customer: (50*0.05-1.5)0.3+(50*0.01-1.5)*0.70 = -0.4. This is negative, so you will not do the
project, given you don’t do any testing. Hence, the expected profit is 0. If it’s positive
however, you simply calculate the expected profit per customer * N

Classical uncertainty
Central limit theorem: For large enough samples, distribution of the sample mean is approximately
normal 𝑝̂ ~ N(P, se(p)2). SE is used since it’s about the estimate P. SE is the SD divided by n. The larger
n is, the smaller the SE is and the more accurate the estimation is

Bootstrap: To create new samples with replacement from the original sample, using the same
sample size (e.g. if your sample frame is {0, 2, 4, 6}, with 2 bootstrap samples you can get {0, 0, 0, 4}
and {4, 2, 0, 6}. The goal is to estimate the mean of every sample. With that, you can create a
∑𝑏 1{𝑝̂𝑏 <𝑥}
distribution and you can calculate the average response rate: 𝐵
, meaning the sum of every
estimation of the response rate, which is 1 if it’s larger than x, and 0 otherwise, divided by the

3

, Module 1 – Uncertainty


number of bootstrap samples B. For example, if x is 0.3 and you have 1000 bootstrap samples, of
300∗1+700∗0
which 300 have a lower 𝑝̂ than 0.3. Than the average response rate is 1000
= 0.0226. With
bootstrap aggregation you can reduce the variance of a statistical learning method (e.g. decision
trees)

Bayesian approach
Bayesian approach: This has two steps (categorical data), whereby you have a prior before making a
posterior distribution:

1) Prior distribution: This is the distribution you make before running the actual test. This is a
distribution of the response rate based on previous experience or, when you have no idea,
each value is equally likely (Also called a flat or diffuse prior, which is a uniform distribution).
A diffuse prior carries not much weight since it’s spread across a lot of observations. This
distribution is called the beta: 𝑝 ~ beta(𝑎, 𝑏) and can be calculated as follows: f(𝑝) ∝
𝑝𝑎−1 (1 − 𝑝)𝑏−1 , with a and b being the different groups (e.g. people who respondent and
who did not respond). A flat distribution has an a and a b of 1. If a is smaller than 1, it goes up
at the left hand side, if b is smaller than 1, it goes up at right hand side. If a=b, than it’s
symmetric. The larger a and b become, the more it’s centered in the middle. If you have
some clue about the probabilities, you change the parameters until you have a beta
distribution that fits your beliefs
2) Posterior distribution: This is the distribution you make after running the actual test. It’s an
updated version of the prior distribution, also called the beta-binomial model. It’s the beta
distribution, including the actual observations from the test: p ~ beta(a+s, b+n-s). The more
test observations there are, the less weight the prior distribution gets. To calculate the
posterior distribution, you multiply the likelihood (actual test) with the prior:
posterior ∝ likelihood ∙ prior, which results in f(𝑝) ∝ 𝑝𝑎+𝑠−1 (1 − 𝑝)𝑏+𝑛−𝑠−1 (ignore this,
only the beta distribution is relevant). To compute the differences between two groups,
count the number of times the beta(a+s, b+n-s) of one group is bigger than the other group.
Remember, in a flat distribution, replace a and b with 1

Comparing posteriors
Hold out test: They receive no treatment. You compare this with the active group (respondents who
did receive the treatment)

Normal-normal model Instead of looking at whether someone responded or not, you could also look
at continuous data such as minutes on site or profits. This can be captured with this model, whereby
the likelihood of each respondent (𝑦𝑖 ~ N(𝑚, 𝑠 2 )) is normal, as well as the prior (𝑚 ~ N(𝜇0 , 𝜎02 )).
Hence, the posterior is normal distributed as well: 𝑚 ~ N(𝜇, 𝜎 2 ) according to the central-limit
theorem. You can use the pnorm function in R to calculate the likelihood of having an observation
that is equal or lower than a particular value

A/B-testing: If you subtract the normal distribution of group B with group A, then B is bigger than A if
m is positive

Size of test group
Optimal test size: This is the case when E[Profittest + Profitrollout] is maximal. Large tests have a low
rollout error (low risk), but a lot of people will see the inferior option (opportunity cost). Hence, it’s a
trade-off between learning in the test phase and earning during the roll-out phase, especially when N
is limited. The hypothesis testing is different than the profit-maximising test size, since with
hypothesis testing you really test which one is better (i.e. α = 0.05 and β=0.8). For example, if the

4
$6.66
Get access to the full document:
Purchased by 22 students

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Reviews from verified buyers

Showing all 4 reviews
2 year ago

3 year ago

2 year ago

1 year ago

3.3

4 reviews

5
1
4
1
3
1
2
0
1
1
Trustworthy reviews on Stuvia

All reviews are made by real Stuvia users after verified purchases.

Get to know the seller

Seller avatar
Reputation scores are based on the amount of documents a seller has sold for a fee and the reviews they have received for those documents. There are three levels: Bronze, Silver and Gold. The better the reputation, the more your can rely on the quality of the sellers work.
UvTstudent98 Tilburg University
Follow You need to be logged in order to follow users or courses
Sold
543
Member since
6 year
Number of followers
411
Documents
5
Last sold
5 days ago

3.6

47 reviews

5
7
4
20
3
14
2
4
1
2

Recently viewed by you

Why students choose Stuvia

Created by fellow students, verified by reviews

Quality you can trust: written by students who passed their tests and reviewed by others who've used these notes.

Didn't get what you expected? Choose another document

No worries! You can instantly pick a different document that better fits what you're looking for.

Pay as you like, start learning right away

No subscription, no commitments. Pay the way you're used to via credit card and download your PDF document instantly.

Student with book image

“Bought, downloaded, and aced it. It really can be that simple.”

Alisha Student

Frequently asked questions