13.2 Experiments
The case that might be familiar to you is an AB test. You can make a change to a product and test it against the original version of the product. You do this by randomly splitting your users into two groups. The group membership is denoted byD, whereD= 1 is the group that experiences the new change (the test group), andD= 0 is the group that experiences the original version of the product (the control group). For concreteness, let’s say you’re looking at the effect of a recommender system change that recommends articles on a website. The control group experiences the original algorithm, and the test group experiences the new version. You want to see the effect of this change on total pageviews,Y。
You’ll measure this effect by looking at a quantity called theaverage treatment effect(ATE). The ATE is the average difference in the outcome between the test and control groups,Etest[Y]−Econtrol[Y], orδnaive=E[Y|D= 1]−E[Y|D= 0]. This is the “naive” estimator for the ATE since here we’re ignoring everything else in the world. For experiments, it’s an unbiased estimate for the true effect.
A nice way to estimate this is to do a regression. That lets you also measure error bars at the same time and include other covariates that you think might reduce the noise inYso you can get more precise results. Let’s continue with this example.
1importnumpyasnp 2importpandasaspd 3 4 N = 1000 5 6 x = np.random.normal(size=N) 7 d = np.random.binomial(1., 0.5, size=N) 8 y = 3. * d + x + np.random.normal() 9 10 X = pd.DataFrame({'X': x,'D': d,'Y': y})
Here, we’ve randomizedD大约一半的测试组nd half in the control.Xis some other covariate that causesY, andYis the outcome variable. We’ve added a little extra noise toYto just make the problem a little noisier.
You can use a regression modelY=β0+β1Dto estimate the expected value ofY, given the covariateD, asE[Y|D] =β0+β1D。Theβ0piece will be added toE[Y|D] for all values ofD(i.e., 0 or 1). Theβ1part is added only whenD= 1 because whenD= 0, it’s multiplied by zero. That meansE[Y|D= 0] =β0whenD= 0 andE[Y|D= 1] =β0+β1whenD= 1. Thus, theβ1coefficient is going to be the difference in averageYvalues between theD= 1 group and theD= 0 group,E[Y|D= 1]−E[Y|D= 0] =β1! You can use that coefficient to estimate the effect of this experiment.
When you do the regression ofYagainstD, you get the result inFigure 13.1。
1fromstatsmodels.apiimportOLS 2 X['intercept'] = 1. 3 model = OLS(X['Y'], X[['D','intercept']]) 4 result = model.fit() 5 result.summary()
Figure 13.1The regression forY=͎0+β1D
Why did this work? Why is it okay to say the effect of the experiment is just the difference between the test and control group outcomes? It seems obvious, but that intuition will break down in the next section. Let’s make sure you understand it deeply before moving on.
Each person can be assigned to the test group or the control group, but not both. For a person assigned to the test group, you can talk hypothetically about the value their outcome would have had, had they been assigned to the control group. You can call this valueY0because it’s the valueYwould take ifDhad been set to 0. Likewise, for control group members, you can talk about a hypotheticalY1。What you really want to measure is the difference in outcomesδ=Y1−Y0for each person. This is impossible since each person can be in only one group! For this reason, theseY1andY0variables are calledpotential outcomes。
If a person is assigned to the test group, you measure the outcomeY=Y1。If a person is assigned to the control group, you measureY=Y0。既然你不能衡量个人的影响,maybe you can measure population level effects. We can try to talk instead aboutE[Y1] andE[Y0]. We’d likeE[Y1] =E[Y|D= 1] andE[Y0] =E[Y|D= 0], but we’re not guaranteed that that’s true. In the recommender system test example, what would happen if you assigned people with higherY0pageview counts to the test group? You might measure an effect that’s larger than the true effect!
Fortunately, you randomizeDto make sure it’s independent ofY0andY1。That way, you’re sure thatE[Y1] =E[Y|D= 1] andE[Y0] =E[Y|D= 0], so you can say that =E[Y1−Y0] =E[Y|D= 1]−E[Y|D= 0]. When other factors can influence assignment,D, then you can no longer be sure you have correct estimates! This is true in general when you don’t have control over a system, so you can’t ensureDis independent of all other factors.
In the general case,D不只是一个二进制变量。它可以ordered, discrete, or continuous. You might wonder about the effect of the length of an article on the share rate, about smoking on the probability of getting lung cancer, of the city you’re born in on future earnings, and so on.
只是为了好玩在继续之前,让我们看看一些东西nice you can do in an experiment to get more precise results. Since we have a co-variate,X, that also causesY, we can account for more of the variation inY。That makes our predictions less noisy, so our estimates for the effect ofDwill be more precise! Let’s see how this looks. We regress on bothDandXnow to getFigure 13.2。
Figure 13.2The regression forY=β0+β1D+β2X
Notice that theR2is much better. Also, notice that the confidence interval forDis much narrower! We went from a range of 3.95− 2.51 = 1.2 down to 3.65− 2.76 = 0.89. In short, finding covariates that account for the outcome can increase the precision of your experiments!