Research in Public #05: Does btc price affect SN behavior? Experiment failure!
Note 1. Quick update from last time. I checked whether the territory owner matters, and it turns out that it does! This implies that some of the relationship between territory cost and post zaps is due to the generosity of the owner. Including territory owner in the regression changes the estimates, but the basic result is still the same: higher posting costs leads to fewer posts, but higher quality posts. I won't bother showing these results here (they can be seen on github), but they'll probably make it into the actual paper
Note 2. The stuff in today's post is highly technical and there aren't any charts. People not interested in econometrics can skip to the last section.
- Introduction
- Recap thus far
- Some technical mumbo jumbo
- Experimental failure!
- How SHOULD bitcoin price affect behavior on SN?
Introduction
In #1243188 I suggested to @k00b that we engage in a research project using SN data. The idea would be to use this data to study: A) how micropayments with real money affects internet discourse; and B) barriers to the adoption of self-custody. I also promised @Undisciplined that I'd carry out the research in public, since many people might not know what economics research looks like, and may be curious as to how the process plays out. You can follow all of the updates here.
Recap thus far
So far, I've demonstrated that users are indeed responsive to posting fees:
- The quantity of posts goes down when territory posting costs go up (#1253062)
- The quality of posts, as measured by zaps and comments in the first 48 hours, goes up when territory posting costs go up (#1255322)
- These results are identified from changes to territory posting costs and comparison of responses across territories. They are not driven by spurious correlations coming from global time trends or peculiarities about individual territories.
The next thing I wanted to ask is: Does the fact that it's real money matter?
This is non-trivial. It's possible, however unlikely, that SN users primarily treat zaps like a scoreboard, and that their behavior wouldn't change whether zapping was based on fake tokens or on real money. We need some test to rule out that possibility.
Some technical mumbo jumbo
The first thing I thought of is to add bitcoin price to the regression models. Unfortunately, this is not straightforward to do, due to a technical issue with how the model is set up.
To see why, let's consider the regression of quantity of posts on territory posting cost. The original model was:
\log Q_{it} = \beta \log C_{it} + \mu_i + \delta_t + \epsilon_{it} ~ ~ ~ ~ (1)
where
i
indexes a territory and t
indexes a week, and:Q_{it}
is the number of posts in territoryi
in weekt
C_{it}
is the posting cost (in sats) in territoryi
in weekt
\mu_i
is a territory fixed effect that captures baseline differences across territories\delta_t
is a week fixed effect that captures an arbitrary time effect\epsilon_{it}
is the error term.
That was the estimated model. Suppose that in the true model, people don't respond just to the posting cost in sats, they respond to the dollar value of the posting cost. The true model would be:
\log Q_{it} = \beta \log P_{t} C_{it} + \mu_i + \delta_t + \epsilon_{it} ~ ~ ~ ~ (2)
where
P_{t}
is the dollar price of a sat in week t
. Here's the problem. We can rewrite equation (2) as follows:\begin{align}
\log Q_{it} &= \beta \log C_{it} + \beta \log P_{t} + \mu_i + \delta_t + \epsilon_{it} \\
&= \beta \log C_{it} + \mu_i + \tilde{\delta}_{t} + \epsilon_{it} ~ ~ ~ ~ ~ ~ ~ ~ (3)
\end{align}
where we've simply grouped terms so that:
\tilde{\delta}_{t} = \beta \log P_{t} + \delta_t ~ ~ ~ ~ (4)
Thus, equation (2) is equivalent to equation (1), with a redefinition of the fixed effects. Since the regression method will estimate arbitrary fixed effects for each time period
t
, running regressions using equation (1) or equation (2) would yield the exact same results for \beta
.Thus, we won't be able to learn anything new simply by re-doing the regressions with the dollar cost of the posting fees in the regressions. We'll have to think of some other approach.
Experimental failure!
One thought I had was to run the regression like in (1) / (3) [they're the same], assuming (2) is the true model. Then, after the regression recovers the values for
\tilde{\delta}_{t}
, I can regress that on \log P_{t}
as in equation (4). Would the coefficient of that regression yield \beta
?Here are the results of those ill-advised regression:
============================================================
Dependent variable:
-----------------------------
week fixed effect
(1) (2) (3)
------------------------------------------------------------
log_price -1.120*** -0.171*** 0.029*
(0.115) (0.066) (0.015)
Constant 18.117*** 4.492*** 0.379**
(1.239) (0.705) (0.159)
------------------------------------------------------------
Observations 226 226 226
R2 0.296 0.030 0.016
Adjusted R2 0.292 0.025 0.012
Residual Std. Error (df = 224) 0.985 0.560 0.126
F Statistic (df = 1; 224) 93.989*** 6.813*** 3.741*
============================================================
Note: *p<0.1; **p<0.05; ***p<0.01
(1): Week fixed effects from post quantity regression; beta=-0.244
(2): Week fixed effects from post quality (zaps) regression; beta=0.187
(3): Week fixed effects from post quality (comments) regression; beta=0.033
So the answer is, no, we don't recover the same values for
\beta
. But that wasn't necessarily unexpected. We'd only expect to recover the same \beta
's if both the following were true:- Bitcoin price only affects user behavior through the dollar-value of posting costs
- Unfortunately, this is unlikely since higher bitcoin price may directly affect interest levels on SN, irrespective of posting costs.
- There are no residual time effects correlated with bitcoin price
- Unfortunately, this also seems quite unlikely. Many things were changing with regard to SN's operation and incentive structure during this time, and these changes may just happen to correlate with bitcoin price movements, even if they're not causally related.
Since the above two points are unlikely to be true, the coefficients in these regressions don't tell us much. So, this approach was a bit unenlightening, and can be considered a failed attempt at deriving insights from the data. Yes, that happens frequently in empirical research!
How SHOULD bitcoin price affect behavior on SN?
To find a way forward, it might help to think of all the different ways bitcoin price might affect behavior on SN, and see if we can test that in such a way that the effect of bitcoin price can be isolated from other things going on at the same time.
Here are some possibilities:
- Higher bitcoin price attracts more users and posts. Seems like we won't be able to separate that from time trends in SN's growth though.
- Higher bitcoin price discourages unprofitable posts and encourages profitable posts. This seems promising, since a higher bitcoin price will amplify both profits and losses on SN. Thus, we should see a divergence in the number of profitable posts and unprofitable posts when bitcoin price is higher, even while controlling for arbitrary time trends in the total number of posts.
- Another possibility is to keep the current framework, but impose stricter functional form assumptions on the arbitrary time trend, such that the time effects don't absorb the entire effect of bitcoin price.
- Anything else? I am hereby asking SN users to help me think of ways to disentangle the effect of bitcoin price from general time trends in SN popularity and usage patterns.
Anyway that's all I have for today. Anyone who wants to vet the code can go to https://github.com/ed-kung/sn-research. I'll keep posting any time I spend a day doing substantial work on this project.
edit
: @SimpleStacker