pull down to refresh

RIP #14: A Formal Model of Posting Behavior on Stacker News

8136 sats \ 24 comments \ @SimpleStacker 2 Mar econ

RIP #14: A Formal Model of Posting Behavior on Stacker News^[1]

One of the feedbacks I got from my department is that I should formalize the model of posting behavior on Stacker News. When I started the paper, I thought that would be mathematical overkill because the inuition is fairly obvious, and the paper is primarily about the data analysis, not formal modeling. But I decided to try it, and after doing it I think it does add a bit of "gravitas" to the paper, and I really like the new results.

The model works like this: people randomly receive "post ideas" that have a certain quality and intrinsic motivation to post. The expected amount that the post gets zapped depends on the quality, and quality and intrinsic motivation could be correlated with each other. When a user receives a post idea, they post it if the expected zaps plus intrinsic motivation exceeds the cost of posting.

The point of the model is to formalize the conditions under which we get the main result, that higher posting costs:

reduces the number of posts, and
increases the expected quality of posts

As it turns out, the logic holds under fairly unrestrictive conditions. We only require three things for the result to hold:

Expected zap amounts need to be increasing in post quality (very reasonable, maybe even tautological)
Quality and intrinsic motivation to post cannot be negatively related on one another, though they can be positively related or independent of each other (also very reasonable)
Extreme levels of intrinsic motivation to post are sufficiently rare (this is the one that perhaps needs most justification)

Assumption 1 is almost tautological, for how would we judge post quality if not by the amount that people zap it?

Assumption 2 says that if someone has a higher quality post idea, that usually comes with a higher intrinsic motivation to post it as well. I think that's almost certainly true: if you put a lot of effort into something, you have a greater desire to show it to others, or it's because you care a lot about it.

Assumption 3 is there to rule out a weird situation in which posting costs get so high that only hyper-motivated but low quality posters remain on the platform. Even if assumption 3 fails, we can still get our main result if we simply don't focus on the extreme tails of the distributions.

One reason I like having the model in the paper is that by formalizing the conditions under which the results hold, I can make the following claim: "All three assumptions are likely to be satisfied in other social media environments, and thus it may be possible to extrapolate our results to other settings (though perhaps with different baseline cost and rewards levels.)" That addresses one of the biggest critiques of the paper, which is that results on Stacker News don't generalize to broader social media contexts.

The model and proof is stated below as a flex.

(... but also because I already did the work and want to show my proof of work)

ModelModel

We envision a model in which a user receives "post ideas" as a Poisson arrival process with rate $γ$ . Post ideas arrive with a quality level $q$ , which affects how other users respond to the post, and an intrinsic motivation to post it, $ϵ$ . The level of internal motivation may be correlated with post quality, and they are drawn from a joint probability density function $f (q, ϵ)$ . We allow different users to draw from different distributions of $(q, ϵ)$ , but for now we focus on the decisions of a single user.

By Bayes' Rule, $f (q, ϵ) = f_{ϵ ∣ q} (ϵ ∣ q) f_{q} (q)$ , where $f_{ϵ ∣ q} (\cdot)$ is the conditional density of $ϵ$ given $q$ , and $f_{q} (\cdot)$ is the marginal density of $q$ . We assume that $f_{ϵ ∣ q} (\cdot)$ is strictly log-concave and weakly satisfies the monotone likelihood ratio property in $ϵ$ with respect to $q$ :

Assumption 1. (Log Concavity) For each $q$ , $f_{ϵ ∣ q} (\cdot ∣ q)$ is strictly log-concave; that is, $ln f_{ϵ ∣ q} (ϵ ∣ q)$ is strictly concave in $ϵ$ .

Assumption 2. (Monotone Likelihood Ratio Property) For any $q_{2} > q_{1}$ , the likelihood ratio of the conditional densities,

ℓ (ϵ; q_{1}, q_{2}) = \frac{f _{ϵ ∣ q} ( ϵ ∣ q _{2} )}{f _{ϵ ∣ q} ( ϵ ∣ q _{1} )}

is weakly increasing in $ϵ$ .

Remark.

Assumption 1 limits the heaviness of the tail of the motivation distribution. Intuitively, it ensures that the extreme tail of the motivation distribution is not thick enough to swamp quality selection effects at very high posting costs. It is not a strong assumption, and it is satisfied by most commonly used distributional families, including the normal and logistic distributions.^[2]

Assumption 2 is a non-negative dependence condition between post quality $q$ and intrinsic motivation $ϵ$ . Informally, it says that higher quality ideas tend to come with (weakly) higher intrinsic desire to post: when $q$ is higher, the distribution of $ϵ$ shifts to the right. Independence between $ϵ$ and $q$ is allowed as a special case, but the assumption rules out systematic negative dependence. In our context, this means that users who receive a high quality post idea are also (weakly) more likely to feel a strong intrinsic desire to post it. As we shall document later, this is consistent with users' self reported perceptions of their own motivations.

Once a post is made, zaps arrive according to a Poisson process. Both the arrival rate of zaps and the amount of each zap are dependent on the quality of the post, $q$ . Let $λ (q)$ be the arrival rate as a function of $q$ , and let $g (z ∣ q)$ be the probability density function of the zap amount conditional on $q$ .

Supposing the user has a discount factor of $ρ$ , the expected present value of the Poisson stream of zaps is equal to:^[3]

V (q) = \frac{λ ( q )}{ρ} E [z ∣ q]

We assume that $V (q)$ is continuous and strictly increasing in $q$ , which merely says that the expected present value of zaps is higher for higher quality posts:

Assumption 3. (Monotone Returns to Quality) $V (q) = \frac{λ ( q )}{ρ} E [z ∣ q]$ is continuous and strictly increasing in $q$ .

The user will make the post if their expected present value of zaps plus their intrinsic motivation is greater than the posting cost $c$ :

Post = 1 if V (q) + ϵ > c and 0 otherwise

From the perspective of the user, $c$ is fixed and known at the time of the post.

We now demonstrate conditions under which an increase in posting cost $c$ will lead to fewer posts, but higher quality posts.

Proposition 1.
Suppose Assumptions 1-3 hold. Then an increase in $c$:

(i) strictly decreases the probability that the post idea is posted, and
(ii) strictly increases the expected quality of posts, conditional on posting.

Proof.

A post idea $(q, ϵ)$ is posted if and only if $ϵ > c - V (q)$ . Define $π (q, c)$ as the probability (over $ϵ$ ) that a post of quality is posted when the cost is $c$ :

π (q, c) = P (ϵ > c - V (q))

$π (q, c)$ is equivalent to the survival function of $ϵ ∣ q$ , evaluated at $c - V (q)$ : $π (q, c) = S_{ϵ ∣ q} (c - V (q) ∣ q)$ .

(i) The unconditional probability of posting is:

P (post) = \int S_{ϵ ∣ q} (c - V (q) ∣ q) f_{q} (q) d q

Since $S_{ϵ ∣ q} (\cdot ∣ q)$ is a survival function, its first derivative is negative everywhere. An increase in $c$ will therefore reduce the value of $S_{ϵ ∣ q} (c - V (q))$ at every $q$ . Since that is what we are integrating over to get $P (post)$ , this strictly reduces $P (post)$ . The expected number of posts per unit time is $γ P (post)$ , so an increase in $c$ strictly reduces the expected number of posts.

(ii) Take any $c^{'} > c$ . The set of ideas that were posted at cost $c$ can be partitioned into two groups:

Surviving posts. Ideas that are still posted at cost $c^{'}$ , i.e. $V (q) + ϵ > c^{'}$
Marginal posts. Ideas that are posted at $c$ but not at $c^{'}$ , i.e. $c < V (q) + ϵ < c^{'}$ .

By the Law of Iterated Expectations,

E [q ∣ post at c] = α E [q ∣ surviving] + (1 - α) E [q ∣ marginal]

where $α = P (surviving ∣ post at c)$ . Noting that $E [q ∣ post at c^{'}] = E [q ∣ surviving]$ , we rearrange the above equation to obtain:

E [q ∣ post at c^{'}] - E [q ∣ post at c] = (1 - α) (E [q ∣ surviving] - E [q ∣ marginal)

Thus, to show that expected post quality is increasing in $c$ , it suffices to show that the expected quality of the surviving posts is always higher than the expected quality of the marginal posts.

To demonstrate this, it is sufficient to establish that for any $q_{2} > q_{1}$ , the ratio $π (q_{2}, c) / π (q_{1}, c)$ is strictly increasing in $c$ . Intuitively, this says that as costs rise, higher quality posts retain a larger share of the posting probability relative to lower quality ideas. This would imply that the marginal posts are tilted towards lower quality than surviving posts.

We write:

\frac{π ( q _{2} , c )}{π ( q _{1} , c )} = \frac{S _{ϵ ∣ q} ( c - V ( q _{2} ) ∣ q _{2} )}{S _{ϵ ∣ q} ( c - V ( q _{1} ) ∣ q _{1} )}

Define the hazard rate: $h (ϵ ∣ q) = f_{ϵ ∣ q} (ϵ ∣ q) / S_{ϵ ∣ q} (ϵ ∣ q)$ . Differentiating $π (q_{2}, c) / π (q_{1}, c)$ with respect to $c$ shows us that the ratio is increasing in $c$ if and only if:

h (c - V (q_{1}) ∣ q_{1}) > h (c - V (q_{2}) ∣ q_{2})

The proof now proceeds in two parts.

Claim 1. First, we show that $h (ϵ ∣ q_{1}) \geq h (ϵ ∣ q_{2})$ for any $q_{2} > q_{1}$ . By the MLRP, for any $t > ϵ$ , we have that $ℓ (t; q_{1}, q_{2}) \geq ℓ (ϵ; q_{1}, q_{2})$ , which rearranges to:

f_{ϵ ∣ q} (t ∣ q_{2}) f_{ϵ ∣ q} (ϵ ∣ q_{1}) \geq f_{ϵ ∣ q} (ϵ ∣ q_{2}) f_{ϵ ∣ q} (t ∣ q_{1})

Integrating both sides over $t$ from $ϵ$ to $\infty$ gives us:

S_{ϵ ∣ q} (ϵ ∣ q_{2}) f_{ϵ ∣ q} (ϵ ∣ q_{1}) \geq S_{ϵ ∣ q} (ϵ ∣ q_{1}) f_{ϵ ∣ q} (ϵ ∣ q_{2})

which rearranges to $h (ϵ ∣ q_{1}) \geq h (ϵ ∣ q_{2})$ as was to be demonstrated.

Claim 2. Now, we show that $h (ϵ ∣ q)$ is increasing in $ϵ$ for each $q$ . Take the derivative of $h (ϵ ∣ q)$ with respect to $ϵ$ . The sign is equal to the sign of:

f_{ϵ ∣ q} (ϵ ∣ q)^{2} + f_{ϵ ∣ q}^{'} (ϵ ∣ q) S_{ϵ ∣ q} (ϵ ∣ q)

When $f_{ϵ ∣ q}^{'} (ϵ ∣ q) \geq 0$ , this is immediately positive. When $f_{ϵ ∣ q}^{'} (ϵ ∣ q) < 0$ , we rely on the log-concavity property. Since $f_{ϵ ∣ q} (\cdot)$ is log-concave, the following holds for every $t > ϵ$ :

ln f_{ϵ ∣ q} (t ∣ q) < ln f_{ϵ ∣ q} (ϵ ∣ q) + (t - ϵ) \frac{f _{ϵ ∣ q}^{'} ( ϵ ∣ q )}{f _{ϵ ∣ q} ( ϵ ∣ q )}

Exponentiating both sides:

f_{ϵ ∣ q} (t ∣ q) < f_{ϵ ∣ q} (ϵ ∣ q) exp [(t - ϵ) \frac{f _{ϵ ∣ q}^{'} ( ϵ ∣ q )}{f _{ϵ ∣ q} ( ϵ ∣ q )}]

Now integrating over $t$ from $ϵ$ to $\infty$ :

S_{ϵ ∣ q} (ϵ ∣ q) < f_{ϵ ∣ q} (ϵ ∣ q) \int_{ϵ}^{\infty} exp [(t - ϵ) \frac{f _{ϵ ∣ q}^{'} ( ϵ ∣ q )}{f _{ϵ ∣ q} ( ϵ ∣ q )}] d t = - \frac{f _{ϵ ∣ q} ( ϵ ∣ q ) ^{2}}{f _{ϵ ∣ q}^{'} ( ϵ ∣ q )}

Since $f_{ϵ ∣ q}^{'} (ϵ ∣ q) < 0$ , this can be rearranged to:

f_{ϵ ∣ q} (ϵ ∣ q)^{2} + f_{ϵ ∣ q}^{'} (ϵ ∣ q) S_{ϵ ∣ q} (ϵ ∣ q) > 0

which was to be demonstrated. Thus, $h (ϵ ∣ q)$ is strictly increasing in $ϵ$ for all $q$ .

By the monotonicity assumption, $c - V (q_{1}) > c - V (q_{2})$ . By Claim 2, this implies that $h (c - V (q_{1}) ∣ q_{1}) > h (c - V (q_{2}) ∣ q_{1})$ . By Claim 1, $h (c - V (q_{2}) ∣ q_{1}) \geq h (c - V (q_{2}) ∣ q_{2})$ . Together, we have $h (c - V (q_{1}) ∣ q_{1}) > h (c - V (q_{2}) ∣ q_{2})$ , which was to be demonstrated.

Remark.

Proposition 1 implies that when posting cost increases, users are less likely to post, but that when they do post, the average quality of the post is higher. Although the proposition is stated from the perspective of a single user, we note that as long as the global joint distribution of $(q, ϵ)$ holds to Assumptions 1 and 2, and if Assumption 3 also holds for all users, then Proposition 1 applies in the aggregate, even if users have heterogeneous marginal densities for $q$ and $ϵ$ .

The result of Proposition 1 extends to a setting with multiple territories. Users post in the territory $j$ that maximizes $V_{j} (q_{j}) + ϵ_{j} - c_{j}$ , provided it is positive. When territory $j$ raises its posting cost $c_{j}$ , users' outside options are unaffected, so the same logic applies: as long as the population-level joint distribution of quality and motivation within each territory satisfies Assumptions 1-3, a territory that raises its posting cost will see fewer but higher quality posts.

Note: This is a series in which I am publicly documenting the research process for an academic study into financial micro-incentives on discussion platforms, using data from Stacker.News. See here for a list of updates. ↩
Weaker, local conditions about the hazard rate in the neighborhood of the current posting cost would also suffice, but we state the stronger condition of global log-concavity for simplicity. Moreover, strict log-concavity is assumed for simplicity, but it suffices to assume either strict log-concavity or strict monotone likelihood ratio property; we do not need strictness for both. ↩
This follows from $V (q) = \int_{0}^{\infty} e^{- ρt} λ (q) E [z ∣ q] d t$ . ↩

view all related items

117 sats \ 2 replies \ @BlokchainB 2 Mar

I think we should renamed these to SNIPs

Stacker news improvement proposals haha

75 sats \ 0 replies \ @Undisciplined 3 Mar

I’ll adopt that going forward

160 sats \ 0 replies \ @SimpleStacker OP 2 Mar

I didn't actually propose any improvements though haha

72 sats \ 12 replies \ @denlillaapan 2 Mar

Really appreciate this and glad you write it up for us.

I confess I'm too tired to engage with it — and about a decade out of econ modeling/grad school stuff to grasp much. Sending some zaps and appreciation instead

63 sats \ 11 replies \ @SimpleStacker OP 2 Mar

Main takeaway is just that you don't need very strong assumptions to get the basic result that raising posting costs reduces number of posts but raises average post quality. The channel works by filtering out lower quality posts.

The thing that could break the result is if low quality posters also have strong motivation to post. If every poster was like SS, then raising posting costs might not improve post quality coz you're left with highly motivated, low quality posters.

Very intuitive, but proving the mathematical model gives the paper a bit more meat on the bones.

124 sats \ 10 replies \ @denlillaapan 2 Mar

Would it not be of importance that the real world value of the cost is so tiny that it won't matter? (And that we can safely intuit that zaps post-posting Will cover it)

Do we care if its 63 or 103 sats to post?

For instance, I don't believe my behavior was changed at all during Undisciplined's econ cost experiment. (Doesn't have to mean anything, ofc, just that I'm way up the D curve).

77 sats \ 3 replies \ @Undisciplined 3 Mar

It doesn’t affect you because you’re so far above the break even point.

It does seem to affect people who go from positive to negative expectations. It also seemed like people would territory shop for where to post.

115 sats \ 2 replies \ @denlillaapan 3 Mar

could you clearly tell quantity or posting quality shifting up and down from back when you varied the ~econ posting cost a lot?

196 sats \ 0 replies \ @SimpleStacker OP 3 Mar

The estimated elasticity was about 0.2, so it'd be pretty hard to notice by the naked eye, without using statistical tests. The statistical relation seems pretty robust though.

5 sats \ 0 replies \ @Undisciplined 3 Mar

I thought so but with some of the later/smaller changes, it was probably more of a vibe check on my part.

136 sats \ 5 replies \ @SimpleStacker OP 2 Mar

I share your instinct. That's why I wrote this in the paper:

Taken together, the results are both surprising and unsurprising. They are unsurprising in the sense that they conform with economic theory: demand curves slope downwards (when posting costs go up, number of posts goes down), and signaling theory works (when posting costs go up, higher quality posts are made). They are surprising in the sense that even such small micro-incentives (the average posting cost is just 51 sats, or about 5 cents) are enough to influence user behavior in such a way that post quality is improved. The results suggest that pay-to-post may be an effective mechanism for mediating content quality, even at very small monetary amounts. Moreover, the assumptions required for this result to hold more broadly are fairly weak: we require only that expected rewards are increasing in post quality (Assumption 3, that post quality and intrinsic motivation to post are positively affiliated (Assumption 2), and that extreme levels of intrinsic motivation are sufficiently rare (Assumption 1). All three assumptions are likely to be satisfied in other social media environments, and thus it may be possible to extrapolate our results to other settings (though perhaps with different baseline cost and rewards levels.)

5 sats \ 4 replies \ @Solomonsatoshi 3 Mar

How about people who prefer not to zap posts from people who wank on about Bitcoin but do not attach Bitcoin sending wallets?

They are asking content consumers to zap them real money but they don't bother to set up sending wallets to be able to send real money themselves.

Maybe the quality of posts would improve if people posting attached both sending and receiving LN wallets and were not so obviously arsemilking hypocrites?

49 sats \ 1 reply \ @SimpleStacker OP 3 Mar

When I zap, I send sats, but when I receive, I'm happy to receive either sats or CCs from people.

5 sats \ 0 replies \ @Solomonsatoshi 3 Mar -102 sats

Yes I know that from looking at your profile but @denlillaapan cannot send sats to anyone as he only attached a receiving wallet.

It is unfortunate that wallets status is now concealed by default although people who have not attached or only attached receiving wallets but who claim to be Bitcoiners, like @denlillaapan might be rather happy to now have their wallet status concealed by default.

5 sats \ 1 reply \ @denlillaapan 3 Mar

who wank on about Bitcoin but do not attach Bitcoin sending wallets?

ah bro, I love checking your comments now and again. "wank on about Bitcoin?!"

Yeah, truly, that I do!

5 sats \ 0 replies \ @Solomonsatoshi 3 Mar -102 sats

You are providing content to a Bitcoin centric audience and not infrequently your content refers to Bitcoin.

But are you @denlillaapan here asserting you are not a Bitcoiner?

If you do not consider yourself a Bitcoiner please confirm it and I will offer my sincere apologies.

258 sats \ 1 reply \ @Taj 2 Mar

Ye I would think of a post and just post it , if it was imo worthy of an upload

But if the barrier to entry, cost to post was much higher

I would probably think of an idea, perhaps shelve it, in favour of another better idea in the future, or wait for yet another idea to compliment the original idea

my pet hate posts

Link only posts?

Ffs put somr meat On the bones

Also we are seeing that bot called something or other commenting on every post with some inane vanilla bs

The cost of entry for that bot is the seesaw, obviously currently it's just spamming everyone

The question is , is that an acceptable annoyance or do we need SNIP-110 🤣🤣🤣

175 sats \ 0 replies \ @SimpleStacker OP 2 Mar

SNIP-110, good one, haha!

123 sats \ 1 reply \ @Undisciplined 2 Mar

You could probably hand wave this away, but it seems like there should be an effort of creating the post that would be additional to the fee and would likely be increasing in q.

59 sats \ 0 replies \ @SimpleStacker OP 2 Mar

I plan to tackle endogenous quality selection next. I think I need it to motivate the section on v4v and learning