The Size Conundrum: Why Online Knowledge Markets Can Fail at Scale

Knowledge Market

Why did Yahoo Answers fail? Why is Stack Overflow declining? Will Quora survive in 2020?

When are community question answering (CQA) platforms like StackExchange, sustainable? In our recent paper at the Web conference, we investigate CQA sustainability from an economic perspective–providing insights on their successes and failures through an interpretable model. Our key idea is to interpret CQA platforms as markets, where participants exchange knowledge. We used this “knowledge market” perspective to analyze the question answering sites on StackExchange, and showed that the content generation in StackExchanges can be captured through Cobb-Douglas model–a production model from Economics. Our model provides intuitive explanation for the successes and failures of knowledge markets through the concept of economies and diseconomies.

Missing figure for diseconomies of scale.
Figure 1: Cobb-Douglas model can predict economies (diseconomies) of scale–whether the ratio of answers to questions will increase (decease) with the increase in number of users. We present the economies and diseconomies of scale in three StackExchanges: SUPERUSER (strong diseconomies), PUZZLING (weak economies), and CSTHEORY (strong economies). Most StackExchanges exhibit diseconomies of scale.

Modeling Content Generation in Knowledge Market

In a knowledge market, users generate different types of content such as questions, answers and comments. Content in these markets exhibit dependencies, e.g., answers depends on questions, and comments depend on questions and answers. We combine user participation and content dependency to model content generation in knowledge markets. To this end, we use production functions, which are a natural way to model output in a market. These functions involve two components, a basis function (Figure 2) and an interaction type (Figure 3). We consider three possible basis: exponential, power, and sigmoid; and four possible interaction types: essential, interactive essential, antagonistic, and substitutable. Considering these choices, we found that the combination of power basis and interactive essential interaction provides the best fit to our dataset for all StackExchanges. In economics, this combination is known as the Cobb-Douglas model.

Missing figure for basis functions.
Figure 2: A basis function captures the relationship between an input and output, e.g., how the number of questions affect the number of answers.


Missing figure for interaction types.
Figure 3: An interaction types capture how different inputs interact to produce an output, e.g., how the number of questions and number of answerers interact to generate number of answers.

Key Findings

Our model asserts that, for most StackExchanges, the aggregate user behavior changes with the size of user community. Figure 4 shows how answering behavior in ANDROID StackExchange changes with community size. Specifically, as the community grows, the users who join more recently are likely to contribute fewer answers compared to those who joined at an early stage. This phenomena is true for most StackExchanges, with varying strength. The larger communities tend to behave similar to ANDROID. Now, why do we observe such size dependent distribution of user behavior? It turns out, in StackExchanges, there’s a core group of users who substantially contribute answers for a long period. The size of this core does not grow in proportion to the community size, which causes the size dependent distribution, which in turn causes diseconomies–ratio of answer to question declines with the increase in community size.

Missing figure for size dependent distribution.
Figure 4: Size dependent distribution of answering behavior in ANDROID StackExchange. Users who join more recently are likely to contribute fewer answers compared to those who joined at an early stage.

Resources

Paper: The Size Conundrum: Why Online Knowledge Markets Can Fail at Scale
Himel Dev, Chase Geigle, Qingtao Hu, Jiahui Zheng, and Hari Sundaram
The 27th International Conference on World Wide Web (WWW), Lyon, France, April, 2018.

Slides: Google Drive
Data and Code: GitHub Repository
BibTex:
@inproceedings{DevWWW18,
author = {Dev, Himel and Geigle, Chase and Hu, Qingtao and Zheng, Jiahui and Sundaram, Hari},
title = {The Size Conundrum: Why Online Knowledge Markets Can Fail at Scale},
booktitle = {Proceedings of the 2018 World Wide Web Conference on World Wide Web, (WWW 2018)},
pages = {65--75},
year = {2018}}