Summary sheet [1]

Theory

Model to be built by generalizing to $m$ coins, each coin having $n$ possible sides
Theorem 1 deals with a negative result to reveal limitations of votes and confidences
Subsequently show determination of answer in three complex settings:
1. $m = 2, n \geq 2$
2. $m = n > 2$
3. $m, n \geq 2$
Thus, theorem 2 proves SPA
Extensions for multiple choice ( $m > 2$ coins) rely on key Lemma applied in theorem 3

Model

Generalize to an arbitrary number $m$ $m$ of possible worlds
- one of the worlds is actual, the rest are counterfactual
A correct answer occurs in actual world
- this random variable should ideally belong to set $\{a_1,...,a_m\}$ of $m$ possible answers to multiple-choice question
Distinguish between respondents’ vote and evidence
- evidence of respondent $r$ is a private signal $S^r$ that belong to set $\{s_1,...,s_n\}$
Respondent’s vote $V^r$ is:

V^r = V(S^r)

that maps signals to votes where $V^r \in \{v_1,...,v_m\}$ for $m$ possible answers

Conditional of world $a_i$ , signals of different respondents distributed with $p(s_k \mid a_i)$ probability
Beliefs about signals received by other respondents:

p(s_j \mid s_k) = \sum_i p(s_j \mid a_i) p(a_i \mid s_k) p(a_i)

explicitly,

p(s^q_j \mid s^r_k) = P_r(S^q = s_j \mid S^r = s_k)

Theorem 1

Stated as: The correct answer cannot be deduced by any algorithm relying exclusively on knowledge of actual signal probabilities, $p(s_k|a_{i^*}), k=1,...,n$ and posterior probabilities over answers implied by these signals, $p(a_i|s_k), k=1,...,n, i=1,...,m$ .

Theorem 2

Stated as: Assume that not everyone votes for the correct answer. Then the average estimate of the votes for the correct answer will be underestimated.

Applicable for the two worlds, many signals case $m=2, n \geq 2$
Proof: attempting to show that actual votes for correct answer exceed that of counter-factual ones i.e., $p(v_{i^*} \mid a_{i^*}) > p(v_{i^*} \mid a_k), k \neq i^*$ as:

\frac{p(v_{i^*} \mid a_{i^*})}{p(v_{i^*} \mid a_k)} = \frac{p(a_{i^*} \mid v_{i^*}) p(a_k)}{p(a_k \mid v_{i^*}) p(a_{i^*})} = \frac{p(a_{i^*} \mid v_{i^*})}{(1 - p(a_{i^*} \mid v_{i^*}))} \cdot \frac{(1 - p(a_{i^*}))}{p(a_{i^*})}

References

[1] Supplementary information: A solution to the single question crowd wisdom problem - readcube.com

Comments & Discussion

Want to suggest corrections or improvements?

Have a correction, suggestion, or idea for improvement?

Comment below using GitHub Discussions (recommended)
Email directly via LinkedIn for detailed feedback
Open an issue on GitHub for technical corrections

All constructive feedback is welcome and helps improve the content for everyone.