SPA Edition One

TL;DR: First in a series of implementation proposals for the Surprisingly Popular Algorithm - methodology and notational code for binary responses.

This edition aims at providing foundational knowledge of surprisingly popular algorithm (SPA). The hypothesis and corresponding notational-code has been arrived at during initial perusal of the paper [1].

Methodology

This segment outlines the approach used to support mathematical derivation of SPA.

  • A worksheet is to be created to track sample questions and corresponding responses randomly assigned to one of the two binary notations.
  • Generate a tabular representation of response distribution as:
    • Capture actual votes polled for each response
    • Against each individual actual response, capture votes in Agreement depicting Outward Opinion (AOO)
      • Indicate percentage agreement ratio for better consumption
  • Generate a tabular representation of average votes
    • Calculate average AOO votes
    • Calculate average actual votes
  • Apply the following hypothesis to identify whether the outcome is popular or surprisingly so:
    • if AOO is higher than actual, alter the outcome
    • if actual is higher than AOO, that is the outcome

Notational Code

This segment provides a breakdown of derivative steps and corresponding formulas adopted.

1) Collect all responses for a given context, both actual and AOO

2) Calculate sum total of actuals for each response

Compute all actual responses tracked as yes:

Ay=y=0rayA_y = \sum_{y=0}^r a_y

Compute all actual responses tracked as no:

An=n=0ranA_n = \sum_{n=0}^r a_n

3) Calculate sum total of AOO for each response

Compute all AOO responses tracked as yes:

Oy=y=0royO_y = \sum_{y=0}^r o_y

Compute all AOO responses tracked as no:

On=n=0ronO_n = \sum_{n=0}^r o_n

4) Calculate average actual and AOO votes

Compute average yes actual votes:

Ay=y=0rayy=0ray+n=0ran=AyAy+An\overline{A}_y = \frac{\sum_{y=0}^r a_y}{\sum_{y=0}^r a_y + \sum_{n=0}^r a_n} = \frac{A_y}{A_y + A_n}

Compute average no actual votes:

An=n=0rany=0ray+n=0ran=AnAy+An\overline{A}_n = \frac{\sum_{n=0}^r a_n}{\sum_{y=0}^r a_y + \sum_{n=0}^r a_n} = \frac{A_n}{A_y + A_n}

Compute average yes AOO votes:

Oy=y=0royy=0roy+n=0ron=OyOy+On\overline{O}_y = \frac{\sum_{y=0}^r o_y}{\sum_{y=0}^r o_y + \sum_{n=0}^r o_n} = \frac{O_y}{O_y + O_n}

Compute average no AOO votes:

On=n=0rony=0roy+n=0ron=OnOy+On\overline{O}_n = \frac{\sum_{n=0}^r o_n}{\sum_{y=0}^r o_y + \sum_{n=0}^r o_n} = \frac{O_n}{O_y + O_n}

5) 2×2 Matrix Representation

If we represent the derived values in a 2×22 \times 2 matrix:

Sp=[OyOnAyAn]S_p = \begin{bmatrix} \overline{O}_y & \overline{O}_n \\ \overline{A}_y & \overline{A}_n \end{bmatrix}

Decision rules:

  • if Ay>Oy\overline{A}_y > \overline{O}_y, then Sp=AyS_p = \overline{A}_y
  • if An>On\overline{A}_n > \overline{O}_n, then Sp=AnS_p = \overline{A}_n
  • if Oy>Ay\overline{O}_y > \overline{A}_y, then diagonal element Sp=AnS_p = \overline{A}_n
  • if On>An\overline{O}_n > \overline{A}_n, then diagonal element Sp=AyS_p = \overline{A}_y

Margin of Error

Presentation of statistical inference for any given data-set needs to be accompanied with a clearly indicative margin of error (EE) notation [2]. Edition two shall delve deeper into this subject.

Application Impact

This segment outlines potential changes to the application to adopt outcomes of edition one. Lack of signal outlay need to be tracked as surveys gain momentum.

References

[1] A solution to the single-question crowd wisdom problem - readcube.com

[2] Estimating Population Parameters - brownmath.com