<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Kronosapiens Labs</title>
    <description>Building socio-technical infrastructure</description>
    <link>http://kronosapiens.github.io/</link>
    <atom:link href="http://kronosapiens.github.io/feed.xml" rel="self" type="application/rss+xml"/>
    <pubDate>Tue, 31 Mar 2026 14:41:22 +0000</pubDate>
    <lastBuildDate>Tue, 31 Mar 2026 14:41:22 +0000</lastBuildDate>
    <generator>Jekyll v3.10.0</generator>
    
      <item>
        <title>Implementing Active Ranking</title>
        <description>&lt;!-- For the raw commit log and research notes, run the following command --&gt;
&lt;!-- gh api repos/dojoengine/daimyo/commits --paginate -q &apos;.[] | &quot;\(.sha[0:8]) \(.commit.author.date[0:10]) \(.commit.message)&quot;&apos; --&gt;

&lt;p&gt;A core task in social choice is the allocation of fixed resources among a set of items: funding across projects, budget across priorities, prize money across submissions.
The output are &lt;em&gt;weights&lt;/em&gt; – percentages out of 100 which say how valuable each item is relative to the others.&lt;/p&gt;

&lt;p&gt;A common approach – asking people to score items on a scale, and then taking some sort of average – is fast but unreliable.
Scores are subjective and uncalibrated: one person’s 7 is another person’s 4.
Legitimacy suffers.&lt;/p&gt;

&lt;p&gt;Pairwise comparison sidesteps this by asking a simpler question: “Which of these two do you prefer, and (optionally) by how much?”
The cognitive load per judgment is lower, the signal is cleaner, and the results can be converted into weights that reflect genuine collective preference – from which rankings, allocations, and funding decisions all follow.&lt;/p&gt;

&lt;p&gt;The problem is cost.
Pairwise comparison scales quadratically: \(k\) items means \(k(k-1)/2\) possible pairs which need deciding.
&lt;strong&gt;Active ranking&lt;/strong&gt; – choosing which pairs to show based on what the system already knows, significantly reducing the number of pairs you most observe – is key to making pairwise methods practical.&lt;/p&gt;

&lt;p&gt;This essay will explore how active ranking worked in practice – over three weeks in February and March 2026, &lt;a href=&quot;https://github.com/dojoengine/game-jams/tree/main/gj8&quot;&gt;during a live game jam&lt;/a&gt; – as part of &lt;a href=&quot;https://github.com/dojoengine/daimyo/tree/main/backend/src/lib/power&quot;&gt;PowerRanker&lt;/a&gt;, a spectral ranking engine that turns pairwise preferences into distributions of weights over items.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This essay was drafted by Claude using &lt;a href=&quot;https://github.com/dojoengine/daimyo/commits/main/&quot;&gt;git commit logs&lt;/a&gt; and edited for style and flow.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;i-background&quot;&gt;I. Background&lt;/h2&gt;

&lt;p&gt;Since joining the Cartridge Gaming Company a little less than a year ago, one of my responsibilities has been running their quarterly hackathons.
In preparation for their eighth jam, I saw an opportunity to put the ideas from &lt;a href=&quot;/blog/2025/12/14/pairwise-paradigm.html&quot;&gt;The Pairwise Paradigm&lt;/a&gt; into practice – not just active ranking, but the full end-to-end pipeline: interface design, data encoding, and algorithmic pair selection.
By bringing pairwise judging into the game jam process, we could distribute the judging burden across the ecosystem, producing more legitimate results with less per-juror effort.&lt;/p&gt;

&lt;p&gt;The result was a homespun judging tool called &lt;a href=&quot;https://daimyo.cartridge.gg/&quot;&gt;Daimyo&lt;/a&gt;.
Judges would see two submissions side by side – each with an AI-generated summary, playable links, and structured metrics – and were asked a single question: “Which is the stronger game jam entry?”&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/daimyo-judging.jpg&quot; alt=&quot;Daimyo Judging Interface&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Responses used a five-point Likert scale (Much / Slightly / Even / Slightly / Much), which we encoded as values from 0 to 1 in increments of 0.25.
Each judgment was designed to take about 30 seconds, and judges were asked for 10 comparisons per session, keeping the total time commitment at about five minutes.
PowerRanker then aggregated these comparisons into a collective weighting, used to decide winners and give out prizes.&lt;/p&gt;

&lt;p&gt;The initiative was successful – judges enjoyed encountering the projects in this way, the process made the game jam more visible to the larger ecosystem, and we were able to produce good results with a relatively small voter lift.&lt;/p&gt;

&lt;p&gt;Of the ideas from The Pairwise Paradigm, active ranking was the one that advanced the most during this process.
The original proposal was simple: for every pair \((a, b)\), construct a &lt;a href=&quot;https://en.wikipedia.org/wiki/Beta_distribution&quot;&gt;Beta distribution&lt;/a&gt; from the vote counts and sample pairs proportional to \(\text{Var}(p_{ab})\).
An optional extension would weight by \(\text{Var}(p_{ab}) \cdot w_a \cdot w_b\), upsampling uncertain pairs between high-value items.
In theory, this would reduce the data requirement from \(O(k^2)\) to some multiple of \(k\), a fundamental change in complexity.&lt;/p&gt;

&lt;p&gt;This was the plan.
It did not work as intended.&lt;/p&gt;

&lt;h2 id=&quot;ii-variance-based-selection&quot;&gt;II. Variance-Based Selection&lt;/h2&gt;

&lt;p&gt;The initial implementation followed the original proposal: Beta-variance-weighted sampling without replacement, with an optional impact flag that multiplied variance by the product of both items’ current weights.&lt;/p&gt;

&lt;p&gt;Watching it run against real data, however, two problems became clear.&lt;/p&gt;

&lt;p&gt;Variance sampling &lt;em&gt;did&lt;/em&gt; explore the pair space – unobserved pairs have a higher variance under a Beta prior, so the system naturally surfaced unseen comparisons.
But it explored indiscriminately, without regard to relative ordering.
The system would spend votes resolving the bear-vs-rabbit comparison with the same urgency as bear-vs-lion, even though these comparisons are not equally important for discovering the final ordering.
Variance told us what we hadn’t &lt;em&gt;seen&lt;/em&gt;, but not what we &lt;em&gt;needed to see&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The weight-scaling extension – multiplying variance by \(w_a \cdot w_b\) – introduced a different problem: path dependence.
With sparse early data, the ranking engine produces extreme weights – an item that wins its first two comparisons can briefly appear to dominate the entire set.
Sampling based on these volatile early weights meant that early winners were shown more often, producing more data about them and reinforcing their high weights – a classic “rich get richer” problem.&lt;/p&gt;

&lt;p&gt;Clearly, variance wasn’t it.&lt;/p&gt;

&lt;h2 id=&quot;iii-composable-transforms&quot;&gt;III. Composable Transforms&lt;/h2&gt;

&lt;p&gt;Before abandoning variance, we tried to save it.
The first iteration added a &lt;strong&gt;coverage&lt;/strong&gt; term alongside variance.
Coverage tracked how many times each item had been observed in any comparison, and penalized well-observed items: \(\frac{1}{1 + n/N}\) initially, then \(\frac{1}{\sqrt{1 + n}}\) for faster drop-off.&lt;/p&gt;

&lt;p&gt;The idea was to balance exploitation (compare uncertain pairs) with exploration (compare unseen items).
We also modified the &lt;strong&gt;weight&lt;/strong&gt; term, compressing it via sqrt to keep it in range with the other factors (\(\sqrt(w_a \cdot w_b)\)).
The goal was to observe all items a few times, and then concentrate observations on similarly-ranked items, as well as the top-weighted items, directing attention to where it had the highest impact.&lt;/p&gt;

&lt;p&gt;The selection function was multiplicitive, with each term contributing proportionally and toggled individually: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;sampleWeight = variance * weight * coverage&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This was better.
Items that had been ignored were surfaced, and attention was distributed more evenly and efficiently across the set.
But the weight component still created path dependence, and the interaction between three multiplicative terms (variance, coverage, weight) was hard to reason about.&lt;/p&gt;

&lt;p&gt;We were making progress, but the deeper problem remained: variance is fundamentally order-agnostic.
The early intuition – that variance-based sampling could reduce the data requirement by an entire complexity class – was wrong.&lt;/p&gt;

&lt;h2 id=&quot;iv-the-pivot&quot;&gt;IV. The Pivot&lt;/h2&gt;

&lt;p&gt;We eventually abandoned variance and weight-based selection entirely.
In order to reduce ranking to \(O(n)\) comparisons, the selection signal needed to be grounded in concrete &lt;em&gt;ordinal&lt;/em&gt; positions instead of global statistical notions of confidence.
The earlier intuition – selecting pairs based on some notion of uncertainty-reduction – was right, but the approach had been wrong.&lt;/p&gt;

&lt;p&gt;The replacement was &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;activeSelect()&lt;/code&gt;, built on three multiplicative terms, which as before could be toggled on and off individually.
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pos&lt;/code&gt; is determined from current data – weights are produced internally and used to create an ordering, from which further statistics were derived:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Coverage&lt;/strong&gt;: \(\frac{1}{\sqrt{1 + n_a}} \cdot \frac{1}{\sqrt{1 + n_b}}\) – favors items with fewer observations&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Proximity&lt;/strong&gt;: \(\frac{1}{1 + \lvert pos_a - pos_b \rvert}\) – favors items close together in the current ranking&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Position&lt;/strong&gt;: \(\frac{1}{\sqrt{pos_a \cdot pos_b}}\) – favors items near the top&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Coverage was kept from the earlier iteration.
Proximity and position were new – ordinal variants of the earlier &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;variance&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;weight&lt;/code&gt; terms.&lt;/p&gt;

&lt;p&gt;The proximity term became the main engine of active selection.
Consider: a weighting of \(n\) items has \(n-1\) degrees of freedom – the ratios between consecutive items.
The goal of the pairwise process, then, should be to achieve high confidence on the relationship between each item &lt;em&gt;and its neighbors&lt;/em&gt;.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;“Neighbor” should be understood probabilistically – two or three comparisons with items one or two positions away, one or two comparisons with items three or four positions away, etc.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Early on, the graph is sparse and the ranking is fuzzy, so coverage dominates: every item needs a few comparisons to anchor its position.
As data accumulates, proximity dominates, concentrating attention on the boundaries where precision is useful.
Position adds an optional bias toward observing pairs towards the top of the ranking, where larger weights call for higher precision.&lt;/p&gt;

&lt;p&gt;This new approach was coherent and intuitive in a way that the previous one was not.&lt;/p&gt;

&lt;h2 id=&quot;v-tuning&quot;&gt;V. Tuning&lt;/h2&gt;

&lt;p&gt;The new design required its own round of tuning, but the iterations were faster and more principled because the terms were independently interpretable.
Claude could perform fast simulations, letting us explore the behavior of different parameterizations interactively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Range matching.&lt;/strong&gt; A subtlety of multiplicative composition is that all terms need to operate on similar scales.
If one term ranges from 0.00001 to 0.99999 while the others range from 0.2 to 0.6, the wide-range term dominates the outcome.
Getting the terms into comparable ranges was a recurring concern throughout tuning, leading to several backs-and-forth around squaring and rooting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Regularization.&lt;/strong&gt; The first version had no way to soften the weighting.
In production, under-observed entries were getting starved while already-ranked items dominated.
We added a regularization parameter &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;r&lt;/code&gt; (0 = uniform, 1 = full weighting), which let us mitigate path-dependent effects.&lt;/p&gt;

&lt;p&gt;The initial implementation was a linear blend: \(r \cdot w + (1 - r)\).
This didn’t work: the blend collapsed signal, producing useless values like 0.500001.
We replaced the linear blend with a &lt;strong&gt;power transform&lt;/strong&gt;: \(w^r\), spanning the same range while preserving signal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Coverage formula.&lt;/strong&gt; The coverage term oscillated between \(\frac{1}{1+n}\) and \(\frac{1}{\sqrt{1+n}}\) as we ran simulations and explored alternative paramaterizations.
The sqrt form was chosen for its gentler drop-off, allowing items with 3-4 observations to continue to appear while still favoring items with zero observations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Term composition.&lt;/strong&gt; Initially, we chose pairs using all three terms, but decided mid-judging to drop the position term and use coverage and proximity only with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;r=0.9&lt;/code&gt;.
This made helped distribute voter attention across the entire set of items, increasing legitimacy.&lt;/p&gt;

&lt;p&gt;This decision reveals an important subtlety in social choice: perceived legitimacy of a process just is as important as how close the final weights may come to an abstract ideal.&lt;/p&gt;

&lt;h2 id=&quot;vi-validation&quot;&gt;VI. Validation&lt;/h2&gt;

&lt;p&gt;After the design stabilized, we ran simulations to validate the approach.
We generated items with known power-law weights, simulated judges making noisy Bradley-Terry comparisons, and measured how well the ranking engine recovered the true ordering.&lt;/p&gt;

&lt;p&gt;The simulation also clarified &lt;a href=&quot;/blog/2026/02/07/reinventing-the-wheel.html&quot;&gt;a foundational question&lt;/a&gt; about the ranking engine itself: whether preferences should be encoded as unidirectional (0.75 to A) or bidirectional (0.75 to A, 0.25 to B).
With Likert-binned vote strengths, unidirectional preferences produce unbounded weight ratios – the top item’s weight grows limitlessly relative to the bottom, as preference “mass” accumulated unevenly.&lt;/p&gt;

&lt;p&gt;Bidirectional preferences, by contrast, cause weights to converge, as every comparison contributes balanced information.
This insight was helpful for understanding “convergence” in this context – and is a jumping off point for future research.&lt;/p&gt;

&lt;p&gt;Our independent variable was &lt;strong&gt;votes per item (vpi)&lt;/strong&gt;: the total number of votes divided by the total number of items.
&lt;strong&gt;Simulation results showed that with active selection, Spearman rank correlation reaches 0.95+ at around 10-12 vpi for most distributions.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The simulation helped balance the relative weighting of the three terms, confirming that the range-matching intuition from manual tuning was correct.
It also gave us confidence in the votes-per-item target, by showing that an accurate ordering can be reliably recovered within our vote budget, and let us publicize the results with confidence.&lt;/p&gt;

&lt;p&gt;For a set of 30 items, 10 vpi means 300 total votes, or 15 judges each making 20 comparisons.
At 30 seconds per comparison, that’s 10 minutes per judge.
This is identical to The Pairwise Paradigm’s \(10k\) estimate, although achieved through a different mechanism than originally proposed.&lt;/p&gt;

&lt;p&gt;In the case of the game jam, we received 360 votes from 16 judges.
This was enough to &lt;a href=&quot;https://daimyo.cartridge.gg/jam/gj8/results&quot;&gt;produce rankings&lt;/a&gt; that were seen as legitimate and reflective of actual submission quality.&lt;/p&gt;

&lt;h2 id=&quot;vii-reflections&quot;&gt;VII. Reflections&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Theory vs practice.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The variance-based approach made sense on paper.
It was information-theoretically motivated, computationally efficient, and had a clear connection to the existing literature.
But it failed in practice because it treated pairs as isolated statistical objects rather than as part of a unified ranking structure.
The position-aware model succeeded because it grounded the selection process in the ranking it was trying to refine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Composability matters.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The ability to drop or add terms based on context proved valuable in both development and production.
A single monolithic scoring function would have been harder to adapt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The weights weren’t ready for prime time.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We had initially hoped to distribute prize money directly using the output weights – the highest-leverage use of the PowerRanker technique.
In the end, however, we didn’t feel confident enough in the weight precision to tie dollars to them, especially as the selection process itself evolved over the course of vote-gathering.
The &lt;em&gt;ranking&lt;/em&gt; was reliable, but the &lt;em&gt;weights&lt;/em&gt; need more work before they could be financially load-bearing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Open questions remain.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The first iteration of Daimyo validated many of the ideas put forward in &lt;a href=&quot;/blog/2025/12/14/pairwise-paradigm.html&quot;&gt;The Pairwise Paradigm&lt;/a&gt;, specifically around the design of the UI and the ability of active ranking to reduce the voting requirement.
Future research should push the simulations further, showing specifically under what circumstances different paramaterizations can achieve desired outcomes.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This research, like the &lt;a href=&quot;/blog/2026/02/07/reinventing-the-wheel.html&quot;&gt;earlier work on pseudocounts&lt;/a&gt;, was conducted in collaboration with Claude – both the R&amp;amp;D itself and the writing of this essay.
The pattern was similar: rapid iteration through a search space, punctuated by moments where stepping back and questioning assumptions.&lt;/p&gt;
&lt;/blockquote&gt;
</description>
        <pubDate>Sun, 29 Mar 2026 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2026/03/29/implementing-active-ranking.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2026/03/29/implementing-active-ranking.html</guid>
        
        <category>algorithms</category>
        
        <category>mechanism-design</category>
        
        <category>social-choice</category>
        
        
        <category>blog</category>
        
      </item>
    
      <item>
        <title>Reinventing the Wheel</title>
        <description>&lt;blockquote&gt;
  &lt;p&gt;This essay describes a research process culminating in a significant overhaul of Chore Wheel’s internals.
The research itself – &lt;a href=&quot;https://github.com/zaratanDotWorld/choreWheel/issues/296&quot;&gt;analysis, dead ends&lt;/a&gt;, and &lt;a href=&quot;https://github.com/zaratanDotWorld/choreWheel/pull/302&quot;&gt;eventual solution&lt;/a&gt; – was a collaboration between myself and Anthropic’s Claude taking place over the course of about four nighttime sessions.
Afterwards, we wrote up this report based on our notes – Claude drafted, I edited.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;img src=&quot;/img/chatgpt-year-review.jpg&quot; alt=&quot;ChatGPT Year in Review&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;From my ChatGPT year-in-review, seemed fitting.&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;contents&quot;&gt;Contents&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#i-introduction&quot;&gt;&lt;strong&gt;I. Introduction&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#ii-background&quot;&gt;&lt;strong&gt;II. Background&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#iii-explorations&quot;&gt;&lt;strong&gt;III. Explorations&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#iv-breakthrough&quot;&gt;&lt;strong&gt;IV. Breakthrough&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#v-reflections&quot;&gt;&lt;strong&gt;V. Reflections&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;i-introduction&quot;&gt;I. Introduction&lt;/h1&gt;

&lt;p&gt;&lt;a href=&quot;https://www.zaratan.world/chorewheel&quot;&gt;Chore Wheel&lt;/a&gt; is a cooperative household management system, deployed as a suite of Slack apps since September 2022.
Its origins trace back to my research on pairwise preference voting, &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3359677&quot;&gt;begun as a master’s student at Columbia in 2016&lt;/a&gt;, &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3317445&quot;&gt;further developed at Colony in 2018&lt;/a&gt;, and &lt;a href=&quot;https://blog.zaratan.world/p/the-story-of-chore-wheel&quot;&gt;culminating in Chore Wheel in 2020&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The core concept is straightforward: residents express priorities as pairwise comparisons (“dishes are more important than sweeping”), and a &lt;a href=&quot;/blog/2025/12/14/pairwise-paradigm.html&quot;&gt;PageRank-style algorithm&lt;/a&gt; aggregates these into a collective prioritization that determines how many points each chore accumulates.
Points are fixed at 100 per resident per month and distributed continuously over time.
A chore prioritized twice as highly accumulates twice as many points, so the collective preferences directly shape how chores are incentivized and performed.
The emphasis on &lt;em&gt;emergent, intuitive, and asynchronous decision-making&lt;/em&gt; – in lieu of long meetings or top-down manager control – was meant to ensure communal resilience in the face of turnover and fluctuating resident capacity.&lt;/p&gt;

&lt;p&gt;After 3+ years of continuous production use at the 9-person Sage House – with over 2,500 claimed chores across 24 lifetime residents – the system had proven the concept.
The prioritization mechanism, the system’s most innovative component, worked.
But consistent patterns of user frustration pointed to problems beneath the surface.
This essay traces the research process that led to a key redesign of the ranking engine – not as a straight line, but as the winding path it actually was.&lt;/p&gt;

&lt;h1 id=&quot;ii-background&quot;&gt;II. Background&lt;/h1&gt;

&lt;h3 id=&quot;the-problem&quot;&gt;The Problem&lt;/h3&gt;

&lt;p&gt;Chore Wheel’s goal is to make cooperative governance &lt;em&gt;simple&lt;/em&gt;, &lt;em&gt;intuitive&lt;/em&gt;, and &lt;em&gt;effective&lt;/em&gt;.
The core promise is that proportional changes in preferences will produce proportional changes in rankings.
Two persistent complaints suggested the goal was not yet fully achieved.&lt;/p&gt;

&lt;p&gt;The first was &lt;strong&gt;compressed distributions&lt;/strong&gt;.
Residents struggled to get the most important chores to high enough values.
No one ever complained that the top chores were worth “too much” – but people regularly felt unable to push them high enough, no matter how many preferences they submitted.
Deep-cleaning a bathroom and watering the plants would sit closer together in the rankings than anyone felt they should, and no amount of voting seemed to fix it.
The result was that the most important chores still felt undervalued, and that people who did them still felt as though they were “sacrificing” on behalf of the group – exactly what we wanted to avoid.&lt;/p&gt;

&lt;p&gt;The second was &lt;strong&gt;opaque causality&lt;/strong&gt;.
Priority changes sometimes felt “random.”
A resident would submit a clear preference – “kitchen deep clean matters more than yard cleanup and trash takeout” – and see unrelated chores shift in unexpected directions.
The effect wasn’t enormous, but it was noticeable.
When inputs don’t produce legible outputs, governance stops feeling democratic and starts feeling arbitrary – and residents disengage.
For a system built on the premise that people should feel agency over their shared environment, this was not a minor complaint.&lt;/p&gt;

&lt;h3 id=&quot;the-ranking-engine&quot;&gt;The Ranking Engine&lt;/h3&gt;

&lt;p&gt;To understand the source of these problems – and why they resisted easy fixes – requires some background on how the ranking engine works and the design decisions it inherited.&lt;/p&gt;

&lt;p&gt;PowerRanker (our internal name for the ranking engine) takes in pairwise preferences from residents (“I prefer dishes to sweeping with a strength of 0.7”), aggregates them into a “matrix” of relationships, and turns this matrix into a set of weights (“priorities”) which we use to assign points to chores.
The underlying math is well-established and is often used to calculate rankings based on tournaments and competitions; our innovation was applying it to problems of social choice.&lt;/p&gt;

&lt;p&gt;One of the challenges of the social choice setting is achieving the twin – and contradictory – goals of enabling groups to reach good outcomes in the face of low engagement, while also preventing minority factions from unfairly dominating the results.
Achieving these goals requires balancing trade-offs – encouraging coalition-building for extreme outcomes, while allowing individuals to make meaningful contributions.&lt;/p&gt;

&lt;p&gt;This balancing, which can be seen as a type of “regularization,” was previously done in two ways.
First, through an “implicit preference” of 0.5 in every off-diagonal cell of the preference matrix, which was removed proportionally (0.5/numResidents) as real preferences came in.
Second, through a “damping factor” – taken directly from PageRank – in which the observed preferences are scaled down and combined with a uniform prior (PageRank’s “random surfer”), which compresses results and ensures that even low-priority chores receive at least &lt;em&gt;some&lt;/em&gt; value.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: The original PageRank paper counterintuitively dampens &lt;em&gt;less&lt;/em&gt; when \(d\) is large.
In this discussion, we will refer to “high damping” as creating more uniform outcomes, and “low damping” enabling more extreme outcomes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;prior-investigations&quot;&gt;Prior Investigations&lt;/h3&gt;

&lt;p&gt;This project occurred in the context of a larger sweep of improvements brought about by the addition of a new house (see &lt;a href=&quot;https://github.com/zaratanDotWorld/choreWheel/pull/289&quot;&gt;here&lt;/a&gt;, &lt;a href=&quot;https://github.com/zaratanDotWorld/choreWheel/pull/292&quot;&gt;here&lt;/a&gt;, and &lt;a href=&quot;https://github.com/zaratanDotWorld/choreWheel/pull/294&quot;&gt;here&lt;/a&gt;).
This 5-person house, Solegria, had a different set of needs than Sage, and attempting to accommodate them &lt;em&gt;within&lt;/em&gt; the existing framework spurred a series of investigations which led to significant improvements in the system.&lt;/p&gt;

&lt;p&gt;Before this project, we undertook an investigation into some confusing behavior which surfaced in the course of implementing a “bulk import” feature in which CSVs of chores with “scores” could be uploaded en-masse instead of created and prioritized individually.
During the development of this functionality, &lt;a href=&quot;https://github.com/zaratanDotWorld/choreWheel/issues/295&quot;&gt;we noticed that equal scores led to skewed outputs&lt;/a&gt;, which turned out to be an unintended consequence of &lt;a href=&quot;https://github.com/zaratanDotWorld/choreWheel/issues/224&quot;&gt;an earlier investigation&lt;/a&gt; into why de-prioritizing a chore could in some cases &lt;em&gt;increase&lt;/em&gt; its priority.&lt;/p&gt;

&lt;p&gt;Initially, preferences had been encoded bidirectionally – in which a preference of 0.7 in one direction was encoded as two preferences, with 0.3 flowing in the other direction so that each preference added a full 1.0 to the matrix.
This approach was appealing, as it kept things continuous, symmetric, and proportional, but it had a problem – due to the underlying mathematics, adding a 0.3 (or similar) flow to the less-preferred chore would sometimes cause it to &lt;em&gt;increase&lt;/em&gt; in value – a clear semantics violation.&lt;/p&gt;

&lt;p&gt;Our initial solution – to remove the counter-weight – created an abrupt discontinuity at 0.5 and led to more volatile rankings.
Once we identified the problem, the solution was clear – center and scale preferences smoothly around 0.5, so a preference of 0.7 would be encoded as 0.4, etc.&lt;/p&gt;

&lt;p&gt;We were making progress, but things had begun to feel like a game of whack-a-mole – every solution seemed to introduce another problem.
The preferences themselves were not changing, nor was the final step of producing weights, but the &lt;em&gt;intermediate pipeline&lt;/em&gt; of scaling and adding and removing preferences was creating complex interactions that were difficult to reason about.&lt;/p&gt;

&lt;p&gt;Fortunately, we had an instinct for where to look next.
Software engineers have a concept of “code smell” – patterns that, based purely on appearance, suggest poor design.
It was certainly smelly that a general-purpose ranking engine depended so directly on a domain-specific value like the number of residents, and that “implicit preferences” were explicitly &lt;em&gt;subtracted&lt;/em&gt; as real data came in.
What if we removed or reworked that functionality?&lt;/p&gt;

&lt;p&gt;Exploring alternative approaches to damping and regularization set us down a road which led to a significant breakthrough.&lt;/p&gt;

&lt;h1 id=&quot;iii-explorations&quot;&gt;III. Explorations&lt;/h1&gt;

&lt;h3 id=&quot;dynamic-damping&quot;&gt;Dynamic Damping&lt;/h3&gt;

&lt;p&gt;The first attempt simplified the pipeline by removing implicit preferences, reinstituting bidirectional preferences, and replacing a fixed damping factor with a dynamic one that increases as a function of the number of explicit preferences – data the ranking engine already had.&lt;/p&gt;

&lt;p&gt;By making the damping factor dynamic instead of fixed, it could ideally replace the regularizing function of implicit preferences, ensuring that a small number of preferences could not dominate the results while creating a cleaner architecture.
Further, our hope was that a dynamic damping factor could mitigate the unexpected effects of bidirectional preferences, giving us the upside of per-pair stability without the downside of unpredictable interactions with the rest of the graph.&lt;/p&gt;

&lt;p&gt;The initial damping formula was a hyperbolic curve, where \(P\) is the number of preferences submitted and \(\text{maxPairs}\) is the total number of possible chore pairs:&lt;/p&gt;

\[d = \frac{P}{P + \alpha \cdot \text{maxPairs}}\]

&lt;p&gt;This curve begins at 0 and approaches 1 as the number of preferences grows, with \(\alpha\) controlling the steepness of the curve.
With few preferences, damping stays high and rankings hew close to uniform.
As preferences accumulate, damping decreases and the rankings increasingly reflect people’s stated preferences.&lt;/p&gt;

&lt;p&gt;We tested this against anonymized Sage House preferences and found that with the right \(\alpha\) we could reproduce the existing priorities almost exactly.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Key learning: a data-driven damping curve can replace both fixed damping and implicit preferences – but matching one dataset doesn’t guarantee generality.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This was promising.
It removed implicit preferences, replaced the fixed damping factor with something principled, and produced nearly identical rankings on real data.
The next step was to validate on a second dataset.&lt;/p&gt;

&lt;h3 id=&quot;cross-dataset-validation&quot;&gt;Cross-Dataset Validation&lt;/h3&gt;

&lt;p&gt;The second dataset, from the 5-person Solegria, told a different story.
Where Sage residents had submitted preferences organically – mostly maximal values of 1 or 0, spread across several active participants – Solegria had a different participation structure.
One person had submitted the vast majority of all preferences through the bulk-upload feature, and those preferences were much more moderate, having been derived from ratios of intended point values.&lt;/p&gt;

&lt;p&gt;The \(\alpha\) that had worked well for Sage produced compressed values for Solegria.&lt;/p&gt;

&lt;p&gt;We briefly explored an “adaptive” alpha based on a measure of preference intensity – higher if preferences are more extreme, lower if they are more uniform:&lt;/p&gt;

\[\alpha = \frac{\text{mean}(|2(p - 0.5)|)}{2}\]

&lt;p&gt;However, regularizing based on explicit preferences would have introduced a new reflexivity: if strong preferences regularize themselves, preference-setting becomes less about expressing true beliefs and more an exercise in anticipating how the algorithm will react.
This was a complexity we were unwilling to introduce.&lt;/p&gt;

&lt;p&gt;More wrinkles emerged when we tested round-trip fidelity: since Solegria’s preferences were generated from known target scores, we could check whether the ranking algorithm recovered the intended distribution.
It didn’t.
The best-fit Sage value significantly compressed Solegria’s rankings relative to intended values.&lt;/p&gt;

&lt;p&gt;Our takeaway was that for hyperbolic damping, the model’s core parameter was sensitive not just to the &lt;em&gt;amount&lt;/em&gt; of data, but to its &lt;em&gt;distribution&lt;/em&gt;.
There was not a universal constant we could apply without significantly distorting either group’s current, in-production priorities – something we were not willing to do as part of an internal research project.
Our goal was to make fundamental improvements either &lt;em&gt;transparently&lt;/em&gt; or not at all.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Key learning: a model whose core parameter is sensitive to how preferences are generated – not just how many exist – cannot be trusted across heterogeneous groups.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;It is worth noting that these data are slightly reflexive – Sage residents produced extreme values in part &lt;em&gt;because&lt;/em&gt; the overly-strong compression was not responsive enough to moderate values.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;handling-coalitions&quot;&gt;Handling Coalitions&lt;/h3&gt;

&lt;p&gt;Before abandoning the hyperbolic approach entirely, a different concern surfaced.
The damping formula counted total preferences without regard to &lt;em&gt;who&lt;/em&gt; submitted them.
One active resident submitting hundreds of preferences could achieve the same damping as several people submitting handfuls of preferences each.
This defeated the coalition-building property that \(\text{numResidents}\) had – however crudely – provided originally.&lt;/p&gt;

&lt;p&gt;A possible solution was inspired by &lt;a href=&quot;/blog/2020/04/19/sharing-the-wealth.html&quot;&gt;Quadratic Voting&lt;/a&gt;: compute an &lt;em&gt;effective&lt;/em&gt; participation score as the sum of square roots of each resident’s preference count:&lt;/p&gt;

\[\text{effectiveP} = \sum_i \sqrt{\text{prefs}_i}\]

&lt;p&gt;This creates diminishing returns per person – one person submitting 100 preferences contributes \(\sqrt{100} = 10\), while four people submitting 25 each contribute \(4\sqrt{25} = 20\).
More participants means higher effective participation, mathematically enforcing the normative value of coalition-building.&lt;/p&gt;

&lt;p&gt;The implementation could be kept architecturally clean – the application would compute \(\text{effectiveP}\) from its knowledge of who submitted what, derive the damping factor, and pass it in.
PowerRanker would become a generic library that accepts damping as a parameter, knowing nothing about residents or participation – separation of concerns between the ranking engine and the application.&lt;/p&gt;

&lt;p&gt;But this approach would penalize Solegria’s &lt;em&gt;legitimate&lt;/em&gt; use case – a single person doing a bulk import on behalf of the group – and more generally, any house where one or two engaged members did most of the administrative work.
The instinct about coalition-building was right; the implementation was simply too aggressive – and too normative – for the messy reality of shared living, where participation is never evenly distributed.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Key learning: penalizing concentrated participation sounds principled, but punishes legitimate cases where engagement is naturally uneven.&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;returning-to-unidirectional-preferences&quot;&gt;Returning to Unidirectional Preferences&lt;/h3&gt;

&lt;p&gt;Alongside these damping experiments, the semantic violation that had originally motivated the move &lt;em&gt;away&lt;/em&gt; from bidirectional preferences (see &lt;a href=&quot;#prior-investigations&quot;&gt;Prior Investigations&lt;/a&gt;) resurfaced.
We had reinstituted bidirectional encoding in the hope that dynamic damping would contain the problem, but it didn’t – deprioritizing a chore still added weight to it, inflating its ranking relative to chores with no preferences at all.&lt;/p&gt;

&lt;p&gt;Bidirectional encoding turned out to be a non-starter, and our only option was a return to scaled unidirectional preferences, where only the preferred chore gains value.
However, since unidirectional preferences add less total weight to the matrix, the results become more volatile – and the choice of damping curve matters even more.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Key learning: bidirectional encoding’s semantic violation – deprioritizing a chore inflates it – persisted regardless of damping strategy, confirming it as a dead end.&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;the-search-for-a-universal-curve&quot;&gt;The Search for a Universal Curve&lt;/h3&gt;

&lt;p&gt;We have so far failed to find a general solution, and the return to unidirectional preferences meant that prior parameter fits needed to be revisited.
Stepping back, we asked a different question: is there another curve that works consistently across different houses and coverage levels?&lt;/p&gt;

&lt;p&gt;We liked the idea of using \(P/maxPairs\) as the input for the damping curve.
This value, the ratio of the number of observed preferences across &lt;em&gt;all residents&lt;/em&gt; to the number of possible preferences &lt;em&gt;per resident&lt;/em&gt;, could range from 0 up to the number of residents.
What curve would cleanly map this number to the range (0, 1)?
The answer was the sigmoid:&lt;/p&gt;

\[d = \frac{1}{1 + e^{-a \cdot P / \text{maxPairs}}}\]

&lt;p&gt;Both formulas normalize by problem size, but they differ in &lt;em&gt;shape&lt;/em&gt;.
The hyperbolic curve produces \(d=0\) with no data, meaning new houses get near-total damping, while the sigmoid starts at \(d=0.5\) – an intuitive “50/50 blend with uniform” – giving early residents more agency.
Further, the sigmoid’s steeper curve meant a single parameter could work across both datasets.&lt;/p&gt;

&lt;p&gt;We swept the steepness parameter \(\alpha\) across both datasets, looking for a value that best matched current production.
The result was encouraging: we found a single \(\alpha\) that worked well for both houses.
Rankings shifted modestly, with a slightly larger spread at Sage than at Solegria.
This seemed like the answer.&lt;/p&gt;

&lt;p&gt;But looking back at the full trajectory of the research, something nagged.
Four different damping formulas had now been explored – hyperbolic, QV-weighted hyperbolic, sigmoid, and adaptive variants of each – always asking “what’s the right damping curve?” without questioning whether functional damping was the right tool at all.
The concept had been inherited from PageRank and never seriously challenged.&lt;/p&gt;

&lt;p&gt;Every iteration refined the curve; none questioned the frame.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Key learning: finding the “right” damping curve was the wrong question – four iterations of refinement revealed that the entire frame of functional damping needed to be reconsidered.&lt;/em&gt;&lt;/p&gt;

&lt;h1 id=&quot;iv-breakthrough&quot;&gt;IV. Breakthrough&lt;/h1&gt;

&lt;h3 id=&quot;damp-before-normalize&quot;&gt;Damp-Before-Normalize&lt;/h3&gt;

&lt;p&gt;Standard PageRank normalizes each row of the transition matrix (so rows sum to 1), &lt;em&gt;then&lt;/em&gt; applies damping (blending with the uniform distribution).
But normalization erases the &lt;em&gt;magnitude&lt;/em&gt; of the raw preference data: an item with 10 strong preferences and an item with 1 weak preference look identical after normalization.
In PageRank this doesn’t matter – the link matrix is binary, so pre-normalization row sums just count outgoing links, which carry no information about page authority.
In a continuous preference matrix, the pre-normalization row sums encode something meaningful: &lt;em&gt;how much total evidence&lt;/em&gt; the system has about each item.&lt;/p&gt;

&lt;p&gt;By adopting PageRank’s approach uncritically, we were inadvertently throwing away information – information which doesn’t exist for links but does exist for preferences.&lt;/p&gt;

&lt;p&gt;With implicit preferences, this effect was obfuscated, as normalization was constrained by these extra values.
Without implicit preferences – and with the sparser matrices that unidirectional encoding produced – the problem with normalize-before-damp became plain as day:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/damp-before-normalize.jpg&quot; alt=&quot;Damp Before Normalize&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fancy models and AI agents are no match for a good notebook at your side.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This was the conceptual breakthrough of the research – the first moment where the PageRank architecture itself, not just its parameters, was questioned.
By reversing the order – damping first, then normalizing – we can regularize outputs &lt;em&gt;without&lt;/em&gt; washing out valuable preference information.
The principle is simple: &lt;strong&gt;regularize before you normalize&lt;/strong&gt;, so that cells with more real data are proportionally less affected by the prior.&lt;/p&gt;

&lt;p&gt;Key mathematical properties were preserved: the final matrix remains row-stochastic, irreducible, and aperiodic, guaranteeing a unique stationary distribution via Perron-Frobenius.
Only the clean markov-plus-uniform decomposition of the classic Google matrix was lost – but that decomposition, and its “random surfer” interpretation, was arguably never the right model for social choice anyway.&lt;/p&gt;

&lt;p&gt;The damp-before-normalize approach addressed the magnitude-erasure problem that unidirectional preferences had made acute, but it still relied on a functional damping curve, which meant it inherited the parameterization problems from the earlier sections.
&lt;strong&gt;What we needed was a way to regularize &lt;em&gt;before&lt;/em&gt; normalization without relying on a curve at all.&lt;/strong&gt;&lt;/p&gt;

&lt;h3 id=&quot;bayesian-pseudocounts&quot;&gt;Bayesian Pseudocounts&lt;/h3&gt;

&lt;p&gt;The damp-before-normalize insight meant that all regularization now logically occurred in the same place: in the matrix, before normalization.
The original pipeline had split regularization across two mechanisms – implicit preferences (added and subtracted to the matrix before normalization) and damping (applied after normalization).
&lt;strong&gt;With damping moved before normalization, regularization could be performed in a single, theoretically clean step.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The implementation of this insight turned out to be almost embarrassingly simple: add a small constant to every off-diagonal cell in the matrix before loading the data:&lt;/p&gt;

\[k = c \times \text{numResidents}\]

&lt;p&gt;Each chore pair starts with \(k\) units of “virtual preference” in both directions, representing a uniform prior belief before any data arrives, scaled by the number of residents.
As real preferences accumulate, they dominate the pseudocounts naturally – no deleting implicit preferences, no coverage-dependent formula, no coalition checks, no steepness parameter.&lt;/p&gt;

&lt;p&gt;There is an irony here.
Pseudocounts do what implicit preferences were trying to do all along: provide a uniform baseline that washes out as real data arrives.
But where implicit preferences were a hack – entangled with damping, lacking a clean mathematical interpretation, creating indirect effects through bidirectional subtraction – pseudocounts have a principled statistical identity as a &lt;a href=&quot;https://en.wikipedia.org/wiki/Dirichlet_distribution&quot;&gt;Dirichlet prior&lt;/a&gt;.
The original system had the right instinct; it just expressed it through the wrong formalism.&lt;/p&gt;

&lt;p&gt;The pseudocount approach has key properties that none of the damping-based models achieved.
First, it &lt;strong&gt;scales naturally with house size&lt;/strong&gt;: more residents means a stronger prior, which means more preference data is needed to move rankings – preserving the coalition-building boost for larger houses without handicapping smaller houses.
Second, it requires a &lt;strong&gt;single parameter&lt;/strong&gt; \(c\) with clear semantics, representing the strength of the uniform prior per resident, which can be set either analytically or through governance.&lt;/p&gt;

&lt;p&gt;We tested pseudocounts against both datasets, and the results were encouraging.
At Sage (9 residents), individual chore values shifted by an average of 0.08% – nearly invisible.
At Solegria (5 residents), values shifted an average of 0.55%, with the spread between highest- and lowest-priority chores nearly doubling.
The difference is explained by the new prior – with only 5 residents, Solegria’s prior was about half that of Sage’s, meaning that Solegria’s preferences could be expressed more fully.
This change – increased spread across the board, with Sage seeing virtually no change and Solegria seeing meaningful change – was exactly what we wanted.&lt;/p&gt;

&lt;p&gt;Most strikingly, &lt;strong&gt;the implementation was a net deletion of code.&lt;/strong&gt;
The old matrix initialization populated every off-diagonal cell with implicit preferences scaled by the number of participants.
The new version initializes with pseudocounts – a uniform prior that knows nothing about participants:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// Before: implicit preferences&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;_prepareMatrix&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;linAlg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;zero&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;numParticipants&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;plusEach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;minus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;linAlg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;identity&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mulEach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;numParticipants&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mulEach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;implicitPref&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// After: pseudocounts&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;_prepareMatrix&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kd&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;linAlg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;zero&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;plusEach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;minus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;linAlg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;identity&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mulEach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;options&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Adding preferences went from a multi-step dance – subtract implicit preferences, then add the scaled explicit preference in the dominant direction – to simple addition:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// Before: subtract implicit, then add explicit&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;preferences&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;forEach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;scaled&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;scaled&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sourceIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;targetIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;implicitPref&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;targetIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sourceIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;implicitPref&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;scaled&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sourceIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;targetIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;scaled&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;targetIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sourceIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;scaled&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// After: just add&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;preferences&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;forEach&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;scaled&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;scaled&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sourceIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;targetIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;scaled&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;targetIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;][&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sourceIx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;scaled&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;And the power method dropped its damping step entirely – regularization now lives in the data, before normalization, not after:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// Before: normalize, then damp&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;_powerMethod&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;epsilon&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;nIter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rowSum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rowSum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;mulEach_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;plusEach_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;// ... power iteration&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-javascript&quot; data-lang=&quot;javascript&quot;&gt;&lt;span class=&quot;c1&quot;&gt;// After: just normalize&lt;/span&gt;
&lt;span class=&quot;nx&quot;&gt;_powerMethod&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;epsilon&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;nIter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;matrix&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rowSum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;this&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;#&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;rowSum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;// ... power iteration&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;h1 id=&quot;v-reflections&quot;&gt;V. Reflections&lt;/h1&gt;

&lt;p&gt;The research began with user complaints – compressed distributions and opaque causality – and every candidate model was evaluated against those experiences.
“Spread” was never an abstract metric; it was a proxy for whether residents felt their preferences actually mattered.
The final model increased spread where it was over-compressed while preserving rank-order stability where preferences had matured – delivering the experience the system always promised: submit a preference, see a legible change, trust the output.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The winding path was necessary.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each intermediate model revealed a dimension of the problem that had been previously overlooked.
The hyperbolic approach revealed that the damping parameter wasn’t universal; cross-dataset validation revealed sensitivity to preference generation method; the QV model revealed the tension between coalition-building norms and real-world participation patterns.
The attempt to revive bidirectional encoding confirmed its fundamental limits, and the sigmoid revealed that even the best curve couldn’t work across all regimes.
The damp-before-normalize investigation revealed that regularization must happen before normalization – the key conceptual insight – and the reconfigured pipeline allowed us to see how implicit preferences and damping could be combined into a single, clean concept.
Each failure narrowed the space of possible solutions, until what remained was obvious.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Every idea was tested against real data.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The research used two production datasets with different participation structures and preference distributions.
Theoretical elegance was never sufficient; the datasets always had the final word.
This also reveals the importance of persistence – it took several years to get enough data for this research.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Separation of concerns clarified everything.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The original PowerRanker knew about \(\text{numResidents}\) and \(\text{implicitPref}\) – application-layer concepts that don’t belong in a generic ranking library.
In the final design, PowerRanker accepts a pseudocount \(k\) as a constructor option and knows nothing about residents, participation, or houses.
The application layer computes \(k = c \times \text{numResidents}\) from its domain knowledge and passes it in.
The library became more general while the application remained domain-aware – and beginning with this separation as a design constraint shaped the research process throughout.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI as a research partner accelerated exploration – but speed has its own risks.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This research was conducted in collaboration with Claude, which made it possible to move through the problem at a rapid pace.
Formulating hypotheses, writing scripts, running them against data, and interpreting results could happen in a single sitting rather than over days or weeks.
But the speed of the process created its own challenge: it became easy to explore faster than we could think.
The damp-before-normalize breakthrough – the most important insight of the entire project – came not from another round of analysis, but from stepping away from the screen and working through matrix algebra with pen and paper.
The AI was indispensable for traversing the search space; the human judgment about when to stop searching and start questioning was equally essential.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Question inherited assumptions.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;PageRank’s normalize-then-damp pipeline was cargo-culted in without questioning whether its order of operations made sense in a different domain.
We spent six iterations optimizing the damping curve before questioning where regularization belonged in the pipeline at all.
Adapting an algorithm means inheriting its structure – and inherited structure can be the hardest kind to see, precisely because it came with the territory.
The same lesson applied to implicit preferences: a pragmatic hack that worked well enough to obscure its own conceptual problems for years.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Legibility as professional obligation, not UX preference.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The commitment to make improvements “transparently or not at all” was a constraint the research imposed on itself: because ranking systems shape collective outcomes in ways not always obvious from individual inputs, the people who live with those outcomes are owed a legible account of how they are produced.
Building new institutions means respecting the people who already inhabit them – honoring existing preferences while expanding future expressivity – and each model should be judged accordingly.&lt;/p&gt;
</description>
        <pubDate>Sat, 07 Feb 2026 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2026/02/07/reinventing-the-wheel.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2026/02/07/reinventing-the-wheel.html</guid>
        
        <category>algorithms</category>
        
        <category>mechanism-design</category>
        
        <category>cooperative-housing</category>
        
        
        <category>blog</category>
        
      </item>
    
      <item>
        <title>The Pairwise Paradigm</title>
        <description>&lt;div style=&quot;display: block; width: 100%; text-align: left; margin: 1rem 0;&quot;&gt;
  &lt;span style=&quot;display: inline-block; font-size: 3rem; font-weight: 800; line-height: 1.15; padding-bottom: 0.08em; background: linear-gradient(90deg, #e8a4a8, #f0c2a6, #f2df93, #bcdba5, #a8d3d8, #abc6ee, #c3ace2); -webkit-background-clip: text; background-clip: text; color: transparent; -webkit-text-fill-color: transparent; -webkit-text-stroke: 1px rgba(0, 0, 0, 0.5);&quot;&gt;
    The Pairwise Paradigm
  &lt;/span&gt;
&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Abstract: Pairwise methods are best thought of as composing a &lt;strong&gt;paradigm&lt;/strong&gt; for turning &lt;strong&gt;scarce attention&lt;/strong&gt; into &lt;strong&gt;robust allocation signals&lt;/strong&gt;. With the right &lt;strong&gt;algorithms&lt;/strong&gt;, &lt;strong&gt;interface&lt;/strong&gt;, and &lt;strong&gt;audience&lt;/strong&gt;, they can support the continuous funding of ecosystems at surprisingly low attention cost.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Thanks to Carl Cervone, David Gasquez, Ori Shimony, and Dandelion Mané for feedback on earlier versions of this essay.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#i-motivations&quot;&gt;I. Motivations&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#ii-why-pairwise&quot;&gt;II. Why Pairwise&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#iii-the-pairwise-paradigm&quot;&gt;III. The Pairwise Paradigm&lt;/a&gt;
    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;#1-algorithm-selection&quot;&gt;1. Algorithm Selection&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#2-pair-selection&quot;&gt;2. Pair Selection&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#3-interface-design&quot;&gt;3. Interface Design&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;&lt;a href=&quot;#4-audience-development&quot;&gt;4. Audience Development&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#iv-continuous-funding&quot;&gt;IV. Continuous Funding&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#v-evaluation-and-legitimacy&quot;&gt;V. Evaluation and Legitimacy&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#vi-putting-it-all-together&quot;&gt;VI. Putting It All Together&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#appendix-other-considerations&quot;&gt;Appendix: Other Considerations&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;i-motivations&quot;&gt;I. Motivations&lt;/h1&gt;

&lt;p&gt;&lt;a href=&quot;/blog/2020/04/04/gaming-the-vote.html&quot;&gt;There is no perfect voting system.&lt;/a&gt;
The truth and tragedy of this statement have been understood for decades, beginning with Kenneth Arrow’s “impossibility theorem” showing the fundamental limits of ranked-choice voting systems.&lt;/p&gt;

&lt;p&gt;At its core, the problem stems from attempting to measure complex social reality under conditions of high stakes.
In attempting to distill &lt;em&gt;subjective&lt;/em&gt; reality into &lt;em&gt;objective&lt;/em&gt; votes, information is lost, and the measurement process itself becomes an arena for power contestation.
In the end, the best we can do is design &lt;em&gt;task-specific&lt;/em&gt; systems in which the gap between subjective experience and objective input is as small as possible, decreasing the scope of conflict and increasing both the utility and legitimacy of these systems.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/blog/2019/05/08/against-voting.html&quot;&gt;In mid-2019 I speculated that&lt;/a&gt;, within the web3 governance community, the limitations of pass-fail voting would shift interest away from proposal-based decision-making towards distributed capital allocation.
Over the last six years, that prediction has largely been borne out: instead of &lt;em&gt;voting on policy&lt;/em&gt;, governance innovation has increasingly come to revolve around &lt;em&gt;giving out money&lt;/em&gt;.
The shift from &lt;em&gt;discrete policy outcomes&lt;/em&gt; (you win, I lose) to &lt;em&gt;continuous financial outcomes&lt;/em&gt; ($10 to you, $5 to me) opens a rich design space for social choice.&lt;/p&gt;

&lt;p&gt;Within this domain, several classes of techniques have been explored:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Quadratic Funding&lt;/strong&gt;, in which direct donations double as “votes” dividing a matching pool, subject to square-root constraints.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Pairwise Methods&lt;/strong&gt;, in which inputs are framed as “A vs B” and converted into numeric allocations via an algorithm.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Metrics-Based&lt;/strong&gt;, in which votes are made on high-level metrics, and allocations are made indirectly based on these metrics.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;AI-Augmented&lt;/strong&gt;, in which AI agents analyze projects and recommend allocations based on various internal processes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these approaches, on some level, seeks to convert &lt;em&gt;scarce attention&lt;/em&gt; into &lt;em&gt;useful signal&lt;/em&gt;.
Each has its own strengths and weaknesses:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Quadratic Funding&lt;/strong&gt; advantages smaller groups, but struggles to allocate attention, leading to “beauty contests.”&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Pairwise Methods&lt;/strong&gt; offer a simple and intuitive framing, but struggle to get sufficient coverage to produce good results.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Metrics-Based&lt;/strong&gt; reduce scope for politics, but struggle with “Goodhart’s Law” failures and incentivize misrepresentation.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;AI-Augmented&lt;/strong&gt; sidestep issues of voter apathy, but struggle with issues of alignment, legitimacy, and interpretability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3359677&quot;&gt;I have been researching and working with decision-making systems since 2016&lt;/a&gt;, with a focus on pairwise methods.
I believe that we should continue exploring &lt;em&gt;all&lt;/em&gt; of these techniques, and develop a culture of practice able to select from among them based on the characteristics of the problem, and audience, at hand.&lt;/p&gt;

&lt;p&gt;The rest of this essay will discuss &lt;strong&gt;pairwise preferences&lt;/strong&gt; specifically, and describe the &lt;em&gt;design space&lt;/em&gt; in which they operate.
We will argue that pairwise methods should not be understood as isolated mechanisms, but as part of a larger &lt;em&gt;paradigm of decision-making&lt;/em&gt; incorporating multiple complementary techniques.
By approaching pairwise methods as a &lt;em&gt;paradigm&lt;/em&gt;, it becomes easier to see how the various elements combine into a high-performance system for allocating shared resources.&lt;/p&gt;

&lt;p&gt;We will focus on the use-case of “&lt;a href=&quot;https://en.wikipedia.org/wiki/Public_good&quot;&gt;public goods funding&lt;/a&gt;,” in which communities come together to fund critical infrastructure, as this is the domain with the most activity and richest basis for analysis.
The scope of these methods, however, is broader: instead of funding public goods, we could apply these methods to problems as grand as the setting of federal budgets, or as mundane as judging hackathons or &lt;a href=&quot;https://www.zaratan.world/chorewheel&quot;&gt;prioritizing chores in a coliving house&lt;/a&gt;.
Given the range of possible applications, getting pairwise right would be a major unlock in our ability to coordinate at scale.&lt;/p&gt;

&lt;h1 id=&quot;ii-why-pairwise&quot;&gt;II. Why Pairwise&lt;/h1&gt;

&lt;p&gt;A &lt;strong&gt;pairwise preference&lt;/strong&gt; is a relative choice between two options, A or B.
Pairwise preferences are the atoms of human subjectivity: the simplest distinction a person can make, running on the “phenomenological bare metal” of perception.
This simplicity makes them robust (they mean what they say they mean), accessible (anybody can make a relative distinction), general (many decisions can be framed in relative terms), and flexible (pairwise preferences can be aggregated in many different ways).&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: contrary to some depictions, Tinder-style swipes are &lt;em&gt;not&lt;/em&gt; pairwise judgments, but pass/fail decisions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Pairwise preferences have been studied for decades, going back to the work of &lt;a href=&quot;https://en.wikipedia.org/wiki/Louis_Leon_Thurstone&quot;&gt;American psychometrician L. L. Thurstone&lt;/a&gt; in 1927 and his research into subjective responses to stimuli.
Pairwise preferences would find many applications in the ranking and ordering of items: from ranking chess players with Elo, to weighting web pages with Google’s &lt;a href=&quot;https://en.wikipedia.org/wiki/PageRank&quot;&gt;PageRank&lt;/a&gt;, to assessing the reliability of nodes in a peer-to-peer file-sharing network with &lt;a href=&quot;https://en.wikipedia.org/wiki/EigenTrust&quot;&gt;EigenTrust&lt;/a&gt;, to allocating credit for open-source contributions with &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4570035&quot;&gt;SourceCred&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This dual heritage, as a technique for subjective measurement &lt;em&gt;and&lt;/em&gt; allocating weights among items, suggests these methods have much to offer to the practice of distributed capital allocation, which faces exactly these problems.&lt;/p&gt;

&lt;p&gt;USV’s Albert Wenger &lt;a href=&quot;https://open.spotify.com/episode/6XJOXe3whWTCm3TkuvAWQq?si=e51a8b5f558e43be&quot;&gt;recently argued&lt;/a&gt; that “capital allocation is often downstream from attention allocation.”
As society increasingly comes to view attention as its scarcest resource, evaluating social choice schemes through the lens of attention becomes increasingly critical.
When compared to other voting systems, pairwise methods are arguably more “attention-native” – stimulating, game-like, and inherently rewarding to engage with.&lt;/p&gt;

&lt;p&gt;Unlike non-attention-native methods like quadratic funding, which often lead to “beauty contests” favoring projects with marketing muscle, pairwise methods more naturally &lt;em&gt;distribute attention&lt;/em&gt; by presenting items in random pairs.
This approach elevates &lt;em&gt;item discovery&lt;/em&gt; into a first-class construct, as voters are frequently exposed to items with which they are unfamiliar.
This reduces the need (and benefit) for projects to market themselves, freeing resources for driving engagement with &lt;em&gt;the system as a whole&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Further, distributing voter attention makes strategic voting more costly.
Unlike, for instance, a Borda Count, in which one can trivially “bury” a rival by listing them last, voter evaluations will involve many pairs of items in which they have no strategic interest.&lt;/p&gt;

&lt;p&gt;Seen through this lens, pairwise methods appear as a natural basis for social choice in an environment of scarce attention.
Rather than take voting methods designed for pen and paper and try to adapt them to fast-paced, digital decision environments, we can take decision primitives &lt;em&gt;designed for attention&lt;/em&gt; and make them the core of an entirely new regime.&lt;/p&gt;

&lt;p&gt;Despite these attractive qualities, pairwise methods remain niche.
They have seen some use in public goods funding, such as in &lt;a href=&quot;https://www.optimism.io/blog/announcing-retropgf-round-3-recipients&quot;&gt;Optimism’s RetroPGF&lt;/a&gt; (helping to allocate $100mm+ in funding) as well as in this year’s &lt;a href=&quot;http://deepfunding.org/&quot;&gt;Deep Funding&lt;/a&gt; initiative, but have yet to capture the enthusiasm of other techniques.&lt;/p&gt;

&lt;p&gt;This is in part due to gaps in pairwise practice, as well as the lack of an overarching vision – making the technique difficult to use and to communicate.
By clearly articulating pairwise as a &lt;em&gt;paradigm&lt;/em&gt; of interrelated techniques, we can better communicate the scope of these methods, build momentum around their use, and advance the art and practice of public goods funding overall.&lt;/p&gt;

&lt;h1 id=&quot;iii-the-pairwise-paradigm&quot;&gt;III. The Pairwise Paradigm&lt;/h1&gt;

&lt;p&gt;The word “paradigm” comes from the Greek word for “pattern,” and in scientific contexts refers to “&lt;a href=&quot;https://en.wikipedia.org/wiki/Paradigm&quot;&gt;a distinct set of concepts or thought patterns, including theories, research methods, postulates, and standards for what constitutes legitimate contributions to a field&lt;/a&gt;.”
By framing pairwise as a “paradigm” instead of a single tool or mechanism, we emphasize that it is not any one technique, but rather the &lt;em&gt;synergies between techniques&lt;/em&gt; that produce desirable outcomes.&lt;/p&gt;

&lt;p&gt;These techniques fall into multiple buckets: audience and problem development, interface design and data collection, and algorithmic data analysis.
As an end-to-end pipeline, audiences feed into a voting interface, which feeds data to algorithms that guide both ongoing data collection and final analysis:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/pairwise-flow.png&quot; alt=&quot;The Pairwise Paradigm&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Individual techniques can be added, altered, or removed &lt;em&gt;within&lt;/em&gt; this pairwise paradigm, allowing for parallel explorations within a coherent design space.&lt;/p&gt;

&lt;p&gt;We will now explore these topics, starting with algorithms (what we’re trying to compute), then work backward through pair selection, interfaces, and audience – the layers that shape the data those algorithms receive.&lt;/p&gt;

&lt;h2 id=&quot;1-algorithm-selection&quot;&gt;1. Algorithm Selection&lt;/h2&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: this section is more technically dense than the rest of the essay.
Feel free to skim if the details are not relevant to you.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Pairwise preferences themselves do not determine rankings or weights.
Rather, they must be &lt;em&gt;converted&lt;/em&gt; into weights using an algorithmic process, a “machine for converting subjectivity into objectivity.”
The choice of algorithm has far-reaching implications for what kind of output gets created, and how it should be interpreted.&lt;/p&gt;

&lt;p&gt;This section will discuss several options and their properties, with a focus on the &lt;em&gt;ontology&lt;/em&gt; and &lt;em&gt;complexity&lt;/em&gt; of each algorithm – how each algorithm models reality, and how it processes information relative to that model.&lt;/p&gt;

&lt;p&gt;In all cases, we begin with a sequence of pairwise observations \([(a, b, x), ...]\) with \(x\) representing the pairwise judgment, and want to produce a set of weights \(w = [w_a, w_b, ...]\) summing to 1 telling us how to divide a fixed pool of capital among the items.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: Throughout this section, we will use the standard &lt;a href=&quot;https://en.wikipedia.org/wiki/Big_O_notation&quot;&gt;“Big-O” notation&lt;/a&gt; for evaluating the complexity, in terms of both computation and data, of these techniques.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;elo&quot;&gt;Elo&lt;/h3&gt;

&lt;p&gt;The Elo rating system, developed by &lt;a href=&quot;https://en.wikipedia.org/wiki/Arpad_Elo&quot;&gt;Hungarian chess master Arpad Elo&lt;/a&gt;, is most well-known as the basis for professional chess rankings, but has found use in determining rankings across a variety of domains.&lt;/p&gt;

&lt;p&gt;In the Elo system, every participant has a rating \(R\) determined by their prior matchups, used to predict the score \(S\) of an upcoming match between \(a\) and \(b\):&lt;/p&gt;

\[E[S_{ab}] = \frac{f(R_a)}{f(R_a) + f(R_b)}\]

&lt;p&gt;After a match, the players’ Elo ratings are updated as a function of the &lt;em&gt;actual outcome&lt;/em&gt; vs the &lt;em&gt;predicted outcome&lt;/em&gt; against an opponent:&lt;/p&gt;

\[R&apos;_a \leftarrow R_a + g(S_{ab} - E[S_{ab}])\]

&lt;p&gt;&lt;strong&gt;Ontologically&lt;/strong&gt;, Elo models interactions as occurring &lt;em&gt;in a sequence between the entities themselves&lt;/em&gt;, with the entities changing as a result of these encounters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Computationally&lt;/strong&gt;, Elo is \(O(n)\) in the &lt;em&gt;number of matchups&lt;/em&gt; \(n\), with every matchup resulting in one constant-time update, with no minimum number of observations.&lt;/p&gt;

&lt;p&gt;Unlike the other algorithms discussed, which generate weights in a single batch using &lt;em&gt;all&lt;/em&gt; of the observed preference data, Elo is an &lt;em&gt;online&lt;/em&gt; algorithm, meaning that the rankings are updated after every matchup – and are thus dependent on the &lt;em&gt;specific sequence of the matches&lt;/em&gt;.
Data are understood as the result of a matchup &lt;em&gt;between&lt;/em&gt; two items, &lt;em&gt;not&lt;/em&gt; as a vote on a pair of items by an independent observer.&lt;/p&gt;

&lt;p&gt;While often proposed as a candidate algorithm for capital allocation, Elo’s ontology of &lt;em&gt;a sequence of matchups between items&lt;/em&gt; makes it a &lt;strong&gt;bad fit&lt;/strong&gt; for the use-case, in which data takes the form of &lt;em&gt;a set of votes by third-party observers&lt;/em&gt;.&lt;/p&gt;

&lt;h3 id=&quot;bradley-terry&quot;&gt;Bradley-Terry&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Bradley%E2%80%93Terry_model&quot;&gt;The Bradley-Terry model&lt;/a&gt; is a popular model for generating weights based on pairwise preferences.&lt;/p&gt;

&lt;p&gt;Bradley-Terry models every item as having a &lt;em&gt;latent strength&lt;/em&gt; \(p\), and models the probability of item \(a\) being preferred to item \(b\) as follows:&lt;/p&gt;

\[P(a &amp;gt; b) = \frac{p_a}{p_a + p_b}\]

&lt;p&gt;&lt;strong&gt;Ontologically&lt;/strong&gt;, Bradley-Terry is essentially &lt;em&gt;Platonic&lt;/em&gt;: pairwise observations are understood as random fluctuations revealing a hidden capital-T Truth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Computationally&lt;/strong&gt;, fitting a Bradley-Terry model is \(O(nm)\) in the &lt;em&gt;number of matchups&lt;/em&gt; \(n\) and &lt;em&gt;number of iterations&lt;/em&gt; \(m\) needed to converge, requiring observations \(O(k^2)\) in the number of items \(k\).&lt;/p&gt;

&lt;p&gt;Note the similarity to Elo, which also models results as a ratio of latent scores.
However, unlike Elo, Bradley-Terry produces weights in a batch based on a &lt;em&gt;set of unordered preferences,&lt;/em&gt; making it a more natural choice in the setting where the inputs have no inherent ordering.
In terms of computational complexity, Bradley-Terry models are typically fit using a statistical technique of Maximum-Likelihood Estimation, in which the underlying probabilities are iteratively updated to improve their fit to the observed data, requiring several passes, or &lt;em&gt;iterations&lt;/em&gt;, over the training data.&lt;/p&gt;

&lt;p&gt;Bradley-Terry methods are popular in the academic literature, as they lend themselves well to evaluation and simulation – one can begin with a fictional “ground truth,” run a simulated voting process, and then evaluate how well the recovered weights align with the initial ground truth.
These types of simulations, while rigorous in the statistical sense, tell us less than we might like about how these models perform under real-world conditions.&lt;/p&gt;

&lt;h3 id=&quot;spectral-methods&quot;&gt;Spectral Methods&lt;/h3&gt;

&lt;p&gt;Spectral methods, the most famous of which is Google’s PageRank, aggregate pairwise inputs into a “graph” of interactions, and then take the weights from the graph’s &lt;em&gt;principal eigenvector&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;In linear algebra, the &lt;em&gt;eigenvector&lt;/em&gt; (“self-vector”) of a graph or matrix \(X\) is the vector \(v\) such that:&lt;/p&gt;

\[Xv = \lambda v\]

&lt;p&gt;We can interpret this vector as representing the “direction” to which the graph “points,” which we can interpret as a type of “center” or “steady state” of the data (&lt;a href=&quot;https://en.wikipedia.org/wiki/PageRank#/media/File:Page_rank_animation.gif&quot;&gt;see visualization&lt;/a&gt;).
Techniques for decomposing a graph into these components are known as “spectral methods” after the “spectrum” of latent values they reveal (just as white light is divided into a “spectrum” of constituent colors).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ontologically&lt;/strong&gt;, spectral methods invert the Bradley-Terry model by taking &lt;em&gt;interactions&lt;/em&gt; as the only knowable reality; weights are understood as &lt;em&gt;summary statistics&lt;/em&gt;, not hidden truths.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Computationally&lt;/strong&gt;, spectral methods are \(O(k^3 + n)\) in the &lt;em&gt;number of items&lt;/em&gt; \(k\) and the &lt;em&gt;number of matchups&lt;/em&gt; \(n\), requiring \(O(k^2)\) observations.
That the computation grows with the number of items, not the number of votes, makes this technique shine in settings where many votes are cast on a small number of items.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: spectral methods naively require \(O(k^2)\) observations, but smart pair selection can reduce this to \(O(k)\) (see “&lt;a href=&quot;#active-ranking&quot;&gt;active ranking&lt;/a&gt;” below).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Spectral methods have a history dating back to German mathematician &lt;a href=&quot;https://en.wikipedia.org/wiki/Perron%E2%80%93Frobenius_theorem&quot;&gt;Edmund Landau&lt;/a&gt;, who in 1915 described how they could be used to judge the outcomes of competitions.
In this setting, “votes” emerge &lt;em&gt;endogenously&lt;/em&gt; from interactions among the items themselves (i.e. two teams competing &lt;em&gt;with each other&lt;/em&gt;, two websites linking &lt;em&gt;to each other&lt;/em&gt;).
In 2018, inspired by projects like &lt;a href=&quot;https://pol.is&quot;&gt;Pol.is&lt;/a&gt; and &lt;a href=&quot;https://allourideas.org/&quot;&gt;All Our Ideas&lt;/a&gt;, my colleagues at Colony and I &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3317445&quot;&gt;extended these techniques to the domain of social choice&lt;/a&gt;, modeling votes emerging &lt;em&gt;exogenously&lt;/em&gt; as voter judgments.
This work led to the development of &lt;a href=&quot;https://pairwise.vote/&quot;&gt;pairwise.vote&lt;/a&gt;, the preferred pairwise interface for web3 public goods funding, used in both Optimism’s RetroPGF and the first iteration of this year’s Deep Funding initiative.&lt;/p&gt;

&lt;p&gt;Among researchers, spectral methods were historically neglected relative to Bradley-Terry until the &lt;a href=&quot;https://arxiv.org/pdf/1209.1688&quot;&gt;2015 Rank Centrality paper&lt;/a&gt; demonstrated both their statistical equivalence and computational advantages.
The key theoretical difference between the models is that spectral methods evaluate relationships &lt;em&gt;globally&lt;/em&gt; and leverage transitive relationships not explicitly observed: if A beats B, and B beats C, then a spectral method can infer that A likely beats C.
This propagation of signal creates more complex interactions, but also allows these methods to produce better results with the same information.
Spectral methods are also robust against cycles and other intransitive relationships, for which they produce ties, not contradictions (see “&lt;a href=&quot;#intransitivity-of-preferences&quot;&gt;intransitivity of preferences&lt;/a&gt;” below)&lt;/p&gt;

&lt;p&gt;For these reasons, we believe spectral methods are a &lt;em&gt;strong choice&lt;/em&gt; for capital allocation, which cares about modeling relationships across an entire ecosystem.&lt;/p&gt;

&lt;h3 id=&quot;deep-funding&quot;&gt;Deep Funding&lt;/h3&gt;

&lt;p&gt;Both Bradley-Terry and spectral methods have a major drawback: they are data-hungry, requiring votes on the order of \(O(k^2)\), the square of the number of items \(k\).
These large data requirements make pairwise methods difficult to use in practice, as they make gathering sufficient data challenging.&lt;/p&gt;

&lt;p&gt;Deep Funding is an alternative technique proposed by &lt;a href=&quot;https://x.com/VitalikButerin/status/1867886974058520820&quot;&gt;Vitalik Buterin in late 2024&lt;/a&gt;, in which pairwise judgments are not used to generate weights directly, but rather to score competing weight &lt;em&gt;proposals&lt;/em&gt;, which are produced externally.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: this initiative is unrelated to &lt;a href=&quot;https://deepfunding.ai/&quot;&gt;SingularityNET’s Deep Funding&lt;/a&gt; program.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Under the slogan of “&lt;a href=&quot;https://vitalik.eth.limo/general/2025/02/28/aihumans.html&quot;&gt;AI as the engine, humans as the steering wheel&lt;/a&gt;,” Buterin introduces a model of social choice in which the weight proposals are produced &lt;em&gt;at scale&lt;/em&gt; by machines, with a relatively small amount of human input being used to score proposals.
This reframing of the problem – and of the role of voters – aims to produce better results with less human input, and can be seen as an alternative response to the attention problem we introduced earlier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ontologically&lt;/strong&gt;, Deep Funding treats weights as &lt;em&gt;untrusted proposals&lt;/em&gt; and treats juror inputs as &lt;em&gt;samples of reality&lt;/em&gt; used to score and select the best proposals.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Computationally&lt;/strong&gt;, Deep Funding is \(O(nw)\) in the &lt;em&gt;number of votes&lt;/em&gt; \(n\) and the &lt;em&gt;number of proposals&lt;/em&gt; \(w\), requiring data on the order of \(O(k)\) in the &lt;em&gt;number of items&lt;/em&gt;.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Given that weights are not derived from votes directly, Deep Funding may be better understood as a meta-algorithm than as a pairwise algorithm proper.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Unlike the \(O(k^2)\) vote requirement of Bradley-Terry and the spectral methods, Deep Funding aims to produce results with only \(O(k)\) votes and a small constant factor, on the order of \(k/10\), or &lt;em&gt;one vote for every ten items&lt;/em&gt;.
This reduction in complexity from &lt;em&gt;quadratic&lt;/em&gt; to &lt;em&gt;linear&lt;/em&gt; means that Deep Funding methods can be applied to problems with many items, for which it would not be feasible to gather a large number of pairwise judgments.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note that this smaller data requirement &lt;em&gt;does not&lt;/em&gt; take into account the computation needed to create the weight proposals themselves.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Deep Funding’s central drawback is that the \(O(k)\) votes are only enough to &lt;em&gt;score&lt;/em&gt; proposals, not to &lt;em&gt;produce&lt;/em&gt; them.
Ultimately, the method can only produce results as good as the proposals it receives, introducing new vectors of complexity, risk, and disruption.&lt;/p&gt;

&lt;h2 id=&quot;2-pair-selection&quot;&gt;2. Pair Selection&lt;/h2&gt;

&lt;p&gt;As mentioned earlier, a challenge of using Bradley-Terry and spectral methods is that they require a quadratic number of votes for a given number of items.
As a result of this explosion of interactions, producing weights for even 50 items might call for 2,000 votes – a big ask, especially for communities &lt;em&gt;already struggling&lt;/em&gt; with voter apathy.&lt;/p&gt;

&lt;p&gt;Developing techniques for managing this scale will be key to bringing pairwise methods into the mainstream.&lt;/p&gt;

&lt;p&gt;Fortunately, two practical approaches already exist: star grouping and active ranking.&lt;/p&gt;

&lt;h3 id=&quot;star-grouping&quot;&gt;Star Grouping&lt;/h3&gt;

&lt;p&gt;This approach, developed by the team at &lt;a href=&quot;https://www.generalmagic.io/&quot;&gt;General Magic&lt;/a&gt;, introduces a pre-filtering step in which voters score each project on a scale of one to five stars.
These scores then bucket projects into one of five tiers, with pairwise comparisons being made only between projects of the same tier.
This reduces the number of votes required by 80%, by ensuring that the quadratic creation of pairs occurs in smaller sets.&lt;/p&gt;

&lt;p&gt;Star grouping can also be adapted to settings where the categorization is not of &lt;em&gt;quality&lt;/em&gt;, but of &lt;em&gt;type&lt;/em&gt;.
In cases where the items being considered are not all of the same type, star grouping can cluster like items together, clarifying downstream voting.&lt;/p&gt;

&lt;h3 id=&quot;active-ranking&quot;&gt;Active Ranking&lt;/h3&gt;

&lt;p&gt;Another approach for reducing the number of votes needed is through a technique called “active ranking,” an adaptation of the “&lt;a href=&quot;https://www.cs.cornell.edu/people/tj/publications/yue_etal_09a.pdf&quot;&gt;dueling bandits&lt;/a&gt;” algorithm from machine learning.&lt;/p&gt;

&lt;p&gt;Active ranking works by surfacing pairs not at random, but based on the &lt;em&gt;uncertainty&lt;/em&gt; of that pair.
The intuition is that among the entire set of items, some pairwise judgments are more “obvious” than others, and so don’t need to be voted on (e.g. bear vs rabbit).
Active ranking directs scarce voter attention to the pairs which are the most ambiguous, and thus provide the most &lt;em&gt;information&lt;/em&gt; (e.g. bear vs lion).&lt;/p&gt;

&lt;p&gt;In theory, we can imagine needing as few as \(O(k)\) votes, albeit with a large constant factor – an entirely different complexity class from \(O(k^2)\).&lt;/p&gt;

&lt;p&gt;To give intuition for why this should be possible, observe that any weighting of \(k\) items can be expressed as a set of \(k-1\) scalars (up to an overall scaling factor), representing the pairwise &lt;em&gt;ratio&lt;/em&gt; of two adjacent items:&lt;/p&gt;

\[[.1, .2, .3, .4] \rightarrow [2, 3/2, 4/3]\]

&lt;p&gt;This generalizes to an arbitrary number of weights and suggests to us that we can &lt;em&gt;in principle&lt;/em&gt; construct \(k\) weights with only \(k-1\) human inputs.
While this limit is not achievable in practice, we can attempt to approach it, reducing the data requirement to some multiple of the number of items, i.e. \(k \cdot 10\) by directing attention towards the subset pairs which are most competitive with each other.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: the estimate of 10 votes per item is speculative and depends on the specific structure of the graph; real-world performance will need to be evaluated in future research.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Implementing active ranking is surprisingly straightforward.
For every pair of items \(a, b\) we construct a Beta distribution \(p_{ab} \sim Beta(votes[a,b], votes[b,a])\) representing the ambiguity of the pair, and then sample based on \(Var(p_{ab})\).
This variance will be high for pairs with few or mixed observations, and low for pairs in which one item is repeatedly preferred – directing voter attention to where it is most valuable.
Calculating these distributions can be done iteratively, with the relevant distribution being updated in constant-time after each vote.&lt;/p&gt;

&lt;p&gt;An optional extension to active ranking would be to sample not only by \(Var(p_{ab})\), but by \(Var(p_{ab}) \cdot w_a w_b\), effectively upsampling the higher-weighted items.
The idea here is that not only do we want to direct voter attention to the most &lt;em&gt;ambiguous&lt;/em&gt; pairs, but to the most ambiguous pairs that are &lt;em&gt;also&lt;/em&gt; the highest-value, as this is where the attention will have the greatest marginal impact on allocations.
Extending active ranking in this way requires more computation, as weights will need updating after each vote, but in practice weights can almost certainly be cached and updated periodically, amortizing computation without significantly degrading performance.&lt;/p&gt;

&lt;p&gt;Compared to star grouping, active ranking has several advantages.
First, active ranking permits \(O(k)\) comparisons &lt;em&gt;total&lt;/em&gt;, while star grouping implies \(O(k^2)\) votes &lt;em&gt;per tier&lt;/em&gt;.
Second, active ranking can be run transparently in the background, while star grouping requires an explicit voting step.
Third, active ranking allows for all projects to be compared, whereas star grouping precludes comparison between groups.&lt;/p&gt;

&lt;p&gt;Note that active ranking is an &lt;em&gt;online&lt;/em&gt; process – the order of votes &lt;em&gt;does matter&lt;/em&gt; for determining which pairs are more likely to be shown, although the final weight production remains a batch process using all available data.
This online nature of active ranking creates a new attack vector, as prior votes now influence the likelihood of future items being shown.
Mitigating this risk will be an important part of bringing active ranking into production.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; For an account of implementing active ranking in practice, see “&lt;a href=&quot;/blog/2026/03/29/implementing-active-ranking.html&quot;&gt;Implementing Active Ranking&lt;/a&gt;”.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2 id=&quot;3-interface-design&quot;&gt;3. Interface Design&lt;/h2&gt;

&lt;p&gt;Another key consideration is &lt;em&gt;the design of the voting interface&lt;/em&gt;.
A data analysis pipeline is only as good as the quality of the data it analyzes.
More so than with other methods, the interface of a pairwise process has significant impact on the quality of the data collected.&lt;/p&gt;

&lt;p&gt;To illustrate this, imagine a hypothetical “bad” interface, showing only the names of the items being compared.
With this interface, participants must make decisions based only on their pre-existing associations, resulting in noisy and unreliable data.
In an even more extreme example, imagine a user deciding between two random strings – a process which produces &lt;em&gt;pure noise&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Taking it in the other direction, can you imagine an interface profoundly better than any you’ve seen before?
Why or why not?&lt;/p&gt;

&lt;p&gt;Ultimately, the performance of a pairwise voting system is downstream of the interface.&lt;/p&gt;

&lt;h3 id=&quot;voting-and-session-times&quot;&gt;Voting and Session Times&lt;/h3&gt;

&lt;p&gt;Before getting into the specifics of UI elements, we should ask ourselves a basic “product” question: &lt;em&gt;how long&lt;/em&gt; should voters spend evaluating a given pair?
With a target time in mind, we can work backwards to make decisions about visual design.&lt;/p&gt;

&lt;p&gt;In discussion, practitioners have proposed target decision times of as short as 5 seconds to as long as 5 minutes.
&lt;a href=&quot;/blog/2025/08/03/deepfunding-jury-analysis.html&quot;&gt;Analysis of deep funding voting data&lt;/a&gt; suggests that the “sweet spot” is about 30 seconds.
More time does not result in meaningfully better judgments, and reduces the total number of pairs submitted.
Less time increases the likelihood of a random choice, and thus of measurement error.&lt;/p&gt;

&lt;p&gt;Working backwards from this 30-second target, we can design an interface to provide as much information as can be processed in that time frame.&lt;/p&gt;

&lt;p&gt;Beyond individual votes, we can also think about the voting “session” as another type of product.
If we see voting on a single pair as the atom of a pairwise process, a 30-second decision bundles naturally into the 5-minute voting sessions, yielding 10 new votes.
The 5-minute session becomes the unit of engagement – light enough to be completed on the train or over coffee, yet substantial enough to move the process forward.&lt;/p&gt;

&lt;h3 id=&quot;input-format&quot;&gt;Input Format&lt;/h3&gt;

&lt;p&gt;Another key question is whether voters will be asked to make &lt;em&gt;ordinal&lt;/em&gt; judgments (A better than B) or &lt;em&gt;cardinal&lt;/em&gt; judgments (A 3x better than B).
While cardinal judgments are appealing at first glance, offering the promise of &lt;em&gt;more signal&lt;/em&gt;, that promise is often unfulfilled.
It is typically the case that in &lt;em&gt;untrained audiences&lt;/em&gt;, measurements of &lt;em&gt;psychic intensity&lt;/em&gt; are more likely than not to be measurements of &lt;em&gt;individual mood or personality&lt;/em&gt;, yielding mostly noise and little signal.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/blog/2025/08/03/deepfunding-jury-analysis.html&quot;&gt;An analysis of Deep Funding’s initial juror data&lt;/a&gt; showed that while the perceived numeric ratio of two projects’ impact varied widely, the perceived &lt;em&gt;direction&lt;/em&gt; of the impact (whether A or B overall was more important) was remarkably consistent – between 63% and 89% agreement, depending on the set of items being evaluated.
This implies that while voters struggled to provide consistent numeric evaluations of the relative impact between pairs of projects, they could more consistently determine which of two projects was more impactful &lt;em&gt;overall&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;It is, of course, not so simple, and questions of cardinality and ordinality continue to be explored by the academic community.
&lt;a href=&quot;https://arxiv.org/pdf/2504.14716&quot;&gt;One recent paper&lt;/a&gt; on LLM evaluation found that models which presented results with “distractions” like excess enthusiasm were more likely to be chosen in ordinal matchups, while both neutral and “distracting” responses received similar cardinal ratings, suggesting that, at least in the case of LLM judgment, cardinal methods may be robust in ways that ordinal methods are not.
&lt;a href=&quot;https://pmc.ncbi.nlm.nih.gov/articles/PMC9586273/pdf/pnas.202210412.pdf&quot;&gt;Another recent study&lt;/a&gt; found that over long periods of time, cardinal measures of personal wellbeing were correlated with objective outcomes (such as moving to a new neighborhood if unhappy where you live), suggesting that cardinal judgments are not always noisy or idiosyncratic, but consistently capture sentiment at scale.&lt;/p&gt;

&lt;p&gt;The question of cardinal vs ordinal judgments is far from settled, and much depends on the specifics of the problem and the audience.
However, it does seem that in our setting of decentralized capital allocation by large heterogeneous groups of voters, that ordinal judgments should be, at the very least, the default.&lt;/p&gt;

&lt;h3 id=&quot;visual-elements&quot;&gt;Visual Elements&lt;/h3&gt;

&lt;p&gt;When it comes to visual design, the Pairwise team has arguably gone the furthest.
Consider the interface they developed for the Deep Funding pilot:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/pairwise-ui.png&quot; alt=&quot;Pairwise Sample Interface&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Here we see a number of design elements:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Each item clearly indicated by name and logo&lt;/li&gt;
  &lt;li&gt;A set of curated and domain-specific metrics&lt;/li&gt;
  &lt;li&gt;An AI-based textual summary of the project&lt;/li&gt;
  &lt;li&gt;The ability to choose either item, or to skip&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These elements form the foundation of an effective decision process.
The metrics enable immediate, quantitative contrast between the two items, while the text summary gives the voter deeper context on individual items.
Letting the voter skip pairs reduces the incidence of bad data, in cases where a voter genuinely cannot differentiate between two items.&lt;/p&gt;

&lt;p&gt;Further, this approach leverages both &lt;em&gt;metrics&lt;/em&gt; and &lt;em&gt;AI judgment&lt;/em&gt;, while leaving the final decision in the hands of human voters.
The key insight is that while the metrics and summaries may be the same for every project, individual voters &lt;em&gt;qualitatively&lt;/em&gt; integrate the information differently, yielding richer results than would be possible by allocating funds by metric or AI judgment &lt;em&gt;directly&lt;/em&gt;.
Pairwise methods are more robust to &lt;a href=&quot;https://en.wikipedia.org/wiki/Goodhart%27s_law&quot;&gt;Goodhart’s Law&lt;/a&gt;-style failures, common among metric and AI-based approaches, in which projects learn to “game the system” by tailoring their self-representation to more narrow and mechanical decision criteria.&lt;/p&gt;

&lt;p&gt;Here is another interface example, &lt;a href=&quot;https://deepfundingjury.com/evaluation&quot;&gt;Verdict&lt;/a&gt;, recently developed by the Deep Funding team:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/deepfundingjury-ui.png&quot; alt=&quot;Deep Funding Jury&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This interface is organized around the idea of “high quality bits” and reflects the team’s priors around data quality and their desire for fewer, highly selected votes.
This interface asks voters to make cardinal assessments of relative value, and asks for written explanations for their decisions.
In addition, the interface states that all votes will be reviewed by “meta-jurors” and potentially rejected if deemed unjustified.&lt;/p&gt;

&lt;p&gt;This interface was developed following the initial pilot, which used the Pairwise.vote interface presented previously.
When analyzing the data from that pilot, the Deep Funding team found that their cardinal rankings varied significantly in scale and range per-voter.
Rather than adopt ordinal inputs, which are more robust to voter idiosyncrasies, they chose to increase gatekeeping as a way of maintaining data quality.
It retains the use of AI summaries, but omits the metrics featured in the earlier iteration, pushing voters to rely more heavily on textual summaries.&lt;/p&gt;

&lt;p&gt;Finally, here is a mockup of an interface I would consider close to ideal, focusing on simplicity and comprehensibility:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/pairwise-mock.png&quot; alt=&quot;Pairwise Mock UI&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This interface leans into ordinal inputs and prioritizes speed and access over gatekeeping, while leveraging AI summarization and metrics to standardize decision framing.&lt;/p&gt;

&lt;h2 id=&quot;4-audience-development&quot;&gt;4. Audience Development&lt;/h2&gt;

&lt;p&gt;The next key consideration when running a pairwise voting process is audience development.&lt;/p&gt;

&lt;h4 id=&quot;audience-selection&quot;&gt;Audience Selection&lt;/h4&gt;

&lt;p&gt;Pairwise methods are well-suited to the setting of &lt;em&gt;distributed capital allocation by a heterogeneous audience&lt;/em&gt;, with many different people coming together to distribute shared resources.
They succeed by directing participant attention efficiently over a range of items, without requiring deep prior knowledge of the items in question.&lt;/p&gt;

&lt;p&gt;Pairwise methods are less effective when used by a &lt;em&gt;small group of experts&lt;/em&gt;, who are more likely to have &lt;em&gt;strong prior knowledge&lt;/em&gt; of the items in question.
For this audience, who has &lt;em&gt;already performed&lt;/em&gt; the cognitive labor of developing opinion, assigning weights directly might be more efficient and more natural.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;To give a concrete example, during the Deep Funding pilot a senior Ethereum community member criticized the process on the grounds that they had already formed clear opinions about the right weights for individual projects, and that the pairwise process made it harder to express those opinions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It is also possible to have an audience which lacks a &lt;em&gt;minimum baseline&lt;/em&gt; of context, such that even with a quality interface, they are unable to meaningfully distinguish between items.&lt;/p&gt;

&lt;p&gt;The ideal audience, then, is a &lt;em&gt;large and diverse group of people with at least some domain knowledge&lt;/em&gt;, able to make judgments on never-before-seen pairs after about 30 seconds of absorbing information.
This group will be able to provide the most &lt;em&gt;net&lt;/em&gt; information, in terms of both &lt;em&gt;quantity&lt;/em&gt; and &lt;em&gt;quality&lt;/em&gt; of inputs.&lt;/p&gt;

&lt;h4 id=&quot;audience-expectations&quot;&gt;Audience Expectations&lt;/h4&gt;

&lt;p&gt;Relatedly, it is important that participant &lt;em&gt;expectations&lt;/em&gt; be set appropriately.&lt;/p&gt;

&lt;p&gt;Pairwise methods don’t ask participants for weights directly, but infer them &lt;em&gt;indirectly&lt;/em&gt; using an algorithmic process.
For audiences used to assigning values directly, “giving up control” of the weights can be disorienting.
After Optimism’s RetroPGF 3, General Magic &lt;a href=&quot;https://gov.optimism.io/t/pairwise-retrospective-and-proposed-spec-for-retropgf-4/7479&quot;&gt;interviewed participants&lt;/a&gt; and found that while participants found the system engaging and an aid in discovery, they also struggled with the shift in expectations of not setting weights directly.&lt;/p&gt;

&lt;p&gt;Communicating the behavior of the system, and making the end-to-end process legible for participants, is key to building trust in these methods.&lt;/p&gt;

&lt;h4 id=&quot;audience-segmentation&quot;&gt;Audience Segmentation&lt;/h4&gt;

&lt;p&gt;Every voting process needs some way of segmenting and evaluating its participants to prevent vote manipulation.
At minimum, this means some form of &lt;a href=&quot;https://en.wikipedia.org/wiki/Sybil_attack&quot;&gt;sybil resistance&lt;/a&gt;, whether tokens, identity “passports,” or webs-of-trust built off of pairwise attestation.
Choosing the right solution will again be a function of problem and audience.&lt;/p&gt;

&lt;p&gt;The dual of sybil resistance is making better use of expert attention through reputation systems.
Just as much as we discard fraudulent votes, we can give leverage to experts by assigning their inputs a higher weight.
One might imagine a process in which the majority of the community engages early, making a “first pass” over the items, to be followed by expert assessment of only the most ambiguous pairs.&lt;/p&gt;

&lt;p&gt;Reputation and identity is a deep field and beyond the scope of this essay, but it is worth gesturing towards how these ideas might be combined.&lt;/p&gt;

&lt;h4 id=&quot;audience-compensation&quot;&gt;Audience Compensation&lt;/h4&gt;

&lt;p&gt;While pairwise voting processes are often seen as more inherently engaging than other forms of voting, it would be naive to expect ongoing voter participation without some external incentive or motivation – financial, relational (status), or both.&lt;/p&gt;

&lt;p&gt;Financial rewards are straightforward, but can incentivize mercenary behavior and, being zero-sum, create a drag on resources.
Non-financial rewards, such as digital collectibles which double as attestations for a public goods funding reputation system, represent an underexplored and positive-sum alternative.&lt;/p&gt;

&lt;h1 id=&quot;iv-continuous-funding&quot;&gt;IV. Continuous Funding&lt;/h1&gt;

&lt;p&gt;Possibly the most significant application of pairwise methods is the idea of “continuous funding.”&lt;/p&gt;

&lt;p&gt;Conventionally, most public goods funding takes place via “rounds.”
Each round is a distinct event, featuring a slightly different set of projects and requiring an entirely new set of votes.
Putting on a grants round is a large lift, consuming many hours of administrative time on the part of both the sponsoring organization and the projects being considered, and of voting time on the part of participants.&lt;/p&gt;

&lt;p&gt;Much of this labor is redundant, as both projects and voters tend to be similar round-over-round.
&lt;a href=&quot;/data/gg-rankings.csv&quot;&gt;Looking at data&lt;/a&gt; from Gitcoin Grant rounds 20 and 22, which took place six months apart in April and November 2024 (disclosure: I participated in both rounds), we see that 52% of the top 25 projects in their “dApps and Apps” category carried over between rounds.
If an ecosystem is evolving only 50% between rounds, then the round-based approach is wasting half of both its administrative input and voter attention.&lt;/p&gt;

&lt;p&gt;Instead of discrete rounds, which make inefficient use of both administrative and voter inputs, public goods funding should move towards long-lived continuous processes, in which valuable administrative and voter inputs are leveraged over longer periods of time.&lt;/p&gt;

&lt;p&gt;Continuous funding processes also allow for a more relaxed attitude towards “correctness,” as the opportunity to course-correct is built in from the get-go.
Rather than feel pressured to “get it right” the first time, it becomes possible to “throw something out there” with the confidence that allocations will be updated in response to new information, and that over time the resources will flow towards the point of greatest impact.
This “cybernetic” or evolutionary approach to capital allocation – focusing less on static point solutions and more on self-correcting processes – promises to be more robust and resilient than the high-cost, round-based approaches dominant today.&lt;/p&gt;

&lt;p&gt;This idea of continuous funding has been explored before, with pilots like GeoWeb’s &lt;a href=&quot;https://github.com/Geo-Web-Project/streaming-quadratic-funding&quot;&gt;Streaming Quadratic Funding&lt;/a&gt;, and more recently Octant’s &lt;a href=&quot;https://streamvote.octant.build/&quot;&gt;StreamVote&lt;/a&gt;, combining quadratic voting with Superfluid’s &lt;a href=&quot;https://superfluid.org/&quot;&gt;continuous payments infrastructure&lt;/a&gt; to distribute funds in real-time.&lt;/p&gt;

&lt;p&gt;However, &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4856267&quot;&gt;apart from one notable multi-year real-world example&lt;/a&gt;, to date no continuous public goods funding process has been built on top of pairwise inputs.
For reasons explored below, we believe that pairwise inputs offer a stronger basis for continuous funding, converting scarce attention into actionable allocations at high efficiency.&lt;/p&gt;

&lt;h3 id=&quot;permissionless-entry&quot;&gt;Permissionless Entry&lt;/h3&gt;

&lt;p&gt;Most public goods funding rounds have high overheads, with staff needed to screen projects, run comms, and audit performance.
This high overhead ultimately consumes funds which could otherwise be directed to projects directly, creating inefficiencies which threaten to undermine the legitimacy of the process entirely.
A fully permissionless continuous funding process would enable capital allocation at lower cost by reducing or eliminating this overhead.&lt;/p&gt;

&lt;p&gt;In a permissionless process, projects add themselves to the pool by putting up a stake.
Once in the pool, the active ranking process would surface high-variance new projects, allowing them to quickly “find their level” of funding.
Projects that perform poorly – say, more than three standard deviations below the mean – would be evicted from the pool and lose their stake.&lt;/p&gt;

&lt;p&gt;See below for a visual intuition for this process:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/img/continuous-funding.png&quot; alt=&quot;Continuous Funding&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Pairwise methods are well-suited for this type of permissionless, self-correcting process, due to the logically “self-contained” nature of the pairwise judgment (i.e. a choice of A vs B does not depend on C).
In contrast, quadratic “votes” are logically conditioned on the &lt;em&gt;entire set&lt;/em&gt; of items (i.e. I might give $10 to A on its own, but only $5 in the presence of B, which gets the other $5), making quadratic methods less naturally adapted to a continuous setting in which items are frequently being added and removed.
This limitation creates frictions which must be overcome.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;To give a concrete example, StreamVote &lt;a href=&quot;https://x.com/OctantApp/status/1999558961624965309&quot;&gt;performed AI screening of projects before allowing them to claim funding&lt;/a&gt;, winnowing the final set of projects from 3,980 down to 17 – a 99.6% rejection rate, and large use of administrative power.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In contrast, an always-on pairwise process with active ranking can &lt;em&gt;itself&lt;/em&gt; serve to screen and filter a dynamic pool of candidates, eliminating a costly administrative task.&lt;/p&gt;

&lt;h3 id=&quot;stale-voting-data&quot;&gt;Stale Voting Data&lt;/h3&gt;

&lt;p&gt;As discussed previously, one of the advantages of pairwise judgments is that they can be meaningfully re-used over time.
However, while &lt;em&gt;in theory&lt;/em&gt; a vote on a given pair is valid in perpetuity, &lt;em&gt;in practice&lt;/em&gt; we should be cautious of relying on stale data.
In reality, projects are always evolving, and a preference for A over B may become increasingly inaccurate if A stagnates while B thrives.&lt;/p&gt;

&lt;p&gt;One approach to balancing this trade-off is through a decay process in which votes lose their impact over time, perhaps decaying to zero over a two-year period.
There are many possible decay curves, all reflecting different assessments of how fast this particular reality changes.&lt;/p&gt;

&lt;h3 id=&quot;funding-rates&quot;&gt;Funding Rates&lt;/h3&gt;

&lt;p&gt;The complementary question to vote decay is that of “how fast should we distribute the money?”
Unlike rounds-based approaches, in which the entire pot is given out at the end of the round, continuous processes must make choices about the speed with which they distribute their funds.&lt;/p&gt;

&lt;p&gt;One approach would be to distribute funds at a &lt;em&gt;constant&lt;/em&gt; rate over some period of time.
A four-year target would see 50% given out after two years, and 100% after four years, at which point the process would end (assuming no new funds).&lt;/p&gt;

&lt;p&gt;Another approach, proposed by the Colony team &lt;a href=&quot;https://uploads-ssl.webflow.com/61840fafb9a4c433c1470856/639b50406de5d97564644805_whitepaper.pdf&quot;&gt;in their 2018 whitepaper&lt;/a&gt;, would be to distribute funds at an &lt;em&gt;exponentially decaying&lt;/em&gt; rate, such that funds are distributed according to a “half-life.”&lt;/p&gt;

&lt;p&gt;A one-year half-life would give out half the funds in the first year, a quarter in the second year, an eighth in the third, and so on (assuming no new funds).
Compared to the constant rate, this approach extends the life of the process indefinitely, at the cost of a decreasing funding rate.&lt;/p&gt;

&lt;h3 id=&quot;process-tempo&quot;&gt;Process Tempo&lt;/h3&gt;

&lt;p&gt;Throughout this section, we have emphasized the value of reducing the administrative overhead involved in running grant rounds.
However, we acknowledge that even a continuous process might benefit from a periodic “tempo” of participation.
An always-on process is hard to keep top-of-mind, and might see particiption lapse over time.&lt;/p&gt;

&lt;p&gt;To counteract this, one could imagine running “campaigns” to periodically refresh voting data at scale, leveraging leaderboards and other incentives to stimulate data-collection.
In the best case, resources which would otherwise have gone to administrative tasks can be redirected towards other purposes, such as commissioning compelling digital collectibles to give as rewards to participants.&lt;/p&gt;

&lt;p&gt;On the payouts side, one could imagine making payouts on a monthly or quarterly basis, or conditioning payouts on milestones determined by a separate governance process.&lt;/p&gt;

&lt;p&gt;The key idea is that user and project experience could be made &lt;em&gt;qualitatively&lt;/em&gt; similar to that of a round-based process &lt;em&gt;where it counts&lt;/em&gt;, while retaining the benefits of a continuous substrate in which allocations update and distribute in real-time.&lt;/p&gt;

&lt;h1 id=&quot;v-evaluation-and-legitimacy&quot;&gt;V. Evaluation and Legitimacy&lt;/h1&gt;

&lt;p&gt;At the start of this essay, we argued that the limitations of pass/fail voting led to a shift in interest towards capital allocation.
Capital allocation, however, has its own challenges: in particular, &lt;em&gt;evaluation&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;As has been well-documented by Metagov’s &lt;a href=&quot;https://grant.metagov.org/&quot;&gt;Grant Innovation Lab&lt;/a&gt;, difficulties in evaluating the effectiveness of grant programs are leading to an erosion of trust in public goods funding overall, and a slow reduction in funding for such programs.
As a result, over the past year the public goods funding community has &lt;a href=&quot;https://ethresear.ch/t/three-fundamental-problems-in-ethereum-public-goods-funding-a-research-agenda/23474&quot;&gt;increasingly begun prioritizing &lt;em&gt;evaluation&lt;/em&gt; over &lt;em&gt;experimentation&lt;/em&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;On some level, evaluation is an impossible and fundamentally speculative task.
If someone wanted to support public goods and could accurately evaluate the future impact of a present investment (or even the present impact of a past investment), they would be better off making billions on Wall Street and starting a foundation.&lt;/p&gt;

&lt;p&gt;Given that nobody is doing that, and that evaluation is on some level &lt;em&gt;made-up&lt;/em&gt;, it’s fair to ask what “evaluation” actually achieves.
We suggest that &lt;em&gt;evaluation produces legitimacy&lt;/em&gt;, which translates concretely into higher future funding inflows.
As Vitalik Buterin famously said, “&lt;a href=&quot;https://vitalik.eth.limo/general/2021/03/23/legitimacy.html&quot;&gt;legitimacy is the scarcest resource&lt;/a&gt;.”&lt;/p&gt;

&lt;p&gt;To adopt ideas from institutional economics, we can imagine evaluation as filling out a currently-barren “policy” layer in the three-layer model of constitution, policy, and operations.
In this frame, the “constitutional” layer represents the code, norms, etc., of the grants ecosystem itself, the “policy” layer represents the process by which specific &lt;em&gt;parameters&lt;/em&gt; (mechanisms, etc.) are chosen, while the “operational” layer represents the grants rounds themselves: applications, voting, etc.
Seen in this way, evaluation is the missing &lt;em&gt;political&lt;/em&gt; process needed to bring web3’s public goods funding ecosystem from experiment into maturity.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://davidgasquez.com/weight-allocation-mechanism-evals/&quot;&gt;How might these evaluations be done?&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One possibility, adapting &lt;a href=&quot;https://arxiv.org/pdf/2403.04132&quot;&gt;techniques used to evaluate large-language models&lt;/a&gt;, is to subject allocations &lt;em&gt;themselves&lt;/em&gt; to pairwise judgments.
In addition to pairwise judgments &lt;em&gt;between projects&lt;/em&gt;, the community can make pairwise judgments &lt;em&gt;between allocations produced by competing mechanisms&lt;/em&gt;.
By comparing the outputs of different allocation processes, a community can set policies for &lt;em&gt;how&lt;/em&gt; resources are distributed: this algorithm vs that, this parameterization vs that.&lt;/p&gt;

&lt;p&gt;Another possibility, popular in the web3 community, is to lean into “info-finance” as a tool for producing high-quality predictions about the future.
Unlike votes, which reflect subjective opinions, financially incentivized predictions can &lt;em&gt;in theory&lt;/em&gt; be treated as objectively reliable sources of information.
Info-finance is not my area, so I cannot comment deeply on the strengths and weaknesses of this design, but these approaches have always struck me as a touch reflexive, with predictions and allocations influencing each other in complex and hard-to-model ways.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This is not idle speculation. &lt;a href=&quot;https://www.espn.com/espn/betting/story/_/id/47337056/scandals-prediction-markets-2025-turning-point-sports-betting&quot;&gt;The proliferation of sports betting apps has begun affecting player behavior&lt;/a&gt;, who exploit privileged positions to resolve markets in their favor.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Yet another possibility would be to develop new metrics for evaluating the “health” of a funding process.
By tracking metrics such as the degree to which a funding distribution follows a power-law, or the churn of projects over time, one can assess how a funding process is performing relative to an idealized baseline.&lt;/p&gt;

&lt;p&gt;Whatever approach is used, it is virtually certain that evaluation will become table stakes for those looking to implement capital-allocation schemes.&lt;/p&gt;

&lt;h1 id=&quot;vi-putting-it-all-together&quot;&gt;VI. Putting It All Together&lt;/h1&gt;

&lt;p&gt;As argued in the introduction, pairwise methods should be understood not as a single technique or mechanism, but as a &lt;em&gt;paradigm of social choice&lt;/em&gt;.
Having introduced the various pieces of this paradigm, we can begin putting them together, and offer concrete estimates of the actual attention requirements for public goods funding.&lt;/p&gt;

&lt;p&gt;Assuming &lt;strong&gt;30 seconds per vote&lt;/strong&gt; and &lt;strong&gt;10 votes per item&lt;/strong&gt; with active ranking, pairwise methods can produce allocations at an attentional cost of &lt;strong&gt;5 minutes&lt;/strong&gt; per item.
For a set of 600 items, this calls for &lt;strong&gt;50 hours&lt;/strong&gt; of total voter attention.
If we assume each voter contributes four five-minute sessions (20 minutes), then we need only 150 voters to produce legitimate results – a relatively easy lift.&lt;/p&gt;

&lt;p&gt;As a continuous process, things get even better.
If we assume that only ~25% of the items undergo meaningful change in a given quarter, we can sustain a continuous funding process &lt;em&gt;indefinitely&lt;/em&gt; with an attention cost of &lt;strong&gt;5 minutes per item per year&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Further, given a sufficiently robust &lt;strong&gt;reputation system&lt;/strong&gt;, we could imagine running seasonal “campaigns” in which the community contributes &lt;em&gt;en masse&lt;/em&gt; to project filtering, followed by experts who resolve the highest-ambiguity pairs or adjudicate between the outputs of funding mechanisms themselves.
Participants could be rewarded with sought-after &lt;strong&gt;digital collectibles&lt;/strong&gt;, serving as reputation attestations and cultivating a public-spirited group identity.&lt;/p&gt;

&lt;p&gt;The net effect is that of an always-on social choice “sensor” – a solar panel for governance – collecting ambient preference information and converting it into actionable outputs in real-time.&lt;/p&gt;

&lt;p&gt;The idea of sustaining a complex public goods funding ecosystem with so little effort might seem implausible, and the continued development of these techniques will certainly surface new challenges and limitations.
And yet, the arguments have been laid out, and these are the conclusions we’ve drawn.&lt;/p&gt;

&lt;h3 id=&quot;final-thoughts&quot;&gt;Final Thoughts&lt;/h3&gt;

&lt;p&gt;The task of helping groups of people effectively manage shared resources is one of the most pressing problems of our day.
We have argued that pairwise methods, seen not as isolated techniques but as part of a &lt;em&gt;paradigm&lt;/em&gt; of social choice, offer a compelling toolkit for solving exactly this problem.&lt;/p&gt;

&lt;p&gt;Relative to other techniques and to their underlying potential, pairwise methods remain underexplored.
By articulating a “pairwise paradigm,” we hope to help chart the way.&lt;/p&gt;

&lt;h1 id=&quot;appendix-other-considerations&quot;&gt;Appendix: Other Considerations&lt;/h1&gt;

&lt;h3 id=&quot;the-independence-of-irrelevant-alternatives&quot;&gt;The Independence of Irrelevant Alternatives&lt;/h3&gt;

&lt;p&gt;The “independence of irrelevant alternatives” (IIA) is a key concept in social choice.
The principle of IIA says that the relative ranking of two options must never depend on a third (the “irrelevant alternative”).
In practice, almost every voting system violates this ideal in some way; the question becomes &lt;em&gt;why&lt;/em&gt; and &lt;em&gt;to what degree&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Some pairwise algorithms, such as spectral ranking, &lt;em&gt;intentionally&lt;/em&gt; draw in third options to produce richer comparisons with less data.
This produces results which reflect a global perspective, incorporating more data but increasing the likelihood of unexpected interactions.
This unapologetic rejection of IIA is arguably an advantage in the setting of capital allocation, where the goal is to distribute resources across an entire ecosystem.&lt;/p&gt;

&lt;p&gt;However, in high-stakes, single-winner settings like elections, these complex and non-linear interactions can undermine the perceived legitimacy of the process.
In those settings, a &lt;a href=&quot;https://en.wikipedia.org/wiki/Condorcet_method&quot;&gt;Condorcet method&lt;/a&gt; with stronger guarantees would almost certainly be preferable.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Note: as discussed earlier, individual pairwise judgments &lt;em&gt;are&lt;/em&gt; in fact independent of alternatives; it is only the final result which is not.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;intransitivity-of-preferences&quot;&gt;Intransitivity of Preferences&lt;/h3&gt;

&lt;p&gt;Closely related to IIA is the idea of cycles and the &lt;em&gt;transitivity of preference&lt;/em&gt; – the idea that if you prefer A to B, and B to C, then it is logically consistent that you would prefer A to C.
To prefer otherwise creates what is known as a “cycle” of &lt;em&gt;intransitive preferences&lt;/em&gt;.
For many voting methods, cycles represent &lt;em&gt;contradictions&lt;/em&gt; and are seen as inherent flaws.&lt;/p&gt;

&lt;p&gt;Spectral methods, on the other hand, handle intransitive preferences naturally – they are not contradictions, but &lt;em&gt;information&lt;/em&gt;.
Cycles are interpreted naturally as ties, which pose problems for single-winner elections, but not for capital allocation in which funds are simply given out equally.&lt;/p&gt;

&lt;p&gt;Further, pairwise spectral methods allow for a different interpretation of intransitivity.
Unlike Bradley-Terry, which models every item as having a latent value, and interprets intransitivity as measurement error, spectral methods model intransitivity as a normal part of reality.&lt;/p&gt;

&lt;p&gt;To give an example, imagine items A and B, each having qualities X and Y.
If we conceptualize voters as making pairwise decisions based on &lt;em&gt;subjective integrations&lt;/em&gt; of the data, one voter might integrate X and Y and choose A, while another, filtering through the lens of their own personal beliefs and lived experience, chooses B.&lt;/p&gt;

&lt;p&gt;Over hundreds or thousands of voters, the pairwise graph becomes a rich field of relationships with frequent cyclical and intransitive behaviors, reflecting a deep and socially-embedded understanding of the ecosystem in question.
Taking this graph as the only knowable ground truth, we then produce weights as a &lt;em&gt;useful synthesis&lt;/em&gt; of that data.&lt;/p&gt;

&lt;p&gt;One might imagine going even further, devising new metrics for community dynamism based on the degree of intransitivity in a given preference graph – or of treating the preference graph as a type of “constitution” for AI governance.
This view of pairwise data – not as messy and needing of discipline – but as rich and nuanced truth, invites bold new visions.&lt;/p&gt;

&lt;h3 id=&quot;dealing-with-heterogeneity&quot;&gt;Dealing with Heterogeneity&lt;/h3&gt;

&lt;p&gt;One of the few prerequisites to using pairwise techniques is making sure that the items being compared are meaningfully comparable.
This may sound tautological – but it is by no means guaranteed.
To give an example, one can meaningfully prefer an apple or an orange; one cannot meaningfully prefer an apple to the &lt;em&gt;color&lt;/em&gt; orange.&lt;/p&gt;

&lt;p&gt;Part of the practitioner’s skill is ensuring that the items being considered are part of a semantically coherent set.
This meaning can come from either grouping like items together, or framing the comparison in a way which makes sense for the given set.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;To give a concrete example, this year’s Deep Funding initiative framed the decision in terms of the relative impact of two &lt;em&gt;software dependencies&lt;/em&gt; on a given &lt;em&gt;software project&lt;/em&gt;; the same dependency might be assessed very differently depending on which project’s context was being considered.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This idea can be extended to the case when projects are of the same type, but have different funding needs – by framing the question in terms of “where should we direct the next marginal dollar,” projects of different scales can be meaningfully considered as part of the same set.&lt;/p&gt;

&lt;p&gt;In permissionless settings where it may be difficult to ensure homogeneity in advance, techniques like star grouping can be used to segment items into coherent sets on-the-fly.&lt;/p&gt;

&lt;h3 id=&quot;ai-and-verifiability&quot;&gt;AI and Verifiability&lt;/h3&gt;

&lt;p&gt;Astute readers may have detected a certain &lt;em&gt;coolness&lt;/em&gt; towards artificial intelligence throughout this essay.
You’re not imagining it – let’s address it plainly.&lt;/p&gt;

&lt;p&gt;Large language models are among the most transformative information-processing inventions since the development of writing, fundamentally changing how people participate in shared society.
Many in the public goods funding and digital governance ecosystems are excited at the prospect of using AI to automate human judgment at scale.
The potential benefits are real: Deep Funding, with its ~\(k/10\) data requirement, outperforms both Bradley-Terry and spectral methods by &lt;em&gt;two orders of magnitude&lt;/em&gt;, even with active ranking.
With benefits, however come risks: of opacity, loss of agency, and the accumulation of unmodeled risk.
Balancing reliance on AI while preserving our capacity for verifiability lets us navigate the design space more safely.&lt;/p&gt;

&lt;p&gt;We can draw an analogy to the 2008 global financial crisis, in which years of cavalier underwriting led to a sudden collapse of asset prices, as trust fled from systems which could not be independently verified.
It is uncomfortably easy to imagine an analogous “legitimacy crisis” occurring in 2028, in which years of deference to AI judgment leads to a sudden collapse of legitimacy, and the wholesale abandonment of digital institutions.&lt;/p&gt;

&lt;p&gt;We can draw a second analogy to the design of “optimistic rollups,” a popular Ethereum L2 architecture in which transactions are “optimistically” assumed valid, subject to a dispute window.
A key aspect of their design is the idea that &lt;em&gt;any&lt;/em&gt; transaction can be disputed and independently verified.
An optimistic rollup in which transactions were spot-checked randomly, on the order of one in every thousand transactions, and could not be individually disputed, would offer terrible security and be easily abused.&lt;/p&gt;

&lt;p&gt;The answer is not to avoid AI, but to leverage it cautiously.
As Archimedes famously said, “Give me a lever and a place to stand, and I will move the earth.”
AI is our lever, but without firm footing, we may lose more than we gain.&lt;/p&gt;

&lt;p&gt;This essay has tried to show how a public goods funding system could be constructed end-to-end using only human judgment, with AI playing a non-critical support role.
With this foundation in place, it becomes possible to &lt;em&gt;safely&lt;/em&gt; transition to higher levels of AI judgment, without taking on large and unmodeled risk.
What we would advocate, in the closing months of 2025, is a two-pronged approach in which AI tools are used to leverage human judgment, but &lt;em&gt;only to the extent that the entire flow&lt;/em&gt; can in principle be performed – and contested – by humans.&lt;/p&gt;
</description>
        <pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2025/12/14/pairwise-paradigm.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2025/12/14/pairwise-paradigm.html</guid>
        
        <category>public goods funding</category>
        
        <category>impact evaluation</category>
        
        <category>mechanism design</category>
        
        <category>pairwise preferences</category>
        
        
        <category>blog</category>
        
      </item>
    
      <item>
        <title>Exploring Deep Funding&apos;s Jury Data</title>
        <description>&lt;p&gt;This analysis explores &lt;strong&gt;611 pairwise judgments&lt;/strong&gt; from &lt;strong&gt;51 jurors&lt;/strong&gt; evaluating Ethereum dependencies through the Deep Funding experiment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key findings:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
  &lt;li&gt;Pairwise evaluations show 89% agreement when comparing diverse projects, and 63% for similar.&lt;/li&gt;
  &lt;li&gt;Contrary to intuition, longer thinking times did not improve decision quality.&lt;/li&gt;
  &lt;li&gt;Results suggest pairwise methods are ready for greater deployment in public goods funding.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;i-introduction&quot;&gt;I. Introduction&lt;/h2&gt;

&lt;p&gt;Last winter, an exciting experiment emerged from the Ethereum public-goods-funding ecosystem.
With a $200k prize pool put up by Vitalik himself, a group of enthusiasts began working on a novel method for supporting public goods: using AI agents to generate funding proposals, and using human inputs to adjudicate between them.
“AI as the engine, humans as the steering wheel” became the initiative’s rallying cry.&lt;/p&gt;

&lt;p&gt;The project, known as &lt;a href=&quot;http://deepfunding.org/&quot;&gt;Deep Funding&lt;/a&gt;, aimed to develop “impact weights” for ~30 first-order Ethereum dependencies (clients, languages, developer tools), and ~5,000 of their child dependencies.
With these weights in hand, funding could be “flowed” through the Ethereum ecosystem, supporting dependencies which might otherwise have gone under-funded.&lt;/p&gt;

&lt;p&gt;A key part of the design involved collecting jury data – the critical inputs used to decide &lt;em&gt;which&lt;/em&gt; AI proposals were most accurate and &lt;em&gt;how&lt;/em&gt; to combine them.
Rather than asking jurors to rank entire lists of dependencies, the design called for &lt;em&gt;pairwise&lt;/em&gt; comparisons: assessing the relative impact of just two dependencies at a time.
This pairwise format would simplify the human judgment task – instead of having to determine a full set of (meaningful) weights across dozens dependencies, they could focus their valuable attention on just one pair at a time.&lt;/p&gt;

&lt;p&gt;A friend brought the project to my attention in December, and – having done a fair amount of work with pairwise methods – I was intrigued.
I began volunteering, helping to refine the project design and the pairwise juror flow.&lt;/p&gt;

&lt;p&gt;Over the course of this past winter and spring, Deep Funding gathered jury data using &lt;a href=&quot;https://pairwise.vote/&quot;&gt;Pairwise&lt;/a&gt;, the voting tool developed by &lt;a href=&quot;https://www.generalmagic.io/&quot;&gt;General Magic&lt;/a&gt; partially based on &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3317445&quot;&gt;my own past work&lt;/a&gt; with &lt;a href=&quot;https://colony.io/&quot;&gt;Colony&lt;/a&gt;.
Over this time period, &lt;strong&gt;51 jurors&lt;/strong&gt; provided &lt;strong&gt;611 votes&lt;/strong&gt; on project pairs across both the primary and secondary dependencies.&lt;/p&gt;

&lt;p&gt;This analysis will explore these data, with the aim of better understanding both the dependency relationships, and the performance of the underlying pairwise process itself.
On a personal note, I have been enthusiastic about these methods &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3359677&quot;&gt;for years&lt;/a&gt;, and am happy to see them increasingly embraced by the Ethereum public-goods community.&lt;/p&gt;

&lt;h2 id=&quot;ii-data-analysis&quot;&gt;II. Data Analysis&lt;/h2&gt;

&lt;p&gt;Our analysis will proceed in three steps:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;First&lt;/strong&gt;, we will discuss the basics of the data set and our metrics of interest.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Second&lt;/strong&gt;, we will explore the judgments overall, looking at the degree to which jurors could distinguish between projects as well as the agreement among jurors.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;Third&lt;/strong&gt;, we will look at the specific relationship between thinking times and juror judgments, looking to understand whether more time spent per-pair produced better results.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://github.com/kronosapiens/notebooks/tree/main/notebooks&quot;&gt;Analysis code can be found here.&lt;/a&gt;&lt;/em&gt;
&lt;em&gt;The data itself are not yet public.&lt;/em&gt;&lt;/p&gt;

&lt;h3 id=&quot;preliminaries&quot;&gt;Preliminaries&lt;/h3&gt;

&lt;p&gt;The data includes &lt;strong&gt;611 pairwise juror judgments&lt;/strong&gt; gathered between &lt;strong&gt;December 24, 2024&lt;/strong&gt; and &lt;strong&gt;March 31, 2025&lt;/strong&gt;.
Participating jurors were presented with pairs of projects through the Pairwise interface:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://s3.us-east-1.amazonaws.com/kronosapiens.github.io/images/pairwise-df.png&quot; alt=&quot;Pairwise Screenshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The projects were organized into two data sets.
The first, “Level 1,” consists of Ethereum’s &lt;strong&gt;direct dependencies&lt;/strong&gt; – representing critical Ethereum-specific infrastructure.
For L1, &lt;strong&gt;30 jurors&lt;/strong&gt; provided &lt;strong&gt;402 votes&lt;/strong&gt; over &lt;strong&gt;35 unique projects&lt;/strong&gt;, with an average of &lt;strong&gt;13 votes per juror&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The second, “Level 3,” consists of the &lt;strong&gt;child dependencies&lt;/strong&gt; of the L1 projects – representing general software development dependencies.
For L3, &lt;strong&gt;21 jurors&lt;/strong&gt; provided &lt;strong&gt;209 votes&lt;/strong&gt; over &lt;strong&gt;144 unique projects&lt;/strong&gt;, with an average of &lt;strong&gt;9 votes per juror&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;These twin datasets permit a unique exploration: by performing the same analysis on both sets, we can learn not only what the data tell us about the Ethereum ecosystem, but better understand the pairwise method &lt;em&gt;itself&lt;/em&gt; by seeing how the results differ between sets.
&lt;strong&gt;A pairwise meta-analysis, if you will.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;As part of the data processing, we introduce a few additional fields.
The first is the &lt;strong&gt;log multiplier&lt;/strong&gt; – the log2 of the multiplier (given between 1 and 1024), used to simplify data visualization.
The second is the &lt;strong&gt;choice multiplier&lt;/strong&gt; – the same multiplier but made positive or negative based on which project was chosen, enabling direct comparison between jurors.
The third is the &lt;strong&gt;thinking time&lt;/strong&gt; – the difference in timestamps between sequential votes by the same juror.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Methodological note&lt;/strong&gt;: thinking time is inferred from the time elapsed between votes and should be considered a noisy signal.
Further, some of the duration is explained by typing time, correlating with the length of the rationale string by .41 in L1 and .19 in L3.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;overall-juror-assessments&quot;&gt;Overall Juror Assessments&lt;/h3&gt;

&lt;p&gt;The first thing we look at is the distribution of multipliers given by the jurors.
These histograms count the individual juror votes, grouped by the strength of the given multipliers.
We see that for L1, the multipliers cluster more closely around 0, likely reflecting the consistent quality of all projects in the set.
Given that L1 consists of Ethereum’s primary dependencies, it is unlikely that one would be significantly more important than another, and the data reflect this.&lt;/p&gt;

&lt;p&gt;For L3, we see a skew towards higher multipliers, reflecting the greater variance in dependency impact and relevance.
Given that L3 projects can run the gamut from critical dependencies to shell scripting utilities, it is unsurprising that the relative impact would be larger.&lt;/p&gt;

&lt;p&gt;Overall, these data align with our intuition that the delta between L1 dependencies should generally be smaller than that between L3 dependencies.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_1_Histogram_of_log_choice_multipliers.png&quot; alt=&quot;L1 Log choice multipliers&quot; /&gt;
&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_3_Histogram_of_log_choice_multipliers.png&quot; alt=&quot;L3 Log choice multipliers&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Let’s look at the mean and standard deviation of the multipliers given by jurors for project pairs.
These scatter plots show individual project pairs, presenting the mean and standard deviation of all juror votes on that particular pair.&lt;/p&gt;

&lt;p&gt;We see that for L1, the standard deviation is relatively constant, indicating that the level of agreement among jurors was similar across the entire data set.
Looking at L3, we see a very different effect: the standard deviation decreases at the extremes and increases towards the center, suggesting that larger differences in impact were easier to distinguish – an intuitive result.&lt;/p&gt;

&lt;p&gt;It is worth noting that the average standard deviation for L1 is higher than that of the obvious cases in L3, suggesting that while L1 comparisons were more uniform in difficulty, they were on the whole more difficult than the simplest cases in L3.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_1_Per-pair_log_choice_multiplier_mean_vs_std_with_best_fit_curve.png&quot; alt=&quot;L1 multiplier mean vs std&quot; /&gt;
&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_3_Per-pair_log_choice_multiplier_mean_vs_std_with_best_fit_curve.png&quot; alt=&quot;L3 multiplier mean vs std&quot; /&gt;&lt;/p&gt;

&lt;p&gt;If we ignore the multipliers and limit our sense of “agreement” to project choice, then jurors in L1 agreed with each other &lt;strong&gt;63%&lt;/strong&gt; of the time (based on 60 pairs receiving exactly two votes), while in L3 they agreed with each other &lt;strong&gt;89%&lt;/strong&gt; of the time (based on 28 pairs receiving exactly two votes).
This suggests that the pairwise process captured meaningful signal, and that the signal was more clear in L3 than L1.&lt;/p&gt;

&lt;h3 id=&quot;juror-thinking-times&quot;&gt;Juror Thinking Times&lt;/h3&gt;

&lt;p&gt;A major debate during the Deep Funding design process concerned the length of time jurors &lt;em&gt;should spend&lt;/em&gt; thinking about each pair.
Opinions ranged from as little as five or ten seconds per pair, all the way to five or even ten &lt;em&gt;minutes&lt;/em&gt;.
This 60-fold gap in expectations is striking, and suggests a practice guided more by intuition than empirical analysis.
What do the data show?&lt;/p&gt;

&lt;p&gt;Among L1 jurors, the median thinking time was &lt;strong&gt;2.8 minutes&lt;/strong&gt;, and among L3 jurors, &lt;strong&gt;1.3 minutes&lt;/strong&gt;.
This makes sense, as L3 projects have greater variety, and so should be &lt;em&gt;generally&lt;/em&gt; easier to distinguish.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_1_Histogram_of_thinking_times.png&quot; alt=&quot;L1 Thinking times&quot; /&gt;
&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_3_Histogram_of_thinking_times.png&quot; alt=&quot;L3 Thinking times&quot; /&gt;&lt;/p&gt;

&lt;p&gt;One might wonder whether jurors who spent more time thinking would be more discerning; perhaps longer thinking times correlated with higher (or lower) multipliers.
Looking at the L1 plot of thinking times vs. final judgments, we see no correlation.
Looking at the same plot for L3, we see a small but weak positive correlation.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_1_Thinking_time_vs_log_multiplier_with_best-fit_line.png&quot; alt=&quot;L1 thinking vs multiplier&quot; /&gt;
&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_3_Thinking_time_vs_log_multiplier_with_best-fit_line.png&quot; alt=&quot;L3 thinking vs multiplier&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This next plot (my personal favorite) shows &lt;em&gt;both votes&lt;/em&gt; for every project pair which received exactly two votes and where both votes had valid thinking times.
We draw every project pair’s votes in the same color, and connect them with a dotted line.
The slope of the line tells us how the longer-thinking juror’s vote differed from their shorter-thinking counterpart.&lt;/p&gt;

&lt;p&gt;While dense with information, this chart offers a fine-grained view of voting behavior and suggests an interesting dynamic.
Overall, in L1 it seems that more thinking time rarely resulted in a very different assessment of impact, and when it did, the effect seemed random.
In contrast, in L3 it seems as though more time thinking typically resulted in a &lt;em&gt;smaller&lt;/em&gt; multiplier – and only rarely in a choice change.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_1_Thinking_time_vs_log_choice_multiplier_per-pair.png&quot; alt=&quot;L1 thinking vs multiplier full&quot; /&gt;
&lt;img src=&quot;https://raw.githubusercontent.com/kronosapiens/notebooks/refs/heads/main/plots/Level_3_Thinking_time_vs_log_choice_multiplier_per-pair.png&quot; alt=&quot;L3 thinking vs multiplier full&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;iii-conclusion&quot;&gt;III. Conclusion&lt;/h2&gt;

&lt;p&gt;Overall, the data suggest that the pairwise process is useful for gathering subjective assessments of relative impact at scale.&lt;/p&gt;

&lt;p&gt;The data on overall assessments confirms our intuition that it is easier to distinguish between things that are different than things which are similar.
Not only was the perceived relative impact higher in L3, where projects vary more in quality, but the level of agreement was higher – suggesting that the decision problem was easier overall.
The 89% agreement among jurors in L3 is striking and seems to pave the way for deployments at greater scale.&lt;/p&gt;

&lt;p&gt;The data on thinking times challenges our intuition that more time spent thinking would produce better results.
With the caveat that our thinking times are a noisy signal, &lt;strong&gt;we see little evidence that longer thinking times result in meaningfully better decisions.&lt;/strong&gt;
If anything, the data suggest the opposite – that we should be encouraging jurors to make more, faster decisions rather than fewer, slow ones.&lt;/p&gt;

&lt;p&gt;Despite their promise, pairwise methods are relatively underdeveloped.
There is ample room for research into improved UI, more efficient pair selection, and continuous funding paradigms in which juror votes are required only marginally instead of en masse.
Through these continued investigations, pairwise methods have the potential to become critical infrastructure for Ethereum public goods funding – proving that sometimes the best way to understand a system is by building tools to evaluate it.&lt;/p&gt;
</description>
        <pubDate>Sun, 03 Aug 2025 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2025/08/03/deepfunding-jury-analysis.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2025/08/03/deepfunding-jury-analysis.html</guid>
        
        <category>public goods funding</category>
        
        <category>impact evaluation</category>
        
        <category>mechanism design</category>
        
        
        <category>blog</category>
        
      </item>
    
      <item>
        <title>Tech in the Age of AI</title>
        <description>&lt;h1 id=&quot;i-technology&quot;&gt;I. Technology&lt;/h1&gt;

&lt;p&gt;At some point in the last few months, I saw a tweet from a software engineer who had just started using &lt;a href=&quot;https://www.cursor.com/&quot;&gt;Cursor&lt;/a&gt;, one of the new AI-powered code editors. His hot take went something along the lines of:&lt;/p&gt;

&lt;pre&gt;
I realized that 80% of my engineering skills have become obsolete.
My leverage on the remaining 20% went up 10x.
&lt;/pre&gt;

&lt;p&gt;I was intrigued, so I downloaded the app. I was impressed. Cursor was well-done, with 95% of the product being a clone of VS Code, a popular mainstream code editor, with the other 5% being a few thoughtfully-designed commands for integrating AI into a development workflow. I had been using Github’s Copilot for a few months at that point, and found Cursor’s product to be more useful in practice.&lt;/p&gt;

&lt;p&gt;Curious about who had made this tool, I found their “About” page. It looked as though Cursor was made by a half-dozen MIT alumni, just a few years out of undergrad. &lt;strong&gt;The next generation has come to turn me to glue, I thought to myself.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Within a matter of days, I found myself using Cursor’s features regularly, to handle tasks like writing tests and implementing new functionality. Cursor’s AI functionality is exposed more or less through three tools:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Generative auto-complete, in which the code editor will directly suggest text while you type.&lt;/li&gt;
  &lt;li&gt;A shortcut for inline edits. Useful for writing tests, e.g. “Write a test showing that function X obtains behavior Y.”&lt;/li&gt;
  &lt;li&gt;A shortcut for interactive chat. Useful for pasting in error messages or having more discursive conversations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The auto-complete in particular is quite helpful, not necessarily because the suggestions are always correct, but because it is easy to accept them when they are, and ignore them when they are not; you don’t feel like you are constantly fighting with the tool.&lt;/strong&gt; Having used Cursor for a few months now, I have developed an intuitive feel for the probabilistic nature of the suggestions: the earlier you are in implementing a piece of functionality, the more random the suggestions will be (less data means wider variance), while the later you are in implementing, the more accurate the remaining suggestions will be (as Cursor has more context to draw on). The best way to describe Cursor would be as an eager intern – often wrong and needing guidance, but enthusiastic and indefatigable.&lt;/p&gt;

&lt;p&gt;Another example of Cursor’s utility is in facilitating front-end development. I am not a front-end developer, but I have enough of a foundation that I’m able to &lt;a href=&quot;zaratan.world&quot;&gt;meet my basic needs&lt;/a&gt;. My experience of front-end is that of having to comprehend the myriad interactions involved with HTML and CSS, and to stay on top of the ever-evolving abstraction layers related to them. Cursor allows me to develop bespoke styles interactively, leveraging the latest techniques:&lt;/p&gt;

&lt;pre&gt;
Me: Can you help me embed this YouTube video?

Cursor: Of course, here is a sample iframe configuration

Me: It isn&apos;t rendering properly, what&apos;s wrong?

Cursor: Sorry, the settings need to updated to include the new `aspect-ratio` property

Me: Awesome, thanks. Can some of this code be removed?

Cursor: Yes, some of my suggestions are now redundant, let me remove them
&lt;/pre&gt;

&lt;p&gt;Leaving me with a clean YouTube embedding and a minimum of black-magic styling, in all of ten minutes. In another example, I was able to develop an entire &lt;a href=&quot;https://master.d2aq635rxv7ilk.amplifyapp.com/&quot;&gt;React SPA&lt;/a&gt; for a &lt;a href=&quot;https://sybil-defense.devcon.org/&quot;&gt;scavenger hunt&lt;/a&gt; I helped organize in less than an hour. Obviously, there is material risk in using code you don’t understand, but I’m not sure if this risk is different in kind or merely in degree when compared to “copy and pasting from Stack Overflow,” which developers have been doing (albeit furtively) for years.&lt;/p&gt;

&lt;p&gt;As I my workflow evolved in conversation with this tool, I found myself returning to that tweet. There is truth to it. Cursor makes development much less error-prone. The auto-complete is excellent at catching syntax and formatting errors as you work. When writing a piece of functionality, once you have demonstrated sufficient intent, the editor can often finish the rest. &lt;strong&gt;From an information-theory perspective, software engineers no longer need to write “informationless” code – if a diff of code provides “no new information” beyond what has already been written, the editor can usually infer it.&lt;/strong&gt; In the ideal case, the most tedious half of the engineer’s job has been removed, allowing the more high-impact and creative parts of the role to take up more space.&lt;/p&gt;

&lt;h1 id=&quot;ii-labor&quot;&gt;II. Labor&lt;/h1&gt;

&lt;p&gt;What will this mean for the workforce? Are we a few years away from massive unemployment among software developers? It’s a valid concern.&lt;/p&gt;

&lt;p&gt;I think it’s fair to say that software engineers, especially in the US, have enjoyed a cush existence this past decade or two. A relatively low bar to entry (relative to other professions), combined with strong demand driven by rapid industry growth, meant that software engineering was, for better or worse, about as close as you could get to free money.&lt;/p&gt;

&lt;p&gt;How will AI change this?&lt;/p&gt;

&lt;p&gt;In the late Obama years, I attended the &lt;a href=&quot;https://personaldemocracy.com/&quot;&gt;Personal Democracy Forum&lt;/a&gt;, a technology-and-society conference held at NYU’s campus. One of the talks featured a young policy wonk, sharply dressed and not much older than me, talking about technology’s impact on the labor market. I was impressed by this man, who would confidently use terms like “labor categories” and reassured the audience that new types of work always emerge in the wake of technological change.&lt;/p&gt;

&lt;p&gt;I suspect the same will be true today. I don’t think software engineers will be obsolete, but I do think that the skills in demand in five years will look very different from those of five years ago. Nor will software engineering departments in larger organizations be structured the way they are today.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Specifically, I suspect that teams will be smaller as a rule, across the board. At every level of scale and complexity, it will take fewer people to develop and maintain a given amount of software.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Further, the software engineering role will broadly have “moved up” the abstraction stack. Much like newer programming languages meant that average programmers needed to know less about kernel memory management, so will more advanced editors mean that average programmers will need to know less about language syntax, library APIs, and the like. &lt;strong&gt;The relative premium on an encyclopedic knowledge of languages and libraries will decline. As a corrolarly, there will be a relative premium placed on product sensibility, user sensitivity, and design fluency.&lt;/strong&gt; New workflows will be more iterative and high-level.&lt;/p&gt;

&lt;p&gt;If we wanted to be alarmist, we could read this as “job loss.” But I disagree. I don’t think this logic clearly leads to fewer people working in software; it might even lead to more. &lt;strong&gt;But what we likely will see is more, smaller, and specialized teams, iteratively developing custom solutions for specific niches.&lt;/strong&gt; The “junior dev” role will increasingly disappear, and seniority will increasingly include aspects of product management. The wider diversity of teams and products will make it easier to find work that is more personally fulfilling.&lt;/p&gt;

&lt;p&gt;Compensation ranges will likely widen, much as it did for lawyers in the 2010’s, as weaker engineers will find themselves struggling to demonstrate value, while stronger engineers will be able to leverage their abilities even further. On the flip side, it will become easier for less naturally technical people to find their way into the profession, broadening its accessibility at the entry levels.&lt;/p&gt;

&lt;p&gt;The result is better and more finely-tailored software serving a wider variety of niches, made by people more deeply connected to their work, and with a more accessible profession overall. That does not seem particularly bad to me.&lt;/p&gt;

&lt;h1 id=&quot;iii-distribution&quot;&gt;III. Distribution&lt;/h1&gt;

&lt;p&gt;Around the same time as the first tweet, I saw a second tweet on the broader effect of AI on the tech industry. The take was something to the effect of:&lt;/p&gt;

&lt;pre&gt;
As the marginal cost of software declines, distribution matters more.
&lt;/pre&gt;

&lt;p&gt;This is something I have been thinking about a lot, and I’m not the only one.&lt;/p&gt;

&lt;p&gt;Broadly, the argument goes something like this: ten years ago, it was difficult to develop software. If you could build a team that could successfully deliver software, you had a business. Demand would be strong enough that anyone could be taught to sell it. &lt;strong&gt;The competitive advantage was technical superiority.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now, however, the market logic has reversed. Anyone can develop software, and there is too much of it. People don’t know what software to use, or how to choose. What really matters is having the marketing and media savvy to cut through the noise and actually reach people. &lt;strong&gt;The competitive advantage will be communication and distribution.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is something I’m taking seriously. After spending 4+ years (largely single-handedly) &lt;a href=&quot;https://blog.zaratan.world/p/the-story-of-chore-wheel&quot;&gt;developing Chore Wheel&lt;/a&gt; as a software product, I am now investing in capacity for content and distribution. I have taken &lt;a href=&quot;https://fractalnyc.com/fractalnyc/FractalU-Fall-Semester-2024-bcde30a9ea374ed080b4d1d22809b3d3&quot;&gt;a Twitter class&lt;/a&gt;, and am trying to develop &lt;a href=&quot;https://x.com/zaratanDotWorld&quot;&gt;greater fluency with X&lt;/a&gt;, which is its own world and language. I have &lt;a href=&quot;https://blog.zaratan.world/&quot;&gt;started a Substack&lt;/a&gt;, and have committed to publishing monthly long-form writing. I have been spending significant mounts of time &lt;a href=&quot;https://docs.google.com/presentation/d/e/2PACX-1vTEpnw9C_uLeqHlRsiqUvbV0cDfjjTWC9FwIJ_adNRIQGcyv0WFLNNMotU1qDDWUkW6KO0ckQ-PqA1Q/pub?start=false&amp;amp;loop=false&amp;amp;delayms=3000&quot;&gt;developing marketing materials&lt;/a&gt; and iterating on messages.&lt;/p&gt;

&lt;p&gt;Distribution is unfamiliar territory for me. For years, I focused on developing my technical capacities. I am now learning that distribution, sales, marketing are deep disciplines unto themselves, and that fluency will come only with time and effort.&lt;/p&gt;

&lt;p&gt;In my first steps on this journey, I read Seth Godin’s &lt;a href=&quot;https://www.amazon.com/This-Marketing-Cant-Until-Learn/dp/0525540830/&quot;&gt;&lt;em&gt;This is Marketing&lt;/em&gt;&lt;/a&gt;. He wrote one thing which really struck me: &lt;strong&gt;“The amateur does what they like. The professional does what other people like.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;In some ways, engineering and distribution are highly complementary, the yin and yang of technology business. Both are deeply creative, involving the iterative definition and solution of problems.&lt;/strong&gt; The engineer might ask: how can I develop software to achieve my objective? The marketer might ask: how can I develop a message to reach my audience? In both fields, your intuition might be right. But it just as often might not be. Where intuition ends, disciplined discovery begins.&lt;/p&gt;

&lt;h1 id=&quot;iv-vision&quot;&gt;IV. Vision&lt;/h1&gt;

&lt;p&gt;Unflagging optimisim my most obnoxious feature. I can find a silver lining in any circumstance.&lt;/p&gt;

&lt;p&gt;It will be important to tread carefully over the next few years. As Karl Polanyi wrote in his seminal &lt;a href=&quot;https://www.amazon.com/Great-Transformation-Political-Economic-Origins/dp/080705643X/&quot;&gt;&lt;em&gt;The Great Transformation&lt;/em&gt;&lt;/a&gt;, it is not the change itself that causes harm, but the &lt;em&gt;rate of change&lt;/em&gt;. The speed with which AI is poised to reorganize labor markets has potential for real harm.&lt;/p&gt;

&lt;p&gt;That said, if we can squint our eyes and gaze far into the future, there is much to look forward to. The promise of smaller, more tight-knit engineering teams embedded more closely with the communities they serve. The potential of more “whole people” – a more humanist kind of tech worker, less instrumental and more curious. The hope of technology being something that people can increasingly create for themselves, realizing the science-fiction settings of some of the best graphic novels.&lt;/p&gt;

&lt;p&gt;There is much to look forward too, and perhaps not much to fear.&lt;/p&gt;
</description>
        <pubDate>Sun, 17 Nov 2024 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2024/11/17/tech-ai.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2024/11/17/tech-ai.html</guid>
        
        <category>technology</category>
        
        <category>entrepreneuership</category>
        
        <category>ai</category>
        
        
        <category>blog</category>
        
      </item>
    
      <item>
        <title>Understanding Generative Art</title>
        <description>&lt;h1 id=&quot;i-overview&quot;&gt;I. Overview&lt;/h1&gt;

&lt;p&gt;This is an essay defending blockchain-based generative art as a contemporary form. It will address three major critiques of generative art:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The artistic legitimacy of generative art,&lt;/li&gt;
  &lt;li&gt;The value and meaning of digital ownership, and&lt;/li&gt;
  &lt;li&gt;The advantages of a blockchain-based art market&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We will demonstrate how these critiques are not only unfounded, but represent an narrow understanding of the nature and potential of the medium.&lt;/p&gt;

&lt;p&gt;For the purposes of this essay, “blockchain-based generative art” will refer to &lt;a href=&quot;https://opensea.io/collection/full-spectrum-by-lars-wander&quot;&gt;artworks&lt;/a&gt; which are produced by successive runs of a computer program, with outputs not known in advance, and exist in a series whose number is fixed in advance. All artworks are publically available for viewing, and the specification and ownership of the artworks are stored in a public digital ledger.&lt;/p&gt;

&lt;p&gt;We are specifically &lt;em&gt;not&lt;/em&gt; discussing images generated by AI models in response to natural-language prompts, such as those of DALL-E or Midjourney. Such works are also described as “generative art”, but represent a very different form and require a different conversation.&lt;/p&gt;

&lt;h1 id=&quot;ii-artistic-legitimacy&quot;&gt;II. Artistic Legitimacy&lt;/h1&gt;

&lt;p&gt;No one can be blamed for some skepticism concerning the artistic merit of generative art. Compared to our popular images of the fine artist painting for hours in her studio, or the metalworker welding metal in high heat, the idea of an “artist” as someone sitting in front of the computer’s blue light, producing pictures at the touch of a keyboard, seems deeply unsatisfying.&lt;/p&gt;

&lt;p&gt;A useful resource for understanding the potential of the medium is generative artist Tyler Hobb’s essay &lt;a href=&quot;https://tylerxhobbs.com/essays/2021/the-rise-of-long-form-generative-art&quot;&gt;The Rise of Long-Form Generative Art&lt;/a&gt;. His argument can be summarized as follows:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;In the past, generative artists could produce unlimited numbers of outputs, and cherry-pick the best ones for display&lt;/li&gt;
  &lt;li&gt;Now, since outputs are public and their number fixed, the programs must meet a higher standard of consistency and quality&lt;/li&gt;
  &lt;li&gt;This has placed significant pressure on artists to produce programs which are both consistently varied and high quality&lt;/li&gt;
  &lt;li&gt;Veiwers are evaluating both individual outputs, and also how the whole collection expresses an underlying concept&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What this suggests is that the constraints that the blockchain places on the medium significantly raises the technical bar for generative art, to the point of approaching the level of challenge we associate with any other legitimate art form. When evaluating a generative art piece, we are both evaluating the artist’s technical skill in producing a program which produces varied outputs of consistent quality, as well as the aesthetics of the individual images and their relationship within collection as a whole, as well as the extent to which the total work expresses an underlying concept of interest.&lt;/p&gt;

&lt;p&gt;The fact that only a fixed number of random outputs can exist is the key to the genre. To paraphrase Martin Scorsese’s famous &lt;a href=&quot;https://www.nytimes.com/2019/11/04/opinion/martin-scorsese-marvel.html&quot;&gt;2019 takedown&lt;/a&gt; of the last decade’s Marvel phenomenon, “art must put something at risk.” In the case of generative art, the risk is that despite the artist’s best efforts, the result is largely given by chance. An artist spends days or months developing a program, only to be judged based on random outputs after-the-fact. The giving up of control on the part of the artist, the trusting in the process and in themselves and their abilities, is the beating heart of the genre.&lt;/p&gt;

&lt;p&gt;Another easy critique is that the production of generative art doesn’t “look like” making art. This is a red herring, and reveals some important assumptions about art production which should be interrogated.&lt;/p&gt;

&lt;p&gt;Recall that early Impressionists were excluded from Parisian galleries on the grounds that standing outside and painting &lt;em&gt;en plein air&lt;/em&gt; wasn’t “real” artistic production, when contrasted with their classically-trained contemporaries laboring over individual brushstrokes. What they did was too quick, too easy. And yet, faster painting meant works could be sold at lower prices, bringing fine art to the middle class. And while initially mocked, we have come to see the Impressionists as advancing a new and influential creative form, which innovated not only visually, but economically. There is no reason to believe that we will not come to feel the same about generative art, in ten or fifty years time.&lt;/p&gt;

&lt;p&gt;Another influential advocate of generative art is &lt;a href=&quot;https://twitter.com/artonblockchain&quot;&gt;Erick Calderon&lt;/a&gt;, a &lt;a href=&quot;https://opensea.io/collection/chromie-squiggle-by-snowfro&quot;&gt;pioneering generative artist&lt;/a&gt; and creator of both the &lt;a href=&quot;https://www.artblocks.io/&quot;&gt;ArtBlocks&lt;/a&gt; platform and the &lt;a href=&quot;https://www.brightmoments.io/&quot;&gt;Bright Moments&lt;/a&gt; network of generative art galleries. Described as “your favorite artist’s favorite artist”, Calderon’s years as the proprietor of a tile company has given him a unique insight into the dynamics and economics of artistic production.&lt;/p&gt;

&lt;p&gt;In his keynote talk at 2023’s &lt;a href=&quot;https://www.outeredge.live/speakers/Erick-Calderon&quot;&gt;“Outer Edge LA”&lt;/a&gt; NFT conference, Erick summarized this progression as follows:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Initially, fine art was “1 of 1”, in which collectors owned unique artworks&lt;/li&gt;
  &lt;li&gt;Later, art expanded to include “1 of X”, in which collectors owned identical members of a series&lt;/li&gt;
  &lt;li&gt;With generative art, we have arrived at “1 of 1 of X”, in which collectors own unique members of a series&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Erick’s telling, the drop in cost of the production of individual artworks has not cheapened the value of the work, but rather opened up a frontier for new kinds of work. He gives an example of personalized conference badges, but this only scratches the surface of possibilities, as Calderon rightly identifies our deep emotional needs for individuation and distinction alongside communion and belonging as key drivers of demand for these new forms.&lt;/p&gt;

&lt;p&gt;All that said, this does not mean that all generative artwork being created now will be relevant in ten years time, much like the majority of boat paintings made in France during the 19th century are not relevant today. But some certainly will be.&lt;/p&gt;

&lt;p&gt;Further, it is not yet clear what emotional modes and moods this art form will best express. The Impressionists found that themes of nature and everyday life were well suited to their new artistic production techniques. To what modes and moods will generative art ultimately be best-suited? The answer remains to be seen.&lt;/p&gt;

&lt;h1 id=&quot;iii-digital-ownership&quot;&gt;III. Digital Ownership&lt;/h1&gt;

&lt;p&gt;A common criticism of digital art can be summarized as “it is meaningless to own artwork which anyone can view”. This critique seems sensible, even self-evident, at first blush but a closer analysis reveals that if anything, it gets things backwards.&lt;/p&gt;

&lt;p&gt;To understand why, let’s consider some of the motivations for collecting art.&lt;/p&gt;

&lt;p&gt;The first and most obvious motivation would be the pure pleasure of having and observing an artwork. Someone collects something that they love, and displays it in a place of their choosing, enhancing their quality of life. This is an important motivation, but it is naive to think that it is the only one.&lt;/p&gt;

&lt;p&gt;The second and arguably more important motivation is the psychological need for individuation, for feeling unique and relevant (and ultimately, safe) in the world. As French philosopher &lt;a href=&quot;https://en.wikipedia.org/wiki/Ren%C3%A9_Girard&quot;&gt;Rene Girard&lt;/a&gt; has repeatedly written, our drive for uniqueness is core to social functioning, and as &lt;a href=&quot;https://en.wikipedia.org/wiki/Thorstein_Veblen&quot;&gt;Thorstein Veblen&lt;/a&gt; famously observed, people will pay large sums to defend their uniqueness.&lt;/p&gt;

&lt;p&gt;Art is culturally relevant, and people collect art because through owning something relevant, they become relevant. But since not all art is relevant, collecting the specific artworks that ultimately become relevant indicates that the collector is perceptive and sophisticated. This is arguably self-evident, judging by how much of the art world is driven by social proof and tastemakers.&lt;/p&gt;

&lt;p&gt;With that in mind, we can see why in the domain of cultural relevancy, digital and physical art have very much in common.&lt;/p&gt;

&lt;p&gt;Owning physical art gives me the right to restrict access. I can hide it away in my house and prevent anyone from seeing it. Once upon a time, this might have been important, as great works of art were not easy to come by: supply was limited and attention was abundant. But in the 21st century, the logic is reversed. The supply is abundant, yet attention is limited. It means very little to own artwork that no-one cares about.&lt;/p&gt;

&lt;p&gt;Ultimately, art collectors are stewards of their art. To buy a piece of art is like adopting a child: the collector takes on the responsiblity of protecting the art, of defending it, of sharing it and of making sure that it is appreciated by the world. Recall: by owning something relevant, you become relevant.&lt;/p&gt;

&lt;p&gt;This logic applies not only to speculative collectors looking to preserve wealth (a top predictor of an artist’s value at auction is the number of working artists citing them as an influence), but also to social collectors looking to establish reputations in their communities.&lt;/p&gt;

&lt;p&gt;Returning to the question of digital ownership, then, we see how little difference it makes whether the art is digital or physical, and how little difference it makes that “anyone can download a .jpg” of an artwork. To own art, whether physical or digital, is to tell the world that you have good taste and good judgement. To own art, whether physical or digital, is to take responsibility for that artwork’s status in the world.&lt;/p&gt;

&lt;p&gt;If anything, and this may be a stretch, digital art has the potential of being &lt;em&gt;more&lt;/em&gt; valuable than physical art, as it’s digitally-native nature may make it more amenable to dissemination, and for the universe of digital art to be more continuously in conversation. Digital art has certainly shown itself to be &lt;a href=&quot;https://oncyber.io/karatekid&quot;&gt;highly&lt;/a&gt; &lt;a href=&quot;https://oncyber.io/artblocksgallery&quot;&gt;suitable&lt;/a&gt; to &lt;a href=&quot;https://gallery.so/blockbird&quot;&gt;interactive display&lt;/a&gt;. But the full extent remains to be seen.&lt;/p&gt;

&lt;h1 id=&quot;iv-art-market&quot;&gt;IV. Art Market&lt;/h1&gt;

&lt;p&gt;For a world as lavish and flamboyant as the art world, it’s markets are famously opaque. Both buyers and sellers express frustration when transacting art, as they are frequently denied information about historical sales and demand. Prices are presented as “take-it-or-leave-it” and pressure tactics run rampant. As colorfully detailed in the Atlantic’s &lt;a href=&quot;https://www.newyorker.com/magazine/2023/07/31/larry-gagosian-profile&quot;&gt;recent profile&lt;/a&gt; of Larry Gagosian, art dealers frequently engage in practices which, were they to occur in regulated financial markets, would amount to criminal insider trading.&lt;/p&gt;

&lt;p&gt;As a luxury “Veblen” good (one where demand increases, not decreases, with price), one could argue that this opacity is essential, even valuable, in the art market. The more you pay for a piece of art, the more valuable it becomes, and prospect of huge comissions encourages art merchants to throw extravagant parties, creating an art world teeming with life.&lt;/p&gt;

&lt;p&gt;The digital art market, on the other hand, is &lt;a href=&quot;https://opensea.io/&quot;&gt;perfectly transparent&lt;/a&gt;. With ownership “on the blockchain”, so to speak, everyone knows exactly when an artwork is sold, and for how much, and to whom (to some extent). This allows potential buyers and sellers to know more exactly what they are getting, and limits the ability of middle-folk to squeeze out value while contributing little.&lt;/p&gt;

&lt;p&gt;Conventional art dealers might argue that this level of transparency will lead to depressed art markets, or that price manipulation is needed to &lt;a href=&quot;https://qz.com/103091/high-end-art-is-one-of-the-most-manipulated-markets-in-the-world&quot;&gt;protect artists from hype cycles&lt;/a&gt; over the long term. But with &lt;a href=&quot;https://opensea.io/collection/chromie-squiggle-by-snowfro&quot;&gt;digital squiggles&lt;/a&gt; going for tens of thousands of dollars, that seems not to be the reality.&lt;/p&gt;

&lt;p&gt;Will the transparency of the digital art market mean an end to the parties? Or will it mean new kinds of people throwing new kinds of parties?&lt;/p&gt;

&lt;p&gt;An optimistic view would be that this more transparent art market would lead to new, more productive, value flows. Art dealers would retain their role as taste-makers, with a blunted edge vis-a-vis their most egregious practices. New models would emerge for facilitating the discovery of art, allowing more varieties of people to sustain themselves in the art world. And of course, more could accrue to the artists themselves, allowing them to produce bolder work. How and what, remains to be seen.&lt;/p&gt;

&lt;h1 id=&quot;v-conclusions&quot;&gt;V. Conclusions&lt;/h1&gt;

&lt;p&gt;As with the Impressionists laboring in the French countryside, on-chain generative art represents a new form for visual art. Like Impressionism, this form comes with a new bundle of affordances: a lower cost-of-production, new creative possibilities and constraints, and a certain sensibility and preference of theme, all underpinned by new art-market dynamics.&lt;/p&gt;

&lt;p&gt;It is an exciting time, and it will be interesting to see how the story unfolds in the years ahead.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thanks to &lt;a href=&quot;https://dianneweinthal.photo/&quot;&gt;dianne weinthal&lt;/a&gt; and Eric Rosenfeld for feedback on earlier versions of this essay&lt;/em&gt;.&lt;/p&gt;
</description>
        <pubDate>Sat, 18 Nov 2023 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2023/11/18/understanding-generative-art.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2023/11/18/understanding-generative-art.html</guid>
        
        <category>art</category>
        
        <category>technology</category>
        
        
        <category>blog</category>
        
      </item>
    
      <item>
        <title>Los Angeles</title>
        <description>&lt;h1 id=&quot;i-overture&quot;&gt;I. Overture&lt;/h1&gt;

&lt;p&gt;Los Angeles is a city dreamt into existence. Unlike the great cities of the old world (or even the American east and midwest), which emerged organically around rivers and harbors, supporting trade which supported life, Los Angeles was simply willed into being: in a desert, the product of salesmanship, speculation, and solipsism.&lt;/p&gt;

&lt;p&gt;Truly the “postmodern city”, nestled at the edge of the world: it means nothing, and can thus mean anything.&lt;/p&gt;

&lt;h1 id=&quot;ii-context&quot;&gt;II. Context&lt;/h1&gt;

&lt;p&gt;In his ambitious and wide-ranging &lt;em&gt;City of Quartz&lt;/em&gt;, historian Mike Davis tells the story of early Los Angeles as one of boosterism, with land speculators imagining and selling an image of Southern California to midwestern retirees – leading to a massive internal migration west and making Los Angeles an unexpected bastion of anglo-protestant power. This influx of wealth, earned elsewhere, shaped Los Angeles into a city of consumption, not of creation. It was a place people came to relax and enjoy, not to struggle and create. A consequence of this is of course, the sloth of being too well-fed: a resistance to change, a streak of conservatism, and an apathy towards the (myriad) other which shades and colors Southern Californian politics to this day.&lt;/p&gt;

&lt;p&gt;During the first half of the 20th century, when the prime land of the coast was virgin and undeveloped, and “high modernism” the civic ethos of the day, there seemed like nothing more worthwhile than to envision and build entire communities whole-cloth (like the pioneering, and infamous, &lt;a href=&quot;https://www.newyorker.com/magazine/1993/07/26/trouble-in-lakewood&quot;&gt;Lakewood&lt;/a&gt;). Water was stolen, and the then-abundance of land obscured the contradiction inherent in its commodification. The American dream was shrink-wrapped and sold, creating vast communities lacking an orienting mythology, propped up by the (then) large and growing defense industry. Unlike other great cities, Los Angeles never had to produce itself.&lt;/p&gt;

&lt;p&gt;Culturally, the gravity of Hollywood distorts the arts, its inexorable logic turning creative output towards profit and mass appeal – the experience of which gave rise to &lt;em&gt;noir&lt;/em&gt;, the genre of disillusionment. As Davis observes, this unique distortive force is compounded by the city’s relative youth; unlike Paris or New York, Los Angeles lacks the “accumluated patrimony” of successive generations of homegrown cultural movements. Lacking these roots, the cultural topsoil easily erodes, leading to a city of passing fads and weird cults.&lt;/p&gt;

&lt;p&gt;There is no one in charge. The monocentric anglo Downtown power structure of the early 20th century gave way to polycentrism, with the newly risen Jewish Westside vying for influence; the California and Jonathan Clubs contending with Hillcrest. Both live in the shadow of international capital, which can enter freely and distort the economic orbits nigh at will.&lt;/p&gt;

&lt;h1 id=&quot;iii-trendlines&quot;&gt;III. Trendlines&lt;/h1&gt;

&lt;p&gt;Where to go? What is the future for the postmodern city which means nothing and therefore can mean anything?&lt;/p&gt;

&lt;p&gt;The tragedy of Los Angeles seems to be that, despite its abundance of natural beauty, it is a city which consists mostly of non-space, a graph of vital nodes and hostile edges. For somebody plugged in, the city has treasures to offer. But for a physical body in the physical space, there is nothing but heat and asphalt.&lt;/p&gt;

&lt;p&gt;Fortunately – even miraculously – we live in an exceptional time, where the ground (to use a perhaps too-apt metaphor for a California city) is shifting beneath our feet. Decades-long trends are reversing, and an imaginative space is opening up. From the perspective of a native Angeleno recently returned after twelve years away, it seems as though we are witnessing the following:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The failure of anti-development &lt;a href=&quot;https://en.wikipedia.org/wiki/2017_Los_Angeles_Measure_S&quot;&gt;Measure S&lt;/a&gt; seems to have marked a watershed moment, in which the multi-decade slow-growth wave has begun to break.&lt;/li&gt;
  &lt;li&gt;The completion of the Exposition Expo train line, and the continuation of Wilshire’s Purple line, points to the increasing effectiveness of the transit lobby.&lt;/li&gt;
  &lt;li&gt;The ongoing coronavirus pandemic has shattered conventional practices and expectations around work and home, creating space for entirely new (and low-congenstion) patterns of travel.&lt;/li&gt;
  &lt;li&gt;The recent protests contra police violence, leading to proposals to reduce LAPD funding (defundthepolice) and Mayor Garcetti’s comments that DA Jackie Lacey “might” have to go, suggest a break in the city’s longstanding, unquestioning support for law enforcement.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In these events we see an opening – towards a higher-density, transit-friendly, peaceful city – and what could be the beginning of a transformation from a militarized urban sprawl into a lively urban &lt;em&gt;field&lt;/em&gt;. Los Angeles (and America) of the 2020s is not the violent place of the 70’s or 90’s. We live in a more peaceful time, and it is time to begin the work of slowly and carefully taking down the walls of our guarded enclaves and militarized public spaces.&lt;/p&gt;

&lt;p&gt;America’s lack of empathy has always been our fatal flaw, but it is never too late to change – and in that change, to revitalize and renew. In the words of the Lebanese poet Kahlil Gibran, &lt;a href=&quot;http://www.katsandogz.com/gibran/ongiving.php&quot;&gt;“to withhold is to perish”&lt;/a&gt;. Ultimately, to borrow a phrase from Lévi-Strauss, the mythic energy of America as a strong and muscular power has been spent. It will not return. We must begin to reimagine our national myth as one of empathy in a time of abundance; only in such a renewal is there hope for a new vitality. There is not so much to fear.&lt;/p&gt;

&lt;h1 id=&quot;iv-housing&quot;&gt;IV. Housing&lt;/h1&gt;

&lt;p&gt;Housing is where we begin. The remarkable dysfunction of California’s housing market is well-known, but less obvious is exactly &lt;em&gt;why&lt;/em&gt; it should be so. Good treatments of the subject matter can be found in &lt;em&gt;City of Quartz&lt;/em&gt;, as well as the more recent &lt;a href=&quot;https://www.amazon.com/Golden-Gates-Fighting-Housing-America/dp/0525560211&quot;&gt;Golden Gates&lt;/a&gt;; here is a summary of the main trends.&lt;/p&gt;

&lt;p&gt;California’s housing crisis can be understood as the intersection of a number of forces.&lt;/p&gt;

&lt;p&gt;The first and simplest is that we ran out of land. Well, not land per-se, but rather prime, easily-developed coastal farm and ranch land. With a remarkable lack of foresight, early southern California was developed with an aggressive suburban attitude, as large tracts of farmland were divided up and freestanding white-picked-fenced homes were mass-produced and priced-to-own, often purchased by the factory workers drawn by the region’s defense and aerospace manufacturing. When the land ran out – and it did, abruptly, mid-century – it was like a splash of cold water after a night of heavy drinking. Supply froze, and so prices went up… and up.&lt;/p&gt;

&lt;p&gt;This seemingly abrupt but basically inevitable housing shortage had the awkward consequence of creating an essentially random distribution of political power. Whoever happened to have bought a home in the “before times” was now a part of an abruptly wealthy and powerful interest group; wealth and power which was predicated on the maintenance of status quo – as evidenced by the passing of the highly regressive &lt;a href=&quot;https://en.wikipedia.org/wiki/1978_California_Proposition_13&quot;&gt;Prop 13&lt;/a&gt; in 1978, which reduced property taxes significantly. These peculiar political dynamics are explored by Conor Dougherty in &lt;em&gt;Golden Gates&lt;/em&gt;, and he makes the point that much of the state’s housing dysfunction is due to the asymmetry in political power between owners and renters: for any given proposed development, the current residents are well-defined, while the potential future residents are not. Building housing for 100 people in a neighborhood of 50 means fighting against that 50 without the 100 to back you up, since they don’t exist yet.&lt;/p&gt;

&lt;p&gt;For any given development, this asymmetry holds, which is why local control biases towards stagnation and “NIMBY”-ism. At the municipal and state levels, however, the political calculus changes, as the renter bloc becomes increasingly politically well-defined – as we saw with the decisive failure of Measure S in 2016, the pro-growth bloc, at least at the city level, has overtaken its slow-growth counterpart. As such, it seems that forward-thinking decisions about California housing will inevitably need to be made at the city, county, and state level – neighborhoods thusfar have been unwilling to do the job.&lt;/p&gt;

&lt;p&gt;Central control, of course, is not without risk. As strikingly described by James Scott in his excellent &lt;a href=&quot;https://en.wikipedia.org/wiki/Seeing_Like_a_State&quot;&gt;Seeing like a State&lt;/a&gt;, too much central control and we risk ending up with high-modernist wastelands like Brasilia and Chandigarh, where shallow notions of visual order and harmony preclude the development of the cozy corners and impromptu interactions which (famously described by Jane Jacobs) form the public good of a vibrant urban life. Balancing central and local control is a problem as old as philosophy, and the only way out is through. Ultimately, we must transcend the consumptive retiree mentality and the intense class and racial divides and individually acknowledge the city as a commons, the obligation to which there is no individual exit. To fail in this recognition is to sign up for a long and losing battle, propping up failing levees in a storm.&lt;/p&gt;

&lt;p&gt;How, concretely, can we move forward? Culturally and politically, we were unprepared for the abruptness of the crisis, leading to a series of questionable policy choices (along with Prop 13, we have in the 1950’s the emergence of similarly regressive &lt;a href=&quot;https://en.wikipedia.org/wiki/Contract_city&quot;&gt;“contract cities”&lt;/a&gt;), which functioned ultimately to shift tax burdens from the rich to the poor, the legacy of which we live with today in the form of underfunded city services and high consumer taxes. Where from here?&lt;/p&gt;

&lt;p&gt;One idea, too radical for this moment but something to keep in the back of our minds, is the &lt;a href=&quot;https://www.nytimes.com/interactive/2019/06/18/upshot/cities-across-america-question-single-family-zoning.html&quot;&gt;elimination of single-family zoning in its entirety&lt;/a&gt;. New York, considered by many one of the most exciting and dynamic cities in the world, famously included no single-family zoning at all in its 1916 zoning plan. California, with its imaginative legacy of the “open west” will need to make some mental shifts and appreciate that it is time to do the same. The market will put the suburbs where they belong.&lt;/p&gt;

&lt;p&gt;A second idea, also ambitious but slightly more practical, is the joint repeal of Prop 13 and the elimination of rent control. It is crucial that they occur together, as these two programs represent the yin and yang of housing misallocation: Prop 13 suppresses the cost of ownership by suppressing property taxes, while rent control supresses the cost of renting by suppressing rent. In both cases, skewed incentives discourage movement and invariably result in individuals living alone in three-bedroom apartments or five-bedroom houses, while a few streets down, families of five share a single room. Removing one without the other would be rightly seen as classist – pro-owner, pro-renter – but removing them together makes sense.&lt;/p&gt;

&lt;p&gt;Realistically, we should proceed with incremental zoning changes, along the lines of &lt;a href=&quot;https://en.wikipedia.org/wiki/California_Senate_Bill_50&quot;&gt;SB 50&lt;/a&gt;. Here &lt;a href=&quot;https://en.wikipedia.org/wiki/Scott_Wiener&quot;&gt;Scott Weiner&lt;/a&gt; seems the one to watch.&lt;/p&gt;

&lt;p&gt;Ultimately, it is possible to believe in the value of home ownership without needing homes to be speculative assets, and nowhere are we guaranteed a fixed, unchanging urban landscape.&lt;/p&gt;

&lt;h1 id=&quot;v-transportation&quot;&gt;V. Transportation&lt;/h1&gt;

&lt;p&gt;Housing changes, of course, are impossible without corresponding changes in transportation; they support and enable each other. How we get around determines where we live, and where we live determines how we get around. Density and transit are like two wings of a bird – without them both, it cannot fly.&lt;/p&gt;

&lt;p&gt;Ultimately, the Los Angeles of low-slung sprawl is over. It is not wanted or needed. The long parched boulevards of single-story (often auto-focused) retail &lt;em&gt;must&lt;/em&gt; be reimagined and rebuilt as two to three story mixed-use districts. There is no need to erect skyscrapers and block out the sun, but new usage patterns which allow for street life on &lt;em&gt;all&lt;/em&gt; (or at least, most) of the streets is essential. Rather than separating residence from commerce via tremendous distance, instead mix them more often together, allowing a higher density of residence to support a wider variety of retail and commerce, and breaking the suffocating strangehold of the car on the city. There is nothing wrong with taking a car to see a friend or to go to a show (discretionary, irregular activities well-served by ride-sharing), but having to take a car to buy milk or a sandwich seems more and more like a tragedy – &lt;a href=&quot;https://noparkinghere.com/&quot;&gt;in which we build more housing for cars than we do people&lt;/a&gt;, and high costs of housing subsidize our so called “free parking”.&lt;/p&gt;

&lt;p&gt;Ultimately, rather that inhuman distances necessitating personal cars for quotidian activities, we would like a gradation of distances for different activities, with the proximity of goods and services being directly linked to the frequency of their need. Daily essentials should be accessible by foot; infrequent needs met by rideshare or bicycle. Owning a car in Los Angeles must become an option, rather than a necessity, for a significant part of the population.&lt;/p&gt;

&lt;p&gt;Curiously, the coronavirus presents us with a rare opportunity in this regard. With surge in people working-from-home, the country’s entire private sector is learning just how little a daily commute is needed to run a successful organization. Certainly, video conferencing can never fully replace in-person, face-to-face interaction, but for many people, a lot of the time, it is unneeded. Making a guess, we should expect a permanent post-virus reduction in the amount of time spent in an office of anywhere from 10-20% – leading to a corresponding drop in commute days. If 1/5 of workers spent any given workday in their residential neighborhood, rather than commuting to a commercial center, we would simultaneously see a drop in road congestion and demand for parking in these commercial centers, while seeing an increase in demand for business (food, retail, etc) in the residential neighborhoods. More demand for business in the residential neighborhoods means more business density, which will make these neighborhoods more pedestrian-friendly. Overall, a greater uniformity of population distribution over the course of the day (vs. large daily migrations from residential to commercial center and back) allows for a wider variety of businesses to thrive over more of the city’s area, reducing the need to cover large distances on a daily basis.&lt;/p&gt;

&lt;p&gt;With fewer cars on the road at any given time, alternative modes of transit can be set up for success. Taking cues from &lt;a href=&quot;https://www.politico.eu/article/the-city-of-2050-less-smog-more-bikes-and-hyper-local-living/&quot;&gt;other cities&lt;/a&gt;, we can permanently designate certain boulevards bicycle-and-pedestrian only, doing for cyclists and pedestrians what Robert Moses did for Long Island’s beachgoers – giving them pleasant, relaxing ways to get around (and with &lt;a href=&quot;https://twitter.com/marctorrence/status/1279575109460729857?s=20&quot;&gt;great opportunities for sidewalk dining&lt;/a&gt;). In fact, &lt;a href=&quot;https://streetsforall.org/covid19&quot;&gt;we have already begun&lt;/a&gt;. Fewer cars on the road means more opportunity for dedicated bus lanes, which, if run properly, represent an &lt;a href=&quot;https://alexdanco.com/2019/09/12/why-i-dont-love-light-rail-transit/&quot;&gt;invaluable supplement&lt;/a&gt; to the city’s earnest-but-outmatched rail system, allowing us to provide effective transit beyond the city center.&lt;/p&gt;

&lt;p&gt;That said, access to a car is a great convenience, and ride-sharing and one-off rentals can be expensive and are impractical for, for example, camping trips. An easy proptech-adjacent entreprenial solution would be to offer “shared-cars-as-a-service”, where groups of people can subscribe to a personal car. The subscription would include normal maintenance and a group driver’s insurance policy, and gas usage would be tracked and the costs distributed automatically. That or something like it would fill an important niche very nicely, allowing two or three cars to meet the needs of perhaps six or ten people living in proximity.&lt;/p&gt;

&lt;p&gt;As an aside, Los Angeles could be a cyclist’s paradise. It is relatively flat and the heat is dry, not humid, making cycling suitable for daily use by professionals (no working up a sweat). If the city’s ameneties and points of interest were more evenly distributed (as they will likely become), the addition of one or two cyclist-friendly east/west boulevards could greatly expand the potential of cycling in the city. Imagine if Olympic Blvd was reduced to two lanes of car traffic, freeing the rest for a large, protected bike path – crossing the city would be a breeze. And if that same boulevard were lines with residences, eateries, and light commercial spaces?&lt;/p&gt;

&lt;p&gt;One could argue that a long-term coronavirus-related reduction in driving would make transit less relevant, rather than more, as it would be consequently easier to get around by car. While there is likely some truth to this, congestion remains half the equation, with parking being the other. As long as every Angeleno needs a car, parking subsidies will continue to be built into the costs of housing, exacerbating the city’s housing affordability crisis. We must seize the opportunity to build transit back into the city.&lt;/p&gt;

&lt;p&gt;As memorably put by technology investor Ben Horowitz, “there are no silver bullets, only lead ones”. No single intervention will solve Los Angeles’ substantial transportation problems, but a variety of interventions can interact to provide an effective multi-modal transporation grid. A multi-modal transporation grid allows for more heterogenous movement patterns, meaning the load on any system is reduced – but if everyone has to drive, we could build freeways for one hundred years and never get the capacity we’d need.&lt;/p&gt;

&lt;h1 id=&quot;vi-outro&quot;&gt;VI. Outro&lt;/h1&gt;

&lt;p&gt;As mentioned in the introduction, Los Angeles is a postmodern city – a product of idea more than of material. While at first blush this might seem like a weakness or fatal flaw (“a meaningless city”), it may yet be the city’s great strength, as a postmodern city can more easily be imagined and re-imagined again.&lt;/p&gt;

&lt;p&gt;There is an opportunity to re-imagine Los Angeles, not as a harsh urban sprawl, but as a vital &lt;em&gt;urban field&lt;/em&gt;. A city where a density of variously-priced residences, spread throughout the city, supports a vibrant commercial life throughout. A city where lush, pedestrian-and-bicycle boulevards complement light rail and well-run protected bus routes to make getting around the city easy and pleasant (complemented by ride-share and – gasp – even some car ownership).&lt;/p&gt;

&lt;p&gt;This is not wild speculation – much of this, at least directionally, is happening now.&lt;/p&gt;

&lt;p&gt;Further, there is a chance to make the postmodern city – one of the most diverse on earth – into a racially integrated city, and to continue to take steps to heal the racial divides which sit at &lt;a href=&quot;https://foreignpolicy.com/2020/07/03/america-founding-fathers-jefferson-washington-adams-race-civil-war/&quot;&gt;the very heart&lt;/a&gt; of American identity. Here again the postmodern city shines – for where else can old and tired myths be reimagined and renewed? Can we find the courage to take funding away from security and from fear, and to put it towards culture and creation? And while we’re at it, make peace with the towns upstream whose water we stole?&lt;/p&gt;

&lt;p&gt;None of these aspirational outcomes, of course, are guaranteed. But they are possible. The trendlines are all there. With vision, leadership, courage, and fortitude, they may even be achievable. And that is something.&lt;/p&gt;

&lt;h3 id=&quot;selected-sources&quot;&gt;Selected Sources&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/City_of_Quartz&quot;&gt;City of Quartz&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.amazon.com/Golden-Gates-Fighting-Housing-America/dp/0525560211&quot;&gt;Golden Gates&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.newyorker.com/magazine/1993/07/26/trouble-in-lakewood&quot;&gt;Trouble in Lakewood&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Seeing_Like_a_State&quot;&gt;Seeing like a State&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://noparkinghere.com/&quot;&gt;No Parking Here&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://thelapod.com&quot;&gt;LA Podcast&lt;/a&gt; (&lt;a href=&quot;https://thelapod.com/episode/check-police/&quot;&gt;1&lt;/a&gt;, &lt;a href=&quot;https://thelapod.com/episode/scrums-of-beverly-hills-2/&quot;&gt;2&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;
</description>
        <pubDate>Sat, 04 Jul 2020 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2020/07/04/los-angeles.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2020/07/04/los-angeles.html</guid>
        
        <category>housing</category>
        
        <category>policy</category>
        
        <category>urbanism</category>
        
        
        <category>blog</category>
        
      </item>
    
      <item>
        <title>Sharing the Wealth</title>
        <description>&lt;p&gt;&lt;em&gt;This essay was originally prepared for a London-based zine on gentrification.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;sharing-the-wealth&quot;&gt;Sharing the Wealth&lt;/h2&gt;

&lt;h4 id=&quot;smoother-residential-gentrification-through-shared-ownership&quot;&gt;Smoother residential gentrification through shared ownership&lt;/h4&gt;

&lt;p&gt;From the moment early man first pointed to the ground and said “mine”, we have fought for control over land. For most of our history, this conflict involved bloody violence; in recent decades, however, economic conflict has replaced physical, and “pirate raid” is replaced by “pricing out”. It is right to interpret this as progress, for it is – and to recognize that competition over land has never been absent from the human experience.&lt;/p&gt;

&lt;p&gt;The question is not one of “stopping gentrification” – it is intellectual laziness to conclude that our perennial competition to occupy land can be permanently stopped through some perfect policy – but rather one of best managing its energies. Here we can have hope: while a river cannot be stopped, a well-built dam can be a source of tremendous power.&lt;/p&gt;

&lt;p&gt;Why do we view residential gentrification – the claiming of land from the less affluent by the more affluent – as “bad”, a social ill to be prevented? Surely it is not the basic experience of change, as change is as constant as breathing. No, we view residential gentrification as bad because, the logic of capitalism aside, home is an emotional sphere which exists outside and apart from the market, even while the physical space remains embedded within it. As Karl Polanyi says in his seminal &lt;a href=&quot;https://en.wikipedia.org/wiki/The_Great_Transformation_(book)&quot;&gt;&lt;em&gt;The Great Transformation&lt;/em&gt;&lt;/a&gt;, there exist three “fictitious commodities”: land, labor, and capital – which we embed in markets, even though truly markets are embedded &lt;em&gt;in them&lt;/em&gt;. We should place housing in this category also.&lt;/p&gt;

&lt;p&gt;Unfortunately, recognizing home as a fictitious commodity does not solve the underlying problem, as we lack a better alternative for resolving conflict for the underlying land, as nationalized land ownership and rent control – two possible strategies – come with their own pernicious costs: a loss of freedom in the former, and chronic misallocation in the latter. This recognition is valuable, however, in that it suggests that there may be more to the social contract. If we accept that people may always have to leave, the question becomes one of “what does leaving look like”? And here we can find answers and inspiration.&lt;/p&gt;

&lt;p&gt;What happens when a neighborhood gentrifies? The prices go up. Rents increase, goods and services become more expensive, and the value of land and property rise. To whom does this value accrue? Overwhelmingly, the landowners. This is the problem: the residents, who participated in the creation of the neighborhood (by living their lives, raising their families, and supporting local businesses), receive none of the upside. To transform the experience of residential gentrification, we must find a way to transfer some of this new value from the landowners to the residents. Being priced out of your home is an unfortunate but sometimes unavoidable reality. What is not unavoidable is walking away with nothing. The emotional experience of gentrification would be very different were a family to walk away with tens of thousands of dollars – or more – to start a new life.&lt;/p&gt;

&lt;p&gt;Simply: tenants should receive partial ownership of their buildings, as a function of the duration of their tenancy. This partial ownership recognizes the role the tenants play – as the fabric of the community – in creating the value that is currently captured entirely by landowners. One scheme would be to set aside 20% of the building’s &lt;em&gt;increase in value&lt;/em&gt; for tenants (much like companies often set aside a percentage of their stock aside for employees), and to distribute these to tenants as a function of tenure – say .2% per year. After ten years, a tenant can claim 2% of the increased value of the building. If the value of that building increases by $1,000,000, that tenant receives $20,000.&lt;/p&gt;

&lt;p&gt;There are many ways such schemes could be implemented. One approach involves landlords adopting a policy where, every ten or so years, they “buy out” their tenants at the then-appraised value. Another approach involves tenants redeeming their shares against &lt;em&gt;future rental income&lt;/em&gt;. If I spent a decade in a neighborhood paying $1000/mo, and then get priced out at $1500/mo, then I could claim a percentage of these higher rents over a period of several years – an empowering passive income stream for a class which has historically struggled to accumulate capital.&lt;/p&gt;

&lt;p&gt;It is reasonable to wonder why landowners would ever agree to such a scheme, absent a significant social movement to force new legislation. The answer is that it aligns the incentives between landowners and tenants, by giving tenants a reason to “think like owners”, resulting in better-maintained and more beautiful living spaces. The contemporary landlord-tenant relationship is toxic, with each party encouraged to extract as much value as possible from the other. Neither party is incentivized to invest in the improvement of the property: the tenant does not because they capture none of the value, and the landlord does not because it cuts into their short-term bottom-line. Under a scheme in which the upside is shared, the tenant knows that they will be able to capture some of the upside associated with their contributions, and the landlord knows that the tenant will be motivated to take care of any improvements – saving the landlord in maintenance costs and real, but hard-to-price, emotional conflict. While it may seem outlandish to contemplate, a scheme like this would increase the quality of landlord-tenant relationships, the way in which landlords are viewed by society, and the quality of our living spaces – a win-win-win.&lt;/p&gt;

&lt;p&gt;Absent some radical new development in law or philosophy, we should view gentrification as a fact of life, much like death and taxes. The productive train of thought is not how to “stop” gentrification, but rather how to shape the process. Here we propose a shared-ownership scheme which aligns the interests of landlords and tenants in a way which takes much of the bite out of gentrification, and allows the tenants, who may inevitably find themselves needing to look for a new home, to walk away with something substantial – wealth which can be invested in new neighborhoods and new communities. This new type of windfall will transform the emotional experience of residential gentrification (a key aspect of gentrification writ large) from one of powerlessness to one of agency in the face of change, and represents an important step forward in urban policy.&lt;/p&gt;
</description>
        <pubDate>Sun, 19 Apr 2020 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2020/04/19/sharing-the-wealth.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2020/04/19/sharing-the-wealth.html</guid>
        
        <category>housing</category>
        
        <category>policy</category>
        
        <category>economics</category>
        
        
        <category>blog</category>
        
      </item>
    
      <item>
        <title>A Review of &apos;Gaming the Vote&apos;</title>
        <description>&lt;h2 id=&quot;i-introduction&quot;&gt;I. Introduction&lt;/h2&gt;

&lt;p&gt;Some months ago I read William Poundstone’s &lt;em&gt;&lt;a href=&quot;https://www.amazon.com/Gaming-Vote-Elections-Arent-About/dp/0809048922&quot;&gt;Gaming the Vote&lt;/a&gt;&lt;/em&gt;, a comprehensive survey of the history and attributes of various alternative voting systems – a topic of longstanding &lt;a href=&quot;/blog/2017/02/06/thesis.html&quot;&gt;personal interest&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The villain of Poundstone’s story is the &lt;em&gt;plurality vote&lt;/em&gt;, a fair-seeming but perniciously flawed way of choosing leaders, and Poundstone devotes fully the first third of the book to a survey of the method’s myriad historical vailures. The remaining two thirds are a survey of the alternatives, and a discussion of their own (inevitable) achille’s heels. Poundstone considers, in turn, the classic methods of Borda and Condorcet, the more recent innovations of Instant Runoff Voting and Approval Voting, and concludes with a bullish assessment of Score Voting as the “least worst” option. Along the way, we meet (among others) the idealistic Marquise de Condorcet, the obsessive-creative Charles Dodgson (Lewis Carroll!), and of course, Kenneth Arrow, our &lt;em&gt;axis mundi&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Let’s now review the various electoral sytems (as we will see, each has a fatal flaw), and then hone in on a number of points where I think Poundstone’s argument can be extended or corrected.&lt;/p&gt;

&lt;h2 id=&quot;ii-the-voting-systems&quot;&gt;II. The Voting Systems&lt;/h2&gt;

&lt;h3 id=&quot;plurality-voting&quot;&gt;Plurality Voting&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;What:&lt;/strong&gt; Voters submit single votes; the candidate with the most votes is the winner.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;But:&lt;/strong&gt; Susceptible to vote-splitting, leading to the election of minority candidates (the “spoiler effect”).&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Plurality voting (also known as “first-past-the-post”) is an electorical system in which voters cast a single vote for their preferred candidate, out of an arbitrary pool of candidates. The votes are summed up, and the candidate with the most votes is elected.&lt;/p&gt;

&lt;p&gt;While simple and fair on its face, the critical flaw of the plurality vote is that when there are more than two candidates, often the larger group (the “majority bloc”) will find themselves choosing between two candidates, while the smaller group (the “minority bloc”) will have only one. In this case, the majority bloc, which should be the one to choose the candidate, will end up losing to the minority bloc. In concrete terms, if the majority bloc is 60% of the vote with two equally-popular candidates (Alice and Bob), then the minority candidate (Charlie) will win with 40% of the vote, while the two majority candidates will each take 30% of the vote and lose the election. Note that no candidate won a majority of the votes, a reasonable standard for legitimacy.&lt;/p&gt;

&lt;p&gt;In real-world elections, a more common occurrence is that the majority and minority blocs are more closely matched (say 52% vs 48%), and a fringe candidate comes in to pull ~5% of the vote from the majority candidate, handing the victory to the minority bloc – the “spoiler effect” phenomenon which famously sent George Bush to the White House in the 2000 United States presidential election. In addition, Poundstone discusses the recent practice (as of the last few decades) in which minority blocs explictly encourage this phenomena by descreetly funding fringe candidates to challenge their majority opponent.&lt;/p&gt;

&lt;p&gt;From a theoretical perspective, the key limitation of the plurality vote is that a voter is unable to provide sufficient &lt;em&gt;information&lt;/em&gt; about their preferences. In many cases, voters who vote for a fringe candidate would still prefer the majority candidate to beat the minority candidate, but this type of “second choice” information is unavailable to the election algorithm. As we will see, the common theme of every other system Poundstone describes is their attempt to incorporate this additional information (with variously mixed results).&lt;/p&gt;

&lt;h3 id=&quot;the-borda-count&quot;&gt;The Borda Count&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;What:&lt;/strong&gt; Voters submit ranked lists, with candidates receiving more or less votes depending on their position. The candidate with the most votes is the winner.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;But:&lt;/strong&gt; Susceptible to tactical voting (“burying”), which can lead to the election of unintended candidates.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Borda count is a method which attempts to avoid the flaw of plurality voting by allowing voters to convey their preferences not only for their first choice, but for all the candidates, by submitting a ranked list. Known appropriately as a “ranked choice” method, the Borda count assigns a numeric score to every candidate based on their position in the list (i.e. for the list “Alice &amp;gt; Bob &amp;gt; Charlie”, Alice gets two points, Bob one, and Charlie zero).&lt;/p&gt;

&lt;p&gt;The Borda count thus avoids the spoiler effect by allowing multiple candidates to “share” the support of their bloc, in such a way that one of the majority candidates should prevail over the minority. To return to our earlier example, say that Alice and Bob are equally-popular candidates for the 60% majority bloc, while Charlie is the sole candidate for the 40% minority bloc. With the Borda count, each of Alice and Bob will receive a score of 30 * 2 + 30 * 1 = 90, while Charlie will recieve a score of 40 * 2 = 80. If we assume some small amount of randomness such that an exact tie is avoided, one of either Alice or Bob will win.&lt;/p&gt;

&lt;p&gt;However, the Borda count’s flaw is that by assigning scores to individual candidates &lt;em&gt;as a function of the number of total candidates&lt;/em&gt;, it makes it possible to create “artificial distance” between candidates (i.e. to create a distance of “two” between Alice and Bob, simply by the presence of Charlie). This ability to create distance between candidates leads to the phenomenon of “burying”, a type of strategic vote where strong candidates are ranked “artificially” low.&lt;/p&gt;

&lt;p&gt;In extreme cases this can lead to unexpected candidates being elected. Consider an example of Alice, Bob, and Charlie, where Alice is the majority bloc candidate (60%), Bob is the minority bloc candidate (40%), and Charlie is a moderate. With plurality voting, Alice would easily win. But under the Borda count, the minority bloc can vote strategically to put Charlie in office: if the majority bloc ranks Alice, Charlie, and Bob, then the minority bloc can vote tactically, falsely putting Charlie as their top candidate (and Alice last). This would lead to a Charlie victory with a score of 40 * 2 + 60 * 1 = 140 vs. Alice’s score of 60 * 2 = 120.&lt;/p&gt;

&lt;p&gt;Theoretically, the problem is rooted in the ability to create absolute (numeric) distance between a pair of candidates, compared to simply relative distance, as a function of something &lt;em&gt;other than the pair of candidates themselves&lt;/em&gt; (in this case, the number of candidates total). The number of candidates in the election should not be able to affect the relative preferences of pairs of candidates (where the majority prefers Alice to Charlie), but in the Borda count it does. Another way to frame this limitation is that the Borda count provides no way to express non-uniform “distances” between the candidates – the psychic “space” between Alice and Bob is assumed to be just as large as that between Bob and Charlie. This can be understood as a type of information-theoretic noise caused by the measurement not quite fitting the thing being measured (this will be a recurring theme).&lt;/p&gt;

&lt;h3 id=&quot;the-condorect-winner&quot;&gt;The Condorect Winner&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;What:&lt;/strong&gt; Voters submit ranked lists, which are translated into pairwise contests. The candidate which defeats every other candidate, pairwise, is the winner.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;But:&lt;/strong&gt; No guarantee of transitivity, i.e. a winner might not exist.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Condorcet winner is the candidate who would beat every other in a two-way contest. This method attempts to both avoid the spoiler effect and the problems of the Borda count by allowing voters to submit multiple preferences &lt;em&gt;without&lt;/em&gt; creating artificial distances.&lt;/p&gt;

&lt;p&gt;To return to our opening example, in an election where Alice and Bob are majority candidates and Charlie is the minority candidate, both Alice and Bob will beat Charlie (60/40), and then Alice and Bob will themselves face off in what will amount to a 50/50 split (with some small randomness breaking the tie). This method solves the problems of plurality voting by allowing for the incorporation of the information that 60% of the people would prefer Alice and Bob to beat Charlie, while &lt;em&gt;avoiding&lt;/em&gt; the problems of the Borda count by never &lt;em&gt;casting relative preference to absolute&lt;/em&gt; (i.e. if I prefer Alice to Bob, ranking Bob &lt;em&gt;last&lt;/em&gt; doesn’t make Alice beat him by “more” as long as he is ranked after Alice).&lt;/p&gt;

&lt;p&gt;The problem with this method, however, is that the Condorcet winner &lt;em&gt;may not exist&lt;/em&gt;. Consider a scenario with three equal-sized blocs (and three candidates), we may find ourselves in a situation where Alice beats Bob by 2:1 (i.e. bloc A and C both prefer Alice to Bob), but Bob beats Charlie 2:1 (i.e. bloc A and B both prefer Bob to Charlie), but Charlie beats Alice by 2:1 (because bloc B and C both prefer Charlie to Alice). This can be written as:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Bloc A: Alice &amp;gt; Bob &amp;gt; Charlie&lt;/li&gt;
  &lt;li&gt;Bloc B: Bob &amp;gt; Charlie &amp;gt; Alice&lt;/li&gt;
  &lt;li&gt;Bloc C: Charlie &amp;gt; Alice &amp;gt; Bob&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This creates a rock-paper-scissors situation known as a “cycle” (draw it as a graph – or scroll down – to see why), and there is no winner. Obviously it is problematic if there is “no winner” in an election, and so this is seen as a flaw in the method. Defenders of Condorcet methods say that cycles are rare in practice (it requires a particular “arrangement” of candidates and voters), and so the concern is over-blown (compared to the type of failures which can occur in the Borda count and plurality votes, which are much more common).&lt;/p&gt;

&lt;p&gt;From a theoretical perspective, the problem here not the lack of information, but rather that the algorithm cannot “see” all the information that is there. As we will learn later, there are other techniques which can see this information and break cycles in a fair way.&lt;/p&gt;

&lt;h3 id=&quot;instant-runoff-voting&quot;&gt;Instant Runoff Voting&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;What:&lt;/strong&gt; Voters submit ranked lists. An iterative algorithm re-allocates votes until a clear (majority) winner emerges.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;But:&lt;/strong&gt; The algorithm is “non-monotonic”: giving a candidates more votes can cause them to lose. Also, popular second-choice candidates may be eliminated too early.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instant Runoff Voting (IRV) is an interesting beast. Unlike the other methods, which run in constant time, IRV is iterative: it runs in a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;while&lt;/code&gt; loop, rearranging votes until one candidate has a majority of first-place votes. Like the Borda count and Condorcet winner, IRV attempts to make second-choice votes meaningful by successively eliminating last-place candidates, reallocating votes to second (or third or fourth!) choices in the event that an eliminated candidate was a voter’s first choice. Returning to our opening example, consider the case where Alice and Bob split the 60% majority bloc’s vote down the middle (alternating as first and second choice on the ballots), while Charlie takes all of the 40% minority bloc’s votes. In the first round of the IRV algorithm, either Alice or Bob will be eliminated (having the fewest votes, ~30%, versus Charlie’s 40). Let’s say that Bob is eliminated. Now, since every voter who ranked Bob first ranked Alice second, all of Bob’s votes will now be transferred to Alice, giving her 60% of the vote and a majority victory.&lt;/p&gt;

&lt;p&gt;IRV has proven to be popular in practice, gaining traction in governments and municipalities around the world, in part due to the intuitive nature of the algorithm making the process easy to understand. Among contemporary advocates for voting reform, IRV has become one of the most popular options (rivaling Score Voting). However, IRV is not without flaws. The essential problem with IRV is that it is, in mathematical speak, “non-monotonic”. A “monotonically increasing” function is one which is always either staying the same or increasing, but never decreasing: for IRV, the “non-monotonicity” means that giving &lt;em&gt;more&lt;/em&gt; votes to a candidate can actually cause them to lose, a troubling phenomenon which does not appear in any of the other systems considered. Also, the nature of the IRV algorithm means that the most broadly popular candidate may still lose. To see why, consider the case where the population is split 50/50 for Alice and Bob, with Charlie being a universally-popular second choice. Charlie seems like the natural best choice, but because he receives no first-place votes, the IRV algorithm eliminates him in the first round, resulting in a deadlock between Alice and Bob – exactly the situation we would hope IRV would avoid.&lt;/p&gt;

&lt;h3 id=&quot;approval-voting&quot;&gt;Approval Voting&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;What:&lt;/strong&gt; Voters submit a binary approve/reject &lt;em&gt;per candidate&lt;/em&gt;. Votes are tallied according to plurality rules.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;But:&lt;/strong&gt; The ambiguous semantics of “approval” means that winner of the election can be hard to predict.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Borda count, Condorcet winner, and Instant Runoff Voting all fall under the category of “ranked-choice voting” systems: they are all different algorithms which operate on the same &lt;em&gt;measurement&lt;/em&gt;, that of a relative ranking of candidates. Approval voting (and its cousin, score voting) are fundamentally different animals, in that their primary representations are not relative, but absolute. As we will see, this approach will spare these methods from many of the flaws of their relative brethren, but introduces pernicious problems of its own.&lt;/p&gt;

&lt;p&gt;Approval voting can be thought of as a generalization of plurality voting, but where instead of voting for one candidate, you can vote for &lt;em&gt;as many as you like&lt;/em&gt;. This prevents the spoiler effect by allowing second-choice candidates to receive votes alongside the first-choice candidates. This also removes the benefits of voting strategically, since the sincere vote is the optimal vote. Recall Alice, Bob, and Charlie. With approval voting, both Alice &lt;em&gt;and&lt;/em&gt; Bob will receive ~60% of the vote, compared to Charlie’s meager 40%. Some small randomness will ensure that one of Alice and Bob is elected, to the satisfaction of the majority bloc.&lt;/p&gt;

&lt;p&gt;Unfortunately, the ambiguous semantics of “approval” (what is the standard by which someone is “approved”?) means that, contrary to expectations, mediocre candidates can prevail over strong candidates. Consider a situation where Alice is beloved by 60% of the population, Bob beloved by the other 40%, and Charlie seen as a bumbling but endearing candidate whom no-one takes seriously, but no one dispises. With approval voting, it is possible that more than 60% (potentially up to 100%) of the population will “approve” of Charlie, on the grounds that, from the perspective of each half of the population, Charlie isn’t a &lt;em&gt;bad&lt;/em&gt; candidate. As a result, Charlie wins the election – an unintended outcome. More fundamentally, depending on how voters interpret “approval”, the same ranked-ordering of candidates can lead to different election outcomes – a phenomenon known as “indeterminacy”.&lt;/p&gt;

&lt;p&gt;Observe that, under a Borda count, Alice would win the election, since a first-place vote from 60% of the population is worth more than a second-place vote from 100% of the population (2 * 60 &amp;gt; 1 * 100). With approval voting, the inability to represent the &lt;em&gt;underlying&lt;/em&gt; relative distinctions leads to the measurement error manifesting as indeterminacy. Said another way, by treating the candidates as unrelated in the model, the base concept of “approval” decoheres and loses definition as relativity inevitably re-inserts itself.&lt;/p&gt;

&lt;h3 id=&quot;score-voting&quot;&gt;Score Voting&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;What:&lt;/strong&gt; Voters submit a numeric score &lt;em&gt;per candidate&lt;/em&gt;. Votes are tallied according to plurality rules.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;strong&gt;But:&lt;/strong&gt; The ambiguous semantics of “scoring” means that winner of the election can be hard to predict.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On the surface, score voting arrives on the scene as the funnier and more handsome cousin of approval voting. Rather than precluding any expression of relative preference, score voting permits the assignment of real-valued scores to each candidate, allowing for &lt;em&gt;implicit&lt;/em&gt; relative preference. Further, the ability to use a fuller range allows voters to avoid the false “uniformity of differences” which hounds the Borda count.&lt;/p&gt;

&lt;p&gt;Unfortunately, this merely kicks the can down the road – it turns out that numerical scores vary as a function of the candidates just as much as binary “approvals”. Consider a beloved Alice, a well-meaning Bob, and a chicanerous Charlie. A voter might give Alice a 1, Bob a .6 (indicating their positive sentiment), and Charlie a 0. Say Charlie is indicted for fraud and drops out of the race – what happens to Bob’s score? The voter wants Alice to win, and giving Bob a 0 maximizes those chances. So we see how Bob’s score, real-valued as it is, decoheres just as much as “approval” does with binary votes. Fundamentally, the voter has no “score” for Bob, only a relative sentiment vis-a-vis Alice and Charlie. In line with our theme, we conclude that the use of real-valued scores is a mirage, providing an &lt;em&gt;illusion&lt;/em&gt; of information.&lt;/p&gt;

&lt;p&gt;Apart from this, score voting is subject to the same quirks as approval voting, and so there is no need to recount them here. And of course, as with all systems described here, the existence of these types of failure conditions &lt;em&gt;in principle&lt;/em&gt; says little about the frequency with which they will be encountered &lt;em&gt;in practice&lt;/em&gt;. All of these systems work well most of the time, which is good enough, &lt;em&gt;most of the time.&lt;/em&gt;&lt;/p&gt;

&lt;h2 id=&quot;iii-generalized-relativity&quot;&gt;III. Generalized Relativity&lt;/h2&gt;

&lt;h3 id=&quot;the-confounding-of-condorcet&quot;&gt;The Confounding of Condorcet&lt;/h3&gt;

&lt;p&gt;Let us now take the ceremonial potshot at our bugbear, the Independence of Irrelevant Alternatives (IIA) – the most frustrating of Arrow’s criteria. Consider the example on page 225 of &lt;em&gt;Gaming the Vote&lt;/em&gt;:&lt;/p&gt;

&lt;p&gt;Three candidates (here, Clinton, Bush, and Perot) run in a ranked-choice election with a Condorcet winner. The first ballots arrive:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Clinton &amp;gt; Bush &amp;gt; Perot (30 million)&lt;/li&gt;
  &lt;li&gt;Bush &amp;gt; Perot &amp;gt; Clinton (30 million)&lt;/li&gt;
  &lt;li&gt;Perot &amp;gt; Clinton &amp;gt; Bush (30 million)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This leads to the following graph:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://s3.amazonaws.com/kronosapiens.github.io/images/condorcet-1.jpg&quot; alt=&quot;Condorcet 1&quot; /&gt;&lt;/p&gt;

&lt;p&gt;As we can see, we have a nightmarish cycle in which each candidate wins and loses by a landslide, and there is intuitively no winner. With this voting data, no system could declare a winner, as the information simply does not exist. Now, additional ballots arrive:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Bush &amp;gt; Clinton &amp;gt; Perot (20 million)&lt;/li&gt;
  &lt;li&gt;Clinton &amp;gt; Bush &amp;gt; Perot (15 million)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This leads to the following graph:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://s3.amazonaws.com/kronosapiens.github.io/images/condorcet-2.jpg&quot; alt=&quot;Condorcet 2&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Now we have sufficient information to declare a winner – but who?&lt;/p&gt;

&lt;p&gt;As Poundstone points out, the new votes favor Bush, and yet – despite a folk moral arithmetic implying that a tie plus a victory equals a victory – Clinton is the Condorcet winner. Yet Clinton’s Condorcet victory hides the landslide victory that Bush holds over Perot – much more decisive than Clinton’s meager margin over either. It &lt;em&gt;feels&lt;/em&gt; to us that this one landslide means more than Clinton’s two narrow victories. We spectators, with our “artist’s eye” (to borrow a term from Paglia) can “see” the relevance of this background victory to the larger picture, but the simple “machine mind” of the Condorcet algorithm cannot.&lt;/p&gt;

&lt;p&gt;Of course, not all is lost – the information is there, clearly – &lt;em&gt;we&lt;/em&gt; can see it. Condorcet cannot, but it turns out that Borda can. In this setting the Borda count would “see” Bush’s victory – with a count of 145 million compared to Clinton’s 140. Of course, that the Borda count is “right” &lt;em&gt;in this case&lt;/em&gt; does not mean that it is better – as we have discussed earlier, each algorithm can “see” only certain facets of the world – and in this case, the Borda count is the machine that sees the right things.&lt;/p&gt;

&lt;p&gt;There are more recent techniques, such as Power Ranking (the method famously underlying Google’s PageRank), which mix the numerical aspect of the Borda count with the graphical approach of of the Condorcet winner to produce compelling results, and remains on the cutting edge of applied social choice with &lt;a href=&quot;https://blog.colony.io/introducing-budgetbox/&quot;&gt;numerous&lt;/a&gt; applications &lt;a href=&quot;https://sourcecred.io/&quot;&gt;under&lt;/a&gt; active &lt;a href=&quot;https://relevant.community/&quot;&gt;development&lt;/a&gt;. As promised earlier, Power Ranking is a technique capable of breaking Condorcet cycles, “spinning the wheel” to leverage more of the available information.&lt;/p&gt;

&lt;p&gt;There is no algorithm which can fully attain the “artist’s eye”, for the same reason that there is no way to fully express a feeling. However we can get closer, and insisting on representations which as closely as possible reflect their underlying world is the start. If all mental concepts are fundamentally relative, we should stop pretending that they are not, and include relativity explicitly (and thoughtfully) in our models – before it sneaks in unnanounced. &lt;em&gt;All alternatives are relevant.&lt;/em&gt; The two words are nearly anagrams, clearly there is a cosmic joke being played here.&lt;/p&gt;

&lt;h3 id=&quot;signal-and-noise&quot;&gt;Signal and Noise&lt;/h3&gt;

&lt;p&gt;Continuing our theme, let’s turn to another example in &lt;em&gt;Gaming the Vote&lt;/em&gt;, the discussion of the infamous “Hot or Not” (chapter 14). Hot or Not, as Poundstone remembers, was (is?) a website in which people can upload photos of themselves, and have the good people of the internet submit judments as to the photo’s attractiveness, which ultimately Poundstone holds up as an example of the efficacy of score voting. Quoting directly (Poundstone, 247):&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;[The creators, Hong and Young] considered having visitors pick their favorite of two on-screen photos. A photo would win points for each time it was preferred over another, random photo. This would loosely simulate a Borda count. (In a true Borda count, a candidate wins a point every time a voter ranks her above a rival. No Hot or Not voter could rank all the millions of pictures on the site, of course. The aggregate effect of random visitors ranking random pairs would be similar.) However, when shown two photos that hapen to be of roughly equal attractiveness, “people will look at the pictures and not know,” Hong said. “They have a harder time deciding.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;blockquote&gt;
  &lt;p&gt;Hong and Young also considered a simple “hot” or “not” vote on a single picture. This would be an approval vote. There it was “average” Joes and Janes which slowed things down. People would have to ponder whether to click “hot” or “not”. Range voting was faster. It seemed to require &lt;em&gt;less&lt;/em&gt; thought. “Sometimes people can’t even express the number,” Hong explained, “they just have a feeling and like having that bar: ‘ah, it’s kinda like here.’” They position the cursor where it feels right and click.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is a valuable history which deserves closer scrutiny, through the analytical eye of information theory.&lt;/p&gt;

&lt;p&gt;First, some definitions: note that a single pairwise preference (“A vs B”) can be represented in one bit (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt; for A, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1&lt;/code&gt; for B). So too can a binary “hot or not” (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0&lt;/code&gt; for “not”, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1&lt;/code&gt; for “hot”). A real-value, on the other hand, requires more bits – for argument’s sake, let’s say 3 bits for an 8-point scale (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;000&lt;/code&gt; gives a 0, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;111&lt;/code&gt; a 7, and the rest in-between). The axioms of information theory tell us that a digital “bit” can contain &lt;em&gt;up to one&lt;/em&gt; “bit” of information (the relationship between the digital bit and the information bit being governed by the mathematics of entropy – in the classic example, the outcome of a fair coin is exactly one bit of information, while the outcome of a loaded coin is always a little bit less).&lt;/p&gt;

&lt;p&gt;Here, we see that score voting yields 3 bits of data, while a binary “hot or not” vote yields 1. Yet, bizarrely, it is easier for users to provide more data rather than less – suspicious. While we cannot prove which measure yields more information (as this requires access to the ineffable truth, which we lack), this juxtaposition should make us wonder how many bits of information we’re really getting in those 3 bits of real-valued scores – likely, it is less than the (up to) 1 bit we get from the binary “hot or not”.&lt;/p&gt;

&lt;p&gt;Historical experience supports my argument. Poundstone mentions at the beginning of the chapter that, among others, YouTube uses 5-point scores for their videos. However, as discussed (in some of my work) &lt;a href=&quot;https://colony.io/budgetbox.pdf&quot;&gt;here&lt;/a&gt;, YouTube (along with Netflix) have since dropped the 5-point system in favor of a binary like/dislike, on the grounds that the 5-point scale ultimately provided very little &lt;em&gt;signal&lt;/em&gt; above and beyond what was gleaned from a like/dislike, and thus introduced mostly noise.&lt;/p&gt;

&lt;p&gt;That it is “easier” for users to provide 3 bits tells us little about the quality of the measurement – it is the easiest thing of all to submit a random number, containing no information at all. Is not the amount of data, but the ratio of data to information, that we should care about. It is difficult decision processes that produce information-rich results.&lt;/p&gt;

&lt;!-- And what of the case of the pairwise vote? Like the binary &quot;hot or not&quot;, a pairwise vote is 1 bit, but note that now that bit is put up against not 3 bits, but 6, since we can generate the pairwise bit indirectly by comparing two real-valued 3-bit scores (if A&apos;s score is greater than B&apos;s, A is hotter). Yet the situation is not quite comparable, because the same score can be used multiple times to generate pairwise outcomes, and the number of pairs grows quadratically with the number of candidates -- for example, 10 people gives us 45 pairs. In the pairwise setting, you may need _more_ data, not less, to get results (45 bits of 1-bit pairwise preferences, vs 30 bits of 3-bit real-valued scores). Yet, from those 30 bits of real-valued scores, we can &quot;infer&quot; 45 bits of pairwise preference. What does this tell us about the information content of these various measures and techniques? Are the &quot;inferred&quot; 45 bits noisier than the &quot;authentic&quot; 45, if users are asked directly? Intuitively, it seems that if someone cannot tell you which of two individuals is hotter, then asking them to submit separate (&quot;easy&quot;) scores and then using those scores to infer a relative preference could hardly give an information-rich result. I will suggest (but again, cannot prove) that real-valued scores are noisier measures which provide a false sense of confidence when compared to their more data-frugal counterparts. --&gt;

&lt;h2 id=&quot;iv-conclusion&quot;&gt;IV. Conclusion&lt;/h2&gt;

&lt;p&gt;The foundation of science is the assumed validity of independent repetition – the idea that things which occur in the future are like things which occurred in the past, and that the things that we observe occur somewhat independently of one-another. This allows us to, for instance, re-run experiments, test theories, and to develop re-usable mathematical models of the world. Unfortunately, this assumption is, at base, incorrect. Historians know that, while we conceptualize history as a series of interconnected-but-separate episodes, in which the past contains clues about the future, the deeper truth is that history is one, single event, in which every moment is intimately and inextricably wound up with every other, and no differentiation can occur. This reality is more felt in some cases than others. The natural sciences, for one, can often get away with strong assumptions of independence (protons behave largely the same in 21st century California is they did in 10th century China). In art history, this is less true – it is virtually impossible to understand the behavior of English painters in the 19th century without understanding the Italian sculptors of the 15th. The social sciences (and by extension, voting systems) sit, somewhat frustratingly, in the middle.&lt;/p&gt;

&lt;p&gt;In many cases, assumptions of independence are necessary to make problems tractable. Fortunately, this is not the case here. There is room in the field of electoral systems and social choice to incorporate notions of relativity alongside notions of psychic intensity, and to develop algorithms which leverage the information encoded in both. Doing so will allow our mechanisms to sit closer to the “reality” of our experience and thus yield more consistently legitimate outputs – a worthy aim in the quixotic quest for better tools of freedom.&lt;/p&gt;
</description>
        <pubDate>Sat, 04 Apr 2020 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2020/04/04/gaming-the-vote.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2020/04/04/gaming-the-vote.html</guid>
        
        <category>voting</category>
        
        <category>mathematics</category>
        
        <category>economics</category>
        
        <category>social-choice</category>
        
        
        <category>blog</category>
        
      </item>
    
      <item>
        <title>A Mild Critique of Quadratic Funding</title>
        <description>&lt;p&gt;This essay is meant as a mild and constructive engagement with one part of the constellation of ideas being advanced under the aegis of &lt;a href=&quot;https://radicalxchange.org/&quot;&gt;RadicalxChange&lt;/a&gt; (pronounced “radical exchange”), specifically the concept of quadratic funding, and it’s claim to “optimality”. Let’s review the argument and then assess the strength of that claim. This will involve a few equations but I’ll narrate the whole thing so it shouldn’t be too hard to follow (or just skip to the critique).&lt;/p&gt;

&lt;h2 id=&quot;a-review&quot;&gt;A Review&lt;/h2&gt;

&lt;p&gt;From the “Liberal Radicalism” &lt;a href=&quot;https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3243656&quot;&gt;paper&lt;/a&gt;, we have the following notion of &lt;strong&gt;social welfare:&lt;/strong&gt;&lt;/p&gt;

\[\sum_p (\sum_i V_i^p(F^p)) - F^p\]

&lt;p&gt;Here, \(i\) is a citizen in a society while \(p\) is a public good in that society. \(c_i^p\) (which shows up later) is the amount of money that citizen \(i\) gives to good \(p\), whle \(F^p\) is the &lt;em&gt;total amount&lt;/em&gt; of funding that good \(p\) receives. \(V_i^p(F^p)\) is the “currency-equivalent utility” that citizen \(i\) receives if good \(p\) is funded at level \(F^p\). Pay extra special attention to the term &lt;strong&gt;currency-equivalent utility&lt;/strong&gt; because it is the hinge of the critique. With these definitions, the equation is straightforward: &lt;em&gt;social welfare is the sum of all individual utilities across all public goods, less the total cost of those goods.&lt;/em&gt; Pretty reasonable.&lt;/p&gt;

&lt;p&gt;Now, the authors (Buterin, Hitzig, and Weyl) use this equation to show why two existing systems, namely capitalism and one-person-one-vote democracy, lead to suboptimal allocations, while their quadratic methods lead to optimal allocations. An important concept in their argument is the &lt;em&gt;first derivative of the individual utility function&lt;/em&gt;, \(V_i^{p\prime}\). This tells us how much value citizen \(i\) gets from the &lt;em&gt;next dollar&lt;/em&gt; which funds the good \(p\), i.e. the slope of the curve.&lt;/p&gt;

&lt;p&gt;For an optimal allocation, we would expect that the first derivative of the &lt;em&gt;total utility&lt;/em&gt; for a given good (summed across all citizens) would be equal to 1, meaning that &lt;em&gt;society as a whole&lt;/em&gt; has reached the point where giving more funding to the good would create less value than the funding itself, i.e. \(V^{p\prime}(F^p) = 1\). At that point, funding should be placed elsewhere.&lt;/p&gt;

&lt;p&gt;Now, under &lt;strong&gt;capitalism&lt;/strong&gt; (the system where all contributions to public goods are made by citizens &lt;em&gt;in isolation&lt;/em&gt;), otherwise known as \(F^p = \sum_i c_i^p\), citizen \(i\) will contribute to a good up until the point where their &lt;em&gt;individual increase in utility is worth what they contribute&lt;/em&gt;, i.e. where \(V_i^{p\prime}(F^p) = 1\). The problem here is that there is a lot of utility that ends up being “left on the table” – even if an extra $1 of funding can create $.5 of utility for three people (i.e. $1.5 of utility for society), no one will provide that funding since from the perspective of the individual, they are giving $1 and getting only $.5 back in value. Formally, this looks like \(V^{p\prime}(F^p) &amp;gt; 1\), i.e. putting in more money will create more utility &lt;em&gt;for society&lt;/em&gt;, but no one does it. Sad.&lt;/p&gt;

&lt;p&gt;Under &lt;strong&gt;one-person-one-vote (1p1v)&lt;/strong&gt; (the system where citizens vote on alloctions), otherwise known as \(F^p = N \cdot \text{Median}_i V_i^{p\prime}(F^p)\), the problem is different. Here, the issue is that since the utility is determined by a majority vote (i.e. by the “median voter”), the allocation will be suboptimal to the degree to which the median voter differs from the &lt;em&gt;average&lt;/em&gt; or &lt;em&gt;mean&lt;/em&gt; voter. Note the appearance of the term &lt;em&gt;mean&lt;/em&gt; here, because it sets the stage for (drumroll please) the quadratic methods.&lt;/p&gt;

&lt;p&gt;Recall that the median is a measure of centrality which &lt;em&gt;ignores&lt;/em&gt; degree of intensity, while the mean is exactly the measure of centrality which incorporates it, i.e. the mean minimizes the &lt;em&gt;square error&lt;/em&gt; of itself to all the data points (while the median minimizes the &lt;em&gt;absolute error&lt;/em&gt;).&lt;/p&gt;

&lt;p&gt;Enter &lt;strong&gt;quadratic funding&lt;/strong&gt; (the system in which the total contribution is the &lt;em&gt;sum of the roots&lt;/em&gt; of the individual contributions), otherwise known as \(F^p = (\sum_i \sqrt{c_i^p})^2\). Unlike capitalism, in which individuals contribute up until &lt;em&gt;their utility&lt;/em&gt; matches their contribution, quadratic funding allows people to  contribute until the &lt;em&gt;total utility&lt;/em&gt; matches their contribution. We’ll look at the derivation because it’ll be instructive. Starting with the individual’s utility function, \(V_i^p(F^p) - c_i^p\), we maximize by taking the derivative and setting to zero (involving several applications of the chain rule), which gives us:&lt;/p&gt;

\[V_i^{p\prime}(F^p) = \frac{\sqrt{c_i^p}}{\sum_j \sqrt{c_j^p}} \leq 1\]

&lt;p&gt;This is an odd looking fraction, but note that it is less than (or equal to) 1, &lt;em&gt;and equals one when you sum across all citizens&lt;/em&gt;. That is the voilà moment for quadratic funding:&lt;/p&gt;

\[V^{p\prime}(F^p) = \sum_i(\frac{\sqrt{c_i^p}}{\sum_j \sqrt{c_j^p}}) = 1\]

&lt;p&gt;While capitalism provides funding up until &lt;em&gt;individual utility&lt;/em&gt; matches the increased funding, quadratic funding provides funding up until the &lt;em&gt;collective utility&lt;/em&gt; matches the increased  funding, which is &lt;em&gt;optimal&lt;/em&gt;. This is a great result and a source of legitimate excitement.&lt;/p&gt;

&lt;h2 id=&quot;the-critique&quot;&gt;The Critique&lt;/h2&gt;

&lt;p&gt;But (and finally, we reach the critique), let us recall the key assumption of the model: individual (subjective) utility, \(V_i^p\), is assumed to be both &lt;em&gt;known&lt;/em&gt; and &lt;em&gt;dollar-valued&lt;/em&gt;, being inferred per-citizen from the amounts contributed. This is a problematic assumption, as it equates something which is fundamentally &lt;em&gt;subjective&lt;/em&gt; (a private feeling) with something which is fundamentally &lt;em&gt;objective&lt;/em&gt; (a real number). The collective (subjective) utility is inferred from &lt;em&gt;summing up&lt;/em&gt; these numbers, equating them with feelings. This seems… peculiar.&lt;/p&gt;

&lt;p&gt;Economists have &lt;a href=&quot;https://en.wikipedia.org/wiki/Social_choice_theory#Interpersonal_utility_comparison&quot;&gt;long wrestled&lt;/a&gt; with this problem. In his &lt;em&gt;Social Choice and Individual Values&lt;/em&gt;, economist icon Ken Arrow famously argued that since choices are defined by relative preferences (“apple vs orange”), that “there is no quantitative meaning of utility for an individual”, and thus “interpersonal comparison [and thus summation] of utilities [have] no meaning”, since something which is not a number can be difficult to compare (i.e. it is easy to see that 6 is less than 7, but not easy to see that red is less than blue).&lt;/p&gt;

&lt;p&gt;One might turn around and say that quadratic funding sidesteps the issue by asking citizens to make &lt;em&gt;absolute&lt;/em&gt; decisions (“give $25 to the parks department”), rather than &lt;em&gt;relative&lt;/em&gt; decisions (“plant apple trees, not orange trees, in the park”). In this case, citizens are &lt;em&gt;telling&lt;/em&gt; us their currency-valued utility – $25, problem solved (known as “revealed preference”). But all that &lt;em&gt;really&lt;/em&gt; tells us is that the citizen prefers giving the parks department $25 dollars to keeping it for themselves – and tells us nothing about the fuzzy questions of psychic insensities. Further, if we assume that everyone has the same capacity for inner experience (a question with deep ties to identity, our other bugbear), but not everyone has the same amount of money to give, then we paint ourselves into another corner: do those with more wealth, who give more, experience greater utility than those who give less? If I make $100 a day while you make $10, is my experience of satisfaction ten times yours? &lt;a href=&quot;https://www.nature.com/articles/s41562-017-0277-0&quot;&gt;Probably not&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You might retort that this is excessive pedantism. Our lived experience is full of assessments of the subjective experiences of others, and – although they are based on evolved heuristics, not mathematical proofs – it seems to work well. In his &lt;em&gt;&lt;a href=&quot;https://www.amazon.com/Gaming-Vote-Elections-Arent-About/dp/0809048922&quot;&gt;Gaming the Vote&lt;/a&gt;&lt;/em&gt; (chapter 15), William Poundstone considers this debate and makes the point that “these intellectual positions… entailed a pose of fashionable agnosticism over matters previously held to be common sense.” Many economists agree, with Amartya Sen giving the famous example of Nero’s sacking of Rome: it is almost universally seen as self-evident that the negative utility of all the Romans who suffered in that blaze outweigh the positive utility that Nero experienced in the sacking, and so the sacking was “bad”. Clearly, utilitarian arguments have a place. To conclude that “we cannot model or compare subjective experience” seems like the easy way out, and evokes the behaviorist posture which constrained psychologists up until the “cognitive revolution” of the mid-20th century. Even if it’s not perfect, putting numbers to feelings seems “good enough” and gives us something to work with – so what’s the problem?&lt;/p&gt;

&lt;p&gt;The problem is ultimately one of signal and noise, of &lt;a href=&quot;/blog/2016/04/16/the-problem-of-information.html&quot;&gt;signifier and signified&lt;/a&gt;, and of the &lt;a href=&quot;https://www.lesswrong.com/posts/uL74oQv5PsnotGzt7&quot;&gt;risks of optimizing for proxies&lt;/a&gt;. Briefly, since we are unable to accurately represent (and thus measure) the thing we really care about (subjective utility), we instead measure a &lt;em&gt;proxy&lt;/em&gt; (funding amounts). Unfortunately, there is a &lt;strong&gt;fundamentally unknowable&lt;/strong&gt; gap between these two measurements, and so &lt;em&gt;we cannot know&lt;/em&gt; how good our mechanisms really are with regard to our true goal of maximizing welfare – not only is there some error, but we cannot know what that error is.&lt;/p&gt;

&lt;p&gt;In casual settings this is a non-issue, since this “proxy gap” will be too small to be consequential. However, the more &lt;em&gt;pressure&lt;/em&gt; that is placed on the system (i.e. the more resources are at stake, the more people whose interests are affected), the greater the incentive to exploit the system (a phenomena known as &lt;a href=&quot;https://en.wikipedia.org/wiki/Goodhart%27s_law&quot;&gt;Goodhart’s Law&lt;/a&gt;), and a key vector of exploitation is the gap between the desired measurement and the true measurement (for a well-known example, consider the test-prep industry which coaches students taking high-stakes standardized tests). The more resources which are deployed using quadratic funding, the more pressure is placed on the system, and so the more the gap between “true utility” and “funding amounts” (the proxy) will be exploited – leading to unexpected failures because the gap &lt;em&gt;cannot be modeled by the system&lt;/em&gt;. Unlike other kinds of error, which can be modeled and thus handled by the system, this kind of error necessarily lies &lt;em&gt;outside the system&lt;/em&gt; and is thus quite pernicious, as the consequences invariably come suddenly and by surprise.&lt;/p&gt;

&lt;p&gt;All of this is not to say that quadratic funding is a bad idea – quite the opposite, in fact, as &lt;em&gt;in general&lt;/em&gt; it will probably work well (see &lt;a href=&quot;https://vitalik.ca/general/2019/10/24/gitcoin.html&quot;&gt;this experiment&lt;/a&gt;) and represents an important step forward. Further, these basic measurement problems do not affect quadratic funding alone – any mechanism which must represent and measure a subjective quality falls into this trap – which includes basically all voting, rating, and reputation systems.&lt;/p&gt;

&lt;p&gt;The point is more that one of the banner claims – optimality – is overstated. Ultimately, quadratic funding is “optimal” in the same way that blue is the “optimal” color for the Blue Man Group – it follows from the definition, rather than from some essential truth. Quadratic funding does not &lt;em&gt;really&lt;/em&gt; maximize utility – it maximizes some other amorphous “utility-like” thing. Which, again, is fine… until it’s not.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thanks to Auryn Macmillan for feedback and for making sure I’m not an idiot.&lt;/em&gt;&lt;/p&gt;
</description>
        <pubDate>Fri, 13 Dec 2019 00:00:00 +0000</pubDate>
        <link>http://kronosapiens.github.io/blog/2019/12/13/mild-critique-qf.html</link>
        <guid isPermaLink="true">http://kronosapiens.github.io/blog/2019/12/13/mild-critique-qf.html</guid>
        
        <category>voting</category>
        
        <category>mathematics</category>
        
        <category>economics</category>
        
        <category>social-choice</category>
        
        
        <category>blog</category>
        
      </item>
    
  </channel>
</rss>
