Participatory budgeting experiments are experiments done in the laboratory and in computerized simulations, in order to check various ethical and practical aspects of participatory budgeting. These experiments aim to decide on two main issues:
Goel, Krishnaswamy, Sakshuwong and Aitamurto report the results of several experiments done on real PB systems in Boston (2015–2016), Cambridge (2014–2015), Vallejo (2015) and New York City (2015). They compare knapsack voting to k-approval voting. Their main findings are: [1]
Later experiments lead to different conclusions:
Benade, Itzhak, Shah, Procaccia and Gal compared input formats on two dimensions: efficiency (social welfare of the resulting outcomes), and usability (cognitive burden on the voters). They conducted an empirical study with over 1200 voters. Their story was about resource allocation for a desert island. They concluded that k-approval voting imposes low cognitive burden and is efficient, although it is not perceived as such by the voters. [2]
Benade, Nath, Procaccia and Shah experimented with four input formats: knapsack voting, ranking by value, ranking by value-for-money, and threshold-approval. Their goal was to maximize social welfare by using observed votes as proxies for voters’ unknown underlying utilities. They found out that threshold-approval voting performs best on real PB data. [3]
Fairstein, Benade and Gal report the results of an experiment with Amazon Turk workers, on a PB process in an imaginary town. In their experiment, 1800 participants vote in four PB elections in a controlled setting, to evaluate the practical effects of the choice of voting format and aggregation rule. They compared k-approval with k=5, [4] : Figure 8(a) threshold-approval, knapsack voting, rank by value, rank by value/cost, and cardinal ballots. Their main findings [5] are that the k-approval voting format leads to the best user experience: users spent the least time learning the format and casting their votes, and found the format easiest to use. They felt that this format allowed them to express their preferences best, probably due to its simplicity. [4]
Yang, Hausladen, Peters, Pournaras, Fricker and Helbing constructed an experiment modeled over the PB process in Zurich. They had 180 subjects that are students from Zurich universities. Each subject had to evaluate projects in six input formats: unrestricted approval, 5-approval, 5-approval with ranking, cumulative with 5 points, cumulative with 10 points, cumulative with 10 points over 5 projects. The subjects were then asked which input format was most easy, most expressive, and most suitable. Unrestricted approval was conceived most easy, but least expressive and least suitable; in contrast, 5-approval with ranking, and cumulative with 10 points over 5 projects, were found significantly more expressive and more suitable. Suitability was affected mainly by expressiveness; the effect of easiness was negligible. They also found out that the project ranking in unrestricted approval was significantly different than in the other 5 input formats. Approval voting encouraged voters to disperse their votes beyond their immediate self-interest. This may be considered as altruism, but it may also mean that this format does not represent their preferences well enough. [6]
Fairstein, Benade and Gal compared the robustness of various methods to the participation rate, that is: if a certain random subset of the voters remain at home, how does it affect the final outcome? They particularly compared the simple greedy algorithm (which assumes cost-based satisfaction) with equal shares (assuming cardinality-based satisfaction). They found out that greedy outcomes are highly sensitive to the input format used and the fraction of the population that participates. In contrast, MES outcomes are not sensitive to the type of voting format used. These outcomes are stable even when only 25–50% of the population participates in the election. [4]
Yang, Hausladen, Peters, Pournaras, Fricker and Helbing do a similar experiment comparing four rules: simple greedy (which assumes cost-satisfaction), value/cost greedy (which assumes cardinality-satisfaction), MES with cardinality-satisfaction, and MES with cost-based satisfaction. They found out that the differences in stability are not significant when comparing rules using the same satisfaction function. [6]
To compute the outcomes, they added to the subjects' votes, some random votes generated using a realistic probability distribution. They then compared three types of explanations: mechanism explanation (a general explanation of how the aggregation rule works given the voting input), individual explanation (explaining how many voters had at least one approved project, at least 10000 CHF in approved projects), and group explanation (explaining how the budget is distributed among the districts and topics). They compared the perceived trustworthiness and fairness of greedy and equal shares, before and after the explanations. They found out that: [6]
Rosenfeld and Talmon conducted two experiments: [7]
Similar results were found when more advanced students (M.Sc. students) were asked to construct the budget-allocation by themselves, rather than choose from 5 options. [7]
Peters and Skowron conducted a simulation experiment: they took the votes from the PB in Warsaw, which were aggregated using the greedy algorithm, and compared the outcome to aggregation using equal shares. Their conclusions are: [10]
{{cite web}}
: CS1 maint: multiple names: authors list (link)