Research Grants

Academic Paper Grant

Collective Action Problems and AI Risk


Research summary:

If AI causes a global catastrophe, it will be through human choices. With a global government, if an AI design were seen to pose a large risk of catastrophe, its development could be prohibited in favor of research into safer designs. In this scenario, AI-driven catastrophe could only be likely if the risks were greatly underestimated by the singleton. However, if powerful AI is developed in a competitive context, risk could be greatly increased. If estimates of the danger of particular AI designs vary among potential developers, winner's curse may tend to lead to the first powerful AIs being constructed by those who underestimate dangers and strongly trade off safety measures for speed. Furthermore, if 'winner-take-all' effects can give large private rewards to the organization with initial access to powerful AI, while existential risks posed by unsafe AI affect all parties, this externality could create a global tragedy of the commons in which all parties reduce safety measures for speed. This paper will develop a simple mathematical model of these effects, and consider their implications for estimating and reducing AI risks.

Planned contents include:

  • A brief review of the literature on existential risks of AI to establish the relevance of the analysis.
  • An exploration of potential commercial and military winner-take-all effects of advanced AI.
  • A simple mathematical model of AI development in which mutually disinterested AI developers must select a level of investment in safety precautions, with higher levels of investment reducing both the speed of AI development and the risk of catastrophe conditional on creating the first powerful AI, in light of varying estimates of the function from safety investments to the danger of any AI produced.  This will be followed with discussion of:
    • Implications of the model for the comparative risk of different scenarios, e.g., with varying numbers of developers and coordination mechanisms: diverse unregulated small developers, a handful of major players such as national governments, or a singleton
    • The robustness of these relative risk comparisons across plausible alternative models and specifications
    • The relative uncertainty of various parameters, and methods to improve estimates thereof

Prior related work:

"Artificial Intelligence as a Positive and Negative Factor in Global Risk," by SIAI Research Fellow Eliezer Yudkowsky.
A presentation and extended abstract for the European Computing and Philosophy conference in 2009, "Arms Control and Intelligence Explosions."
A voluminous game theory literature on nuclear arms races and brinkmanship is directly relevant, as is work on unstable arms races in the context of molecular nanotechnology.

Target dates for:

Extended abstract (Posting an extended abstract on SIAI website, and circulating to a few related academics for comment)3 weeks after start date.[1]

Working paper (Posting a working paper on the SIAI website; circulating to related academics)8 weeks after start date.

 

Journal submission: 14 weeks after start date.

 

Follow-up steps (Brainstorming, and drafting proposals for, any follow-up publications.  Should it be developed into a journal paper?): 15 weeks after start date.

 

[1] The "starting date" is the date (guaranteed to be within six months of the receipt of grant money) when we have skilled people to allocate to the project.  Extra donations increase our base of skilled people and thereby increase the number of projects we can get to; the lagged start date allows us to find new people, bring them here, and train them.


Total budget:  $7,200

How research costs are estimated:
  • Person-months for research and writing: 3 (This is our standard estimate of the time required for journal articles[1]).
  • Dollars required to support one skilled full time researcher-month[2]: $2,400

[1] Our base estimate is 1.25 person-months per conference paper, and 3 per journal article, for an experienced full-time researcher. This estimate takes the planning fallacy, and the importance of an outside view in avoiding that fallacy, into account. While typical rates of article production by professors are extremely low, the distribution is strongly skewed towards research-oriented universities and departments, and informal surveys of researchers working on existential risks give data consistent with this estimate for full-time work required per paper. Visiting Fellows vary in their experience levels, so that mean productivity is expected to be lower, but a team mix can be selected to account for this.

[2] This billing rate reflects an estimate of financial outlays for SIAI to create the equivalent of one full-time skilled researcher-month, including stipend or hosting expenses, workspace, and administrative or management time, and other supporting expenses. Actual person-months may be greater or lower depending on the labor mix for a particular project, with shortfalls made up from general funds. This rate is not reflective of the money researchers could earn in the competitive labor market. Think of this as a matched donation. You donate the living expenses; our researchers donate the surplus value of their labor.
 

How this paper will help reduce existential risk:

Research benefits (What ideas will the paper explore?  How will that knowledge help reduce existential risk?)
  • The two most plausible reasons for an AI-driven catastrophe are biased underestimation of the riskiness of AI designs, and collective action problems, but the latter topic has been relatively less explored, suggesting that relatively more 'low-hanging-fruit' remain to be found in the vicinity.
  • A clearer model of the collective action problems surround AI risks will help to estimate the value of institutional mechanisms for coordination, and to allocate resources between research into such mechanisms and other risk-reducing strategies.

Influence benefits (What target audience will the paper impact, how?  How will that impact help with existential risk?)
  • The game theory of nuclear arms races and brinkmanship, wherein leaders adopted policies that they knew imposed a probability of global thermonuclear war through miscalculation ("trembling hands"), is rigorous and well-developed; this theory can make the assumptions of an analysis of AI risks explicit and easy to debate, and help to elicit more precise and focused discussion of those assumptions from academics.
  • Analysis of the competitive commercial and military pressures that could result in compromises of safety procedures suggests that developing safe AI will be more difficult than otherwise; the analysis could thus help persuade computer scientists and other non-social scientists that AI risks are more significant and worthy of consideration than they had suspected.

Human capital benefits, or network benefits (Will writing this paper help new Visiting Fellows become familiar with key research domains?  Will it help create relationships with outside co-authors?  Will it give folks interested in existential risk entry into new communities where valuable contacts may be found?)

  • The connection of this paper with existing work on nuclear strategy and dangers may assist in reaching out to game theorists, political scientists, and policy analysts concerned with existential risks.
  • Visiting Fellows working on this project can gain practice in creating models of this type, skills that could be applied to extensions and variations to more deeply explore this area.



Credit card transactions are securely processed through PayPal.

PayPal – Grant Donation

(200 characters max)