Research summary:
Computer scientist William Hibbard has proposed that powerful AI systems should be developed through an open-source process. Ideally, in an open-source development process many diverse parties could help in identifying technical flaws in an AI design, while transparency could raise the trust in the motivations of the resulting AI systems. Both of these benefits could be of enormous importance. Accidental flaws in advanced AIs could have catastrophic effects, particularly after a design becomes widely used (with perhaps billions or trillions of copies) or is able to further improve itself. Powerful AIs with motivations that could only be verified by their designers could be seen as an existential threat by other parties, provoking either an arms race of AI development as all parties cut corners on safety in favor of speed or outright pre-emptive violence. On the other hand, if diverse parties could verify the trustworthiness of open-source AIs, those systems could be relied on to enforce agreements regulating further AI development.
However, transparent AI development also poses an obvious risk: other projects, including those conducted by national militaries, could simply take incremental work by an open-source project and combine it with their own secret research, attaining powerful AI systems before other contenders. Such a project could then apply the technology for its own ends, perhaps gaining a sufficient advantage to prevent any other organizations from developing powerful AI. Worse, designs that are clearly extremely dangerous and unpredictable to a degree that national governments would refrain from implementing them could be run and released by the most risk-tolerant party observing the open-source development process.
This paper will explore the interaction of these potential advantages and disadvantages with several technological variables. In particular, it will consider the effects of varying the following parameters:
- The relative difficulties of creating powerful but dangerous AIs, and of creating powerful AIs with reliably safe motivations.
- The relative difficulties of designing an AI with safe motivations and of verifying the motivations of an AI design developed by others, a design that may contain the equivalent of 'backdoors'.
- The ease with which an advanced AI can be rapidly leveraged by a state or other organization to attain overwhelming military advantages sufficient to resist external aggression and to halt other AI development, i.e. the ease with which advanced AI can be used to produce a 'hard takeoff'.
Depending on the relevant parameter values, varying degrees of transparency and differing means of cooperative development become feasible or infeasible, including open-source development. By making these dependencies explicit, we can better evaluate strategies and identify parameter information with high information value when deciding among them.
Prior related work:
- Bill Hibbard's paper exists, and SIAI staff have done provisional analysis of related issues.
- This paper has not yet begun, but closely related work has been done for other papers.
Target dates for:
[1] The "starting date" is the date (guaranteed to be within six months of the receipt of grant money) when we have skilled people to allocate to the project. Extra donations increase our base of skilled people and thereby increase the number of projects we can get to; the lagged start date allows us to find new people, bring them here, and train them.
- Conference fees, air travel, motel: $1,400
- Costs for researcher time: $4,500
How research costs are estimated:
- Person-months for research and writing: 1.875 (This is obtained by taking our standard estimate[1] of 1.25 person-months per conference paper and multiplying by 1.5, because this paper requires gathering historical data.)
- Hours per person-month: 160
- Dollars required to support one skilled full time researcher-month[2]: $2,400
[2] This billing rate reflects an estimate of financial outlays for SIAI to create the equivalent of one full-time skilled researcher-month, including stipend or hosting expenses, workspace, and administrative or management time, and other supporting expenses. Actual person-months may be greater or lower depending on the labor mix for a particular project, with shortfalls made up from general funds. This rate is not reflective of the money researchers could earn in the competitive labor market. Think of this as a matched donation. You donate the living expenses; our researchers donate the surplus value of their labor.
How this paper will help reduce existential risk:
- Separately from the ability to create a safe or benevolent AI according to some set of values, if no methods exist for outside organizations to verify the motives of AIs, concern about distributional issues may provoke conflict or an arms race of reckless AI development in which all parties expect to do worse than in a cooperative solution. Better analysis of mechanisms for cooperation in AI design is thus valuable.
- The technical knowledge required to verify the motivations of an AI produced by others in an adversarial context should greatly assist in avoiding the accidental creation of powerful AI with dangerous motivations. Since the dangers of international conflict are better understood than the dangers of mistakes in designing AI motivations, drawing attention to the former application of this knowledge could bolster the latter application.
- Explicitly identifying relevant parameters affecting strategies for cooperative AI development can clarify thinking on the topic by facilitating separate analysis of the parameters rather than overly specific scenarios.
- The method of using jointly constructed or verified AI systems to enforce international agreements can be used to reduce many existential risks other than AI ones, and so to draw wider attention to the paper and artificial intelligence issues.
Human capital benefits, or network benefits (Will writing this paper help new visiting fellows become familiar with key research domains? Will it help create relationships with outside co-authors? Will it give folks interested in existential risk entry into new communities where valuable contacts may be found?):
- Work on this paper may offer opportunities for discussion collaboration with Future of Humanity Institute and machine ethics researchers who have investigated related topics.
Donate Online
Credit card transactions are securely processed through PayPal.