Research Grants

Academic Paper Grant

Containing Superintelligence: Feasibility and Strategies

Research summary:

SIAI's research on AI safety has focused on the concept of "Friendly AI" -- AI proven to have goals that, under certain assumptions, stably align with humanity's, even as the AI rewrites its own code and increases its intelligence. For AI designs expected to become extremely powerful, this degree of goal alignment could be essential. However, other classes of precautions involve constructing systems specifically to reduce the likelihood of unwanted rapid self-improvement, "intelligence explosion," or increase in power. In particular, AI could be limited in its access to resources and manipulators, i.e., "kept in a box." Or it might be designed with goals such that it has only weak incentives to obtain such access, e.g., an "Oracle AI" concerned only with answering questions without utilizing external resources or one with extreme temporal discounting. This paper will analyze some of these strategies, their ease of implementation, and their safety.


Planned contents include: 

  • An introduction on the problem of Friendly AI, and the Singularity.
  • A description of the circumstances under which Friendly AI, or full alignment of goals between AI and its creators, is most essential.
  • A discussion of advantages and disadvantages for strategies resembling "AI Boxing", and exploration of the space of such strategies.
  • A discussion of the risks of boxed AIs escaping, and reasons why these might be easy to underestimate.
  • An analysis of the feasibility and costs and benefits of creating AI with narrow goals that do not create strong drives to obtain power and resources.
  • An investigation of solutions to some problems with these strategies.


Prior related work:

Eliezer Yudkowsky's AI Box Experiment

AI as a Positive and Negative Factor in Global Risk, by Eliezer Yudkowsky

Stephen Omohundro's Basic AI Drives discusses the pressures on non-Friendly AIs to seek out resources and manipulators


Target dates for:

  • Extended Abstract (Posting an extended abstract on SIAI website, and circulating to a few related academics for comment)2 weeks after start date.[1] 
  • Working paper (Posting a working paper on the SIAI website; circulating to related academics): 6 weeks after start date.
  • Conference submission: 10 weeks after start date.
  • Follow-up steps (Brainstorming, and drafting proposals for, any follow-up publications.  Should it be developed into a journal paper?)12 weeks after start date.

 

[1] The "starting date" is the date (guaranteed to be within six months of the receipt of grant money) when we have skilled people to allocate to the project.  Extra donations increase our base of skilled people and thereby increase the number of projects we can get to; the lagged start date allows us to find new people, bring them here, and train them.


Total budget:  $4,400

Conference fees, air travel, motel: $1,400
Costs for researcher time: $3,000

How research costs are estimated:
  • Researcher-months for research and writing: 1.25 (This is our standard estimate of the time required for conference articles.[1])
  • Dollars required to support one skilled full time researcher-month[2]: $2,400

[1] Our base estimate is 1.25 person-months per conference paper, and 3 per journal article, for an experienced full-time researcher. This estimate takes the planning fallacy, and the importance of an outside view in avoiding that fallacy, into account. While typical rates of article production by professors are extremely low, the distribution is strongly skewed towards research-oriented universities and departments, and informal surveys of researchers working on existential risks give data consistent with this estimate for full-time work required per paper. Visiting Fellows vary in their experience levels, so that mean productivity is expected to be lower, but a team mix can be selected to account for this.

[2] This billing rate reflects an estimate of financial outlays for SIAI to create the equivalent of one full-time skilled researcher-month, including stipend or hosting expenses, workspace, and administrative or management time, and other supporting expenses. Actual person-months may be greater or lower depending on the labor mix for a particular project, with shortfalls made up from general funds. This rate is not reflective of the money researchers could earn in the competitive labor market. Think of this as a matched donation. You donate the living expenses; our researchers donate the surplus value of their labor.
 

How this paper will help reduce existential risk:


Research benefits (What ideas will the paper explore?  How will that knowledge help with existential risk?):

  • The paper will explore the safety and feasibility of alternative strategies for reducing AI risk
  • If Friendly AI is very difficult relative to AGI, and global regulation of AI is infeasible, then strategies such as this may be the most productive way to reduce AI catastrophic risks.


Influence benefits (What target audience will the paper impact, how?  How will that impact help with existential risk?):

  • If ideas in the paper, or work inspired or caused by it, are of sufficient quality, they may cause the implementation of improved safety measures by future AI researchers
  • Analysis of limited suboptimal solutions can help to avoid the impression of "making the perfect the enemy of the good," and place ideal solutions in context.
  • A comprehensive discussion of both advantages and disadvantages of these proposals can also draw attention to their limits, and the risk of their failure.
  • A call for further investigation into the feasibility of risk-reducing measures such as these may help to inspire research efforts in this area, which may be easier for academics to contribute to today than the development of a theory of Friendly AI.


Human capital benefits, or network benefits (Will writing this paper help new Visiting Fellows become familiar with key research domains?  Will it help create relationships with outside co-authors?  Will it give folks interested in existential risk entry into new communities where valuable contacts may be found?):

  • Visiting Fellows working on this paper will become more familiar with issues around a key domain of AI risk. The paper may also create relationships with researchers interested in the problem of AI safety in 'softer takeoff' scenarios.




Credit card transactions are securely processed through PayPal.

PayPal – Grant Donation

(200 characters max)