Research Program

Research Areas

Our Phase I research divides into two categories:

  • Theoretical Research
  • System Design and Implementation (primarily Phase II)

Theoretical Research

Research Area 1: Mathematical Theory of General Intelligence

Our research in this area will focus on using algorithmic information theory and probability theory to formalize the notion of general intelligence, specifically ethical general intelligence. Important work in this area has been done by Marcus Hutter, Juergen Schmidhuber, Shane Legg, and others, as well as by our team; but this work has not yet been connected with pragmatic AGI designs. Meeting this challenge is one of our major goals going forward. Specific focus areas within this domain include:

  • Mathematical Formalization of the "Friendly AI" Concept. Proving theorems about the ethics of AI systems, an important research goal, is predicated on the possession of an appropriate formalization of the notion of ethical behavior on the part of an AI. And, this formalization is a difficult research question unto itself.
  • Implications of Algorithmic Information Theory for the Predictability of Arbitrarily Intelligent AIs. In 2006, Shane Legg made an interesting, ultimately failed attempt to prove algorithmic information theoretic limitations on the possibility of guaranteeing ethical behavior on the part of future AIs. This line of research however has significant potential for future exploration.
  • Formalizing the Concept of General Intelligence. Shane Legg and Marcus Hutter published a paper in 2006 presenting a formal definition of general intelligence. Their work is excellent but can be extended in various ways; in particular, work is needed on connecting these ideas with practical intelligence tests for AGIs.
  • Reflective Decision Theory: Extending Statistical Decision Theory to Strongly Self-Modifying Systems. Statistical decision theory, as it stands, tells us little about software systems that regularly make decisions to modify their own source code in radical ways. This deficit must be remedied if we wish to formally understand self-modifying AGI systems, their potential dangers, and potential routes to ensuring their long-term safety and beneficialness.
  • Dynamics of Goal Structures Under Self-Modification. Under what conditions will an AGI system's internal goal structure remain invariant as the system self-modifies? Supposing that one of the system's top-level goals is precisely this sort of goal-system invariance - nevertheless, that is clearly not enough to guarantee invariance. Additional conditions are needed, but the nature of these conditions has not been seriously investigated. This is a deep mathematical issue in the dynamics of computational intelligence, with obvious critical implications for the creation of stably beneficial AGI.

Research Area 2: Friendliness Theory

A central view of our research team is that ethical issues must be placed at the center of AGI research, rather than tacked on peripherally to AGI designs created without attention to ethical considerations. Several of our focus areas have direct implications for AGI ethics (particularly the investigation of goal system stability), but we also intend to heavily investigate several other issues related to AGI and ethics, including:

  • Formalizing the Theory of Coherent Extrapolated Volition. SIAI Research Fellow Eliezer Yudkowsky has proposed "coherent extrapolated volition" (CEV) as a way of arriving at a top-level supergoal for an AI system that represents the collective desires of a population of individuals. While fascinating, the idea has only been presented informally, and a mathematical formalization seems necessary so that its practical viability can be assessed. For example, it is of interest to try to articulate formally the conditions under which the CEV of a population of individual agents, appropriately defined, will exist. This may depend on the coherence versus divergence of the beliefs or mind-states of the individuals.
  • Framework for Formalizing Desired Beneficial Outcomes. To create safe and beneficial AI systems, we must have a clear vision of what constitutes a beneficial outcome. The recently developed science of Positive Psychology is making great strides in understanding elements that promote human happiness. Political philosophy has studied a wide variety of approaches to structure "the good society" in a way that maximizes the benefits to its citizens. We will work toward creating a framework which formalizes these kinds of insights so that they can be considered for AI goal systems.
  • Decision-Theoretic and Game-Theoretic Foundations for the Ethical Behavior of Advanced AIs. Microeconomics and decision theory study the nature of individual preferences and their influence on behavioral outcomes. Game theory is the core mathematical theory of decision making by interacting agents. We will use these tools to analyze the likely behavior of alternative models for the safe deployment of advanced self-modifying AIs. The preferences of an agent together with the behavior of other agents in its environment determine the actions it will take. We must design the preferences of agents so that their collective behavior produces the results we desire and is stable against internal corruption or external incursion.

System Design and Implementation

Research Area 3: AGI Design

This is arguably the most critical component of the path to AGI. As noted earlier, AGI design and engineering will be our central focus in Phase II. In Phase I, however, our work in this area will focus on the comparison and formalization of existing AGI designs. This is crucial, as it will lead to a better understanding of the strong and weak points in our present understanding of AGI, and form the foundation for creating new AGI designs, as well as analyzing and modifying existing AGI designs.