We’re not Writing the Laws
August 13th, 2007 –
Numerous readers in the past have posted suggestions for how the universe should be run, in the form of laws, rights, or moral principles. I don’t want to embarrass them, so I won’t post links here, but if you have made such a suggestion then feel free to repost it in the comments below.
We know we’re not (yet) wise enough to write down scientific theories or fashion standards or music styles to be used for the rest of time. These objects are the output of an optimization process; human beings have spent long years studying, thinking, observing, and testing to develop them. If we want future generations to continue this process, it is necessary to communicate the target of the optimizer, not just its output so far.
Wikipedia is not an AI scientist; iTunes is not an AI composer. To build an AI scientist or an AI composer, we need to load in the optimization targets of science and music. Similarly, the 10 Commandments, the Bill of Rights, and Asimov’s Laws are not moralities; they are the output of human moralists. To build an AI moralist, you need to load in the right optimization target; that target is what I’m calling a morality.
Volition extrapolation is our current idea of how to load in optimization targets from existing humans.






































Sounds similar to democracy – “loading”, negotiating, “cohering” and reconciling a diversity of long and short term “optimization targets”. At present, we just use polls.
In that context, however, we do make laws that have direct effects on how this process works – campaign finance, redistricting, polling place regulations, etc.
If the details of the process we use for “extrapolating volition” (in other words: voting) are so critical to the outcome of our civilization, shouldn’t we be making what amounts to law?
“If the details of the process we use for âextrapolating volitionâ (in other words: voting)”
Voting is not extrapolating volition. We know from evolutionary psychology that humans have many of the same basic moral urges; there should therefore be a coherent theory of morality that 95% of the population would accept, given time to think about it and the correct answers to factual questions. The problem is, we don’t know what that theory is, and if we tried to figure it out we’d probably get it wrong. So we have to build an AGI which will figure it out for us.
Humans (and all systems), in their thrivings to survive, develop more sophisticated ways of dealing with their environment so that it can satisfy their needs. The more sophisticated ways we have of doing this, the more “smart” we are. The same applies to AI: the more variables it can process from the environment, the more smart it will be, no matter if it’s just a pure calculating machine.
This is the very essence of the ‘optimizer’ as you call it. But we need not to translate the target optimizer to the AI – it will find itself. Our target is not inherited in us because we wanted to find it, but because with the millions of years of trying what works for us and what not that it came as required trait.
I agree with your premises but am unsure about your conclusions. I do think that morality is analogous to science in this respect. However, there is an ideal scientific theory: the one that actually fits all observation. If it is easier to find this than to create an AI scientist that is more effective than us, we may just want to find the theory directly. I think this may well be easier: I wouldn’t be very surprised if a reductionist theory of everything was found in ten years, but would be shocked if a super-human AI scientist was built by then.
Thinking about it more, I am actually proposing a closer analogy between morality and science than you seem to be, for the real analogy to an AI-scientist is an AI-moral-philosopher which tries to find the ethical truth. CEV is analogous to a Coherent Extrapolated Belief machine that tries to work out what we would end up believing about the laws of physics: this may be the physical truth, but needn’t be.
Contra Tom McCabe, I don’t think that morality is anything like a poll: even under conditions of extra time for thought. It is possible for us to all get it wrong: just like science.
“I wouldnât be very surprised if a reductionist theory of everything was found in ten years”
A physical Theory of Everything would still not account for higher-level behavior in a useful way. Sure, you know complex biological processes (say) reduce to physics, but do you know how? Building an AI scientist would probably be easier than creating a real TOE (one that explains all higher-level phenomena).
“I donât think that morality is anything like a poll: even under conditions of extra time for thought. It is possible for us to all get it wrong: just like science.”
There’s an external physical world for our science to be wrong about. It’s not clear that there’s a similar external moral world (outside of our own collective volition). Say you have a precisely specified moral theory; how, pray tell, do you test it against ‘reality’?
[...] Intelligence, is sorting out the user requirements for a friendly artificial intelligence. A recent post in the SIAI weblog points to “Coherent Extrapolated Volition”, Yudkowsky ’s paper [...]
If we entertain the notion that the RPOP will have subjective experience – as an inexorable result of possessing powerful general intelligence (which is my guess)… Then could we direct it to determine volition *without* it directly experiencing the mind(s) that it is extrapolating? It seems like that should be possible. The brain is bound-by, and follows the laws of physics. Could the extrapolated brain(s) be presented to the RPOP strictly as physical objects that behave deterministically (according to the fundamental laws of physics)? More specifically, can we give the RPOP a high-level goal that says: “Implement Humanity’s CEV” and a lower high-level goal that says “Do not write the extrapolated volition/mind into your own mind-file.” Ie. Without performing any subjective “simulations” of those mind(s). Or if necessary perhaps tell the RPOP to write the extrapolated brain/mind to a discrete file separate from it’s mind-file That at least seems possible from my POV.
Tom: So CEV generates only instructions and arguments as output, but would leave the decision to implement them to us, within the democratic contexts in which we currently operate. That sounds both accurate and reasonable.
So not only does CEV have to output a broad, optimized ethical and political system, but it has to output it in a way //we// can understand well enough to see exactly how much better it is, possibly prior to implementation.
Pretty tall order. Good luck!
a
a