A paraphrased transcript of a conversation with Eliezer Yudkowsky.
Interviewer: Suppose you’re talking to a smart mathematician who looks like the kind of person who might have the skills needed to work on a Friendly AI team. But, he says, “I understand the general problem of AI risk, but I just don’t believe that you can know so far in advance what in particular is useful to do. Any of the problems that you’re naming now are not particularly likely to be the ones that are relevant 30 or 80 years from now when AI is developed. Any technical research we do now depends on a highly conjunctive set of beliefs about the world, and we shouldn’t have so much confidence that we can see that far into the future.” What is your reply to the mathematician?
Eliezer: I’d start by having them read a description of a particular technical problem we’re working on, for example the “Löb Problem.” I’m writing up a description of that now. So I’d show the mathematician that description and say “No, this issue of trying to have an AI write a similar AI seems like a fairly fundamental one, and the Löb Problem blocks it. The fact that we can’t figure out how to do these things — even given infinite computing power — is alarming.”
A more abstract argument would be something along the lines of, “Are you sure the same way of thinking wouldn’t prevent you from working on any important problem? Are you sure you wouldn’t be going back in time and telling Alan Turing to not invent Turing machines because who knows whether computers will really work like that? They didn’t work like that. Real computers don’t work very much like the formalism, but Turing’s work was useful anyway.”
Interviewer: You and I both know people who are very well informed about AI risk, but retain more uncertainty than you do about what the best thing to do about it today is. Maybe there are lots of other promising interventions out there, like pursuing cognitive enhancement, or doing FHI-style research looking for crucial considerations that we haven’t located yet — like Drexler discovering molecular nanotechnology, or Shulman discovering iterated embryo selection for radical intelligence amplification. Or, perhaps we should focus on putting the safety memes out into the AGI community because it’s too early to tell, again, exactly which problems are going to matter, especially if you have a longer AI time horizon. What’s your response to that line of reasoning?
Eliezer: Work on whatever your current priority is, after an hour of meta reasoning but not a year of meta reasoning. If you’re still like, “No, no, we must think more meta” after a year, then I don’t believe you’re the sort of person who will ever act.
For example, Paul Christiano isn’t making this mistake, since Paul is working on actual FAI problems while looking for other promising interventions. I don’t have much objection to that. If he then came up with some particular intervention which he thought was higher priority, I’d ask about the specific case.
Nick Bostrom isn’t making this mistake, either. He’s doing lots of meta-strategy work, but he also does work on anthropic probabilities and the parliamentary model for normative uncertainty and other things that are object-level, and he hosts people who like Anders Sandberg who write papers about uploading timelines that are actually relevant to our policy decisions.
When people constantly say “maybe we should do some other thing,” I would say, “Come to an interim decision, start acting on the interim decision, and revisit this decision as necessary.” But if you’re the person who always tries to go meta and only thinks meta because there might be some better thing, you’re not ever going to actually do something about the problem.