Today MIRI releases a new technical report: “Botworld 1.0” (pdf) by recent hires Nate Soares and Benja Fallenstein. The report is a “literate” Haskell file, available from MIRI’s Github page.
Soares explains the report on his accompanying Less Wrong post, which is also the preferred place for discussion of the report:
This report introduces Botworld, a cellular automaton that provides a toy environment for studying self-modifying agents.
The traditional agent framework, used for example in Markov Decision Processes and in Marcus Hutter’s universal agent AIXI, splits the universe into an agent and an environment, which interact only via discrete input and output channels.
Such formalisms are perhaps ill-suited for real self-modifying agents, which are embedded within their environments. Indeed, the agent/environment separation is somewhat reminiscent of Cartesian dualism: any agent using this framework to reason about the world does not model itself as part of its environment. For example, such an agent would be unable to understand the concept of the environment interfering with its internal computations, e.g. by inducing errors in the agent’s RAM through heat.
Intuitively, this separation does not seem to be a fatal flaw, but merely a tool for simplifying the discussion. We should be able to remove this “Cartesian” assumption from formal models of intelligence. However, the concrete non-Cartesian models that have been proposed (such as Orseau and Ring’s formalism for space-time embedded intelligence, Vladimir Slepnev’s models of updateless decision theory, and Yudkowsky and Herreshoff’s tiling agents) depart significantly from their Cartesian counterparts.
Botworld is a toy example of the type of universe that these formalisms are designed to reason about: it provides a concrete world containing agents (“robots”) whose internal computations are a part of the environment, and allows us to study what happens when the Cartesian barrier between an agent and its environment breaks down. Botworld allows us to write decision problems where the Cartesian barrier is relevant, program actual agents, and run the system.
As it turns out, many interesting problems arise when agents are embedded in their environment. For example, agents whose source code is readable may be subjected to Newcomb-like problems by entities that simulate the agent and choose their actions accordingly.
Furthermore, certain obstacles to self-reference arise when non-Cartesian agents attempt to achieve confidence in their future actions. Some of these issues are raised by Yudkowsky and Herreshoff; Botworld gives us a concrete environment in which we can examine them.
One of the primary benefits of Botworld is concreteness: when working with abstract problems of self-reference, it is often very useful to see a concrete decision problem (“game”) in a fully specified world that directly exhibits the obstacle under consideration. Botworld makes it easier to visualize these obstacles.
Conversely, Botworld also makes it easier to visualize suggested agent architectures, which in turn makes it easier to visualize potential problems and probe the architecture for edge cases.
Finally, Botworld is a tool for communicating. It is our hope that Botworld will help others understand the varying formalisms for self-modifying agents by giving them a concrete way to visualize such architectures being implemented. Furthermore, Botworld gives us a concrete way to illustrate various obstacles, by implementing Botworld games in which the obstacles arise.
Botworld has helped us gain a deeper understanding of varying formalisms for self-modifying agents and the obstacles they face. It is our hope that Botworld will help others more concretely understand these issues as well.