Paul Rosenbloom on Cognitive Architectures

 |   |  Conversations

Paul Rosenbloom portrait Paul S. Rosenbloom is Professor of Computer Science at the University of Southern California and a project leader at USC’s Institute for Creative Technologies. He was a key member of USC’s Information Sciences Institute for two decades, leading new directions activities over the second decade, and finishing his time there as Deputy Director. Earlier he was on the faculty at Carnegie Mellon University (where he had also received his MS and PhD in computer science) and Stanford University (where he had also received his BS in mathematical sciences with distinction).

His research concentrates on cognitive architectures – models of the fixed structure underlying minds, whether natural or artificial – and on understanding the nature, structure and stature of computing as a scientific domain.  He is a AAAI Fellow, the co-developer of Soar (one of the longest standing and most well developed cognitive architectures), the primary developer of Sigma (which blends insights from earlier architectures such as Soar with ideas from graphical models), and the author of On Computing: The Fourth Great Scientific Domain (MIT Press, 2012).

Luke Muehlhauser: From 1983-1998 you were co-PI of perhaps the longest-running cognitive architectures aimed at artificial general intelligence, the Soar project, which is still under development. One of your current projects is a new cognitive architecture called Sigma, which is “an attempt to build a functionally elegant, grand unified cognitive architecture/system – based on graphical models and piecewise continuous functions – in support of virtual humans and intelligent agents/robots.” What lessons did you learn from Soar, and how have they informed your work on Sigma?

Paul S. Rosenbloom: That’s an interesting and complex question, and one I’ve thought about off and on over the past few years, as Sigma was motivated by both the strengths and weaknesses of Soar. My current sense is that there were five lessons from Soar that have significantly impacted Sigma.

The first is the importance of seeking unified architectures for cognition. Not focusing on individual capabilities in isolation helps avoid local optima on the path towards general intelligence and opens the way for deep scientific results that are only accessible when considering interactions among capabilities.  In Sigma, this idea has been broadened from unification across cognitive capabilities, which is the primary concern in Soar, to unification across the full range of capabilities required for human(-level) intelligent behavior (including perception, motor, emotion, etc.). This is what is meant by grand unified.

The second is the importance of a uniform approach to architectures – at least as was exhibited through version 8 of Soar – in combination with Allen Newell’s exhortation to “listen to the architecture.” Many of the most interesting results from Soar arose from exploring how the few mechanisms already in the architecture could combine to yield new functional capabilities without adding new modules specifically for them. This has been recast in Sigma as functional elegance, or yielding something akin to “cognitive Newton’s laws”. This doesn’t necessarily mean allowing only one form of each capability in the architecture, as is implied by strict uniformity, but it does suggest searching for a small set of very general mechanisms through whose interactions the diversity of intelligent behavior arises. It also emphasizes deconstructing new capabilities in terms of existing architectural mechanisms in preference to adding new architectural mechanisms, and preferring to add microvariations to the architecture, rather than whole new modules, when extensions are necessary.

The third is the importance of having a working system that runs fast enough to provide feedback on the architecture from complex experiments. This is reflected in Sigma’s goal of sufficient efficiency.

The fourth is the functional elegance of a trio of nested control loops (reactive, deliberative and reflective); where flexibility increases (and speed decreases) with successive loops; where each earlier control loop acts as the inner loop of the next one; and where there is a mechanism such as chunking that can compile results generated within later/outer/slower loops into knowledge that is more directly accessible in earlier/inner/faster loops. This control structure is the largest conceptual fragment of Soar that has been carried directly over to Sigma, with work currently underway on how to generalize chunking appropriately for Sigma.

The fifth is that the earlier uniform versions of Soar did not provide sufficient architectural capability for full cognitive unification, and fell even further short when considering grand unification. Instead of leading the development of Sigma away from Soar’s original uniformity assumption, as has been the case with version 9 of Soar, this lesson led to a search for a more general set of uniform mechanisms; and, in particular, to the centrality of graphical models and piecewise-linear functions in Sigma. Graphical models yield state of the art algorithms across symbol, probability and signal processing, while piecewise-linear functions provide a single representation that can both approximate arbitrary continuous functions as closely as desired and be appropriately restricted for probabilistic and symbolic functions. The goal given this combination is to create a new breed of architecture that can provide the diversity of intelligent behavior found in existing state-of-the-art architectures, such as Soar and ACT-R, but in a simpler and more theoretically elegant fashion, while also extending in a functionally elegant manner to grand unification.

Luke: Where does Sigma fit into the space of AGI-aimed cognitive architectures? How is it similar to, and different from, AIXILIDA, DUAL, etc.? And why do you think Sigma might offer a more promising path toward AGI than these alternatives?

Paul: Sigma can be viewed as an attempt to merge together what has been learned from three decades of work in both cognitive architectures and graphical models (although clearly reflecting a personal perspective on what has been learned).  It is this combination that I believe provides a particularly promising path toward AGI.  Although a variety of cognitive architectures have been developed over the years, no existing architecture comes close to fully leveraging the potential of graphical models to combine (broad) generality with (state of the art) efficiency.

I’m not at this point going to get into detailed comparisons with specific architectures, but one particularly illustrative dimension for comparison with the three you mentioned concerns functional elegance.  AIXI is at one extreme, attempting to get by with a very small number of basic mechanisms.  LIDA is at the other extreme, with a large number of distinct mechanisms.  I’m not terribly familiar with DUAL, but from what I have seen it is closer to AIXI than LIDA, although not as extreme as AIXI.  Along this dimension, Sigma is closest to DUAL, approaching the AIXI end of the spectrum but stopping short in service of sufficient efficiency.

With respect to your first question, concerning the broader space of AGI-aimed architectures, there are additional dimensions along which Sigma can also be situated, although I’ll only mention a few of them explicitly here.  First, in the near term Sigma is more directly aimed at functional criteria than criteria concerned with modeling of human cognition.  There is however an indirect effect of such modeling criteria through lessons transferred from other architectures, and there is an ultimate intent of taking them more directly into consideration in the future.  Second, Sigma sits somewhere in the midrange on the dimension of how formal/mathematical the architecture is, with distinctly informal aspects – such as the nested control loops – combined with formal aspects inherited from graphical models.  The third and fourth dimensions are the related ones of high-level versus low-level cognition and central versus peripheral cognition.  Sigma is aimed squarely at the full spans of both of these dimensions, even with a starting point that has been more high level and central.

Luke: When discussing AGI with AI scientists, I often hear the reply that “AGI isn’t well-defined, so it’s useless to talk about it.” Has that been much of a problem for you? Which operational definition of AGI do you tend to use for yourself?

Paul: I tend to take a rather laid back position on this overall issue.  Many AI scientists theses days are uncomfortable with the vagueness associated with notions of general intelligence, and thus limit themselves to problems that are clearly definable, and for which progress can be measured in some straightforward, preferably quantifiable, manner. Some of the initial push for this came from funders, but there is a strong inherent appeal to the approach, as it makes it easy to say what you are doing and to determine whether or not you are making progress, even if the metric along which you are making progress doesn’t quite measure what you originally set out to achieve.  It also makes it easier to defend what you are doing as science.

However, in addition to working on Sigma over the past few years, I’ve also spent considerable time reflecting on the nature of science, and on the question of where computing more broadly fits in the sciences. The result is a very simple notion of what it means to do science, which is to (at least on average) increase our understanding over time. (More on this, and on the notion of computing as science – and in fact, as a great scientific domain that is the equal of the physical, life and social sciences – can be found in On Computing: The Fourth Great Scientific Domain, MIT Press, 2012.) There is obviously room within this definition for precisely measured work on well-defined problems, but also for more informal work on less well-defined problems.  Frankly, I often learn more from novel conjectures that get me thinking in entirely new directions, even when not yet well supported by evidence, than small ideas carefully evaluated.  This is one of the reasons I’ve enjoyed my recent participation in the AGI community.  Even though the methodology can often seem quite sketchy, I am constantly challenged by new ideas that get me thinking beyond the confines of where I currently am.

My primary goal ever since I got into AI in the 1970s has been the achievement of general human-level intelligence.  There have been various attempts over the years to define AGI or human-level intelligence, but none have been terribly satisfactory.  When I teach Intro to AI classes, I define intelligence loosely as “The common underlying capabilities that enable a system to be general, literate, rational, autonomous and collaborative,” and a cognitive architecture as the fixed structures that support this in natural and/or artificial systems.  It is this fixed structure that I have tried to understand and build over most of the past three-to-four decades, with Soar and Sigma representing the two biggest efforts – spanning roughly two decades of my effort between them – but with even earlier work that included participating in a project on instructable production systems and developing the XAPS family of activation-based production system architectures.

The desiderata of grand unification, functional elegance, and sufficient efficiency were developed as a means of evaluating and explaining the kinds of things that I consider to be progress on architectures like Sigma.  Increasing the scope of intelligent behavior that can be yielded, increasing the simplicity and theoretical elegance of the production of such behaviors, and enabling these behaviors to run in (human-level) real time – i.e., at roughly 50 ms per cognitive cycle – are the issues that drive my research, and how I personally evaluate whether I am making progress.  Unfortunately these are not generally accepted criteria at traditional AI conferences, so I have not had great success in publishing articles about Sigma in such venues.  But there are other more non-traditional venues, such as AGI and an increasing number of others – e.g., the AAAI 2013 Fall Symposium on Integrated Cognition that Christian Lebiere and I are co-chairing – that are more open minded on criteria, and thus more receptive to such work.  There are similar issues with funders, where these kinds of criteria do not for example fit well the traditional NSF model, but there are other funders who do get excited by the potential of such work.

Luke: Stuart Russell organized a panel at IJCAI-13 on the question “What if we succeed [at building AGI]?”

Many AI scientists seem to think of AGI merely in terms of scientific achievement, which would be rather incredible: “In the long run, AI is the only science,” as Woody Bledsoe put it.

Others give serious consideration to the potential benefits of AGI, which are also enormous: imagine 1000 smarter-than-Einstein AGIs working on curing cancer.

Still others (including MIRI, and I think Russell) are pretty worried about the social consequences of AGI. Right now, humans steer the future not because we’re the fastest or the strongest but because we’re the smartest. So if we create machines smarter than we are, it’ll be them steering the future rather than us, and they might not steer it where we’d like to go.

What’s your own take on the potential social consequences of AGI?

Paul: I’m less comfortable with this type of question in general, as I don’t have any particular expertise to bring to bear in answering it, but since I have thought some about it, here are my speculations.

Computer applications, including traditional AI systems, are already having massive social consequences. They provide tools that make people more productive, inform and entertain them, and connect them with other people. They have become pervasive because they can be faster, more precise, more reliable, more comprehensive and cheaper than the alternatives. In the process they change us as individuals and as societies, and often eliminate jobs in situations where humans would otherwise have been employed. They also have yielded new kinds of jobs, but the balance between jobs lost versus those gained, whether in number or type, does not seem either terribly stable or predictable. Over the near future, I would be surprised if work on AGI were to have social consequences that are qualitatively different from these.

But what will happen if/when AGI were to yield human-level, or superhuman, general intelligence? One interesting sub-question is whether superhuman general intelligence is even possible. We already have compelling evidence that superhuman specialized AI is possible, but do people fall short of the optimum with respect to general intelligence in such a way that computers could conceivably exceed us to a significant degree? The notion of the singularity depends centrally on an (implicit?) assumption that such headroom does exist, but if you for example accept the tenets of rational analysis from cognitive psychology, people may already be optimally evolved (within certain limits) for the general environment in which they live. Still, even if superhuman general intelligence isn’t possible, AGI could conceivably still enable an endless supply of systems as intelligent as the most intelligent human; or it may just enable human-level artificial general intelligence to be tightly coupled with more specialized, yet still superhuman, computing/AI tools, which could in itself yield a significant advantage.

In such a world would an economic role remain for humans, or would we either be marginalized or need to extend ourselves through genetic manipulation and/or hybridization with computational (AGI or AI or other) systems? My guess is that we would either need to extend ourselves or become marginalized. In tandem though – and beyond just the economic sphere – we will also need to develop a broader notion of both intelligence and of agenthood to appropriately accommodate the greater diversity we will likely see, and of what rights and responsibilities should automatically accrue with different levels or aspects of both. In other words, we will need a broader and more fundamental ethics of agenthood that is not simply based on a few fairly gross distinctions, such as between adults and children or between people and animals or between natural and artificial systems. The notion of Laws of Robotics, and its ilk, assumes that artificial general intelligence is to be kept in subjection, but at some point this has to break down ethically, and probably pragmatically, even if the doubtful premise is granted that there is a way to guarantee that they could perpetually be kept in such a state. If they do meet or exceed us in intelligence and other attributes, it would degrade both them and us to maintain what would effectively be a new form of slavery, and possibly even justify an ultimate reversal of the relationship. I see no real long-term choice but to define, and take, the ethical high ground, even if it opens up the possibility that we are eventually superseded – or blended out of pure existence – in some essential manner.

Luke: As it happens, I took a look at your book On Computing about 4 months ago. I found myself intuitively agreeing with your main thesis:

This book is about the computing sciences, a broad take on computing that can be thought of for now as in analogy to the physical sciences, the life sciences, and the social sciences, although the argument will ultimately be made that this is much more than just an analogy, with computing deserving to be recognized as the equal of these three existing, great scientific domains — and thus the fourth great scientific domain.

Rather than asking you to share a summary of the book’s key points, let me instead ask this: How might the world look different if your book has a major impact among educated people in the next 15 years, as compared to the scenario where it has little impact?

Paul: I wrote the book more to impact how people think about computing – helping them to understand it as a rich, well structured and highly interdisciplinary domain of science (and engineering) – and where its future lies, than necessarily to create a world that looks qualitatively different. I would hope though that academic computing organizations will some day be better organized to reflect, facilitate and leverage this. I would also hope that more students will come into computing excited by its full potential, rather than focusing narrowly on a career in programming. I would further hope that funders will better understand what it means to do basic research in computing, which does not completely align with the more traditional model from the “natural” sciences.

Luke: Thanks, Paul!