Abstract: Ingredients of Super-Intelligent Machines – Marcus Hutter

connectomeThe dream of creating artificial devices that reach or outperform human intelligence is an old one. Most AI research is bottom-up, extending existing ideas and algorithms beyond their limited domain of applicability. In contrast, the information-theoretic top-down approach (UAI) investigates head-on the core of [general/rational] intelligence: the ability to succeed in a wide range of environments. All other traits are emergent. This approach integrates Ockham’s razor, Epicurus principle, Bayesian learning, algorithmic information theory, universal Turing machines, the agent framework, sequential decision theory, universal search, and Monte Carlo sampling, which are all important subjects in their own right. The approach allows to develop generally intelligent agents that are able to learn and self-adapt to a diverse range of interactive environments without providing any domain knowledge. These achievements give new hope that the grand goal of Artificial General Intelligence is not elusive.

Marcus Hutter

Recent Article in The Conversation related directly to this talk.

To create a super-intelligent machine, start with an equation

ntelligence is a very difficult concept and, until recently, no one has succeeded in giving it a satisfactory formal definition.

Most researchers have given up grappling with the notion of intelligence in full generality, and instead focus on related but more limited concepts – but I argue that mathematically defining intelligence is not only possible, but crucial to understanding and developing super-intelligent machines.

From this, my research group has even successfully developed software that can learn to play Pac-Man from scratch.

Let me explain – but first, we need to define “intelligence”.

So what is intelligence?

I have worked on the question of general rational intelligence for many years. My group has sifted through the psychology, philosophy and artificial intelligence literature and searched for definitions individual researchers and groups came up with.

The characterisations are very diverse, but there seems to be a recurrent theme which we have aggregated and distilled into the following definition:

Shane Legg

Intelligence is an agent’s ability to achieve goals or succeed in a wide range of environments.

You may be surprised or sceptical and ask how this, or any other single sentence, can capture the complexity of intelligence. There are two answers to this question:

  1. Other aspects of intelligence are implicit in this definition: if I want to succeed in a complex world or achieve difficult goals, I need to acquire new knowledge, learn, reason logically and inductively, generalise, recognise patterns, plan, have conversations, survive, and most other traits usually associated with intelligence.
  2. The challenge is to transform this verbal definition consisting of just a couple of words into meaningful equations and analyse them.

This is what I have been working on in the past 15 years. In the words of American mathematician Clifford A. Truesdell:

There is nothing that can be said by mathematical symbols and relations which cannot also be said by words. The converse, however, is false. Much that can be and is said by words cannot be put into equations – because it is nonsense.

Indeed, I actually first developed the equations and later we converted them into English.

Universal artificial intelligence

This scientific field is called universal artificial intelligence, with AIXI being the resulting super-intelligent agent.

The following equation formalises the informal definition of intelligence, namely an agent’s ability to succeed or achieve goals in a wide range of environments:

Click to enlarge

Explaining every single part of the equation would constitute a whole other article (or book!), but the intuition behind it is as follows: AIXI has a planning component and a learning component.

Imagine a robot walking around in the environment. Initially it has little or no knowledge about the world, but acquires information from the world from its sensors and constructs an approximate model of how the world works.

It does that using very powerful general theories on how to learn a model from data from arbitrarily complex situations. This theory is rooted in algorithmic information theory, where the basic idea is to search for the simplest model which describes your data.

The model is not perfect but is continuously updated. New observations allow AIXI to improve its world model, which over time gets better and better. This is the learning component.

AIXI now uses this model for approximately predicting the future and bases its decisions on these tentative forecasts. AIXI contemplates possible future behaviour: “If I do this action, followed by that action, etc, this or that will (un)likely happen, which could be good or bad. And if I do this other action sequence, it may be better or worse.”

The “only” thing AIXI has to do is to take from among the contemplated future action sequences the best according to the learnt model, where “good/bad/best” refers to the goal-seeking or succeeding part of the definition: AIXI gets occasional rewards, which could come from a (human) teacher, be built in (such as high/low battery level is good/bad, finding water on Mars is good, tumbling over is bad) or from universal goals such as seeking new knowledge.

The goal of AIXI is to maximise its reward over its lifetime – that’s the planning part.

In summary, every interaction cycle consists of observation, learning, prediction, planning, decision, action and reward, followed by the next cycle.

If you’re interested in exploring further, AIXI integrates numerous philosophical, computational and statistical principles:

Theory and practice of universal artificial intelligence

The above equation rigorously and uniquely defines a super-intelligent agent that learns to act optimally in arbitrary unknown environments. One can prove amazing properties of this agent – in fact, one can prove that in a certain sense AIXI is the most intelligent system possible.

Note that this is a rather coarse translation and aggregation of the mathematical theorems into words, but that is the essence.

Since AIXI is incomputable, it has to be approximated in practice. In recent years, we have developed various approximations, ranging from provably optimal to practically feasible algorithms.

At the moment we are at a toy stage: the approximation can learn to play Pac-Man, TicTacToe, Kuhn Poker and some other games.

Watch AIXI play Pac-Man.

The point is not that AIXI is able to play these games (they are not hard) – the remarkable fact is that a single agent can learn autonomously this wide variety of environments.

AIXI is given no prior knowledge about these games; it is not even told the rules of the games!

It starts as a blank canvas, and just by interacting with these environments, it figures out what is going on and learns how to behave well. This is the really impressive feature of AIXI and its main difference to most other projects.

Even though IBM Deep Blue plays better chess than human Grand Masters, it was specifically designed to do so and cannot play Jeopardy. Conversely, IBM Watson beats humans in Jeopardy but cannot play chess – not even TicTacToe or Pac-Man.


AIXI is not tailored to any particular application. If you interface it with any problem, it will learn to act well and indeed optimally.

The current approximations are, of course, very limited. For the learning component we use standard file compression algorithms (learning and compression are closely related problems). For the planning component we use standard Monte Carlo (random search) algorithms.

Neither component has any particular built-in domain knowledge (such as the Pac-Man board or TicTacToe rules).

Of course you have to interface AIXI with the game so that it can observe the board or screen and act on it, and you have to reward it for winning TicTacToe or eating a food pellet in Pac-Man … but everything else AIXI figures out by itself.
This article is adapted from a presentation which will be delivered at the Science, Technology and the Future conference, November 30 and December 1 2013.


Abstract: Introduction to the Technological Singularity – Marcus Hutter

team-marcus-hutterThe technological singularity refers to a hypothetical scenario in which technological advances virtually explode. The most popular scenario is the creation of super-intelligent algorithms that recursively create ever higher intelligences.
It took many decades for these ideas to spread from science fiction to popular science magazines and finally to attract the attention of professional philosophers and scientists. I will give an introduction to this intriguing potential future.
After explaining what the technological singularity is, the history of this idea, related developments and movements, and different versions and paths toward the singularity, I will address the question of its plausibility and time-frame.
In particular, I will introduce Moore’s exponential law, Solomonoff’s hyperbolic law, Hanson’s acceleration of economic doubling patterns, and Kurzweil’s epochs of evolution.

Obstacles towards a singularity, its negotiability and wide-ranging implications will also be covered.

By Marcus Hutter



Marcus Hutter – Universal Artificial Intelligence

Universal Artificial Intelligence

hedbot_aixi_bubble_smallLast year I did a series of interviews with Marcus Hutter while he was down in Melbourne for the Singularity Summit Australia 2012.

Marcus will also be speaking at the [highlight]Science, Technology & the Future conference in Melbourne on Nov 30 – Dec 1 2013 in Melbourne, Australia.[/highlight]

Hutter uses Solomonoff’s inductive inference as a mathematical formalization of Occam’s razor. Hutter adds to this formalization the expected value of an action: shorter (Kolmogorov complexity) computable theories have more weight when calculating the expected value of an action across all computable theories which perfectly describe previous observations.

At any time, given the limited observation sequence so far, what is the Bayes-optimal way of selecting the next action? Hutter proved that the answer is to use Solomonoff’s universal prior to predict the probability of each possible future, and execute the first action of the best policy (a policy is any program that will output all the next actions and input all the next perceptions up to the horizon). A policy is the best if, on a weighted average of all the possible futures, it will maximize the predicted reward up to the horizon. He called this universal algorithm AIXI.

Below is the transcription of the part of the interview series where Marcus talks about intelligence, Bounded Rationality, and AIXI.

What is Intelligence?

marcus hutter - interview with adam ford[dropcap]I[/dropcap]ntelligence is a very difficult concept (maybe that’s the reason why many people try to avoid diluting it or consider more narrow alternatives). I’ve worked on this question for many many years now. We went through the literature; psychology literature, philosophy literature; AI literature – what individuals, researchers, and also groups came up with definitions, they are very diverse. But there seems to be one recurrent theme and if you want to put it in one sentence, then you could define intelligence as:
“an agents ability to achieve goals in a wide range of environments”, or to succeed in a wide range of environments.
Now look at this sentence and ask, “wow, how can this single sentence capture the complexity of intelligence?” There are two answers to this question. First: many aspects of intelligence are emergent properties of intelligence, like being able to learn – if I want to succeed or solve a problem I need to acquire new knowledge, so learning is an emergent phenomenon of this definition.
And the second answer is: this is just a sentence that consists of a few words, what you really have to do, and that’s the hard part, is to transform it into meaningful equations and then study these equations: And that’s what I have done in the last 12 years.

Bounded Rationality

marcus_hutter_singularitysummit_australia_2012_1037x691[dropcap]I[/dropcap]t is an interesting question whether resource bounds should be included in any definition of intelligence or not, and the natural answer is of course they should. Well there are several problems: the first one is that nobody ever came up with a reasonable theory of bounded rationality (people have tried), so it seems to be very hard. And this is not specific to AI or intelligence, but it seems to be symptomatic in science. If you look at the several fields (i.e. the crown physics discipline) theories have been developed: Newton’s mechanics, General Relativity Theory, Quantum Field theory, the Standard Model of Particle Physics. They are more and more precise, but they get less and less computable, and having a computable theory is not a principle in developing these theories, of course at some point you have to test these theories and you want to do something with them, and then you need a computable theory – this is a very difficult issue (and you have to approximate them or do something about it) – but having computational resources built into the fundamental theories, that is at least in physics, and if you look at other disciplines, that is not how things work.
You design theories so that they describe your phenomenon as well as possible and the computational aspect is secondary. Of course if it is in-computable and you can’t do anything with it, you have to come up with another theory, but this always comes second. And only in computer science (and this comes naturally) computer scientists try to think about how they can design an efficient algorithm to solve my problem, and since AI is sitting in the computer science department traditionally, the mainstream thought is “how can I build a resource bounded artificial intelligent system”. And I agree that ultimately this is what we want. But the problem is so hard, that we (or a large fraction of the scientists) should take this approach, model the problem first, define the problem first, and once we are confident that we have solved this problem, then go to the second phase, and try to approximate the theory, try to make a computational theory out of it. And then there are many many possibilities, then you could still try to develop a resource bounded theory of intelligence, which will be very very hard if you want to have it principled, or you do some heuristics… or .. or .. or… many options. Or the short answer maybe I am not smart enough to come up with a resource bounded theory of intelligence, therefore I have only developed one without resource constraints (that would be the short answer).


aixi1line[dropcap]O[/dropcap]k so now we have this informal definition that intelligence is an agent’s ability to succeed or achieve goals in a wide range of environments. The point is you can formalize this theory, and we have done that and it is called AIXI. Or Universal AI is the general field theory and AIXI is the particular agent which acts optimally in this sense.
So that works as follows: it has a planning component, and it has a learning component. What the learning component does is: think about a robot walking around in the environment, and at the beginning it has little or no knowledge about the world, so what it has to do is to acquire data/knowledge of the world and then build its own model of the world, how the world works. And it does that using very powerful general theories on how to learn a model from data, from very complex scenarios. This theory is rooted in Kolomogrov complexity, algorithmic information theory – the basic idea is you look for the simplest model which describe your data sufficiently well. And this agent or robot has to do this continuously, gets new data and updates its model. So now the agent has this model, that is the learning part. Now it can use this model for predicting the future… And then it uses these predictions in order to make decisions, so the agent now thinks if I do this action, and this action… this will now happen and this is good or bad. I’ll come to the good or bad part soon. And if I do this other action it is maybe better or worse. And then the “only” thing what the agent has to do is think about all the potential future action sequences and take the one which is best according to the model which the agent has learned, which is not perfect but which over time gets better and better. Finally you have to qualify what does “best” mean, and that’s the utility part or succeeding: the agent gets occasional reward from the teacher, who could be just a human or the reward could be built in (for instance if the battery level is low it is bad, if it’s high it is good, if it finds a rock on Mars it is good, if it falls down a cliff it’s bad), so we have these rewards, and the goal of the agent is to maximize his reward over it’s lifetime. That’s the planning part. So first comes the learning part, then the prediction part, then the planning part, and then it gets to actions and the cycle continues.
harcus hutter blue backgroundSo this theory, the AIXI agent, it’s mathematically rigorously well defined. It is essentially unique, and you can prove amazing properties of this agent – in a certain sense you can prove that it’s the most intelligent system possible. I am translating the mathematical theorems into words, which is a little tricky but that’s the essence. The downside is that it’s in-computable. You asked before about the resource bounded intelligence so AIXI needs infinite computational resources, and in order to do something with it you need to approximate it, and we have done this in recent years also. At the moment it is at the toy stage so it can play PacMan, Tic Tac Toe, some simple form of Poker, and some other games… The point is not that it is able to play PacMan or Tic-Tac-Toe (they are not hard), the point is that the agent has no knowledge about these games, it starts really blank, and just by interacting with the environment – it does not even know the rules of the game – by interacting with this poker environment or PacMan environment it figures out what is going on, and learns how to behave well.
The cool thing really is and the difference to many other projects (there is Deep Blue who plays chess better than the Grand Masters, but it was systems specifically designed to play chess, and it can’t play go), this system is not tailored to any particular application. If you interface it with any problem (in theory it can be any problem: chess, solving a scientific problem) it will learn to do that very well and indeed optimally. The approximations we have at the moment, are of course, very limited, but if you look at these approximations they use standard compressors for the model learning part; There is nothing about PacMan in these data compressors: they are standard data compressors. For the planning part we use standard Monte-Carlo (random search) which has nothing to do with a particular problem, or a game – and this approximation is already able to learn by itself {these various games}. There is no PacMan knowledge built in. The only thing (of course) you have to do is to interface the game with this agent For PacMan you have these pixels in a 15×15 grid, and each square is a wall, is free, is food or there is a ghost, and this piece of information you give this agent and then it gets negative reward if it gets eaten by a ghost, positive reward if it eats a pallet, and that’s it, and the goal of the agent is to maximize reward, and everything else is figured out by itself.

Video Interviews

For more video interviews please Subscribe to Adam Ford’s YouTube Channel

YoutTube Playlist of Interview Series with Marcus Hutter:

At Singularity Summit Australia 2012 – “Can Intelligence Explode?”

Speaker: Marcus Hutter

Marcus Hutter (born 1967) is a German computer scientist and professor at the Australian National University. Hutter was born and educated in Munich, where he studied physics and computer science at the Technical University of Munich. In 2000 he joined Jürgen Schmidhuber’s group at the Swiss Artificial Intelligence lab IDSIA, where he developed the first mathematical theory of optimal Universal Artificial Intelligence, based on Kolmogorov complexity and Ray Solomonoff’s theory of universal inductive inference. In 2006 he also accepted a professorship at the Australian National University in Canberra.

Hutter’s notion of universal AI describes the optimal strategy of an agent that wants to maximize its future expected reward in some unknown dynamic environment, up to some fixed future horizon. This is the general reinforcement learning problem. Solomonoff/Hutter’s only assumption is that the reactions of the environment in response to the agent’s actions follow some unknown but computable probability distribution.

Read more