Friendly AI: a vector for human preference

January 24, 2010

This post is part of The Friendly AI problem sequence.

Followup to: The project of Friendly AI, Preference is general and precise, Preference is resilient and thorough.

What we need is not superintelligence, but supermorality, which includes superintelligence as a special case.

Why We Need Friendly AI

Sufficiently advanced AIs reliably implement a specific unchanging preference, eventually affecting any reachable aspect of the world if not stopped (which would be hard to impossible). This observation prohibits quite a few otherwise plausible assumptions about how the intelligent agents could behave, with important consequences for the design of Friendly AI (FAI), an autonomous decision-making machine that is built so that it can be trusted with moral evaluation of its decisions.

Since preference is thorough and general, FAIs can’t be limited to a narrow domain, neither in the aspects of the world that they are supposed to act upon (make decisions about), nor in the aspects of human preference that they are supposed to apply in making moral estimation of the decisions. Any autonomous AI intended for a narrow domain will just fill in the blanks and become an agent in the general domain, but with parts of preference not determined by our own. This adds difficulty to the project of FAI: the scope of the problem can’t be restricted, partial solutions don’t work as intended at all.

Since preference is precise and resilient, FAI’s preference has to be not only comprehensive in its scope, but also specified correctly with precision, and on the first try. Small differences in preference escalate to overwhelming differences in the outcome (resulting state of the world shaped by an agent acting for that preference), spread out through all of its aspects, with no simple regularity to account for the change. Once implemented, the genie can’t be easily put back into the bottle or reformed, it’ll try to protect its preference, resisting any change or threat of extinction.

On the other hand, once the problem is solved for an implementation of human preference in a single FAI machine, the rest takes care of itself. Preservation of preference is a basic drive, so if we can trust this particular FAI agent with moral decisions (even if it’s computationally relatively limited and has a long way to go in improving its ability to prepare complex plans), we can also trust the next-generation agents it constructs to make decisions for more and more aspects of the world. The dangers of autonomous AIs turn into virtues once their preference is the right one. The FAI need not immediately take over everything, need not be a superintelligence from the start and for however long it takes to get there (assuming that all that can be done is being done, so that competing factors won’t likely take over); the mere presence of competitive autonomous agents reliably holding our preference ensures a good chance for our preference having a significant say in shaping the future.

Intelligent agents have two thresholds in ability important in the long run: autonomy and reflective consistency. Autonomy is a point where an intelligent agent has a prospect of open-ended development, with a chance to significantly influence the whole world (by building/becoming a reflectively consistent agent). Humanity is autonomous in this sense, as probably are small groups of smart humans if given a much longer lifespan (although cultish attractors may stall progress indefinitely). Reflective consistency is the ability to preserve one’s preference, bringing the specific preference to the future without creating different-preference free-running agents. The principal defects of merely autonomous agents are uncontrollable preference drift and inability to effectively prevent reflectively consistent agents of different preference from taking over the future; only when reflective consistency is achieved, does the drift stop, and the preference extinction risk gets partially alleviated.

As with advanced AI, so is with humanity, there is danger in lack of reflective consistency. An autonomous agent, while not as dangerous as a reflectively consistent agent (though possibly still lethal), is a reflectively consistent agent with alien preference waiting to happen. Most autonomous agents would seek to construct a reflectively consistent agent with same preference, their own kind of FAI. A given autonomous agent can (1) drift from its original preference before becoming reflectively consistent, so that the end-result is different, (2) construct another different-preference autonomous non-reflective agent, which could eventually lead to a different-preference reflective agent, (3) fail at the construction of its FAI, creating a de novo reflectively-consistent agent of wrong preference; or, if all goes well, (4) succeed at building/becoming a reflectively consistent agent of same preference. Humanity faces these risks, and any non-reflective autonomous AI that we may develop in the future would add to them, even if this non-reflective AI shares our preference exactly at the time of construction. A proper Friendly AI has to be reflectively consistent from the start.

The very motivation behind the Friendly AI problem turns around in light of the problem of preservation of human preference and implications of its successful resolution. The original motivation for FAI, as I stated it, was to build a tool for augmenting the moral evaluation side of human decision-making, a kind of a calculator for right and wrong where we already have calculators for could and couldn’t, allowing us to find better solutions for harder problems. The updated motivation is to construct a vehicle for human preference, means of its propagation and application in the future, with humanity itself in the present form inadequate for this role. (This isn’t a decision of replacing people with FAIs, seeing it this way would be a category error; I’ll return to this point in later posts.)


Preference is resilient and thorough

January 19, 2010

This post is part of The Friendly AI problem sequence.

Followup to: Preference is general and precise.

Preference defines the way in which an intelligent agent seeks to influence the world. In any given situation, an agent is only capable of forming crude plans (compared to the precision of the goal), focusing on few select aspects of the outcome. This makes it important for this agent to instantiate other agents (equivalently, aspects of itself) working for the same preference in many more situations, so that they can convert knowledge about the world available to them into actions optimized for this preference.

As an apparent subgoal, an agent would seek to ensure that it’s possible to set up preference-optimization as widely as possible. It won’t be able to do that if it’s completely destroyed, so self-preservation (or, rather, preference-preservation) is a natural drive. (See also Steve Omohundro’s “The Basic AI drives”.) Another threat is appearance of an optimizer working for a different preference: they’d have a conflict of interest, and would have to settle on the lesser extent of preference-optimization than otherwise. Appearance of another-preference agent has a much larger impact than any non-fatal unintelligent circumstance: most events have only local impact, but an intelligent agent will work on influencing the whole world. Only total extermination can contain the impact of a competing agent. This is the danger of generality.

It doesn’t matter how a different-preference optimizer originates for it to become a serious trouble. An obvious story is for the original agent to propagate itself imperfectly, so that its future subroutines start working for a different preference. Even a small error can lead to a serious disagreement about the details of the preferred outcome. Thus, an intelligent agent would not only seek survival (capacity to influence the world) and propagation (capacity to have detailed influence on every aspect of the world), but perhaps with the same urgency it’d seek to precisely preserve its preference, including avoiding cumulative change over time and over changes of computational substrate.

A defining characteristic of an intelligent agent is autonomy of its decision-making process. The above remarks present this autonomy in a different light: not only won’t an agent require direct assistance in order to implement complex goals, but it’ll also seek to perfectly resist any attempts to influence its preference. If the AI is not extremely intelligent or is constrained by the situation (e.g. it’s running on an isolated computer that can be shut down externally), its behavior can be influenced by its environment: some actions can be rewarded, others discouraged. When it is preferable to change its behavior according to expectations of those in control of the environment, the AI would do so, but this is merely appearance: the underlying preference remains the same, and this preference is the only thing that matters for the eventual global effect of the AI on the world.

A faulty analogy with reinforcement learning suggests that it should be possible to train AI into following the necessary preference. The method works for the reinforcement learning algorithms, but not for intelligent agents advanced enough to understand the concept of unchanging preference and able to preserve it. It’d take a very precise, specifically-designed preference for an intelligent agent to “take the lessons seriously”, so that the whole of its future effect on the world gets determined by these “lessons”. But even then, reinforcement learning can’t work for human preference, because it’s too big and not even known to ourselves, and so we can’t construct a good enough training set.

The drive for preservation of preference, together with the drive for survival (that can use the Internet as an obvious lower-bound plan for where to hide), makes a preference instantiated in an intelligent agent very resilient, almost impossible to influence or exterminate. Given generality and precision of preference, and thus global and thorough nature of AI’s effect on the world, letting such an AI out is very undesirable (on the cosmic scale!), even if the new AI is not significantly more powerful than ourselves, and doesn’t pose a direct extinction risk.

I leave the idea of intelligence explosion outside the scope of this post. Here, I emphasize the aspects of AI’s behavior that are not dependent on the hypothesis of rapid growth of AI’s power, but make the construction of autonomously operating AI almost as disastrous as if it could erase our civilization directly. There are extenuating circumstances to this scenario, as cooperation (trade) would allow to dole the world’s optimization among the agents of different preference, but the world makes a pretty big pie, even a small part of which would be a tremendous loss.


Preference is general and precise

January 10, 2010

This post is part of The Friendly AI problem sequence.

Followup to: The project of Friendly AI.

Any Future not shaped by a goal system with detailed reliable inheritance from human morals and metamorals, will contain almost nothing of worth.

Value is Fragile

The problem of Friendly AI is inexorably linked to the concept of human values, or preference. But what is “preference”, and what kind of statement do human values make?

Typically, preference (order) allows to make choices between alternatives: when faced with two possible actions, each leading to its consequence, preference tells which of the two consequences, and hence which of the two actions, is preferable. For any such question, a given preference gives an answer. Conversely, a given set of answers (satisfying necessary conditions) could be encoded as a preference.

In human experience, the ability to figure out which choice is preferable is very limited, as is the ability to understand more complex consequences. This makes it hard to feel just to what extent a specific (relatively simple) preference elaborates in detail what kind of world is preferable. But if we have preference as an explicit mathematical object, it becomes possible to ask very detailed questions.

For simplicity, let’s ignore the practical limitations and let preference be a total order on atom-by-atom states of the whole world. What can be said about the most preferable world? It is a huge data structure determined by a given mathematical object (and by that object only), for example it could be the output of a given program. From this point of view, it’s easy to see that the output need not be simple (with regular structure) even for simple preferences, just as Mandelbrot set and decimal expansion of pi hold a lot of detail even though they are defined by simple formulas. Preference defines its preferable outcomes with precision, and speaks of every single detail.

A tiny modification of preference may result in change to every detail of the most preferable outcome. There’s no telling how low does the most preferable outcome for one preference appears in the preference order of another preference: most of the details chosen for one preference will be arbitrary according to another. Similar programs can produce completely dissimilar outputs.

Human values hold quite a lot of detail, accumulated as psychological evolutionary adaptations. These adaptations don’t add up to any single simple principle: even though one could say “fitness”, this goal already refers to specific detail-rich class of environments, and adaptations are only very crude heuristics towards such a goal. What people value comes from the way they actually happen to be built, not from the explanation of how they came to be built this way.

If misreading or losing even a small detail of definition of preference results in a completely different preference order (completely different notion of highly preferable state of the world), it becomes particularly important to capture our preference precisely in order to be able to trust an autonomous moral machine holding that preference. Furthermore, it seems that even people can’t be trusted to preserve preference in sufficient detail, for the future created by our descendants to be of any worth from our point of view. Any modification of preference performed with merely human-level analysis of consequences, or “naturally”, that is without any analysis at all (for example, what happens with human brain over time), leads to a guaranteed failure. This generalizes the notion of existential risk to include the risk of losing value of the future for reasons other than extinction.


The project of Friendly AI

September 19, 2009

This post is part of The Friendly AI problem sequence.

What is the problem and project of Friendly AI? This issue is rather confused, so I’ll outline the motivation and break the problem into its two main components.

Much of the power of technology manifests as predictable tools we create. Predictability comes in many forms: a bowl is expected not to leak liquids, an excavator is expected to be useful for digging holes, a written note is expected to bring back forgotten memories. Tools can be trusted to deliver their predictable effects, and so can be safely designed to wield great power. An A-bomb, based on its design, is trusted not to blow up spontaneously, and software in the banks is trusted to correctly keep track of everyone’s accounts.

Humans can inventively reason about a lot of things, but our ability to correctly anticipate the effects of detailed plans is pretty limited. When designing a bridge, it is not enough to pick its shape and materials and estimate intuitively whether it’ll stand a load: this mode of operation will yield an unpredictable result, one that can’t be trusted. To get better at designing predictable tools, we invent more tools targeted at helping in this task.

Computers can be used to implement huge calculations, if the problem statement can be entered explicitly. For example, you can program the material and mechanical laws in an engineering application, enter a building plan, and have the computer predict what’s going to happen to it, or what parameters should be used in the construction so that the outcome is as required. That’s the power outside human mind, directed by the correct laws, and targeted at the formally specified problem.

The process of decision-making has two aspects: prediction (factual estimation) and valuation (moral estimation). To be selected, a plan has to be both feasible and lead to good consequences. It is possible to implement a nuclear winter, but people don’t want that to happen. So far, people have been fairly successful at designing powerful mental tools for prediction (think physics, not futurism), but outside narrow domains, the application of the resulted plans always has to be “manually” morally evaluated by people in order to proceed with the decisions. We can create designed powerful tools to augment only half of the decision process, the other half remains hopelessly in the domain of human brain.

Let’s say we built an AI, a tool capable of planning in any domain, that is also capable of estimating desirability of plans, and so can make decisions autonomously. If this AI is considered independently of its goals, it’s like an engineering application with a random building plan: it can powerfully produce a solution, but it’s not a solution to the problem anyone needs solving. If you can specify a problem, but don’t have the AI, nothing happens. If you have the AI but give it a random goal, it solves a random problem, with all its power of precision and autonomy. The AI algorithm is essential when you do have an ability to specify the problem, but it’s a separate issue from specifying the problem statement that comes from human nature.

I tentatively identify Friendly AI as an autonomous decision-making tool that is powerful at what it does and can be trusted not only with factual estimation, but also with moral estimation. You don’t have to manually check what it deems desirable, just as you don’t have to manually check how a calculator arrived at each specific result, to be confident that the result is correct.

What is the difficulty then? Why can’t we program human values in a computer, just like a building plan, to be computed in higher resolution? The answer is that we can’t explicitly see our values. We can use them, with varying levels of success, but we can’t write them down, cast the whole of human preference in explicit form. Any direct attempt to do so will end up as a crude caricature that breaks in situations not at all difficult to find. A moral machine would need to work with human values, but human programmers can’t enter them, and neither can they do in their heads what a machine would be able to do given a formal problem statement, because humans can’t handle this problem statement, it’s too big. It could exist in a computer explicitly, but it can’t be entered there by programmers.

So, here is the barrier: problem statement (human values) resides in the structure of human mind, but the strong power of inference doesn’t, while the strong power of inference (potentially) exists in computers outside human minds, where the problem statement can’t be manually transmitted. Creating Friendly AI requires these components to meet in the same system, but it can’t be done in a way other kinds of programming are done.

On the surface, the problem of Friendly AI seems to be about engineering an algorithm capable of powerful planning that is guaranteed by design to follow a clearly defined goal system. But the deeper problem seems to be extracting that goal system from humanity, seeing values in the messy detail of a given physical system.

Technically understanding the more or less arbitrary physical artifact as an instance of goal-directed algorithm is a problem much more general than constructing a specific algorithm. To see the human values in detail, the basic paradigm of what values are, as a property of physical processes, is necessary. Here we seem to be on a pre-Newtonian stage, there is no “mass” or “force” in the description of preference (but there is a lot of existing science to throw at this problem).

The project of understanding arbitrary physical systems as formal goal-directed agents is (1) more general than designing a specific goal-directed AI, so that the solution to the latter may not even meaningfully contribute; (2) a necessary component of any successful FAI design; (3) safer than designing an AI, which, given arbitrary goals, is a very dangerous thing to have around; and even (4) may answer some fundamental conceptual questions in AI design, allowing to complete the project.


Leveling up

August 10, 2009

That scream of horror and embarrassment is the sound that rationalists make when they level up.

Eliezer Yudkowsky

I wasn’t updating this blog since February, and the reason for that is that I understood a couple of things that prompted a change in the perspective on how to do and communicate research, as well as the direction of research.

What I’ve been doing before was top-down formalization of intuition, something akin to philosophy: start with a vague idea about a phenomenon, and then iteratively clarify it, step by step rendering parts of the idea more explicit, in turn using the clearer understanding to train intuition, and so on. Throughout the process, there are almost no stand-alone technically understood components, everything is only held together in the mind. The intermediate product of this process is a set of mental tools allowing to better understand the phenomenon under study.

There is a number of related difficulties to this approach. As most of the concepts are fuzzy, there is a temptation to neglect epistemic hygiene. This shows in attempts to cover inferential distances with explanations that misuse technical terms, saying only something similar to the truth, but not really true, as it’s easier this way. This plagued the first sequence I ran on the blog, in June-July 2008, with term “probability”. What it really takes to describe a complex idea that you don’t yet understand technically is probably a book-length description, that won’t be an easy read either (with much of philosophy being the primary example). More importantly, it’s easier to engage in sloppy thinking, creating the illusion of progress while going in circles, and to start chasing lost purposes, solving problems that don’t need to be solved.

While research is informed by both facts and tools from the literature, in the “fuzzy” mode there is really very little that generalizes to something helpful on a not-directly-related problem. The most helpful thing is the methodology, a set of tricks for managing concepts as they develop, separating meaningful ones from the trivial, grounding in the existing body of science, and so on.

What I discovered when I started to look into the mathematics on topics related to intelligence (machine learning, graphical models, decision theory, game theory, formal semantics, logic, model checking, etc.) is that the intuitions forming in the mind once you understand these topics are vastly superior to those I was able to gather before, both from reading “fuzzy-grade” research (descriptions of “ad-hoc” AI approaches, neural nets, cognitive science, neuroscience), and from developing my own structures. At that point, I was down this “fuzzy” path for about year and a half, starting from no knowledge in the related fields; the material I described on the blog is what I constructed in the first half a year, a year before writing it up, since reduced to a kind of recurrent neural networks, with experimental implementations and so on. It took only a couple of months to comprehend the hands-down superiority of math, even for the ideas that aren’t reduced to math yet.

And then I saw that the problem I was solving doesn’t really develop in the direction of Friendly AI (FAI), that all my previous activity was mostly a lost purpose, apart from educational value. I was acting from a vague idea that understanding AGI is a step in the direction of understanding FAI, since FAI is a kind of AGI. This idea turned out to be misguided for a number of reasons, that should become clear from the following posts.

I leave the existing posts be, despite not really approving of them, and will resume blogging here.


Learning factored representation

February 11, 2009
Warning! This post has been marked as wrong or obsolete, and may not reflect my current views.

This post is part of Locally structured fluid representation sequence.

Followup to: Balancing context with conceptual slippages, Summarizing structure in new labels, Independence of patterns.

Repeated contexts and context transitions become compressed over use, losing variability in their compressed form. Any distinguishing characteristics of particular instances of such repeated contexts can be extracted as separate properties. Commonalities get compressed in a central pattern, and variations become properties of that central pattern. For example, typical objects, such as cups, have certain common characteristics, but properties of a particular cup can be expressed as additional patterns showing where it differs from typicality.

Central pattern of an object extracts mutual information from features describing the object, and as a result remaining patterns of object properties become more independent from each other, given the object pattern. A change in one property of an object doesn’t usually call for changes in other properties, and if it does, the dependent properties should probably again be summarized by a new single property. Individual slippages of object properties don’t affect most of the scene.

Resulting representation shouldn’t be strictly hierarchical, as limiting the representation to a hierarchy significantly reduces its expressive power. Center of a natural category can consist of a collection of interfering patterns, encoding the object’s structure and instantiated depending on context, whereas more rare characteristics are much more independent, given any compatible state of the object’s center.

Learning factored representations of transformations of the scene may result in formation of procedural patterns, with the center of transformation becoming procedure itself, and peripheral variations in transformation’s properties becoming arguments of the procedure.


Independence of patterns

February 7, 2009
Warning! This post has been marked as wrong or obsolete, and may not reflect my current views.

This post is part of Locally structured fluid representation sequence.

Followup to: Structural representation of uncertainty, Interference of patterns

In a given scene, two patterns are called independent, if changes in one of them don’t lead to changes in another, if they don’t interfere directly or through short enough sequence of changes in the scene. Independence is conditional on context, so two patterns can be independent in one scene, but not in a different scene, and a change to a third pattern can make them interfere.

Interference makes scene a whole, connects its parts, translates the presence of additional patterns into influence on behavior of existing patterns. Independence allows modular composition of elements of the scene, “keeps everything from happening all at once”. It could also be fundamental to scalable implementation, since it makes interactions between patterns local on each given step.

Groups of patterns, where patterns in each group are mostly independent of patterns in other groups, can function in parallel, so that the whole processes in inference within each group are independent from other processes. Such configurations could be used to compute answers to subproblems, to divide a bigger problem on a collection of smaller ones (using procedural patterns), or to model a bigger system by a collection of models of its parts.


Patterns as contextually invoked procedures

January 3, 2009
Warning! This post has been marked as wrong or obsolete, and may not reflect my current views.

This post is part of Locally structured fluid representation sequence.

Followup to: Continuous balancing of changing structure.

Patterns present in the memory direct the change of current context. They are instantiated depending on state of the scene, and resulting state of the scene depends on their structure. Thus, apart from declarative interpretation, patterns can be considered as contextually invoked procedures.

In simpler cases, declarative patterns build new structures in the current scene, adding content, and displacing other content. The resulting structure can be mostly predetermined, perhaps as reconstruction of past episodes, verbatim or distilled into semantic memory.

In other cases, procedural patterns implement more elaborate procedures that recombine existing patterns in a scene, even if those existing patterns never occurred in the same combination before. The operation of procedural patterns can be thought of as based on controlled conceptual slippages.

The structure of the scene is supported on a network of contextual interfaces between patterns. When interfaces change, so does the structure. When previously unconnected patterns somehow acquire compatible interfaces, these patterns become connected, which in turn leads to interference between them, and a wave of integration of their structures. The scene gets rebalanced around a new connection.

To implement nontrivial procedure, a pattern is invoked for a combination of cues on existing patterns. Its application attaches new cues to these patterns, that act as interfaces connecting them in a new way and starting a recombination process.

The operation of procedural patterns is slightly analogous to the way complex biochemical processes work in a cell, with molecules being produced step by step, new cues appearing at each step, allowing new reactions to proceed at various active sites, protein folding acting as structure-changing rebalancing, enzymes implementing global context, and structures like ribosomes reliably transforming elements of representation. Analogy is rather weak, but shows some of the elements of the process. It applies more to the declarative patterns than to procedural ones, and doesn’t include learning.

When patterns are interpreted as contextually executed procedures, the balancing process can be interpreted as a process of parallel procedure execution, where multiple procedures are running in their local contexts, interacting with each other through that context. Each procedure has activation conditions, and each procedure has its own structure that determines its effect when applied at the call site. External input introduces the change in the context, but gets processed the same way. A balanced scene that doesn’t change corresponds to a stable point, with procedures running in a loop. Declarative patterns are simple procedures, and procedural patterns are more general. Declarative patterns are “nouns”, and procedural patterns are “verbs”. Some patterns contain just a few steps, and some initiate complex processes, transforming the scene along one of the many possible paths, chosen based on context along the way, branching out into multiple parallel procedures.


Episodic and semantic memory

December 21, 2008
Warning! This post has been marked as wrong or obsolete, and may not reflect my current views.

This post is part of Locally structured fluid representation sequence.

Followup to: Fragments of structure, Summarizing structure in new labels.

In this model, structure restoration can be interpreted as remembering, and context balancing as focusing attention on contextually relevant facts and memories. Depending on the character of restored structure and restoration process, some memories can be considered episodic or semantic. Episodic memory restores a significant part of a single past scene, allowing to situate the restored structure relative to known locations and times. Semantic memory plays out a semantic rule of thumb, filling in a property that can be discerned from the context, and even though this property can have complex structure, it isn’t associated with a particular past scene.

Before an episode is first recalled, the pattern of that episode is unique in the memory. This theoretically allows to recall every single bit of the old episode, there is no ambiguity in the details, for example if an unique label belonging to that episode gets triggered by a cue. But once a part of an episode gets recalled, its content is associated with two episodes: the original one, and the episode of recall. The recalled part, or episodic memory, becomes a weaker cue for the parts of the episode that were not recalled the first time. After many recalls, the episodic memory that gets restored in the context of recall is formed more as a reconstruction of previous episodes of recall, than of the original episode. The details that are not usually recalled get forgotten, and the details that by some reason get distorted during recalls stay distorted in the subsequent recalls. These effects, following from simple considerations about associative memories, are known to occur with human memory, as retrieval-induced forgetting and memory distortion, and can be mimicked by very simple models.

The first episode in agent’s experience to which a rule encoded in semantic memory applies, plays the same role as the original episode of episodic memory. The only difference is in what kind of content gets the emphasis during the recall. Episodic memory retains the relations to many details, even as retrieval-induced forgetting tries to shut them out, with rarity of recall events helping the matter, while semantic memory focuses on few properties and gets applied over and over. This allows to view semantic memory as a special case of episodic memory, that through many cases of recall abstracted out all of the episode-specific details, leaving only what’s usually important in the contexts of recall.

When a fact is learned declaratively, there is an episode in which it’s first stated, but the details of that episode are irrelevant to the fact itself. When the fact is recalled in the future, the recalling process can stop on the fact, without going into the details of the episode in which the fact was learned, even though those details are available. The limited part of episodic memory becomes semantic memory. Alternatively, a fact can be abstracted out as a regularity present in many scenes, without ever being a part of episodic memory, as the only unambiguous inference that follows from cumulative memory of many past episodes.

Another interesting effect is that not every episode can form an episodic memory. If an episode is so ordinary that no cue can uniquely point to it, there is no way to recall it as an episode. When you commute to work a hundredth time, you don’t usually pay attention to details, each action results from a known rule, there is nothing to learn from the process. New memories capture reusable novelty, contexts that are expected to repeat sometime in the future, but did not appear in the past.


Summarizing structure in new labels

December 9, 2008
Warning! This post has been marked as wrong or obsolete, and may not reflect my current views.

This post is part of Locally structured fluid representation sequence.

Followup to: Structural representation of uncertainty, Continuous balancing of changing structure.

Labels allow to represent states of knowledge in the most compact form. Expressive power of structural contexts and whole scenes allows to construct representations of elaborate combinations of previously encountered states of knowledge. These more complex representations are harder to manage than simple labels, and so when certain structural pattern (or map) becomes common enough, it can be assigned a new unique label of its own.

New labels allow to compactly represent frequently encountered states of knowledge, to form a language adapted to the environment. Multiple generations of labels can represent the most salient aspects of bigger and bigger structures in the scene. Smaller representation allows more robust processing. Structure restoration can function with fewer errors because structures that need to be restored become smaller. Maps can capture more global concepts in the scene without needing to consider more labels at the same time, because common combinations of labels are summarized by new labels. New labels form a basis for new levels of representations for structures that are already represented in the scenes.

A label by itself is good for nothing: if it isn’t in any map, it’ll never get restored. If a new label appears once in an ordinary context, it’ll never be restored again, because this context is matched by old maps better than a new map that also includes a new label.

A new label can get remembered if it appears in a novel structural context consisting of old labels. New maps that capture this new context can be restored in the future by right combinations of the old labels, and as a result restore the new label. From now on, the new label appears in all scenes containing this novel structural context, at the same time new maps representing this context do. As a result, it becomes possible to represent this context and associated maps just by the new label, and this label gets learned by other maps to reflect the presence of regularity represented by it. It’s no longer owned by the context in which it was bootstrapped and with which it was originally associated, and in the future it can even get completely disassociated from it, gradually shifting its semantics elsewhere.

Thus, it’s unnecessary to create a separate algorithm to manage new labels and rigidly assign them to maps they are supposed to represent, map learning takes care of it. It’s sufficient to create a new label for each new map, and if it turns out to be useful, it’ll get learned by other maps. Relatively useless labels get abstracted out of maps, the same way relatively useless maps get discarded or merged with other maps.