Continuous balancing of changing structure

November 26, 2008

Followup to: Restoring the structure, Balancing context with conceptual slippages.

Now that the operation of our algorithms is no longer monotonic, so that the scene is not just being extended, but balanced, with possible replacement and deactivation of patterns, it’s time to consider its continuous operation.

In continuous balancing, scene is never being reset, and the process of balancing never stops. Elements of the scene are being updated through external activation and deactivation of certain maps (change in their salience), concurrently with activity of structure waves. This is an equivalent of sensory input. For now, let’s assume that this input describes the scene on high level as well as on low level, activating maps corresponding to arbitrarily abstract properties and relations. To simplify the dynamics, let the scene change slowly relative to propagation of structure waves.

Known maps (long-term memory) are simply maps that were synthesized by structure waves at some point. When certain pattern loses support from sufficient number of other salient patterns, it gradually fades from the scene. Resulting maps with no salience can later be reactivated, if they fit a structure wave better than alternatives. The strength of parameters of a map depends on how it was constructed and changes with each reconstruction.

This setting threatens to leave too much debris, with elements of old scenes remaining active when current scene is updated to something else entirely. However, each active pattern interferes with other patterns, influencing global context. Forgotten elements of an old scene are influenced by the current scene, and vice versa. Old scene can’t change externally updated elements of the current scene, which to some extent gives the direction to dynamic. On the other hand, preserving context left from the old scenes is also a very important feature, allowing to model dynamics of environment and perform deliberative inference.

Representation can now be considered a dynamic inductive-predictive model of environment, responding to sensory input and improving itself by drawing inferences between its elements and learning new rules. This representation is a more technically elaborated substrate for holistic control framework, though far from being specified in enough detail to be implemented.


Structural representation of uncertainty

November 18, 2008

Followup to: Reduced representation of alternatives.

Scene graph represents information about environment, focusing on certain objects and relations, at a certain level of description. Each element of representation leaves fair amount of uncertainty, specifying only a coarse picture. At the lowest level, labels encode classes of properties without discriminating among the properties within represented class. Finer distinctions are expressed through the graph structure in which labeled nodes are organized. Common arrangements of local structure can later be summarized in maps.

Similar states of information about environment are represented by similar scene graphs, which allows to infer ways of transforming representation without significantly distorting represented information about environment. Many details and high-level properties of environment implicitly described by the scene are absent in the representation but known based on scenes encountered in the past. Scene can be transformed to replace a given description of environment with one that is expressed in terms of more familiar structural contexts, and that doesn’t leave out properties that are usually represented explicitly.

Details included in representation, especially restored ones, are of different levels of uncertainty. Furthermore, since patterns are represented implicitly, uncertainty of the same detail can vary between its different local reconstructions by structure waves. Essentially, stochastic process of modification of map salience through propagation of structure waves balances uncertainty of different elements of representation, by enumerating sufficiently certain structural contexts. Salience of a map is influenced through different routes, since the same map can be a part of many different patterns, and the same pattern can be suggested as sufficiently likely by many different structure waves.

A pattern that is used in many contexts in the scene, is of high certainty by itself, while each of its instantiation points can have different levels of uncertainty, and all of the details in the scene are very unlikely to be correct simultaneously. Similarly, contradicting properties generate alternative scenes of higher uncertainty than the scene without contradicting properties, but common areas of these alternative scenes are still pretty certain.

Reduced representation can be thought of as a state of knowledge that allows to answer questions. Structural contexts or patterns are specific answers, and structure waves are questions. Answers to different questions can contradict each other when considered by themselves. Different parts of the scene can have contradictory properties, just as alternative properties of the same part of the scene contradict each other. Which color is a zebra? Different colors at different points. What color is an animal? Depends on which animal it is. Encoding different structures of alternative states of the scene allows to represent the structure of uncertainty about the scene with the same expressive power as used for describing the fixed structure of a single scene.

Contradictions can be resolved either in the question, by specifying additional details that allow to give a simple answer, or in the answer, by making it conditional, by enumerating alternatives. Different alternatives in the answer can be recast as elaborations to the previous question, and this is one way of exploring an overly complex answer to an overly general question, such as the structure of the whole scene. Structure waves are iteratively refined questions shaped by state of knowledge about the scene, and structural contexts they restore are both local answers and iterations of further refinement. Scope of the question can include elements of alternative structures of the scene, or elements of different parts of the same structure of the scene expressed in common pattern, erasing the difference between these cases.


Learning rules of thumb

October 11, 2008

Followup to: Rules of thumb, Focus of attention.

States of individual low-level event detectors don’t correspond to specific events in environment. Only when considered in a context of focus of attention, individual detectors indicate localized events. More generally, an event in environment can be represented by a collection of low-level event detectors in the mind. This happens because each specific event in environment rarely gets attention more than once, and when it does, the question of presence or absence of that event gets resolved once and for all, so that there is no need to keep a detector specialized on answering this question around.

Rules of thumb describe the events in environment, but act and get learned in the mind. They capture general relations that hold between specific events, so within the mind they establish relations between contexts, collections of events, parts of focus of attention. For each context representing a specific event, rule of thumb constructs another context, representing inferred specific event, and this inferred event is different for different original events. Rules of thumb generalize observed transitions between focuses of attention, so that when focus of attention represents a novel situation, they can be used to infer a next focus of attention. When transitions between focuses of attention are originally enforced by sensory input, or by few rules of thumb that look for specific cues, new rules of thumb make the transitions more robust, so that sensory input or initial cues are not required anymore to make the same inference.


Focus of attention

October 7, 2008

Followup to: Principles of holistic control, Event detector as experimental setup.

Event detectors can be regarded as experimental devices for discovering properties of past and future. State of detector is the result of the experiment, so if detector is designed to answer only one specific question, the answer should also stay the same, as constant state of the detector. But semantics of detectors changes over time, and the questions are more generally context-sensitive, so state doesn’t stay the same, and interpretation needs to keep up. When interpretation of detector changes, so do interpretations of dependent detectors, and detectors depending on them.

Locally and globally, each moment the mind changes which properties of environment it indicates, and correspondingly changes its state. Interpretation of possible states of the detectors (external description) and states themselves (actual implementation) are interrelated: current states form a context for reaching following states, and correspondingly interpretation of current states determines interpretation of possible following states.

Let’s call interpretation of (properties of environment indicated by) current state of mind, focus of attention. Events indicated by next state of mind are directly indicated by currently indicated events in environment (including current sensory input). Thus, focus of attention follows indicator chains in the environment. Since mutually indicating properties propagate together, inference running for a while without drastic disturbances will lead to a robust model of aspect of environment, with elements reinforcing and double-checking each other.

Focus of attention is driven by several related pressures. First, directions of inference from properties of environment in the focus of attention form the direction of change in overall attention, if aligned sufficiently together. Second, events in the focus of attention tend to form a coherent picture, with mutually reinforcing events staying longer together and separate events gradually dissipating. Third, low-level input and output have fixed semantics, thus binding focus of attention to the agent in several points. Even though the focus can stretch for light-years in high-level representation, it never completely leaves action and perception, which emanate waves of attention through the first pressure, by immediate inference, in all directions around the agent. This same pressure drags the attention forward in time, changing the representation to reflect current events in the environment. And fourth, focus of attention is driven to seek desirable properties in environment, forming a substrate for preparing and following goal-reaching plans.

Depending on context, individual detectors can have a great variety of interpretations. As a result, separately, detectors have a very distributed semantics, indicating a certain texture of configurations located anywhere and at any time. But together, they add up to an inferentially localized picture, in which each individual event makes much more clear-cut distinction, being focused by other events.


Event detector as experimental setup

October 4, 2008

Followup to: Where map meets the territory, Levels of representation, Improving event detectors.

Uncertainty about environment is uncertainty about specific events. Experiment is a procedure for which the outcome is initially unknown, but knowledge about actual outcome (after the experiment is performed) can be used to resolve the uncertainty about environment. Experimental setup is created in such a way that its future state, after the experiment is performed, indicates the property of environment in question. This constitutes a theoretical basis or interpretation for experiment. Interpretation provides both a functional cause of experiment (the reason it’s being performed, chain of events that leads to it being performed), and a way to use its results. Interpretation also provides a criterion of optimality for experimental setup, so that a particular experiment is a good solution with respect to this criterion, configuration optimized for the task.

Event detector can assume one of the multiple possible states, and serves as a tool for resolving uncertainty. The state it assumes indicates which of the alternatives holds in reality. From this perspective, event detector can be regarded as part of experimental setup, that each moment answers a question about environment.

Two complementary interpretations can be used for event detectors. First, whole intelligent agent can be regarded as experimental setup, with this particular event detector being a readout of the experiment (global interpretation). The event detector is considered independently of other elements of mind, and interpretation relates state of the detector directly to state of environment, shows what events in environment are indicated by the readout. Structure of mind is a way of obtaining measurement with required properties. The relation between detector and environment is primary, and algorithm of the mind is a way to implement that relation. Direction of improvement for the detector is defined by the external criterion, by structure of environment.

The second way is to consider only the detector itself as experimental setup (local interpretation). Detector can be configured to answer a question specific to needs of cognitive algorithm, and not so much for a feature of environment. Interpretation of its states can be derived from interpretations of elements of mind it interacts with, which makes derivation of interpretation more tractable than in global case. Starting from input/output, where detectors indicate their own state, interpretation reaches deeper in the environment as detectors get deeper in the mind. Each detector solves a local optimization problem, efficiently indicating the state of environment following from states indicated by related detectors.

Interpretation guides learning, sets the optimization target for representation. When learning is regarded as development of new experiments in anticipation of inference, interpretation can be said to be the process of learning, chain of events that leads to particular changes in representation. Global interpretation regards intelligent agent as a whole, directing representation to indicate the goal, and to figure out the ways of indicating the goal, developing the experiments that find a way to indicate the goal more efficiently. Local interpretation optimizes the representation with regard to current cognitive algorithm, where operations are instrumental, performing small subtasks far from the goal.


Dynamics of representation

September 4, 2008

Followup to: Where map meets the territory.

Natural events in the environment come in different shapes, and may be located far away from each other, be spatially or temporally distributed, or overlap. A single event may include many different configurations bundled together. Relations between events, that allow intelligent agent to build simplified models, to infer presence of some events from knowing about presence of others, come from the way physical laws apply to the actual content of the world.

Events in the mind follow the same relations, but work by different rules. They are arranged differently and are much more localized in time and space, but they don’t need to be atomic. Just as events in environment are only chosen to approximately describe its structure, events in the mind describe the dynamics of a particular kind of cognitive algorithm. This level of description is useful for designing the algorithm, it allows to develop reduction of a special case of the high-level phenomenon of intelligence down to the interaction of events, but it doesn’t cut to the bottom.

Attended events of environment don’t need to be represented all at once, as some kind of declarative enumeration at the moment of implementing the decision that follows from the model. Events of the model happen at different times, just as events in environment represented by them, to function as elements of the inference process.

There are two main factors that influence the way representation events happen in the mind. First, events drive the inference process and support current context. Inferred event appears after the events of context that indicate it. Some events are intermediary and can be discarded after followup events are inferred, other events need to stay around to shape the context of further inference. Second, a fixed pattern on the mind indicates different things depending on when it appears.

The simplest example of how event in the environment indicated by the same event in the mind changes over time is low-level input, when these events are identical. If binary event detector in the mind for low-level input is active at time T, it represents the same event of low-level input in environment at time T. At time T+1, it represents a different event, low-level input at time T+1, not the old event of input at T. On the other hand, at time T+7 there might form a different event in the mind that represents the input at time T, even though the original event template now represents the input at T+7.

Model of environment not only needs to incorporate new facts and infer missing elements, but also to change representation of some of the events as time goes on. The latter feature is not negative, as it allows to keep track of time and of temporal relations between different events. For example, when a temporally shifted representation of one event meets a new version of another event, they appear simultaneously and thus an inference rule between them can form, that would not happen if they never appear at the same time.

Each event in the mind leaves behind itself a trail of representations: it starts in one form, and then changes into the next, the one after that, and so on. Some of the events are more time-insensitive than others and may have most of their representation unchanged (representation of a single event is not necessarily atomic, so parts of representation of a single event may change in time, and other parts remain stable). As trails of multiple events interact in the mind, relations (rules of thumb) that are time-insensitive will be based on time-insensitive parts of representation, and relations that are more time-sensitive will look at time-dependent parts as well. This allows to learn and perform spatiotemporal inference.


Principles of holistic control

August 22, 2008

Followup to: Where map meets the territory, Levels of structure, Keeping the target in sight, Vague questions and precise answers, Causal rules and unpredictable actions.

When mind is supporting a picture of environment, it is not being passive. Some of the future events in the environment are determined by this picture, they happen because they are in the picture. The chain of events flows from representation in the mind to the outcome through the actions, which lie on the both sides, in the mind and in the environment at the same time.

It is trivial that whatever state the low-level output event assumes in the mind, the same state will appear in reality, because it is the same event on both sides. For other events in the mind, it is not necessarily so; changing such events in the mind will make them represent environment incorrectly, rather than making them determine corresponding state of environment. On the other hand, every change that does leave representation correct, leads to the environment complying with it.

This perspective can be turned around: if representation is restricted to only assume correct states, it can be freely varied within that bound, and whatever picture of environment it chooses to draw will automatically come true.

One way of constructing an accurate picture of environment is through following the rules of thumb, filling the gaps in the picture based on known elements of any kind. When it is known that there is a chair in a room, it is possible to infer that the chair is likely to stand there, instead of flying in the midair, even though original description doesn’t include information about how the room floor relates to the chair.

Since this picture is centered around the actions of the agent that supports it, rules of thumb need to be sufficiently strong (if not individually then cumulatively) not to break down under the surprising changes of context that may result from the actions.

Each element of the representation asks a question, adds a constraint on the set of possible causal patterns that satisfy it. Additional elements help filling the gaps in the model of environment, initiating inference that weaves the structure. Not all details of the environment can be supported in the mind at once, even if they can be inferred from known facts. A tiniest hint may bring to attention many precise details, reconstructing intermediate elements of structure. Inferred from robust rules of thumb, state of mind would indirectly correspond to the state of environment, and giving different hints will lead to the state of mind corresponding to different aspects of environment.

The uncertainty about the future state of environment that is determined by the state of mind may be resolved in many ways, by assuming one of the allowed states of mind. On the side of the mind, the process of resolution of this uncertainty starts from making a decision, from introducing a fact in the model that doesn’t otherwise follow, and checking if the model assembles to a coherent state, if this element leaves the picture in the mind corresponding to environment, thus causing the environment to assume the state corresponding to the decision. If the decision consists in a relatively vague hint about the future, there are good chances that there is in fact a state of environment that satisfies the hint.

Much like drawing the attention to an aspect of environment, asserting a certain vague property in the future state of environment leads to construction of the detailed representation of the state of environment that has that property, driven by a multitude of specific inferred facts about the state of environment (both in the past and in the future) and general direction specified by the property. Where attention draws the details of representation from available factual information, model of the future may make up some of the details when they can be determined by the model. Thus constructed model will include events in the past and the future, but also the present, in particular low-level action events. Choosing a certain property in the future leads to construction of the model of environment that has that property, and the model of environment includes specific state of low-level actions in the present, which causes these actions to be carried out, which in turn determines the future to have required property, to be in accordance with the model.

The plan formed by the model of the future chosen to lead to a certain outcome doesn’t need to be very detailed, for example it doesn’t need to contain the whole sequence of low-level actions from the current point on. Plan gets refined as it unwinds, as more accurate factual information becomes available about events in the environment that were only modeled based on the goal at the start. At each moment, low-level action is chosen according to the best current guess included in the current model.

This approach allows to view the process of control in intelligent agent as a result of two cognitive pressures acting on representation of environment supported by its mind. The first pressure compels the representation in the mind to be correct, to depict the state of environment (past, future and present) as accurately as possible. The second pressure biases the representation to see the future state of environment that is as close as possible to the goal.

I call this perspective on how control algorithm could operate “holistic control”, to reflect the way plans get constructed. Inference operates across the levels of representation and in both directions in time, it is neither bottom-up nor top-down, it is not forward chaining or backward chaining. Control algorithm doesn’t contain clear-cut feedback loops, processing doesn’t happen in feed-forward fashion. The model of environment is held together by heuristic rules that aren’t organized in any kind of hierarchy, the model itself is “flat”, not modular except for the structure inherited from environment it represents. The operation of control algorithm is focused on the support of model of environment, not on action and perception. Action and perception are only peripheral (although indispensable) aspects of control, with low-level input binding the model of environment to reality at one tiny point, supplying new facts and showing the mistakes, and low-level output giving the model ability to participate in the causal web of environment.


Where map meets the territory

August 10, 2008

Followup to: Levels of structure, The dynamics of mind.

Events in the mind of intelligent agent indicate events in the environment. A configuration of events in the mind represents the current context, knowledge about which events are known to be present. Given a context, some additional events can be inferred according to known rules of thumb, supplementing the context with events on different levels of representation and resolving uncertain events.

The breakup of the current knowledge about environment on separate general events is a property of data structure, knowledge representation, designed for the purpose of being manipulated by inference algorithm. General events that don’t change (don’t become irrelevant or invalidated), and therefore don’t need to be manipulated by inference, don’t need to be separately represented in the mind. Events in the mind show what is different in the current knowledge about environment, they are active elements resulting from inference, starting from sensory input and trains of deliberative computation.

Most of the events in the mind are separate from events in the environment they represent. The property of representing (indicating) events in the environment results from using inference process that, through many intermediate steps, creates events in the mind in the right way. Some of the events, on the other hand, may be thought of as lying both in the mind and in the environment, and representing themselves. These are events of low-level input and output, boundary of the mind, where map meets the territory.

State of the environment is reflected in the mind, events outside the agent are warped around the border to fit inside its mind. These events include elements of future as well as past, elements independent on agent’s actions and determined by them. Two pictures of environment work by different laws: events in the environment itself are constrained by laws of physics and feature excruciating level of detail, while events in the mind are simple caricatures constrained by rules of thumb, mimicking the former. Both pictures meet at the identity of boundary events, at input and output. Events at the boundary are identical in both forms, and events further and further away from the boundary in both directions are traced through chains of relations holding in each of the pictures, reaching high-level events in the mind and events far away, both in space and time, in the environment. Captured in correspondence, pictures determine the elements in each other.


Causal rules and unpredictable actions

August 4, 2008

Followup to: Causation as robust indication, Unpredictability of actions.

Changes considered for causal rules to handle are usually brought up in the context of decision making. Decisions lead to actions, and actions can create radical changes in context. Being the most unpredictable, actions of intelligent agents serve as the hardest test for robustness of causal rules and filter out many rules of thumb that are not normally considered causal. The more significant changes are allowed to occur, the less rules of thumb are general enough to endure them.

Causal rules don’t need to be preserved for all physically possible contexts, since only contexts that follow from causal history of environment are actually realized, and only optimization processes that currently exist in the environment can change contexts in novel ways. Causal rules serve as a tool that works where the set of significantly different causal patterns remains approximately the same, but within that domain of applicability they allow to draw accurate conclusions. Thus, causal rules only need to persist during sufficiently insignificant changes in global context that preserve the semantics of events, through local changes in context that can be enacted by existing optimization processes.

For example, grass is usually wet when it’s raining. Wet grass is a good indicator of rain, it is a reliable rule of thumb in natural context. Yet if an intelligent agent creates a new context, by specifically pouring water on the grass, in this new context the rule will no longer hold, wet grass doesn’t cause rain. Colliding stones produce a sound in natural context, and if an intelligent agent collides stones in artificial context, there is still a sound, which shows that colliding stones cause a sound.

Causal rules are rarely infallible, the same principle that tests them and distinguishes them from mere correlations can break them as well. There usually is a way to create a contrived context in which a given causal rule will no longer hold. Colliding stones in vacuum or colliding specifically crafted elastic “stones” will produce no audible sound. Administering symptom relief medicine leads to disease not causing symptoms. Only causal rules that follow immediately from laws of physics or those guarded by limitations in available technology can withstand a directed attack. Thus, most of the causal rules have additional conditions, specifying the kinds of changes that they are able to withstand.

This places various correlations on the same scale with causal rules, distinguishing causal rules only by generality, applicability in wider context. Depending on the context, different strength is required from rules to carry out accurate inferences, and weaker rules can contribute considerably to determining the details not captured by stronger causal rules.


Causation as robust indication

August 1, 2008

Followup to: The flow of reality, Semantics of the rules of thumb.

There are multiple senses in which words causation and causality are used, and many historically accumulated complications. Basically, causality is a relationship between two events, cause and effect, that establishes the effect as reliably following (from) the cause. Causal relations are naturally used for describing functional structure of environment. In addition to enumeration of events (“button was pressed; light turned on”), causal relations establish the internal structure that binds these events (“button was pressed, which caused light to turn on”). Functional structure allows to model the dynamics of environment in different contexts, acting as an algorithm that computes state of environment given different initial conditions. This way, causal relations not just describe the structure of a fixed scene, but generalize to other scenes.

The sense in which effect follows from cause reliably is the subject of many disputes. Restricting the notion of causality only to relations that hold deterministically makes it almost useless for modeling real environment. Thus, effect should be allowed to sometimes not follow the cause, which contributes to the problem of connection between correlation and causation. Two events may be highly correlated, yet not causally dependent, and otherwise, two events may be causally dependent, but without overwhelming correlation. Apples often lie on the ground near the apple tree, but they don’t cause the apple tree to appear nearby. Smoking causes cancer, but only sometimes.

The difference that is usually pointed out between statistical properties and causal properties appears when studied scene is changed. One of the central problems in statistics is regression, inference of the whole distribution from limited number of observations. It can consist in e.g. finding the values of parameters that give a distribution from parameterized set that fits the data best. The distribution that describes the data is considered to be fixed and the same for all data points. Even when data points are taken from a process that changes its state over time, distribution that describes development of the process is still specific (even though it’s uncertain). Statistical properties of a single distribution describing static conditions are contrasted to causal properties that describe the behavior of the scene when conditions change. Statistical properties change with conditions, but causal properties are fixed, and can be used to regenerate statistical properties of the scene from changed conditions.

Yet if we consider the development of environment as a whole, there are no external changes to be applied to the dynamics of environment. Everything that happens, happens within the environment, every change is interaction between configurations within the dynamics of reality. From this perspective, causal properties may be regarded as statistical hypotheses that hold for considered part of the environment both before and after the change is applied to it through interaction with other parts of the environment. Statistical properties of a fixed distribution thus apply only to a limited part of the dynamics of the scene when it isn’t changed, while causal properties are more general statistical properties that persevere through “external” changes.

Let’s focus on a specific class of causal properties that describe rules of thumb which indicate events given other events. This way, a causality relationship between cause and effect becomes a rule of thumb that provides evidence for effect given the cause. The feature that distinguishes causal rules from other rules of thumb is that they provide evidence for effect given the cause in many contexts, even in contexts that were changed through interaction with various causal patterns. Causal rules are rules of thumb that are general enough not to break down as the dynamics of environment unfolds. In this sense, causal rules are the only reliable kind of rules of thumb.