Unpredictability of actions

July 29, 2008

Followup to: The flow of reality.

Effects produced by optimization processes are much less predictable than other events happening in the environment. This happens because optimization processes can produce novel causal patterns, which break rules of environment that worked well in the past. When optimization process is set up with a known goal, some rules of the optimized environment may be known in advance. But specific path towards the goal is usually unknown. Leaving the details of the path unknown in advance may be the whole point of launching an optimization process with known goal: you describe the goal, perhaps only vaguely, and out comes a precise and efficient plan for achieving it. Along this unknown path the optimization process may produce all kinds of unknown causal patterns breaking the old rules.

Ordinary causal patterns appear in ordinary contexts. It is reflected both in their origin, where a pattern is produced through a usual kind of interaction between usual causal patterns, and in the form of rules of thumb that capture its operation, where the conditions of applicability include just a few surface properties. Zooming in on optimization process as an intelligent agent, the choice of actions depends on representation of environment that includes lots of contextual information. Unlike an ordinary causal pattern, a mind isn’t limited to any few of the configurations appearing in the environment; it absorbs as much of them as possible. As a result, any action that doesn’t take an obvious step towards the goal can depend on many contextual features, and so isn’t amenable to being captured by a simple rule.

Unpredictability of actions doesn’t necessarily refer to creation of novel configurations that break the semantics of old events. It can consist merely in the failure to predict or interpret specific actions, where they perform a choice between known causal patterns. You get in the taxi in unknown city, and you don’t know where driver will turn, even if you know the destination.

Forming cached interpretations for the actions of intelligent agent may be unreliable: they don’t work by the rules of natural causal patterns and may defy normal classification. An actual action is chosen by the agent for the specific context, which can place it in any of the context-insensitive bins an observer might have. It may be marked as “stupid”, “brilliant”, “careless” or “disastrous” and not actually be one.


Keeping the target in sight

July 26, 2008

Followup to: The dynamics of mind.

When constructing an action, it is not enough to know that the action is a good indicator of target outcome. It is also useful to know if the action is required to get the outcome: maybe it’ll just come about anyway. Performing a sun-worshiping ritual every evening is a good indicator of an event that sunrise will occur tomorrow morning, probability of sunrise given that you performed the ritual is high, but the ritual was hardly useful for achieving this outcome. This seems to require a separate notion of causation, only choosing actions that do contribute to the outcome, that cause it. But there is also a way to do without causation, at least in this sense.

The trick is that preparing a supper in the evening is also a good indicator of sunrise. The decision making can be thought of not as a sequence of specific solutions to specific problems, but as a process that preserves the target state of environment in the role of an event indicated by current state of mind. If model of the environment that assesses indication is accurate enough, having the target outcome indicated by actual state of mind shows that target outcome will likely in fact happen. If performing a sun-worshiping ritual is a good indicator of sunrise, and sunrise is desirable, it can as well be performed, just as bones are allowed to be white instead of green, so long as they perform their function. On the other hand, if preparing a supper leads to even better outcome than performing a ritual, it is a preferable action to establish. Accurate decisions are self-fulfilling prophecies, and right decisions foretell good outcomes.


Semantics of the rules of thumb

July 25, 2008

Followup to: Levels of structure.

Rules of thumb work only in certain environment. Even under the constraints of laws of physics an unusual physical configuration may act contrary to most of them. Each rule applies only in a certain context, and together they can support a plausible model of environment only in a certain global context. When a rule is applied, context is only checked (for applicability) based on the information about events derived from surface properties. On the other hand, rules of thumb encode information about the environment that allows to perform inference just from surface properties, without knowing physical configurations accurately enough to perform the same inferences using laws of physics.

Rules of thumb are fundamentally hypotheses, probability distributions over physical configurations, that can gain high probability given information about distribution of configurations represented by other probable events (current context), which in turn lends higher probability to other events under the hypothesis. Starting only from physical laws leaves hypotheses representing the traces of different physical configurations in time equally likely. Singling out laws of thumb prioritizes certain kinds of configurations, those expected to be possible (more likely) in the actual state of the environment and at the same time useful for inference from the surface properties. The whole collection of rules of thumb provides an expressive language for assembling complex hypotheses about environment from reusable parts. The probabilities assigned to hypotheses can be represented implicitly, by supporting a set of events that pass a certain threshold of probability. Rules of thumb act as update rules on this set, removing the events that are inferred to be improbable given the rest of the events, and adding events inferred to be probable. Doubt everything, but one at a time.


The dynamics of mind

July 22, 2008

Followup to: The flow of reality.

A configuration confined within a narrow event (causal pattern) plays a role of a program in the physical world: its structure (partially) determines what will follow in which contexts, what will be the output for each input, the future for each possible past. As a part of the output, causal pattern determines its successor, the pattern that will implement the next phases of the process initiated by previous pattern.

Let’s consider the dynamics of a causal pattern that implements a mind. First, some imagery to fuel intuition with. Mind is an optimization process, a converging stream in the flow of reality. In its wake, this stream warps the environment by applying little nudges calculated for higher impact. To calculate the appropriate nudges, the stream tunes itself to react to the relevant properties of the environment. The nudges are determined by common causes to become concerted with other causal patterns, so as to direct the flow towards the target. The structure of the stream gets refined over time to become more receptive to current properties of the environment and to turn them into actions to a greater effect. The flow of causality gets captured by the stream, and, transformed by its structure, it is released to shape the future.

Knowing the environment means identification of causal patterns present in it. For the knowledge to be applied to decision-making, it needs to be reflected in the state of mind. In other words, the different possible causal patterns to be found in the environment need to cause different functional states of mind (different causal patterns in the mind). The state of mind acts as a way to disambiguate the possible states of environment. Where the environment is sparse, it is sufficient for the mind to be set so as to assume different states depending on the wide events that contain the possible causal patterns in the environment. As well as representing the long-term rules by which the environment happens to play, mind needs to represent the current state of the environment, or, in other words, the state of the environment relative to the time and place where the causal pattern implementing the mind happens to be. The structure of the environment around the mind is important, because it’s where the actions are initiated. The long-term rules are used to support a representation that robustly integrates different pieces of information that the mind comes across and allows it to run deep queries about the state of the environment that are not immediately accessible to it, or even about the states of environment in the future.

Armed with representation that reflects the structure and state of the environment, the dynamics of mind causes the actions that are predicted to lead to the goal. Goal is not necessarily very fine-tuned, it may simply consist of certain events that are targeted, although that alone could be sufficient to launch the optimization process that leads to the state of environment much more intricate than anything that could’ve been created by chance. An action, or a decision to implement a high-level plan, consists in the state of mind, a belief, in the fact that the objective will be achieved, that itself causes it to be achieved. A reliable decision acts as an indicator of the outcome in the environment, in the same way as representation of other knowledge about the environment, and so it is constructed from the representation. Amusingly, making good decisions consists in the mind dreaming up self-fulfilling prophecies foretelling good fortune.


Vague questions and precise answers

July 20, 2008

Followup to: Levels of structure.

There is a tremendous number of configurations allowed by the laws of physics. A perfect specification of a configuration would require all the tiny details, because it’s physically possible for each of them to be different. Specifying a configuration that actually exists is another matter. You often can just point and say “That thing over there!”, and it would be sufficient. An event doesn’t need to reflect all the facets of causal pattern to capture it precisely, it only needs to delineate the set of configurations that doesn’t contain any other causal pattern. An event is a way to disambiguate the choice between multiple alternatives, thus implicitly specifying the configuration without describing the details.

Even when you can’t precisely define a certain concept, but only have a vague intuition about what it is, there often is the precise and unique referent of that intuition. On the other hand, the change of the boundaries of the event, that may seem insignificant on the scale of the event with all the volume of configurations that it encompasses, may shift it from original causal pattern to a completely different one. This makes the business of defining something a dangerous one, and sometimes counterproductive where a less clear-cut method of identifying the target exists.

A collection of weak indicators that point to the same target works even better. It is possible to describe all the essential details about most of the situations is just a few words, so that the myriad of implied consequences become evident. Likewise, a configuration can be represented by several heterogeneous reusable categories, all of which contribute to the description.


Levels of structure

July 19, 2008

Followup to: Rules of thumb, The flow of reality.

The rules of thumb make inferences from surface properties (events) through implicitly identifying causal patterns that can arise in the environment and exhibit these properties. The events capture wide sets of configurations, within which the causal patterns that can actually appear in the environment occupy a tiny fraction of possibilities. The properties associated with events by rules of thumb physically follow from causal patterns, and not from events.

The value of rules of thumb is in binding together the events, in imposing the structure on the model of the environment perceived in terms of vague categories, in representing the dynamics of underlying physical processes. The value of events is in turn in their usability for identifying and applying further rules of thumb, and in representing the state of the environment.

As the state where a rule of thumb is applicable can also be considered an event, rules of thumb can be reduced to rules of inferring the individual events, given the current information about context in terms of events that are known to be present. Some rules indicate that certain events belong to the current context, some rules indicate that certain events don’t. Acting together, they refine the context, filling in the missing events and removing those included by mistake.

Using multiple events in a model allows to compose the descriptions of complex causal patterns from reusable parts. Overlapping events, describing different regions and aspects of the underlying physical process at different levels of granularity, preserve the integrity of the overall model, binding its parts together. Using the events that describe the same complex causal pattern on different levels of description allows to capture the general rules by which the pattern operates, and fine-tune these rules according to specifics encoded in the events of finer granularity.

When different levels of description operate within the same model, some of the rules of thumb can describe the relations between the levels of description of the same structure, rather than interactions between different parts of the structure or dynamics of the structure at different times. This allows to model the dynamics of environment from the hints about its structure made on different levels, to incorporate the general high-level properties and little details alike.


The flow of reality

July 17, 2008

Followup to: Rules of thumb.

If we are to study the mind, the structure that drives the environment towards the target by choosing the actions that lead there, we need to look at the structure of environment, and the ways in which its overall dynamics can be predictable.

The environment develops in time, the past determines the future. Physical laws establish a relation that holds between the physical configurations in the past and in the future. Given an event (set of configurations), these laws show which configurations are allowed at different times. Literally applying laws of physics to events is no use: an object that at first looks like a door, but then grows legs and runs away is not a physical impossibility, configuration that does that is included in the event of there being a door-like object. But it’s not actually realized.

Each actual physical configuration descends from the deep causal history. A certain object can only be realized when it is preceded by the process that leads to the creation of that object. The form that a particular event takes in reality is determined by its causal past, and not just by physical restrictions that apply to it.

Events that fit together either arise by coincidence, or as a result of sharing the causal past. Complex dependencies are very unlikely to result from a coincidence, but producing or preserving dependencies also requires a special kind of process that doesn’t degenerate into random noise. Causal patterns not only establish the relationship between surface properties that are associated with them, but also act as elements in the overall flow of reality, determined by causal patterns that precede them, and determining causal pattern that follow.

There are many general kinds of causal patterns that are useful in describing the environment. One-time events result from certain combinations of patterns and dissolve afterwards, casting the ripples of their structure down the stream, in the future. Persistent physical objects are patterns that preserve their structure, generating the patterns in the future that hold the same properties as patterns in the past, forming steady streams in the flow. By far the most interesting kind of pattern is optimization process. Where stability can keep the event from dissolving away, propagating its structure over time, optimization makes a step further and drives the dynamics of environment towards a state that was never encountered before. The structure of the state that optimization process targets is implicit in the structure of the process, and as such, it can be much more intricate than the structures that can appear explicitly by coincidence. Optimization process is a knot on the stream of reality that tightens over time, ever closer to revealing the hidden structure.

Just as optimization process that arises by accident may be quite unlike the target state, other patterns may also pass through implicit stages, where the causal pattern only exists in a form of the process that eventually converges on it. Persistent patterns, such as living organisms, may pass through quite a metamorphosis before restoring the pattern in original form. Events may interact with each other using stable pathways, establishing robust dependencies even when confronted with the full richness of the real world. Patterns that repeat and persist build themselves on top of the stable processes. The actions that an intelligent agent chooses may seem to oppose its goals, or look random and irrelevant, but a reliable plan adds up to the intended outcome.


Rules of thumb

July 6, 2008

People manage to successfully reason using the heuristic rules describing the behavior of macroscale things and events, not needing to derive their behavior from the fundamental laws of nature, and the world seems to behave in a way that allows it. How does this happen? This is not just a consequence of the objects following the simple laws of physics. A door consists of tremendous amount of atoms, yet when I close it, the door just closes. On the high level of description, it is a very simple object. It doesn’t occasionally grow legs and run away, it simply closes.

When trying to extinguish fire, one can try to pour gasoline on it. The fire gets extinguished by water, which is also a liquid. So the gasoline will work as well, won’t it? This is magical thinking, when one acts on surface similarities, and expects the similar consequences. But for objects consisting of a tremendous amount of underlying elements, what is the chance of surface similarities indicating similar behavior in other respects?

The rules of thumb work, because the objects that have the similar causal structure get repeated over and over, resulting from the common cause. The similarity in many properties doesn’t just follow from the presence of few superficial similarities. If I see a flying duck, it might be a robot assassin duck sent by aliens to get me, it just is improbable to the point of being ridiculous. The same cause, the physical process of development from the duck zygote, leads repeatedly to the same system appearing over and over again. By surface similarities I recognize not the other properties that follow from the properties I observed, but a certain causal pattern from the set of causal patterns that exist in my environment and which I learned to recognize in my experience. After the causal pattern is recognized, I can conclude that it has certain other properties.

The examples of repeated patterns in the environment come from many sources. The laws of physics tend to preserve the form of rigid objects, so that the same object will remain in the future, that was there in the past. The stars and other cosmic objects get formed over and over by the same laws of physics during the cosmological development of the universe. On out planet, biological organisms replicate, and so their forms get repeated many times. They mutate, and so slightly different versions of the same system may share many properties. Human culture propagates the ideas and technologies, and so many objects sharing the same properties get constructed for similar purposes. We are surrounded by objects resulting from the common causal processes, and so we are used to seeing the similar consequences from superficially similar events, even if it’s possible to construct systems with the same surface properties that act in an entirely different way.

Applying gasoline to extinguish fire confuses the recognition of the type of causal structure of an artifact with direct recognition of implied properties. Even though both gasoline and water obey the same underlying laws of physics, the causal processes happening during the chemical interactions involved are very different, and so the consequences of applying superficially similar actions to them often won’t result in sufficiently similar consequences. When I don’t expect the object that looks like a duck to be a robot assassin disguised as a duck, I rely on the identification of known causal pattern (a real duck) by few of its observable properties. If I had evidence (indirect experience) of there being in fact robot assassin ducks, I wouldn’t be so sure.

The rules of thumb can be formed about the relationship of many events occurring in sufficiently similar contexts. Groups of events may imply other events, both in the past and in the future, and due to the universality of laws of nature, rules of thumb tend to be reusable, at least where the expected repertoire of causal structures doesn’t change too much.


Indicator chains

July 2, 2008

Followup to: The semantics of beliefs.

Let’s consider the way information gets propagated in the model of environment. As before, I use the notion on an event, as a restriction on a set of possible states of the environments (both in the past and in the future, and including the states of mind of the agent). For example, in all states of the world that we could consider, the event of there being a cup on the table at present time refers to a set of the states of the world in which in fact there is a cup on the table at present. Absence of knowledge about the environment corresponds in these terms to absence of knowledge about the events. Imperfect knowledge about the event ranks the states of the world probabilistically: adds probability to the states that contain the event, and reduces from the states that don’t.

If we know that two events depend on each other, that one of the events is a good indicator of another, then knowledge about indicator translates into knowledge about indicated, information about the property of environment specific to one place and time translates into information about another property of environment, that holds at different place and time. Event R is a good indicator of event E, if probability P(E|R) is high, since in this case, if we come to know that R is present, then probability of event E jumps from probably low prior probability P(E) to P(E|R).

Indicators can work in any direction in time and space. You can have indicator event that is located in the past relative to the indicated, or in the future; near it spatially, or far away; and some events are not localized in space and time, such as laws of physics or huge persistent objects.

The indicators can form intricate chains: if event A is an indicator of event B, and event B is an indicator of event C, then event A is also an indicator of event C. The strength of the indicator chain depends on all its links, and the more links are included in the chain, the weaker it gets, but if all links are strong enough, the chain will probably hold. This is not always so, for example an event that Bill looks like a human, but lacks one finger, is good indicator of an event that Bill is a human, and the fact that someone is a human is a good indicator that he has 10 fingers, but in this case Bill having lost a finger is a bad indicator of him having 10 fingers. In such cases, the errors produced by using faulty indicator chains can be fixed by other indicator chains. In this example, indicator chain from Bill having lost a finger to him having 9 fingers fixes the erroneous conclusion that he must have 10 fingers.

Indicator chains can travel back and forth in time, and across great distances. Astronomers observe the light from the supernova, which indicates that in the past a great explosion happened in a particular place, which is an indicator of the light from this explosion reaching another star thousand years in the future, in the direction opposite from us, that is not even in our light cone. A person reaching out for a cup of tea on the table is an indicator of the cup of tea being on the table a moment ago, and indicator of that person drinking the tea afterwards.