Followup to: Levels of representation.
Consider a binary event detector that, say, detects the presence of tigers in a picture. This detector defines a certain set of pictures for which it activates. If this is a simple preliminary detector constructed from general description of tigers, it can be significantly improved. But what constitutes an improvement? A change in detector could make it a better tiger-detector, worse tiger-detector, or even turn it into a cloud-detector.
The original detector is a vague question, and the improved detector is an answer. The initial outline allows to identify the clusters of known events fitting the description and change the detector to includes these clusters but not stray peripheral events that come through the boundary, and also to plan for extrapolation of these clusters. Event detector is constructed to be applied in the future, so it needs to be able to recognize not only causal patterns that were never observed before, but also causal patterns that never existed but will appear in the future. In a way, learning is a self-improvement action that agent applies to its future self to process certain events better.
Predicting the subsequent events requires a probabilistic model of environment, and relying on few exemplars that a newly-fledged detector managed to observe is not sufficient. If a new detector is expressed in terms of few other existing detectors (right from the vague question stage), rather than as a classifier applied to the whole input domain, the problem becomes much simpler. Existing detectors already have a good idea about probability distribution of their states, so probability distribution for the new detector roughly derives from them, modulo dependency. The form of the new detector can then be adjusted, guided by this probability distribution and facts about dependencies observed from few exemplars.
Intermediate events allow tractable inference, their purpose is in implementing computational steps that follow the natural structure of modeled environment. An event detector needs to at least not be redundant (not repeat another available detector) and be probable enough to assume more than one state during some reasonable timeframe (if a detector isn’t ever expected to activate, there is no point in keeping it around). When detector supports multiple states, many of these states need to hold sufficient probability mass. After original form of a new event detector distributes the probability mass among its states, based on the knowledge about detectors from which it’s constructed, one of the pressures for the refinement of new detector is in shifting the boundaries of its states to even out (or, at least, keep within limits) the distribution of probability.
The form events tend to assume depends on many aspects of the cognitive algorithm, and only on crude level do they correspond to natural events of the environment. Within the joint detectors of the natural events, factoring into individual detectors may look rather unnatural, like set of floating-point numbers that have a “zero” at twentieth bit in memory. The high-level dynamic of events repeats the structure of environment, but the low-level dynamic of individual detectors may look differently. The low-level dynamic needs to be specifically designed to implement the required high-level dynamic.
Posted by Vladimir Nesov
Posted by Vladimir Nesov
Posted by Vladimir Nesov