So far, the arrangement of events and structure of environment as depicted by representation weren’t described in any detail, the discussion centered on relations between events in general. In the next several posts, I’ll explore a way of representing more concrete structural descriptions in form of events and rules of thumb. This will add another perspective on the process of inference, identifying correlates of representing multiple instances of object, recurring properties, process of creating new elements of representation in response to observing new variants of known objects, access to properties of objects in a scene, and so on. This perspective will also suggest more concrete form of representation, setting stage for a more close investigation of possible inference algorithms.
Frame representation is an expressive method for capturing the structure of a static scene, its components, relations between them, properties and generalizations. A scene in frame representation is described by frames, which contain a certain set of slots. Frames represent entities or classes of entities, and slots can be filled by one of multiple possible values, describing properties of these entities or relations between them.
Let the scene within a focus of attention be explicitly presented in form of an undirected graph, with labeled nodes and unlabeled edges. This form is sufficient to capture a frame representation of a scene, with reserved labeled nodes added to represent slots, frames, directed edges, and so on. I assume that the scene is presented in a fairly loose frame-like form, with restrictions for representation arising as necessary.
This graph is not the knowledge representation, but representation of a scene that knowledge representation needs to capture. This is a design requirement for the cognitive algorithm: it needs to be able to represent a scene with about this level of expressive power. Additionally, it needs to be able to perform various operations: inference, attention control, learning, reconstruction of the scene from low-level input, and so on.
Structural elements of a scene in this representation are described by subgraphs. Graph for the overall scene is composed of many such subgraphs linked together, specifying each other’s properties and relations. Setting one of the several alternative properties is reflected by attaching one of the several alternative subgraphs. A simple binary property can be represented by a single node, while a complex property that has internal structure or includes relations to other objects, is represented by a bigger subgraph. For example, form of an object can be described by a graph of its 3D surface mesh.
Scene graph represents a focus of attention, highlighting a particular view of environment, on particular levels of description. Different scene graphs can capture different parts of environment, including its state at different time. The same object can be represented at overlapping levels of granularity or considered from different perspectives. Relations between properties captured by different scenes can be made a central element of another scene. Thus, scene graph is an open-ended representation that can be extended indefinitely, by elaborating particular details, adding new entities, establishing relations with additional entities, and shifting to different levels of description. An object represented by a single node can be expanded to a subgraph having many properties, and conversely a subgraph can be collapsed to a one-node summary.