A Commentary on the BBS target article "The neural basis of predicate-argument structure" by James R. Hurford

Word Counts:

Abstract: 61

Main Text: 1236

References: 116

Total: 1413

Predicates: External Description or Neural Reality?

Michael A. Arbib

Computer Science Department, Neuroscience Program, and USC Brain Project

University of Southern California

Los Angeles, CA 90089-2520

arbib@pollux.usc.edu

http://www-hbp.usc.edu/

Abstract

Hurford argues that propositions of the form PREDICATE(x) represent conceptual structures which preexist language and which can be explicated in terms of neural structure. I disagree, arguing that such predicates are descriptions of limited aspects of brain function, not available as representations in the brain to be exploited in the frog or monkey brain and turned into language in the human.

Main Text

A numbered paragraph is based on the corresponding Section of the target article; unnumbered paragraphs convey my comments.

1.2. The basic ontological elements are whole events or situations and the participants of these events. The event described by A man bites a dog could be represented as

$e, x, y, bite(e), man(x), dog(y), agent(x), patient(y) (1)

I don't think this works. We need to replace agent x by agent(x, e) to indicate in which event x plays the stipulated role; similarly for y. For Hurford, the discussion of episodes is an aside to his concentration on 1-place predicates, but I suggest that the crux for a prelinguistic representation is the event and the "action-object frame" A(x,y) - agent x is doing A to object y - and its variations. Rizzolatti and Arbib (1998) examined

whether or not a ‘prelinguistic grammar’ can be assigned to the control and observation of actions. If this is so, the notion that evolution could yield a language system ‘atop’ of the action system becomes much more plausible. (p.191)

This talk of a ‘prelinguistic grammar’ was not meant to imply that gestures may be a primitive form of grammar for our approach was semantic rather than syntactic:

… we might say that the firing of ‘mirror’ F5 neurons is part of the code for a declarative case structure, for example,

Declaration: grasp-A(Luigi, raisin)

which is a special case of grasp-A(agent, object), where grasp-A is a specific kind of grasp, applied to the raisin (the object) by Luigi (the agent). … this is an ‘action description’, not a linguistic representation. (p.192. Italics added.)

Being able to grasp a raisin is different from being able to say “I am grasping a raisin”, and the neural mechanisms that underlie the doing and the saying are different. However, the case structure lets us see a commonality in the underlying representations, thus helping us understand how a mirror system for grasping might provide an evolutionary core for the development of brain mechanisms that support language.

2. Representations of the form PREDICATE(x) are taken to stand for the mental events involved when a human attends to an object in the world and classifies it perceptually as satisfying the predicate in question.

More specifically, the notion is that a person may attend to a limited number of objects, and x then stands as an index for one of those objects. Thus a scene might be represented by a conjunction

P1(X1) & P2(X2) & P3(X3) & P4(X4) (2)

where each Xj indexes some region of the scene and Pj(Xj) indicates that the object at that location possesses property Pj. There leads to another point which (I think) weakens Hurford’s critique of Rizzolatti and Arbib (1998):

4. An example of a scene-description might be

APE(x) & STICK(y) & MOUND(z) & HOLE(w) & IN(w,z) & PUT(x,y,w) (3)

translating to An ape puts a stick into a hole in a mound.

The inclusion of PUT(x,y,w) in (3) reinforces the point that Hurford’s focus on unary predicates does not do justice to describing animals which perceive to act, with acts dependent on relations between objects. The key question remains: “How do we go from predicates which we may use to describe internal behavior to neural representations which themselves abstract from the activity levels and parameterizations of schemas and their underlying neural networks, and instead provide abstractions which may in turn be refined to yield the cognitive and semantic forms which drive the production and perception of the phonological forms of language?”

In discussing the possible neural basis of (2), Hurford (§4) cites papers from 1984 onward. However, I would claim some priority in this area with the slide-box metaphor (Didday & Arbib, 1971; Arbib, 1972): In the days before computer graphics, movie cartoons were drawn using cels, which I there called slides. Since the cartoon might run for seconds without the background changing, one may draw this background just once. In the middle ground, there might be a tree about which nothing changes for a while except its position relative to the background. It could thus be drawn on a separate slide and repositioned as needed. In the foreground, key details might change for each frame. The slides could then be photographed appropriately positioned in a slide-box for each frame, with only a few parameter changes (including minimal redrawing) required between successive frames. The slide-box metaphor suggested a similar strategy might be used in the brain, with long-term memory (LTM) corresponding to a "slide file" and working or short-term memory (STM) corresponding to the "slide-box." The act of perception was compared to using sensory information to update slides already in the slide-box and to retrieve other slides as appropriate, experimenting to determine whether a newly retrieved slide fits sensory input "better" than one currently in the slide-box which, in the brain, corresponds to a mass of neural tissue linking sensory and motor systems. A crucial point was that retrieval of a slide provided access to a wealth of information about the object it represented, including appropriate courses of action.

I cite this background to stress that (3) is a pale approximation of the slide-box metaphor, which is in turn a pale approximation to the multi-level modeling methodology that unifies the functional schemas of schema theory (Arbib, 1981; Arbib et al. 1998) with the dynamics of detailed neural networks. For example, one schema in the visuomotor system of a frog (Arbib, 1987) might correspond to a pattern of neural activity signaling the likelihood of a small moving object in a region x1 of the visual field, another schema might signal the likelihood that a large moving object is moving with velocity v in region x2, while a third might indicate the likelihood that a barrier of extent w is located around region x3. Thus, rather than being predicates that return 0 or 1, they are functions or likelihood distributions over a multi-dimensional parameter space. Moreover, the frog’s actual course of action (the “choice” of motor schema to guide action, and the setting of control parameters for that action) cannot be directly inferred from these schemas, but rather depends on the interplay of the activity of their neural instantiations as they play upon the brainstem, determining whether the frog will snap at its apparent predator, jump to escape an apparent predator (modulating its direction of escape on the perceived trajectory of the predator), and whether or not it will attempt to detour around a barrier in doing so.

In summary, (2) is a fine answer to the question “What objects does the animal see, and where does it see them?”, and Hurford provides an interesting analysis of relevant neural data. Moreover, I think it useful to debate whether representations in the ventral stream are "more prelinguistic" than those in the dorsal stream. But I answer the question of my title “Predicates: External Description or Neural Reality?” by saying that the predicates like (2) are, in general, our external descriptions, not the animal's neural reality. It is a highly evolved skill of humans to be able to name an indicated object, and I suggest that PREDICATE(x) is best seen as a description of human naming behavior, rather than as a conceptual structure preexisting language that is part of the causality of neural circuits.

References

New References

Arbib, M. A. (1972) The Metaphorical Brain: An Introduction to Cybernetics as Artificial Intelligence and Brain Theory, Wiley-Interscience: New York

Arbib, M.A. (1981) Perceptual Structures and Distributed Motor Control, in Handbook of Physiology, Section 2: The Nervous System, Vol. II, Motor Control, Part 1 (V. B. Brooks, Ed.), American Physiological Society, pp.1449-1480.

Arbib, M.A. (1987) Levels of Modeling of Visually Guided Behavior (with peer commentary and author's response), Behavioral and Brain Sciences, 10:407-465.

Arbib, M.A., Érdi, P. and Szentágothai, J. (1998) Neural Organization: Structure, Function, and Dynamics, Cambridge, MA: The MIT Press.

Didday, R.L., and Arbib, M.A. (1971) The Organization of Action-Oriented Memory for a Perceiving System I. The Basic Model, Journal of Cybernetics, 1:3-18.

Reference Repeated from Target Article

index model. Cognition 32:65-97.

Rizzolatti, G., and Arbib, M.A. (1998) Language within our grasp. Trends in Neuroscience 21,5:188-94.

Acknowledgements

Preparation of this Commentary was supported in part by a Fellowship from the Center for Interdisciplinary Research of the University of Southern California.