In their Interface Theory of Perception, Hoffman, Singh, and Prakash argue for the role of evolution in human perception. This is a claim that is undoubtedly true and with which few modern perceptual scientists would disagree. But neither is it particularly novel. But then they follow this claim so its logical extreme, ending in solipsism (the philosophical position claiming that nothing but the existence of one’s own mind can be known; while ultimately correct, such a position is unproductive; and in the case of the authors’ thesis, it also proves self-contradictory). There they arrive at a theory with which, once again, no one could disagree. But then, neither could they agree. Nor hold any opinion at all, since solipsism, which ultimately seems to be the fundamental tenet of Interface Theory, is fundamentally unfalsifiable.
As a fan of both evolutionary theory and the necessary fallibility of vision, I approached this paper with optimism. But several aspects of the authors’ arguments are problematic: (1) many of their arguments attack straw man alternative accounts of perception; (2) the authors ignore prior work that seems to take theirs as a special case; (3) the authors’ review of evolutionary and genetic algorithm models confuse perception with choice/behavior; and perhaps most importantly (4) their argument, taken seriously, degenerates into solipsism—a problem that rears its head early and which raises its head higher and higher as the paper goes on.
Many of the authors’ arguments are attacks on straw men: The authors take pains to define various “veridical” approaches to perception (e.g., Definitions 3…5), using them as a foil for their own theory. But I do not know any perceptual scientist who takes any “veridical” (isomorph-based) account seriously. Even most Bayesians, who assume that the goal of perception is to infer the most likely world given an image, would not assume that the resulting percept is veridical; only likely to be correct (for some definition of “correct” that is much weaker than an isomorph).
Prior work takes Interface Theory as a special case: The authors’ Definition 6, a “hybrid realist strategy”, resembles what Holland and colleagues in their 1986 book would call a quasi-homomorphism, or q-morph. Next, they present Interface Theory in Definition 7, arguing that Interface Theory does not propose that perceptions, X, are a mapping of the world,W. The problem with this definition is that “… the probabilities of perceptions are systematically related to probabilities of events in W” is the definition of a mapping (i.e., an isomorphism, homomorphism or q-morph): Interface theory proposes that perception produces a goal-directed q-morph of the world, much like Holland and colleagues did in their 1986 book.
Confusing perception with behavior: A more serious limitation of the argument is that is confuses perception with behavior. This error is evident in the simulations the authors cite. Perception is equated with behavior in the sense that what the simulated organism sees is equivalent to what it does. For example, in their first evolutionary game (Figures 2 and 3) their “veridical” organism (Figure 2) and the “interface” organism (Figure 3) are both granted four perceptual categories that map onto different quantities of some resource, and both organisms’ behavior strictly follows this perception.
The first problem with this simulation is that the “veridical” organism is saddled with a useless perceptual system: the two central categories of the “veridical” organism’s perception straddle the optimal level of the organism’s payoff function, so all the relevant action in its environment occurs within a range it cannot discriminate (see Figure 2). By contrast, the “interface organism” has a perceptual function with a category straddling the peak of the payoff function (Figure 3). Not surprisingly, the “interface” organism outperforms the “veridical” organism. But if we (or evolution) were to move the “veridical” organism’s perceptual categories half a bar’s width to the right, it would likely perform much better. For example, if the “veridical” organism’s “yellow” perception straddled resource quantity 50 (as the “interface” organism’s “blue” perception does), then the “veridical” organism could guide its behavior as optimally as the “interface” organism does. But as illustrated in their Figure 2 (reproduced below), with yellow and green straddling 90% or more of the integral of the payoff function, the “veridical” organism simply cannot see the properties that matter in its environment. Such an organism can hardly be described as “veridical”.
The deeper problem is the authors’ confusing perception with action: They assume that what their simulated organism sees is identical to what it does. This may seem a harmless simplification, but it has sweeping implications for their argument. Imagine that we separate perception from action by allowing these organisms to map their perceptions to their behaviors flexibly (as most animals do). Play the evolution game, not just on the organisms’ perceptions, but also on the mapping from perception to action. Suddenly, the “veridical” organism would do much better (especially if its perceptual categories were tuned differently). Now let the organisms learn the mapping from perception to action and the “veridical” organism will do better still: Place the animals in a world where the payoff contingencies might change (today the best quantity of the resource is 50; tomorrow, it’s 75) and the “veridical” organism will eat the “interface” organism’s lunch.
The “veridical” organism can adapt because it sees the world, not “correctly” (with its four perceptual categories), but in terms of a stronger q-morph than the “interface” organism: Where a quantity of resource in the world varies monotonically, so does the “veridical” animal’s perception (modulo the coarseness of its categories). But whereas the “interface” animal was ideally adapted to the world of Wednesday (with a perceptual system tuned to say “right amount of resource!”), this is Thursday and the “right amount” (75) is perceptually indistinguishable from the “woefully wrong” amount of resource (25).
In response to this argument, the authors of the target article might object, “That’s not playing fair! You can’t just throw a separation between perception and behavior into the mix and declare victory.” But evolution does not “play fair” and it did throw a distinction between perception and action into the mix. My argument is that it is better to evolve an animal who can see that 100 gallons of water is more than 50 gallons of water but nonetheless learn to choose 50 over 100 when appropriate (a “veridical” organism with learning) than to evolve an organism that sees 50 gallons of water as more than 100 gallons, which it sees as equivalent to no water at all (an “interface” organism).
The authors’ argument degenerates into solipsism: None of the preceding strongly argues that the authors’ theory is wrong. And at the end of the day, it might be right. But if it is, then we’ll never know. As the authors say on p. 41, (and even more strongly thereafter), “So what justification do we have to believe that the representational spaces employed by human perceptual systems correspond to objective reality?” The answer they suggest is None. In the authors’ defense, in this sense at least, their proposal diverges from simply being a special case of Holland et al.’s (1986) q-morphs. They are going Full Monty and claiming an even weaker linkage between what’s in the world and what’s in the head (except insofar as the icons in the head are useful for dealing with the world’s OS).
The problem with this version of their proposal is that it amounts to solipsism: The idea that we have no idea of what’s actually out in the world. (Lest you think I’m exaggerating, re-read their symmetry proof.) At some level, this claim is correct: We have no awareness of the very small (molecules, atoms, electrons, quarks, etc.) or the very large (galaxies, galaxy clusters, parallel universes, etc.). But this is not their claim. Their claim is that we have no connection to the mid-sized stuff: Tables, chairs, cars (their example), each other, etc. are, are not an illusion exactly, but, well, not really there. Or rather, if they really are there, then they aren’t what they appear to be. Instead, their appearance is crafted (like the computer icon) to be useful to our survival and reproduction. Like the “interface” animal of Figure 3, we don’t see 100 gallons of water as greater than 50 gallons of water unless 100 gallons is somehow more likely to get us a date than 50 gallons is.
The problem with this thesis is that, if it is true, then we can never know it (because its very truth implies that reality is beyond our grasp); neither can we know whether it’s false. In response to this troubling implication of their claims, the authors appeal to various kinds of objective measurements of the world (with rulers, timers, etc.). The problem with this appeal, of course, is that our perception of these devices must be as fallible as our perception of everything else: On the authors’ thesis, the car we “observe” is not what we think it is; why should the ruler, scale or timer be any different? The authors claim a degree of separation between ourselves and the world that might be true, but if it is, then you cannot even be confident that the authors made it or that I am denying it.
Summary: In the end, I think there is a great deal that is deeply right about the author’s thesis: Our visual system does not deliver an isomorph of the world (to do so is mathematically impossible since inverse optics is ill-posed), and what it delivers instead is undoubtedly shaped by what is relevant to our survival and reproduction as a species. But this part of their thesis is not especially novel (I have been teaching it to my students for 25 years). What appears more novel is the claim that what our perceptual systems deliver is shaped exclusively by our immediate needs: That if 50 gallons of water has a higher immediate payoff than 100 gallons, then we will see 50 gallons as more than 100 gallons.
Evolution is short-sighted, but as evidenced by the cases of sexual reproduction and learning, it is not that short-sighted: Solutions that are more likely to keep working tomorrow are more adaptive than those that only work today. Such is the case with perceptual systems that develop better q-morphs of the world: If I see 50 gallons of water as more than 100 gallons just because 50 gallons has a better payoff for now, but you are smart enough to see that, even though 50 gallons is currently “better” than 100, 100 is still more than 50, then you are the more likely to survive and reproduce.
 An isomorphism is a mapping between systems (here, W and X) such that every state in W has a corresponding state in X, every transformation (e.g., state transition) in W has a corresponding transformation in X, and vice-versa. A homomorphism is the same as an isomorphism, except that not every state or transformation in W must have a corresponding state or transition in X; only the important ones are required to (where importance is defined roughly in terms of goal-attainment, or in the authors’ terms, evolutionary selection). And a q-morph relaxes this constraint further, requiring only that most important states/transformations in W have corresponding states/transformations in X. As the value of most(important states/transformations) approaches the value of all (important states/transformations), the q-morph approaches being a homomorphism. All three of these kinds of representations—isomorphisms, homomorphisms and q-morphs—are defined exclusively in terms of thecorrespondences between states/transformations in W and those in X.
 I interpret “correspond” in this context liberally to mean “is a q-morph” and not conservatively to mean “is an isomorph”. As noted previously, if the authors mean that a correspondence is an isomorphism, then the authors’ statement attacks a straw man, since no one who has studied perception thinks that it produces an isomorph of reality.