When today’s grass is greener than tomorrow’s gold: Modeling temporal discounting
We value the present more than the future. When given the choice, very few people would prefer to wait a month to receive $51 if the alternative were to receive $50 today, even though the accrual during this delay would correspond to a whopping annual interest rate of nearly 27%.
This entrenched preference for the present, and the discounting of the future it entails, appears to be an immutable aspect not just of human cognition but of organisms more generally. When given the choice between a smaller reward now or a larger reward later, most animals generally prefer the immediate reward.
In humans, decisions relating to the present involve regions of the brain (viz. limbic and paralimbic cortical structures) that are also consistently implicated in impulsive behavior and cravings such as heroin addiction, whereas decisions that pertain to the future involve brain regions (viz. lateral prefrontal and parietal areas) known to support deliberative processing and numerical computation.
Our strong preference for immediate rewards may therefore reflect the proverbial “reptilian brain,” which competes with our “rational brain” that is telling us to consider and plan for the future.
Temporal discounting is not just an interesting phenomenon in its own right: it is also a crucial component of decision making. For example, will you spend your savings on a holiday now, or will you use your savings to increase your retirement pension in a few decades time? Should society consume fossil fuels now or maintain a stable climate in the long run?
Learning how exactly people discount future rewards is thus crucial across many fields of study, from decision making to economics and public policy.
The standard experimental task to examine delay discounting consists of participants repeatedly answering questions of the form “would you prefer $A now or $B in X days?”, where the relative magnitudes of A and B and the delay (X) are manipulated across trials. This task is simple to administer and participants have no difficulty responding to the stimuli.
There are, however, several challenges relating to interpretation of the data from this task. First, to get stable estimates of discounting may require more trials than participants have time or patience for. Second, because people’s responses are inevitably accompanied by error, estimates of discounting preferences may be quite labile and unreliable. In light of the importance of discounting for decision making and public policy, this unreliability is more than an irritation.
A recent article in the Psychonomic Society’s journal Behavior Research Method stackled these challenges by proposing a new method to estimate discounting in the inter-temporal choice task. Researcher Benjamin Vincent proposed a method that relied on what is called a “hierarchical Bayesian” approach. We have touched on Bayesian methods here before, and in a nutshell the advantage of Bayesian statistics and modeling is that it provides an elegant and entirely rational framework for the updating of one’s knowledge in light of the evidence.
Unlike conventional frequentist statistics and modeling—which suffer from problems we recently explored in a digital event—a Bayesian approach requires that we make our “prior” knowledge explicit. That is, before we can interpret the data of an experiment we need to formalize our existing knowledge about what we expect to happen. Sometimes we may wish to assume that we are ignorant about what to expect in an experiment, but often we do have relevant prior knowledge. In the case of discounting, Vincent exploited the known fact that participants’ discounting attenuates as the amount in question increases. This is known as the “magnitude effect” and its size can be quite striking, as shown in the figure below which summarizes a large corpus of data:
The more money is at stake, the more people are willing to wait, as reflected in their decreasing discount rate (the discount rate is something akin to an “interest rate” that an amount has to earn in order for people to be indifferent about delaying it).
As a general rule, our estimates of people’s performance improve by consideration of prior knowledge—if we already know that people discount larger amounts less, then it would be wasteful not to consider this information in our interpretation of a new experiment involving a wide variety of amounts.
The second important attribute of Vincent’s approach was its reliance on a “hierarchical” model. Hierarchical approaches (which are also known as “multi-level” modeling) accommodate and exploit individual variation in order to improve our overall estimate of a variable. We can illustrate the magic of hierarchical estimates using a famous example known as Stein’s paradox. Stein’s paradox holds that the best estimate of a person’s true ability is not their own performance, but an adjusted measure that brings an individual’s performance estimate more in line with the observations for all other individuals. Yes, that’s right—to estimate Fred’s ability, don’t look at Fred’s performance alone but also at Joe’s and Phil’s scores.
A famous illustration of Stein’s paradox involves the batting averages of baseball players. It turns out that each player’s batting average during the first half of a season is a relatively poor predictor of the same player’s performance during the second half of the season. The prediction is much improved by computing a composite measure that adjusts each player’s first-half score towards the grand mean across all players. The reason behind this counter-intuitive result is that the grand mean is less susceptible to measurement error than individual scores—hence we should blend an individual’s score with the more reliable knowledge about the sample overall. Vincent’s hierarchical model of discounting performs precisely this blending: Each individual participant’s discounting responses are considered in the estimate, but they are nudged slightly in the direction of the grand mean to counteract the measurement error that necessarily distorts individual scores.
Illustrative results of the hierarchical Bayesian modeling by Vincent are shown in the figure below:
The top row represents group-level inferences in an experiment in which people’s discounting was estimated through 27 different intertemporal-choice questions as described above. The remaining three rows show the data from 3 representative individuals.
The left-hand column of panels shows the conventional magnitude effect: as the amount in question increases, the discounting that people apply to that amount diminishes, although the steepness of that decline differs across individuals.
The right-hand column of panels presents inferences about people’s so-called “psychometric function”, which expresses their choice probabilities (in this case, choosing the delayed, larger amount) as a function of the difference in psychological “utility” of the two amounts. Utility is the psychologically-relevant pleasure that one derives from a monetary reward, and the function should hover around the indifference point (.5) when the utility of the delayed amount is equal to the utility of the present amount (i.e., their difference is zero). Whenever the former exceeds the latter, we should choose the delayed amount, and vice versa. It can be seen that all functions are quite sharply tuned around 0, as expected, and they rapidly converge on 0 and 1 with only small differences in utility favoring one or the other reward. This result confirms that people’s decisions are consistent and highly reliable, once they have worked out by how much to discount the future amount to equate psychological utility with the present.
These inferences about people’s psychometric function represent one of the strengths of the hierarchical Bayesian approach: We can gather knowledge about the unobservable—but identifiable—internal variables that drive people’s decision making in a discounting task. We can do so at the level of individuals, taking into account individual differences but without being led astray by the measurement error that affects individual estimates more.
Vincent’s work provides us with a powerful tool to infer more about how people discount the future than has been possible to date, which is available for download here. Given that small differences in discount rate can make a large difference in the future—for example, $1,000,000 in 300 years is worth $50,000 today if it is discounted at 1%, but only worth $7.75 if discounted at 4%–this is welcome news.
Article focused on in this post:
Vincent, B. T. (2016). Hierarchical Bayesian estimation and hypothesis testing for delay discounting tasks. Behavior Research Methods. DOI: 10.3758/s13428-015-0672-2.