The Neuroscience of Reward, or Why Surprise Matters in Horse Training

WRITTEN BY: HARRIE PHILLIPS

PGCertClinEd, BAdVocEd (VocEd&Trng), DipVN (Surgical, ECC), DipBus, DipTAE (Development & Design), TAA

PUBLISHED: 12 June 2026

Harrie Phillips, Founder and Managing Director of the Australian College of Animal Care

Horses dislike surprises. Anyone who has been around horses for even a split second knows this. The plastic bag in the hedge that wasn’t there yesterday, the new horse in the next paddock, the wheelie bin that has somehow become threatening overnight, all of these produce the same response: head up, eyes wide, body braced. And if we are really unlucky, some fancy footwork, a spin and speed even Usain Bolt would be proud of. As prey animals, horses are evolved to treat the unexpected as dangerous until proven otherwise, and we spend a lot of our training lives trying to convince them that the world is more boring than it looks. So why am I saying “reward” and “surprise” in the same sentence as “horse”?

Because there is another kind of surprise, the kind that happens inside the horse’s brain when a reward arrives that they were not quite expecting. This kind of surprise does not produce alarm. It produces learning. It is, in fact, the chemical mechanism by which any horse learns anything from any of us, and the horse world has been remarkably slow to grasp how much it matters.

This article is about that second kind of surprise. Where it comes from, why it matters, what the most current research from primates, mice and horses tells us about how it works. And I am hoping I can translate that into something understandable, something useful for the way you train, ride and handle your horse from tomorrow onwards.

What does dopamine actually do in your horse’s brain?

Dopamine has a reputation it does not quite deserve. Most people, if they know anything about it, know it as the “pleasure chemical”, the brain’s reward signal, the molecule that makes things feel good. This is not exactly wrong but it is misleading enough to obscure what dopamine is really doing in any mammalian brain, including your horse’s.

The actual function of phasic dopamine release is to signal a prediction error^{1, 2}. Stay with me, this will be useful, I promise. When something good happens that the brain was not expecting, dopamine neurons in the midbrain fire in a brief, sharp burst. Think about how happy you’d be if you finally won lotto after many years of buying ‘broken’ tickets. When something good happens that the brain was expecting, those same neurons stay quiet. Much like getting an easter egg on Sunday morning when you’re an adult. When something good was expected and did not happen, the neurons briefly suppress their firing below baseline. Disappointment anyone? Dopamine, in other words, does not encode “yum”. It encodes “Oh!”, and the size of the “Oh!” depends entirely on how surprising the reward was.

This was first shown in the 1990s by Wolfram Schultz and his colleagues, who recorded directly from dopamine neurons in monkeys learning a simple task^{3, 4}. The pattern they found has been replicated across decades of work, in primates, rodents, and now in human imaging studies, and it is one of the more solid findings in modern neuroscience. The dopamine system is a teaching signal, and what it teaches the brain is how to better predict the world.

Until recently, applying this finding to horses required a degree of extrapolation. The neuroscience was clear in mammals broadly, but direct equine evidence was thin. That has changed. Recent work from researchers at Nottingham Trent University has shown that horses with lower physiological arousal perform significantly better on learning tasks⁵. Earlier work from Andrew Hemmings’s group at the Royal Agricultural University laid the foundation for understanding how the equine basal ganglia and dopamine system underpin learning, with direct implications for training⁶. Cathrynne Henshall and colleagues at Charles Sturt University have argued that even aversive learning in horses involves dopamine-driven prediction errors in the same way it does in other mammals⁷.

The horse’s brain, in other words, is doing exactly what the rat’s, the monkey’s, and your own brain is doing when learning is happening. There is no equine exception to this. What we now understand about dopamine and prediction error applies directly, and the implications for training are significant.

Dopamine, in other words, does not encode “yum”. It encodes “Oh!”, and the size of the “Oh!” depends entirely on how surprising the reward was.

Why does this matter for everyday horse training?

So, if you’ve stuck with me after that last bit, well done. Here is the part that should change how you ride.

Because dopamine fires for the unexpected reward and stays quiet for the predicted one, a reward that arrives reliably, in the same way, at the same intensity, on the same schedule, every time the horse offers the same behaviour, generates progressively weaker dopamine responses with each repetition^{2, 8}. The brain has stopped being surprised. The reward, neurologically speaking, has stopped being a reward. It has become a piece of predictable information that the prediction error system has nothing left to learn from.

This is why the horse who has heard “good boy” four hundred times in the same session, in the same tone, after the same movement, stops responding to “good boy”. The phrase has not lost its meaning in any abstract sense. It has lost its dopamine. The same thing happens with the wither scratch that is delivered every single time the horse halts square at the centre line, or the treat that comes out reliably after every successful stretch in groundwork. The behaviour holds for a while because the horse has learned the pattern, but the learning system itself has gone quiet, and the response begins to drift downward in quality unless something disturbs the predictability. Like many marriages I suppose.

Wolfram Schultz’s group at Cambridge has shown that dopamine signalling is exquisitely sensitive to uncertainty. When rewards become fully predictable, the response weakens substantially, and unpredictable rewards generate the strongest learning signals⁹. The finding has been replicated and extended many times since, and it does not appear to be a quirk of monkey neurophysiology. It appears to be how mammalian learning systems work.

The behavioural version of this has been known since the 1950s. Skinner and Ferster demonstrated that variable reinforcement schedules, where the reward arrives unpredictably, produce behaviour that is more persistent and harder to extinguish than behaviour reinforced on a fixed schedule¹⁰. Slot machine designers have known this for decades. Modern dopamine research has now given us the neural mechanism for what behaviourists were observing six decades ago.

For horse training, the implication is straightforward and a little uncomfortable. The reliable, well-timed, perfectly consistent reward that we have all been told to deliver, while not wrong, is not the whole story. A reward that arrives in exactly the same way every time eventually stops driving learning. To keep the dopamine system engaged and the response improving, the reward itself needs an element of variability and surprise.

What’s the trap most people fall into?

Now, before anyone reading this starts varying their cues or their timing in pursuit of more dopamine, let me be very clear, because this is the part that goes badly wrong if it is misunderstood.

The variability belongs in the reward, not in the cue, and not in the timing of the release.

The cue you give your horse for forward should mean the same thing every time. The cue for halt should mean the same thing every time. The pressure-release/positive reinforcement window should be as tight and consistent as you can make it, ideally a fraction of a second from correct response to release. These three pillars of clarity of cue, timing, and consistency are the foundation on which all training rests, and they do not change because I am now banging on about varying something.

What can vary, and what should vary to keep learning sharp, is what comes after the release or the marker. The reward type. The reward value. The reward frequency. Whether you give a wither scratch, a piece of food, a long rest, an enthusiastic word, or simply move on to the next thing. These are the elements that should not become so predictable that the horse’s brain stops registering them.

A recent narrative review by Bradshaw-Wiley, Henshall, McLean and Freire, published in 2025, looked at the use of combined positive and negative reinforcement in modern equine training and reached a similar conclusion¹¹. Combining a release of pressure with a varied positive reinforcer is more powerful than pressure-release alone, in part because the positive component can be varied in ways the pressure-release cannot. The cue stays clean. The release stays clean. The reward varies, and the dopamine system keeps doing its job.

If you take only one thing from this article, take this: vary your rewards, not your cues. The cues are the language. The rewards are the grammar. Mess with the language and the horse cannot understand you. Vary the grammar and you keep the conversation interesting. Don’t believe me? Go try to understand a Gen Z text conversation.

Vary your rewards, not your cues. The cues are the language. The rewards are the grammar. Mess with the language and the horse cannot understand you. Vary the grammar and you keep the conversation interesting.

What does Janet Jones add to this?

Dr Janet Jones is a cognitive neuroscientist who applies brain research to horse training, and she can explain these concepts much better than me. If I have even piqued a little of your interest whilst reading this, her books definitely should be on your bookshelf.

In her 2020 book Horse Brain, Human Brain and in her widely-read Psychology Today column, Jones talks about how food rewards should be used rarely, not routinely¹². Her reasoning is grounded in the same dopamine prediction error research. As she put it directly in her 2022 column on the topic, “When it comes to neural activation, a key element of positive reinforcement is surprise. Frequent edible treats destroy surprise and waste your training power”¹³.

Her recommended approach is to lean on non-edible rewards (scratches, praise, rest, the cessation of work) as the everyday currency of training, and to save food for the rarer, higher-value moments where the surprise factor multiplies the dopamine response. A treat that comes out of nowhere after a particularly good attempt at something difficult, in her framing, is a more powerful learning event than a treat that is dispensed reliably after every repetition of an established behaviour.

It also means you should not “waste” a treat by giving your horse one for nothing. I get it, they’re cute and all, but are you giving them the treat for them, or for you? That’s what I thought. Save it for something much more meaningful.

Now this is a stronger position than the consensus in the equine positive reinforcement community, where regular food rewards remain the standard practice. The underlying neuroscience that Jones cites is, however, robust. Whether her specific application of “use food rarely” is the right operational conclusion for every trainer and every horse is a question for another researcher somewhere. Not me, I have fjords, and they like their treats.

What is not in dispute is the principle she is building on. Predictability weakens dopamine. Variability strengthens it. Whether you take that as a reason to use food rarely, or as a reason to vary the type, value and frequency of your rewards more generally, you are using the dopamine system the way it was built to be used.

Does the same principle apply to pressure-release?

So much of this research looks at positive reinforcement and especially food rewards. But here is the thing that will most likely change how you handle your horse on the ground and in the saddle, because the dopamine prediction error story does not stop at food rewards. It applies equally, and through largely the same neural machinery, to negative reinforcement. The systems are not identical at the cellular level, and there is still active research on the differences (which we will definitely leave to the experts), but the core principle holds for both.

Dopamine signals the surprise, surprise drives learning.

The neuroscience

Until recently, the role of dopamine in negative reinforcement learning was much less well-studied than its role in positive reinforcement. That gap has been closing rapidly, and what the science is showing is actually really interesting. Trust me. You must, because you are still reading.

In 2021, a group of researchers led by Zhijun Diao recorded directly from substantia nigra dopamine neurons in mice learning to escape mild foot shocks¹⁴. They found that the dopamine system was bidirectionally engaged throughout negative reinforcement learning, not silent as some earlier theories had assumed. In the early phase of learning, when the mice had not yet figured out how to escape, the aversive stimulus suppressed dopamine activity, as you would expect. Like when we accidentally touch the electric fence, it’s not exactly fun. But once the mice had learned the escape response, the same aversive stimulus began to produce a brief burst of dopamine activity, because it now reliably predicted the relief that was about to follow. Kind of like how when you let go of the fence, you know the shock will stop.

A 2025 study by Cheng and colleagues extended this work substantially¹⁵. Using more sensitive techniques, they tracked exactly how the dopamine signal shifts across the stages of negative reinforcement learning, and found a beautifully clean pattern:

First, before the animal can escape, the dopamine response tracks the aversive stimulus itself.
Then, as the animal begins to learn that the aversive stimulus reliably ends, the dopamine response shifts to track the termination of the stimulus. This is the neural signature of relief, and it functions exactly like the dopamine response to an unexpected food reward.
Finally, once the animal has fully learned to escape on cue, the dopamine response shifts again, this time to track the onset of the aversive stimulus, because that onset now reliably predicts the relief that the animal can produce by performing the escape response.

In other words, pressure-release training is not running on some separate, lesser learning system. It runs on the same dopamine prediction error wizz bang magic as food reward training does. The relief from pressure functions, in the brain, exactly like a positive reward, but this means the same predictability problem applies.

And yes, this has been argued for horses too. Henshall and colleagues have applied the prediction-error framework to equine learning, including aversive learning⁷, and the equine neurophysiology literature has reached the same conclusion from a different angle: pressure-release works because it engages the dopamine system, and it fails for the same reason any reward-based training fails⁶. The reward has become too predictable to teach.

What does this mean for the horse you ride?

This is where Haagen comes in.

Haagen is the horse I would put a beginner on any day of the week. He is honest, generous, tolerant of fumbled cues, and unflappable. He is also the horse who, perhaps inevitably given those qualities, has the dullest leg aids on the property by a comfortable margin.

I do dressage with him, working equitation, and mounted archery. The first two require a horse who responds reliably to all the cues: seat, leg and rein. The third requires a horse who can be steered entirely by seat and leg with the reins dropped, while I do something slightly more interesting with a bow and arrow in my hands. I depend on Haagen taking the leg seriously, and roughly once a week I have to set aside time to remind him that the leg is, in fact, a thing. And when it doesn’t work for me, my coach Jen has to step in. Thanks Jen.

For a long time I assumed this was just Haagen being Haagen, a personality quirk of a tolerant horse who has decided that effort is optional. The neuroscience tells a different story. Haagen’s leg aids have not dulled because he is lazy or because he is testing me. They have dulled because they have become, from his nervous system’s point of view, perfectly predictable noise. I sit on him, I squeeze with my legs, he moves forward, the squeeze ends. Same cue, same response, same release, every time, I hope. The dopamine system has, quite reasonably, stopped paying attention.

This is the rider’s version of the four-hundredth “good boy”. The cue has not lost its meaning. It has lost its surprise, and with it, its power to drive any further learning or refinement. Haagen knows what the leg means. He just no longer has any neurological reason to respond to it more sharply than he is currently responding, because the response that gets the same release every time is the response his brain has settled on as sufficient.

The fix is not to use more leg everytime, which most riders try first and which only confirms to the horse that the cue means “press a bit then press harder then press harder still”. Escalating the pressure to get the response is perfectly valid, but the real fix is also to vary what comes after the release. Sometimes the release alone. Sometimes the release plus a wither scratch. Sometimes the release plus a brief halt and a long rest. Sometimes the release plus moving immediately into something Haagen finds more interesting, like a faster pace or a change of direction. He loves a good side pass. Much more interesting than trotting a circle.

What I am doing, in dopamine terms, is restoring the prediction error. The leg cue stays the same. The timing of the release stays the same. That is all consistent. What varies is the consequence of getting it right, and that variability is enough to bring the dopamine system back online and remind Haagen’s nervous system that this is, after all, a learning situation, not a habit one.

The same principle applies to seat aids, voice cues, halt cues, anything that has gone quiet in your everyday work. If you find yourself needing more of an aid than you used to need to get the same response, the cue has not “blunted” in any mystical sense. It has become predictable enough that the dopamine system has stopped responding to it. Adding more pressure does not fix this. Varying the reward does. Consistently.

The cue has not lost its meaning. It has lost its dopamine.

Is your horse always learning?

Something that I find myself saying to my fellow horse peeps all the time, going right back to my first article on Rory at the float, is that every interaction with your horse is training, whether you intend it to be or not. The dopamine prediction error story is the neurological version of that. The learning system is always running. It is always tracking what predicts what, what reliably produces what, and what is worth paying attention to.

What you can choose, as the human in the partnership, is whether the rewards you deliver still have the capacity to drive learning, or whether they have settled into a pattern so predictable that the horse’s nervous system has filed them under “background information” and stopped responding to them.

So, it’s not about whether you are using positive or negative reinforcement, food or scratches, voice or seat aids. The question is whether your horse’s brain is still surprised, in the small but neurologically meaningful sense, by what happens after they get the response right.

A note on what makes this all possible

Everything in this article assumes a horse who is in a state to learn at all. The dopamine prediction error system is exquisitely sensitive to surprise, but it is also exquisitely sensitive to disruption. A horse who is frightened, flooded, exhausted, in pain, or chronically stressed is not in a learning state, regardless of how varied or surprising your rewards are.

Stress hormones interfere directly with the cognitive processes required for new learning to occur¹⁶, and even when stressed animals appear to comply, they may not be laying down the same kind of learning as their relaxed counterparts. Recent equine work has shown that horses exposed to uncontrollable stress acquired an instrumental task no better than inactive horses, while moderate exercise significantly enhanced learning¹⁷. Horses with lower physiological arousal also perform better on learning tasks than those in higher arousal states⁵. The relaxed, curious horse, in other words, learns better than the wound-up, anxious one.

If your horse is not learning, before you reach for any of the techniques described in this article, the first question to ask is whether the horse is in a state where any learning can happen.

References

Schultz, W. (2016). Dopamine reward prediction-error signalling: a two-component response. Nature Reviews Neuroscience, 17(3), 183–195. https://doi.org/10.1038/nrn.2015.26
Schultz, W. (2024). A dopamine mechanism for reward maximization. Proceedings of the National Academy of Sciences USA, 121(20), e2316658121. https://doi.org/10.1073/pnas.2316658121
Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80(1), 1–27. https://doi.org/10.1152/jn.1998.80.1.1
Mirenowicz, J., & Schultz, W. (1994). Importance of unpredictability for reward responses in primate dopamine neurons. Journal of Neurophysiology, 72(2), 1024–1027. https://doi.org/10.1152/jn.1994.72.2.1024
Evans, L., Cameron-Whytock, H., & Ijichi, C. (2024). Eye understand: Physiological measures as novel predictors of adaptive learning in horses. Applied Animal Behaviour Science, 271, 106152. https://doi.org/10.1016/j.applanim.2023.106152
McBride, S.D., Parker, M.O., Roberts, K., & Hemmings, A. (2017). Applied neurophysiology of the horse; implications for training, husbandry and welfare. Applied Animal Behaviour Science, 190, 90–101. https://doi.org/10.1016/j.applanim.2017.02.014
Henshall, C., Randle, H., Francis, N., & Freire, R. (2022). Habit formation and the effect of repeated stress exposures on cognitive flexibility learning in horses. Animals, 12(20), 2818. https://doi.org/10.3390/ani12202818
Bech, P., Crochet, S., Dard, R., Ghaderi, P., Liu, Y., Malekzadeh, M., Petersen, C.C.H., Pulin, M., Renard, A., & Sourmpis, C. (2023). Striatal dopamine signals and reward learning. Function, 4(6), zqad056. https://doi.org/10.1093/function/zqad056
Fiorillo, C.D., Tobler, P.N., & Schultz, W. (2003). Discrete coding of reward probability and uncertainty by dopamine neurons. Science, 299(5614), 1898–1902. https://doi.org/10.1126/science.1077349
Ferster, C.B., & Skinner, B.F. (1957). Schedules of Reinforcement. Appleton-Century-Crofts.
Bradshaw-Wiley, E., Henshall, C., McLean, A., & Freire, R. (2025). Combined reinforcement: Poisoned cue or a panacea for modern equine training? A narrative review. Applied Animal Behaviour Science, 291, 106718. https://doi.org/10.1016/j.applanim.2025.106718
Jones, J.L. (2020). Horse Brain, Human Brain: The Neuroscience of Horsemanship. Trafalgar Square Books. ISBN 978-1-57076-948-1. View at publisher
Jones, J.L. (2022). New Ways of Using Food to Train Horses. Psychology Today, “Horse Brain, Human Brain” column, February 2022. www.psychologytoday.com/au/blog/horse-brain-human-brain
Diao, Z., Yao, L., Cheng, Q., Wu, M., Di, Y., Qian, Z., Wei, C., Liu, Y., Tian, Y., & Ren, W. (2021). Involvement of midbrain dopamine neuron activity in negative reinforcement learning in mice. Molecular Neurobiology, 58(11), 5667–5681. https://doi.org/10.1007/s12035-021-02515-6
Cheng, Q., Liu, W., Yao, L., Xu, S., Wei, C., Zheng, Q., Wu, M., Han, J., Liu, Z., Ren, W., & Sun, Z. (2025). Dynamic changes of dopamine neuron activity and plasticity at different stages of negative reinforcement learning. Proceedings of the National Academy of Sciences USA, 122(45), e2509072122. https://doi.org/10.1073/pnas.2509072122
Mendl, M. (1999). Performing under pressure: stress and cognitive function. Applied Animal Behaviour Science, 65(3), 221–244. https://doi.org/10.1016/S0168-1591(99)00088-X
Henshall, C., Randle, H., Francis, N., & Freire, R. (2022). The effect of stress and exercise on the learning performance of horses. Scientific Reports, 12, 1918. https://doi.org/10.1038/s41598-021-03582-4

Every due care has been taken to ensure the information herein is based on sources Veterinary Nurse Solutions believes to be reliable, but is not guaranteed by us and does not purport to be complete or error-free. As such, we do not warrant, endorse or guarantee the completeness, accuracy, and integrity of the information. You must evaluate, and bear all risks associated with, the use of any information provided hereunder, including any reliance on the accuracy, completeness, safety or usefulness of such information. As part of our quality control of information contained within this document, it has been peer-reviewed by qualified animal care professionals.

Veterinary Nurse Solutions acknowledges that there is more than one way to carry out many of the tasks described within this website, and techniques omitted are not necessarily incorrect.

WRITTEN BY: HARRIE PHILLIPS

What does dopamine actually do in your horse’s brain?

Why does this matter for everyday horse training?

What’s the trap most people fall into?

What does Janet Jones add to this?

Does the same principle apply to pressure-release?

The neuroscience

What does this mean for the horse you ride?

Is your horse always learning?

A note on what makes this all possible

References

Contact details:

Useful links:

Equitation Science

The Neuroscience of Reward, or Why Surprise Matters in Horse Training

WRITTEN BY: HARRIE PHILLIPS

What does dopamine actually do in your horse’s brain?

Why does this matter for everyday horse training?

What’s the trap most people fall into?

What does Janet Jones add to this?

Does the same principle apply to pressure-release?

The neuroscience

What does this mean for the horse you ride?

Is your horse always learning?

A note on what makes this all possible

References