Home » Library » Learn » Training Theory

Extinction and Intermittent Reinforcement

By Wendy Williams on 01/01/2006

Filed in - Training Theory

Intermittent reinforcement is an interesting procedure. In many ways, it is hard to distinguish between "no-food trials in an intermittent reinforcement schedule" and "extinction". In both cases, no food is delivered following the target response. More importantly, the removal or prevention of a reinforcer contingent on a particular response (response cost or neg. punishment) adds another twist to the question. Here is how I would address the question:

In extinction, the unwanted (target) behavior is first analyzed to determine what reinforcer is maintaining it. Then THAT reinforcer is functionally disconnected from the behavior. In short, the behavior is no longer effective in producing the reinforcer. When done correctly, the frequency, rate, magnitude or probability of the behavior declines.
In response cost (aka: negative punishment), the unwanted (target behavior) is followed immediately by the removal of a reinforcer, or the opportunity to earn reinforcers. In order to be truly effective, response costs must be delivered on a continuous schedule: a price/cost for EVERY instance of the target response. So instead of simply failing to reinforcer the behavior (as in extinction), the subject looses something of value - every time!
When response cost and extinction contingencies become intermittent, they functionally turn into intermittent reinforcement schedules that can strengthen behavior instead of weakening it. (Note: if you only get punished once out of every 10 time you engage in the response - you functionally get 9 reinforcers, right?) So you have to be incredibly consistent when using extinction or ANY form of punishment (response costs or true punishment), if you want to be effective.
Intermittent reinforcement involves occasional trials when reinforcers are not delivered. I like to think of these trials as opportunities for the subject to learn about persistence. In some ways, intermittent reinforcement schedules train two behaviors: the target behavior of interest and the behavior of persistence. For example, after 3 no-food trials, the critter earns a RFT for the behavior and for being persistent.
[Note: I believe the Baileys would refer to this as a shaping schedule, a necessary part of raising criteria while developing a new behavior. KP]
Can intermittent schedules be confused with extinction or response cost? Sure, if the trainer moves into a very lean reinforcement schedule too quickly, the subject may stop responding. This is very similar to ratio strain that occurs when the response criteria is raised too quickly. It is important to remember that you must reinforce the behavior of persistence, if you want the subject to keep trying. Amazingly, if you move gradually to an intermittent schedule, animals (and people) can learn to emit extraordinary amounts of behavior for very small amounts of reinforcers. In my dissertation, I managed to train several pigeons to respond on concurrent variable ratio schedules that required as many as 600 keypecks before moving them to a VI food schedule!
It is true that it is not necessary to infer conscious thought on the part of your subject. We will probably never really know what the "critter knows". All we have to go on is the overt behavior. Sooooo, if your subject's behavior begins to take a nosedive, it is possible that the animal is responding as if on a response cost or extinction schedule. That would tell me that I am raising the bar too quickly. In Karen Pryor's words, I would "go back to kindergarten". Return to a lower criterion for the behavior, restablish it and then move forward more slowly. (In the case of moving from continuous rft to intermittent, you might want to only increase to a requirement of 2 reponses for several trials or even sessions, and then to 3.....then to 5, 7, etc.. ) In the end, the take-home message is to use the subject's behavior as a guide as to how fast you can move to a very lean RFT schedule. We don't know what the animal "infers" but we can use the subject's behavior to help guide us through the transition from continuous to intermittent RFT. Your subject's behavior will tell you if you're moving too quickly or not.

Wendy Williams

help!

Submitted by frankieskat on Wed, 2010/08/18 - 3:12am.

Hello, thanks for the article, I am trying to find out if my trainer has given me the right info on intermittant reinforcement with the clicker. This article has helped a bit but does not address the specific question which is can you pair the click with the intermittant reinforcement? i.e. witholding the treat until after a series of clicks. In my reactivity class I was clicking my dog for doing a heel and eye contact then because the dog seemed to be chaining the heel eye contact with staring back at the stooge dog I was told to do multiple clicks before the payout (about 4 or 5 clicks) it did seem to work. He had that pesistance you talk about and he tried harder and harder with his behaviour - i.e. better heel, staring more intently, focus was entirely on me and not the dog. It all happenned very quickly so I think he understood that the clicks were linked. I am just concerned that I have been told a treat must come after every single click regardless.

I've read this article over

Submitted by Kaitlyn Heideman on Fri, 2010/04/02 - 2:33am.

I've read this article over and over but I can't understand any of it! Something just isn't clicking here...hehe

Thanks

Extinction and Intermittent Reinforcement

help!

I've read this article over

Post new comment

Similar entries

Build a great relationship

Sign Up for our Newsletter