Why clicking and correction don't mix.
Behavior analysts refer to a learned stimulus that triggers an operant behavior as a 'discriminative stimulus.' The behaviorists do not, as far as I know, differentiate between a discriminative stimulus that was trained through positive reinforcement and one that was trained through negative reinforcement.
In practice, however, there is a distinct difference. In clicker training (operant conditioning with a marker signal) the behavior is developed first, as an operant freely offered in expectation of positive reinforcement. The discriminative stimulus is then paired with that operant in order to function as an indicator of a reinforcement opportunity. Each discriminative stimulus signals the opportunity to earn reinforcement for one particular behavior or suite of behaviors.
This positively trained discriminative stimulus always 'opens the door' to positive reinforcement. If the behavior does not occur, the only result is that no reinforcement occurs. When the behavior occurs, reinforcement is guaranteed. (We clicker trainers sometimes call this kind of signal a cue, to differentiate it from the traditional term, a command. )
As soon as the animal understand s what a given cue means, the cue, or positive discriminative stimulus, becomes in itself a conditioned positive reinforcer, like the click. Thus a cue can be used as a reinforcer for behavior that occurs as the cue is being given. One may for example use the well-established positive cue for one behavior to shape another behavior, or to reinforce previous behavior in a chain. The cue can be used also as marker signal, just as if it were a click, to pinpoint especially good aspects of another behavior. It seems likely, too, that the desirable emotional response that we know to be associated with the click also accompanies the presentation of these positively conditioned stimuli.
Behavior that has been trained by correction may also have associated discriminative stimuli, which indicate when the specific behavior is to occur. However, these discriminators, or commands, may or may not lead to positive reinforcement. If the animal fails to perform the behavior, or performs it incorrectly, the stimulus may lead to punishment (usually called 'correction'. The negative discriminative stimulus, usually called a command, is now a conditioned negative reinforcer, signaling the opportunity for avoiding punishment.
Even if the behavior was trained entirely with positive reinforcement, if one now clicks for correct behavior following a discriminator ( a cue, command, or signal) but also gives aversive correction (leash pop, verbal reprimand, etc.) for incorrect behavior following that same stimulus, the stimulus immediately loses its value as a positive reinforcer. It is, at best, ambiguous in terms of reinforcement. It is not a click. It no longer automatically triggers the positive emotions associated with conditioned positive reinforcers. It can no longer be predictably used inside a chain to reinforce previous behavior.
Even if primary reinforcers, such as approval, toys, and treats are supplied in abundance during or after training or performance, the discriminative stimuli themselves-the commands-are now threats as well as promises. Behavior tends to break down, interestingly, both preceding and following these ambivalent stimuli: preceding, because the preceding behavior may begin to extinguish due to lack of a positive conditioned reinforcer consisting of the now-aversive stimulus, and following, because the behavior that might be punished tends to be avoided. The shift becomes visible in the learner's attitude, which switches from attentive eagerness to reluctance, often with visible manifestations of stress. Even though successful response to a given discriminative stimulus is still followed by reward, if failure is now followed by punishment, you have made that discriminative stimulus ambiguous in terms of predictable outcome. It is no longer 'safe.' You have poisoned your cue.
poisoned cue & Non-Reward Markers
14 years ago, the very first dog training seminar I ever attended with the lecturer promoting using a clicker & treats was also using NRM's (Non Reward Markers; i.e., "wrong" or "uh-uh" or "eh-eh") to "mark" when a dog didn't respond to the cue & then re-cue the dog). I didn't like doing it but did it because what did I know? I never did see a positive response to it with my own dogs or those dogs in class & private lessons & eventually quit using it.....(I now only use it to interrupt an unwanted behavior, not to "mark" when the dog didn't respond to the cue).....thank you for making this distinction so clear & why cues should be clear & concise & if you're having trouble with them, how to go back to basics before you decide to add one. I know I drive my students crazy with "when" to add the cue but when they do it's working! Yippeeeee! Lyne C.
Thanks, Chris Bond
Thanks, Chris, for your additional explanatio of the emotional effects of following a cue with a correction. As you wisely point out this is more severe with naive animals. Enlarging the repertoire and building more criteria (locations, distractions, etc.) will help too. Karen Pryor
I don't mean to imply from my question that I promote the use of punishment; I merely want to understand this concept of poisoned cues.
If a behavior has been trained using positive reinforcement, the dog has an expectation that response leads to reward. I get that. However, if the dog now fails to respond, and the handler "corrects" the dog, I fail to understand how the cue has now become ambiguous. All that has been done is provide information to the dog: correct response still leads to reward, but failure to respond, or incorrect response does not. If the dog then responds appropriately, it is still reinforced. For example, a dog has been trained using positive reinforcement to sit on the verbal cue "sit". Now, on one trial, the dog fails to sit and the handler places the dog into a sit position (i.e. "corrects" the dog). I understand that the cue "sit" now means "either sit on your own and get a treat, or if you don't sit, I'll put you there", which could be considered a threat. However, the cue "sit" is not ambiguous...the act of sitting still produces the reward, and "sit" would still be a conditioner reinforcer. It is the not sitting, that takes the reward away. If the dog fails to sit when given the cue, and nothing happens, no correction and no reward, is that not effectively the same thing? The dog does not get the reward which is the only thing that has made the cue the conditioned reinforcer. You have just given a cue without reward which should cause the cue to lose reinforcing properties just as the click would, should you fail to treat.
What am I missing?
This is an excellent question, Roxanne.
The answer lies in the power of classical conditioning, which is always occurring during operant conditioning.
Emotion is a reflexive response. It comes from the fast-reacting primal part of the brain. It occurs before the operant response, which comes from the logical part of the brain (cortex).
Avoidance is a strong survival response, so can win-out over attraction. If the cue is only associated with good things, attraction results. If the cue is associated with both good and bad (aversive) things, avoidance can result... despite the possibility of good things happening.
If 99 good things happen in your day, then 1 bad thing happens, what do you tend to remember most when you go to sleep that night, and what emotion do you feel? If it's something you really don't want to happen again, how much thought do you put into how you can avoid it, even if it means you miss out on some good things? Does it cause conflict, even if it you feel it was a result of something you chose to do or not do?
The result of using both positive reinforcement and aversives can be a feeling of conflict or doubt, which is also aversive, and which slows behavior down. (Slow recall? Slow sit?)
Positive reinforcement, when used alone, builds confidence. The reflexive response is absolute, without doubt, since the association is absolute.
Options if the dog "gets it wrong", that will avoid poisoning the cue:
- Identify why. Too difficult, too many distractions, need more practice, dog's brain is fatigued, unclear cue from the dog's perspective, other...? So many reasons a cue can be missed.
- Pause for a second, then try a different (easier) cued behavior and reward that.
- Go back to basics and practice, building proficiency and generalizing the cue. The benefit of this is, you're building an even stronger history of reward into the cue.
I have seen this process followed by very skilled, experienced trainers with amazing results.
This article totally explains how I ruined the cue "come" with my first dog. Even after switching to clicker training, he already had the negative association therefore I had to select a new recall word. To this day, eight years later, he still think come means run away from momma!
Kudos to you for your success in changing the recall cue! :)
Post new comment