Behavior chains and back-chaining
During the first season of ClickerExpo, it seemed to me that a lot of people had questions about behavior chains and back-chaining. I'd like to shed a little light on the subject. A behavior chain is an event in which units of behavior occur in sequences and are linked together by learned cues. Back-chaining, which means teaching those units in reverse order and reinforcing each unit with the cue for the next, is a training technique. We use this technique to take advantage of the intrinsic nature of the event.
The cue as a reinforcer
The key to understanding what's going on in a behavior chain—and why it creates reliable behavior—is to know that a cue is also a conditioned reinforcer. Put another way, a cue, which is the "green light" for a clickable behavior that leads to some kind of treat, becomes in itself a good and rewarding event. By carefully timing the instant in which you give the cue, you can reinforce some other behavior that's going on at that time. In training a behavior chain, you can mark a behavior and reinforce it and cue the next behavior simultaneously.
Example of a behavior chain
There are many behavior chains in everyday life. When I take my dog out of his crate in the morning, I immediately take him for a walk. This involves many little units of previously learned behavior: standing still to have his leash put on (instead of romping and playing, as he'd like to do); waiting politely at the door, and again, if I ask for it, at the top of the porch stairs—I often have to go back for gloves or some other forgotten item—then walking without pulling, and so on. Each of these behavioral units was taught individually at first. Now they are linked, with each cue reinforcing the previous behavior. For example, when he's waiting quietly at the open door I reinforce that by saying "Let's go." The actual reinforcer is the walk itself. The cue to go through the door reinforces the polite waiting.
Here's the important point: when I was developing waiting politely at the open door, I didn't need to click and treat for the wait before I said "Let's go." I didn't need to say "Good boy" as praise for the wait, before I said "Let's go." Unless the known cue, "Let's go" in this case, is also associated with punishment it is in itself a powerful positive reinforcer (for more on this topic see "Poisoning the Cue"). It can function to mark the behavior, just like a click.
In building my go-for-a-walk chain, and all the units inside it, I got the job done just using cues as reinforcers in the natural course of our daily routine. Life offers many opportunities—going into and out of the car, visiting other houses, and going to the vet—for indicating and reinforcing behaviors that add up to "good manners" by using the cues "Wait" and "Let's go" in a timely fashion.
What if I had a dog not yet attentive to learning new cues? If I were, for example, temporarily taking care of someone else's rambunctious, clueless door-dasher, a few contingent cues, even if they are conditioned reinforcers, might not be enough to get the job done efficiently. In that case, I would certainly bring out a clicker and treats to put in a bunch of brief but intensive C/T sessions dedicated to learning that "Sit" at the door is the only way to get the door to open; to learning that waiting for the cue "Let's go" is the best way to get permission to move forward, and so on. I would also want to make sure that I myself did not carelessly "break" the chain, for example by putting the leash on, sitting the dog at the door, and then going off to make a phone call or something, leaving the dog unreinforced for the front end of the chain.
Many of the behaviors we train our dogs to do are really behavior chains. Heeling, retrieving, running an obstacle course, almost all obedience exercises, tracking, gaiting, and stacking in the show ringâ€¦ all chains. While the various units of a chain can be trained individually in no particular order, linking them together is far more easily done if you work from the end of the chain forward.
Back-chaining the retrieve with the clicker
"Start with the drop or give. Establish a cue for that, then back up to the take, hold and give, then the take, carry, hold, and give. Train "go over and find it" with the object stationary on the floor, after that. Last of all, introduce the throw, watch, and chase (or chase and catch, for Frisbee) part of the retrieve. Doing this with clicks and treats is fast and fun and can be taught to puppies as soon as they can see, hear, and totter about on four legs. If you back-chain the retrieve you will always have a zesty, eager partner who will never try to play "keep-away" instead of fetching the object back to you.
That is, you start with the last item in the chain—in the retrieve, it would be the give. You shape that behavior, put it on cue, and then insert the next part: hold until I say "give." Then you back up one more step, and teach the take, first from your hand, then from the ground. Building the chain backward ensures that you are always moving toward reinforcement—the prize at the end of the chain—and that each part in the chain is strengthened, every time, by the cue for the next part.
In building a behavior chain or inserting new behaviors into the front end of a chain, you don't need to click and treat every unit. Direct reinforcement of the new behavior may not be necessary, since you are already using the next cue as a click. Continuing to the next link in the chain is more reinforcing than interrupting the chain with a minor reinforcer, such as a food treat. Going toward a known way to succeed can be so important that the dog would rather keep working toward the goal than stop to eat or to acknowledge praise.
Uses of back-chaining
If you are starting to build complex chains for competition, you will go faster and your dog will understand better if you build each unit separately, join the units up from back to front, and practice the chains always in groups of units rather than running through the whole chain every time, over and over. For example in the articles exercise in utility, one might build and practice the mini-chain of "pick up, bring, hold, give," separately from the mini-chain of "go out, find, and select the right article." Of course you could also occasionally practice the mini-chain of "select, pick up, and bring," clicking as the dog turns back with the right one, then going to him with the treat, to reinforce his good selection quickly. A side benefit of training in mini-chains is that if one unit goes wrong in performance, you can take that chunk out, shape that unit and its associated behaviors up again, and then put the mini-chain back into the long chain. And of course, each of these mini-chains should also be built backward. An example: "Back-chaining the retrieve with the clicker."
Skills that benefit from back-chaining include the retrieve, tracking, search and scent work (start with the reporting behavior), and any performance task that happens at a distance, including field trials and herding. Incidentally, it's not just dog performance that benefits from back-chaining. If you ever have to memorize a piece of music, or a poem, or a speech, or a dance routine, it will go much faster if you break it into little chunks and learn the last chunk first, then the next to last, and so on, backing up to the start.
A common misconception is that a behavior chain is a series of behaviors that are initiated by a single cue. In fact, that's the way some behaviors "look" to us, because we tend to ignore any information the dog gets which does not come directly from the handler. Take, for example, the obedience exercise of retrieving the dumbbell over a jump. Some dogs whip through it with accuracy and panache. It certainly looks as if the dog has memorized the whole sequence and is doing it on a single cue, the owner's send-out from the starting position of sitting at heel. However, this cluster of behaviors is riddled with object-related cues, or what the bird trainers call prop cues.
The initial unit, leaving heel position and taking the jump, is cued by the handler. The sight of the dumbbell on the other side, however, is the cue for picking up the dumbbell, and also the reinforcer for taking the jump. The feel of the dumbbell in the mouth is the cue to turn back to the owner (taking the dumbbell home) and then, when the dog turns back, the sight of the jump is the cue to take the jump—and the sight of the owner standing there in a particular pose reinforces the jump and also cues the "front" behavior, and so on.
What if the dumbbell isn't there, when the dog gets over the jump? What if it took a bad bounce and went out of the ring? That can happen. It's the rare dog that turns and takes the jump back anyway; mostly they just wander around looking confused. There is no cue (sight of dumbbell), so no pickup behavior occurs. No cue (dumbbell in mouth), so no turn-and-jump behavior. The loss of a cue in mid-chain is not the only way a behavior chain can go to pieces but it's a common one.
Some people maintain that the best way to get "reliability" in performing a series of behaviors is to train with many, many repetitions of the same sequence over and over, sometimes called "patterning." It's hard, it's boring, and the resulting behavior is very vulnerable to changes in the environment. However, sometimes it seems to work. Why? In fact, if the sequence is holding up, it's probably not because of the many repetitions, but because there are cues within the chain that are reinforcing the pieces of the pattern. We just don't recognize them as cues because they are environmental; they aren't deliberate words or signals from us.
Different kinds of chains
Repeating a single behavior
Even some very experienced trainers consider that a behavior chain can only consist of a series of the same behaviors repeated over and over. That is one kind of chain. For example, running a horse or dog down a jump chute over a series of identical jumps is a chain; the sight of each jump is the reinforcer for the last jump and the cue for the next one. When the jumps stop, the jumping stops too.
Many behaviors, always in the same sequence
Some canine sports, such as Flyball, involve a variety of behaviors that always occur in the same order. Freestyle, heeling to music, or dancing with dogs (theses terms are synonymous) is another example. Routines are choreographed and performed in a given sequence. That's a behavior chain. Each cue, whether an object cue—a jump in front of you or a handler's cue, a word or movement—signals the shift to a new behavior and also reinforces the behavior that is going on simultaneously.
Clicker cure for lost dumbbell problem
"What can you do to train against the mishap of a dumbbell bouncing out of sight? Here's one recipe. Teach the dog to hunt for and find the dumbbell by scent, for a click and treat, indoors, around the house—then outside; under furniture, in clumps of grass, under ring gates. Then establish that if the dumbbell is in sight, pick it up and bring it; if it's not, find it by scent, then pick it up."
In cases where the animal appears to know the sequence by heart, very often he is still responding to cues, too. They might be position cues: we always canter when we reach this end of the arena. They may be superstitious cues from the handler such as weight shifts of which the handler is unaware. Or they may be environmental cues, such as music or jumps. The result is still a behavior chain.
There are intrinsic hazards in building a chain that will always be performed in the same sequence. If the animal actually memorizes the sequence—"First I always do this, then that, then the other"—he may begin doing it on his own, anticipating the next behavior. When the animal "jumps the gun" and acts without the cue (a common occurrence in roping horses) behaviors inside the chain fail to be reinforced, and start to break down. We see this happen frequently when training for Flyball competition and with the Drop on Recall obedience exercise. It's vital to deal with anticipation immediately, retraining that unit or cluster of behaviors to make sure that the animal waits for the cues; otherwise, problems will multiply.
Of course the biggest and most important chains in dog training are the performance chains: long sequences of many behaviors, linked, reinforced, and thus maintained by cues, in which the individual units may come in virtually random sequences. Running an agility course is an example. The crossing of obstacles occurs in a continuous stream, but the obstacles may be in any sequence and in any location. Running the course is a flexible chain, and one in which the function of cue as reinforcer is particularly obvious.
Take, for example, the challenge of contact zones. Some obstacles, such as the A-frame and the dog walk, have contact areas at the start and finish. The dog must touch those contact areas on the way up and again on the way down. The requirement keeps the dog safe; if he passes through the contact areas correctly, he can't jump onto the obstacle from a bad angle or bail out early from too high up, risking injury.
Because the course is different in every trial, every time the dog takes one obstacle, the handler has to give a cue to identify the next obstacle. Common sense might lead the handler to wait until the dog completes one obstacle before telling him where to go next; but common sense is wrong in this case because of those contact zones. If you habitually give the next cue when the dog is already on the grass, guess what. He's going to start leaping over those contact zones to get on the grass because that is where he is reinforced with the cue for the next behavior. If you always give the next cue while the dog is on the contact zones, you reinforce being on the contact zone, and the dog will be certain to hit that spot.
It really doesn't matter what sequence the obstacles come in, but it does matter very much when the handler gives the cue. If the cue comes late, you have lost the opportunity to reinforce the previous task with precision. And if the cue comes way too late, so that the animal meanwhile acts independently and goes off on its own, you have broken the whole chain. All the previous behaviors are now at risk, especially if this event is repeated often; and it is not the dog (usually assumed to be easily distracted), but the handler's timing of the cues that is at fault.
The linking of behaviors by well-timed cues is the essential factor in maintaining "reliability" in all long, complex, flexible chains. This includes obedience, tracking, search and rescue, field trials, hunting, retrieving, service work, and police work. I don't mean that the dog shouldn't work on its own initiative, of course it must; but always under direction as well. When the work is "on cue" the chains stay reinforced—because the cues are reinforcers.
Post new comment