Defining Behavioral Terminology
by Ron Harris
As practical dog trainers we tend to use the vocabulary of modern behavioral scientists rather loosely. To the scientists terms like "reinforcement," "reward," "punishment" have rather precise meanings, which we often distort. I have found that beginning to understand how the scientists use these terms, and particularly the concepts behind them, has helped me to think more clearly about training, and to be a better trainer. With that said, let me define the basic terms and concepts as I understand them.
Before I jump into the model, I'd like to point out that it is just a model. It approximates the learning process of higher forms of life like our dogs. It is a useful simplification of a very complex process, which we can use to help us understand what we are doing as we work with our dogs. Having said that, when it is not clear to me exactly how the model applies in a given situation, I usually conclude that nature is a good deal more subtle than the model and stop worrying about definitions.
I visualize the fundamental model of learning, which applies to essentially all living creatures, from the simplest to the most intelligent, as a chart with four quadrants:
Reinforcement and punishment are the tools we have to effect learning. "Positive" and "Negative" mean "to add" and "to take away." There is no good/bad or emotional meaning attached. "Reinforcement" acts to increase a behavior in frequency, duration, and/or intensity. "Punishment" acts to extinguish a behavior. "R+" etc are shorthand for "Positive Reinforcement," "Positive Punishment," etc. The easiest to understand are the two positives. R+ means add a reinforcer, for example me telling you how smart and clever you must be to have read this far into this article. P+ means add an aversive, for example your spouse yelling at you to stop wasting your time surfing the net and clean the parakeet cage. R- and P- are a little harder to understand, so we'll deal with them later.
There are two types of reinforcers, primary and conditioned. A primary reinforcer affects the physiology of the organism. Examples are: food, water, air, sex, heat, cold, and many more. Some are easy to use in training, some are hard, some probably impossible. Think of a conditioned reinforcer as something the subject learns to accept as a reinforcer. It is "conditioned" by being paired with a primary reinforcer. The clicker is a good example of an effective conditioned reinforcer. Praise is another example.
It is important to remember that what is reinforcement, and how valuable it is to a subject, is entirely up to the subject. If you love pizza, and pizza makes me ill, then offering me a slice is not going to reinforce me, regardless of how you would feel about it. And offering you a slice right after Thanksgiving dinner and your 2nd slice of pumpkin pie with real whipped cream is probably not going to light your candle, either.
Something, a treat for example, is only a reinforcer when two conditions are met:
This is why timing is so important. Your dog will doubtless appreciate a yummy piece of deli roast beef which you extract from your pocket after a couple minutes of heeling, but he will associate it with whatever he happens to be doing at the instant you deliver it. No part of the heeling was reinforced by that treat, therefore no learning. In my opinion, this explains why so many people have so many bad things to say about "food training." When you do it wrong, no learning takes place and it really doesn't work.
- It is delivered soon enough after the behavior so that the subject associates it with the behavior, and
- The subject thinks it is reinforcing. A dry, old biscuit may not be very attractive to a thirsty dog on a hot day.
To create a conditioned reinforcer by pairing it with a primary reinforcer the trainer offers the primary reinforcer immediately after the reinforcer he/she is trying to condition. Think "click and treat." The trainer clicks and then gives the animal a bit of food. Once the clicker, or a word, or any other CR is conditioned it takes the place of a primary reinforcer for marking the correct behavior. This means that the food, or ball, or breeding for that matter, can be delayed, the dog learns because he has been reinforced for the correct behavior. It does not mean that the primary reinforcer can be eliminated. A CR which is not followed frequently by a primary reinforcer (the experts say essentially every time) will not remain conditioned for very long.
We usually think of punishment as something aversive, either physically or psychologically. Once again, what is aversive, and particularly how severe it is, are determined by the perceptions of the subject. What I think is very painful may be only annoying to you.
How to Tell Them Apart
Behaviorists assert that we can only observe behavior. We can't observe emotions, or reasons, or drives. We see the behaviors that result, but we can't really know why. Experienced trainers are good at inferring emotions and drives from behavior, but, as my friend Sheila Booth likes to say, "The dog's got all the facts, we got all the opinions." Insofar as reinforcement and punishment are concerned what this means is that the only way we know whether we are reinforcing, or punishing, is by observing the resulting behavior. If, over time, the behavior increases in frequency, duration, and/or intensity, we are reinforcing. If it extinguishes, we are punishing.
The negatives are commonly misunderstood. Remember that negative means only to take away. It doesn't mean do something bad or evil. People often use the term "negative reinforcement" to mean what the scientists call "positive punishment." "Negative reinforcement" is taking away something aversive. The ear pinch is a good example. The trainer pinches the ear, offers the dumbbell, then releases the pinch when the dog takes the dumbbell in his mouth. We know it is reinforcement because it increases the behavior of taking the dumbbell. It is negative because we are removing something, in this case the painful stimulus to the ear.
"Negative punishment" is taking away something reinforcing. I can't think of any good examples in dog training, but I did see a good example with humans one morning. I was in the locker room at our local community center one Sunday morning and it was full of fathers and sons there to go swimming. One young boy and his friend were running around the locker room flinging towels and generally being badly behaved. The father warned them that if they didn't cut it out they weren't going swimming. When they didn't cut it out he made them get dressed and they left.
Now the only way for me to really know whether this was perceived by the boys as punishment would have been to observe their behavior the next time they were there, and I learned not to work out Sunday morning, but this appears to me to be an example of P-.
One word of caution about human examples and analogies. While the principles apply, humans (even young ones) reason and have long term memories which permit us to associate events which are widely separated in time. A "nice job on the World Wide Widget account, here's a bonus" is likely to be quite reinforcing, even if it happens days or weeks after the work. Dogs simply can't do this. In dog training we are working with simple, immediate, non-verbal associations.
The ABC's of Learning
Ted Turner (head trainer for many years for Sea World, not the Mouth of the South) summarizes the requirements for learning as follows:
in that order.
The "antecedent" cues or triggers the behavior. The simplest example is a verbal command from the handler, but there are many other examples, such as the helper running away on the escape, or the scent on an article on the track. The "behavior" is whatever it is we want the subject to learn. The "consequence" is the reinforcement (or punishment, for that matter) that follows the behavior.
Turner's model has some interesting implications for us as dog trainers. Probably most important, it predicts that always luring (with food, for example) or guiding (with hands or a collar) interferes with learning. When we lure or guide we are doing "consequence, behavior, consequence." This is not to say that luring and guiding are always bad. You might wait a very long time for a dog to offer certain behaviors on his own so that you could reinforce them. It seems to me that some of the things the ABC model tells us are:
- Try to figure out a way to get the behavior with little or no interference. Armin's puppy square is a very good example of this.
- When you do lure or guide, fade it (not the consequent reinforcement) as quickly as possible.
- If you can, vary how you guide the behavior. For example, if you begin to teach the basic position using a wall or a fence to guide the dog straight, also use the corner of a chair in the living room, trees, steps, curbs, whatever.
The progression Armin uses to teach a young dog the hold and bark is a very good application of (2). (See The Hold and Bark) Initially the handler restrains the dog on a leash as the helper approaches him to guard. When the dog begins to show the correct behavior, the helper comes close enough that the leash goes slack. In the the vocabulary we are using here, we are starting to fade the guide. Next, the dog is allowed to pull the handler toward the helper, and finally the dog is allowed to go to the helper on his own. When the dog makes a mistake (biting the helper when and/or where he is not supposed to), there are no consequences. No reinforcement, no punishment. From the dog's point of view (Ok, in my opinion of the dog's point of view) that behavior didn't work. Eventually (very soon if we have brought him along correctly) he will offer the behavior we want and get reinforced. The helper, bless his bruised and battered body, is allowing the dog to learn. As a further aside, punishing the "mistake" is risky because it is extremely unlikely that the dog will associate the punishment with only the "wrong" behavior.
If I have piqued your interest in behaviorism, there are several books worth reading. Karen Pryor's Don't Shoot the Dog formally introduced these concepts to the dog training world. Excel-erated Learning by Pamela J. Reid is another excellent book which applies these concepts directly to dog training. Visit Alleydog for a more rigorous definition of these terms, and an understandable explanation of classical vs. operant conditioning. Two of the foundation scientific papers in the whole field are How to Teach Animals by B.F. Skinner, Scientific American, 1951 and The Misbehavior of Organisms by Breland, K. & Breland, M. American Psychologist, 61, 681-684. For a little light bedtime reading try Two Types of Conditioned Reflex and a Pseudo Type by B.F. Skinner, Journal of General Psychology, 12, 66-77.
For me, all of this distills down to one simple rule: Watch for behavior you want and reinforce it