PDA

View Full Version : Clicker training and rewarding with food



accipitterpatter
08-02-2012, 04:36 PM
I am currently training a female RT and I'd like to incorporate some clicker training with the "usual" method of training a falconry bird. I've done some research but one area that I haven't been able to find a lot of information about is exactly how to manage tidbits when the hawk does something right. I don't want to inadvertently cause aggression towards me by handling food rewards incorrectly.

So I have a few questions.

Once I click a behavior I want, what is a good way to present the food reward? Does it depend on the situation, such as having a high level tidbit for hood training? I know that some people throw tidbits on the ground. Does a combination of methods work?

It seems to me that it's important to have multiple tidbits readily available during the training session. Is there too long a delay between the click and food reward if I have to reach into my hawking vest for a tidbit each time? Should I have multiple tidbits hidden in my gloved hand?

At the moment I can't think of any more questions, but knowing me I will probably have some later. If there is anyone who'd like to offer some advice on these or similar aspects of clicker training, I would greatly appreciate it. Thanks.

Ally
08-02-2012, 06:02 PM
Hi Maureen,

It really doesn't matter how you give the tidbits, the important part is the time between the click and the tidbit. Generally, for a reward to be associated with a particular behavior (with dogs, at least), the reward has to come within about 3 seconds of the behavior. The clicker gives you more time to reward by "marking" the correct behavior and giving your time to get the reward to the bird. High-level tidbitting, tossing to the ground, whatever you prefer.

I kind of like tossing the tidbit, because it gives you a chance to "reload" with a new piece while the hawk is busy getting the tidbit. You can click and toss--the reward is presented almost immediately, and the bird still makes that association with whatever behavior you just "clicked."

Just remember that when you click, have a reward ready. Click = reward to the bird, so if you click and DON'T have a reward ready, you can confuse the bird in regards to the meaning of the click. If you need to space out the behavior you are trying to capture/mark, it's another good reason to toss the tidbit :)

If the bird is on the fist, I usually palm the tidbit in the same hand as the clicker. If the bird is on a perch, I hold the clicker with one hand and a tidbit in my closed fist with the other. With my imprint gos, I have a dish of tidbits in plain sight and just grab one out of there to toss or hand-feed. I don't know that I would do that with a passage redtail though...

accipitterpatter
08-02-2012, 07:33 PM
Ally, thanks for the informative reply. I like the idea of having time to reload with a tidbit in the glove while the hawk is busy with one on the ground. That really helps a lot. After all the reading I'd done, things weren't clicking (horrible pun, I know) for me until you put it to me that way.

BTW, it's cool that you're able to have tidbits visible to your imprint gos while training. I thought that would be the last thing you'd be able to do with an imprint gos. How are you able to do that, if you don't mind me asking? A gos, and an imprint gos, are in my future falconry plans, so I'm curious.

andy hall
08-03-2012, 12:05 AM
I put a line of tidbits across my gloved palm and as the bird does behaviors I open up more fingers starting with the pointer so they get a piece with each finger. I can usually get 3-5 reps with this method. Then I will set them down somewhere or ask them to go to another perch and reload the palm.

Andy Hall

thebooster99
08-03-2012, 12:59 AM
Ally,

Where are you getting your information about the "3 second rule?" Unfortunately, this just is not correct information that you are giving out.

christopher.vly
08-03-2012, 01:22 AM
Toby, what would be the correct way to do it then?

JRedig
08-03-2012, 07:51 AM
Ally,

Where are you getting your information about the "3 second rule?" Unfortunately, this just is not correct information that you are giving out.

I would say that depends on the point you are at in training, you agree Toby?

IME, when establishing the CR, the reward needs to be given quickly. As training progresses, the time delay can increase once there is a level of understanding established with the bird/dog etc. They know they performed the right action because it was marked with the CR and you can take more time to get a reward. The time gap will be dependent on many thing, I doubt raptors have as long of a gap as dogs given their mental processing speeds.

kitana
08-03-2012, 08:22 AM
Actually, Jesus Rosales-Ruiz and other skilled trainers demonstrated that if there was a delay as short as 5 seconds between the marker and the reinforcer, unwanted displacement behaviors would crop up and contaminate the training, even with very highly trained animals, thus making most clicker trainers suggest no more than a 3 seconds delay between the marker and the delivery of the reinforcer.

http://www.clickertraining.com/node/65

thebooster99
08-03-2012, 03:33 PM
Jeff,

I totally agree with you. However, the woman that I was responding to was not talking about establishing the conditioned reinforcer. She was talking about training specific behaviors, which would assume the CR was already established. As you know, this process is usually a completely different step at the beginning of the training process, even though I personally have started skipping that step.

Also, I know from talking to you at length that one can even use the CR without a reward to "encourage" a behavior that you want. (I'm just writing this for the others reading.)

I just simply disagree with anybody that states there is a time limit.

Don't get me wrong. Obviously you can't wait a week, but I have seen fantastic results that have happened as a result of approximately a minute or more between the CR and the reward. For example, I have been watching my bird from my balcony and when he did a certain behavior I would whistle, go to my fridge and grab a tidbit, then walk downstairs and out to him to deliver the reward. I have shaped several behaviors using this technique. Of course this is well after the bird has an established pattern of being shaped.


I would say that depends on the point you are at in training, you agree Toby?

IME, when establishing the CR, the reward needs to be given quickly. As training progresses, the time delay can increase once there is a level of understanding established with the bird/dog etc. They know they performed the right action because it was marked with the CR and you can take more time to get a reward. The time gap will be dependent on many thing, I doubt raptors have as long of a gap as dogs given their mental processing speeds.

Ally
08-03-2012, 04:00 PM
Ally,

Where are you getting your information about the "3 second rule?" Unfortunately, this just is not correct information that you are giving out.


Toby, you were not reading carefully. I also would appreciate being referred to by my name, as it is signed at the bottom of my post per forum rules, and not 'that woman.'

I said the time between the REWARD and the behavior is only about 3 seconds to make the association. I then followed said statement with one that said using a clicker, or CR, gives you more time to reward.

My meaning came across a little skewed when I said the timing between the click and the tidbit was important--and it still is, there should be a tidbit following each click, but it doesn't have to to be nearly as fast as without the CR.

Kitana provided a great source for the information, and I received my information through my education at Animal Behavior College and through my behavioral psychology degree, since you asked.

Maureen -
The behavior has been shaped since the day I got him, to be patient and wait for my to feed him. He is a full-food imprint using Steve Layman's method--my log is under the Imprinting forum, titled "Blackjack, 2012 tiercal NA goshawk" or something similar. Good luck with your RT!

goshawkr
08-03-2012, 04:11 PM
Ally, thanks for the informative reply. I like the idea of having time to reload with a tidbit in the glove while the hawk is busy with one on the ground. That really helps a lot. After all the reading I'd done, things weren't clicking (horrible pun, I know) for me until you put it to me that way.

BTW, it's cool that you're able to have tidbits visible to your imprint gos while training. I thought that would be the last thing you'd be able to do with an imprint gos. How are you able to do that, if you don't mind me asking? A gos, and an imprint gos, are in my future falconry plans, so I'm curious.

I have watched Steve Layman working with his imprint goshawks. When he delivers a tidbit, he hands them a rabbit leg in his bare hand, and the goshawk tears off one tidbit. Then steve puts the rabbit leg out of reach.

Its an example of what one can accomplish when you are very careful abotu what behaviors you are rewarding.

thebooster99
08-03-2012, 07:49 PM
You are absolutely correct. I didn't read that carefully. I apologize.

However, after reading it again several times, I still disagree with your "3 second" statement, but this time the opposite way. I personally don't think reward can be delivered fast enough without a CR. How on earth will the bird know what it is doing correct if you aren't hitting a CR the moment the behavior happens? 3 seconds is an eternity. If you are shaping a sharpy, there could easily be 20 behaviors in 3 seconds.

From my experience, timing is SOOOO important that any time between the behavior and the CR is too long. The CR needs to occur at the very same moment the behavior does. I believe it is so important that when I'm shaping, I will form my mouth so all I have to do is blow for the whistle to happen. Otherwise, the time it takes me to form my mouth, inhale and blow is far too long to properly reinforce the behavior that I was looking for.

With regard to you being offended by me calling you a "woman," you may want to lighten up. I'm sure you've been called much worse over the course of your life. If you haven't, you are very lucky. I definitely have. And by the way, you didn't read it carefully. I didn't call you "that woman." I called you "the woman." This should lessen how much I've offended you, but if you're still offended by my use of the word "woman," I truly apologize.

All jabbing aside, the fact of the matter is, I was in the middle of writing the post and when it came to writing your name, I wasn't able to see it above the text on my screen and I couldn't remember it (too many blows to the head from women that I've offended over the course of my life). Furthermore, I was concerned the if I hit my "back" button, it would have erased my writing and I would have had to start over.




Toby, you were not reading carefully. I also would appreciate being referred to by my name, as it is signed at the bottom of my post per forum rules, and not 'that woman.'

I said the time between the REWARD and the behavior is only about 3 seconds to make the association. I then followed said statement with one that said using a clicker, or CR, gives you more time to reward.

My meaning came across a little skewed when I said the timing between the click and the tidbit was important--and it still is, there should be a tidbit following each click, but it doesn't have to to be nearly as fast as without the CR.

Kitana provided a great source for the information, and I received my information through my education at Animal Behavior College and through my behavioral psychology degree, since you asked.

Maureen -
The behavior has been shaped since the day I got him, to be patient and wait for my to feed him. He is a full-food imprint using Steve Layman's method--my log is under the Imprinting forum, titled "Blackjack, 2012 tiercal NA goshawk" or something similar. Good luck with your RT!

Ally
08-06-2012, 11:19 PM
You are absolutely correct. I didn't read that carefully. I apologize.


No problem Toby, apology accepted!

Yep, been called many worse things, but there's no need to be rude. I totally understand not wanting to start over writing you post. That's why I use the "CTRL + C" command on my keyboard to copy what I've already written if I need to go back.

Or, I just refer to previous posts, like your first one, where someone has already mentioned the name of the person I am addressing.

If I didn't make it clear before, the "3 second rule" is NOT, in fact, a rule, but a general guideline that I have found with dogs and seems to carry over pretty well to make that exact association I want. It can be difficult to do without a CR, which is why operant conditioning and conditioned reinforcers are so incredibly powerful in shaping behavior. They capture the exact moment or behavior you are looking for without the delay that reaching for a tidbit (or other treat) can cause. I love operant conditioning, I am a firm believer that it can turn a "good" trainer, falconer, teacher, whatever, into a great one.

goshawkr
08-07-2012, 03:31 PM
If I didn't make it clear before, the "3 second rule" is NOT, in fact, a rule, but a general guideline that I have found with dogs and seems to carry over pretty well to make that exact association I want. It can be difficult to do without a CR, which is why operant conditioning and conditioned reinforcers are so incredibly powerful in shaping behavior. They capture the exact moment or behavior you are looking for without the delay that reaching for a tidbit (or other treat) can cause. I love operant conditioning, I am a firm believer that it can turn a "good" trainer, falconer, teacher, whatever, into a great one.

The absolute rule is actually more like ~15 minutes. However, the longer the time between the CR and the actual delivery of the reward, the weaker the association will be. Although it should be noted that the subject animal will frequently pick up on cues, and be rewarded by them, knowing that the treat is coming. A classic example is when you click a dog, then fumble in a pocket for the actual treat. They are rewarded by your hand in the pocket, in anticipation of what will be coming out, and all of that gets associated to the behavior that you tagged with the CR. Hawks do just as well with that anticipation being a reward of sorts.

Now to go back and re-examine the 3 second...shall we call it a guideline?... a bit. One thing that is very important to remember, especially when working with a well armed animal from the solitary side of the social/non-social spectrum like hawks, is that they have frequently have less than friendly reactions to rivals for their food, and these behaviors typically express themselves when they get impatient about expecting food to appear and not actually having it appear. This means that you need to be very careful about delivering the food after the CR is given, or you have some ugly anti-social behaviors show up, and THOSE will be reinforced. Sometimes instead of the behavior you are trying to reinforce.

The key is to wait out the ugly anti-social behaviors. Preferably until they are gone all together, but sometimes you may need to settle for them being reduced. The hawk will quickly put it together that it only actually gets the reward if it minds its manners.

There are very strong instinctive impulses backing these anti-social behaviors, and they dont take very much reinforcement to become very strongly expressed.

accipitterpatter
08-10-2012, 06:05 PM
Thanks to everyone who has offered advice and information. I actually have another question that is related to the subject. What do you use for a CR? I would like to be able to use something that is hands free and that I can use over longer distances. I would prefer not to use a spoken word for it. In the past, when I was not incorporating clicker training in my training, I have used one blast on a whistle to call the bird to the fist and six quick blasts on the whistle to indicate I'm swinging the lure and the hawk should come over to get it. I would still like to use these cues. Can I use my regular whistle to sound the CR in a different pattern (such as two quick blasts) from those mentioned above or will that confuse the hawk? I bought a shepherd's whistle and am experimenting with that but it does not sound loud enough for use in the field (plus I am having a tough time figuring out how to get it to work right). I thought I could use two different tones, one for my cues and one for my CR, but again, I don't think it will be loud enough to be absolutely reliable in the field.
Now to go back and re-examine the 3 second...shall we call it a guideline?... a bit. One thing that is very important to remember, especially when working with a well armed animal from the solitary side of the social/non-social spectrum like hawks, is that they have frequently have less than friendly reactions to rivals for their food, and these behaviors typically express themselves when they get impatient about expecting food to appear and not actually having it appear. This means that you need to be very careful about delivering the food after the CR is given, or you have some ugly anti-social behaviors show up, and THOSE will be reinforced. Sometimes instead of the behavior you are trying to reinforce.

The key is to wait out the ugly anti-social behaviors. Preferably until they are gone all together, but sometimes you may need to settle for them being reduced. The hawk will quickly put it together that it only actually gets the reward if it minds its manners.

There are very strong instinctive impulses backing these anti-social behaviors, and they dont take very much reinforcement to become very strongly expressed.

Geoff, those anti-social behaviors are EXACTLY what I want to avoid, and because of my inexperience with clicker training, I am a little apprehensive. If you don't mind, could you elaborate on how you correct or "wait out" anti-social behaviors? What are some of the behaviors I should be looking out for to reenforce or not reenforce? I wanted to do clicker training so that I could hunt with a better mannered hawk, so I don't want to create problems. Again, I really appreciate the information. If anyone else would like to chime in on ways to make sure the hawk retains good manners, I would appreciate that also.

alessia55
08-26-2012, 01:34 PM
I don't know much about clicker training with birds, but I'm a very experienced dog trainer, and use clicker training there. The time between the "click" and the food reward depends on the scenario. When you are first building the association between the clicker and the food reward, you want to simply click and toss/give food immediately. This is so that the animal understands "click = food is coming". Once the association has been made, the time between the click and the reward can be expanded. Done properly, I have dogs that can wait 60secs for a reward and still understand what behavior was marked. The clicker means food is coming, though it may not be immediately. Eventually you can have the animal perform more behaviors before actually rewarding with food. So, the "3-second" rule may apply when you are first teaching the animal the association between the clicker and the food reward, but doesn't necessarily apply once the association has been made and the animal understands the clicker is marking good behavior, and that a reward will come later.

goshawkr
08-28-2012, 06:09 PM
Geoff, those anti-social behaviors are EXACTLY what I want to avoid, and because of my inexperience with clicker training, I am a little apprehensive. If you don't mind, could you elaborate on how you correct or "wait out" anti-social behaviors? What are some of the behaviors I should be looking out for to reenforce or not reenforce?

I am sorry, but its very hard to describe well. Really, its something taht you can learn on the fly. If you see bad manners, wait till they go away to give the reward. Focus on good mellow behavior, like a relaxed pose, wtih fluffed out feathers.