An analysis of the Law of effect by Edward Thorndike
Thorndike’s law of effect relates to human behavior and is based on his belief that human behavior can be analyzed in the stimulus-response units. His conclusion is that human response to a problem is bound to be replicated in similar situations in the future if there is a resultant feeling of satisfaction. However, situations where the responses do not result in satisfaction lead to less likelihood of the use of similar responses in the occurrence of a similar situation in the future. The law of effect suggests that animals, humans included, are adaptive. This means that they will adapt to the behavior that most benefits them. While they adapt to these behaviors, humans improve on their responses in order to find a response that is stronger and more efficient. Thorndike categorized human behavior based on stimulus-effect responses. This basis showed that responses are triggered by a stimulus. In this situation, stimulus-responses that are continuously used over time are strengthened while those that are rarely used are weakened. This law faces opposition from psychologists and physiologists as they find it indefinite. This law undermines explanations on human behavior based on purpose and has been disputed by research.
In his study of human behavior, Thorndike studied both animals and human beings. He carried out the cat in a puzzle box experiment to establish a pattern of behavior for the cat (Thorndike, 1911). He placed a cat in a puzzle box with food outside the puzzle box but visible to the cat to provide the incentive. Initially the cat tried to get out of the box using trial and error and got out by chance where it could get to the food. On repetition of the process the cat no longer acted by trial and error but followed an established pattern to get out showing that the repetition of the behavior was due to the initial success of reaching the reward, in this case the food outside the box. With each experiment the cat took a shorter time to get out of the box than the previous time. Therefore, he used this experiment to prove that indeed repetition of behavior is influenced by whether success or failure is achieved following an action.
Thorndike’s law of effect provided the groundwork for B. F. Skinner in his introduction of the principle of operant conditioning that states that all behaviors are accompanied by consequences and these consequences dictate whether the behavior is repeated or not (Skinner, 1940). For instance, in the case of the cat in the puzzle box, the time taken by the cat in consecutive experiments to get out of the box was shorter than previous times. This law is teleological in that the action implemented is usually for the sake of the cause or the final outcome. The consequence of finding a way out was getting out of the box and getting the food hence the action of finding a way out of the box is a means to a cause in this experiment, the cause being accessing the food.
Thorndike also carried out experiments on behavioral responses based on a human beings intellect. He established that there was formation of neural connections between a stimuli and a response hence the intellect perpetuated the formation of these bonds (Thorndike, 1911). Hence, people with a higher intellectual capacity would form more neural bonds easily than those with a lower intellectual capacity. He believed that the ability to create the neural bonds was based on an individual’s genetic potential in the creation of the structure of the person’s brain.
The law of effect has an influence on learning since an individual has to master which reaction to a stimuli results in satisfactory results and on mastering the reaction one needs to come up with reinforcements to the reaction to make it better. For example, in the experiment with the cat, its reactions became more streamlined as it stopped using those reactions that did not have satisfactory results. It strengthened those reactions that lead to its freedom, which was pulling the lever.
Another experiment done by Ferster and Skinner showed that an animal (1957), when moved from an interval to a ratio schedule indicated a change in its degree of responding. Albeit in stamping –in, the degree is expected to remain the same for the replication of a reaction, where all rates of response have the same probability of re-occurring. Ferster and Skinner also experimented with pigeons. One subject’s degree of response increased while the other subject’s rate of response reduced when the program changed from an erratic interval to an erratic ratio matched four numbers of responses per reinforcement (Ferster & Skinner, 1957).
Both the increase and decrease in the degree of response in the change from interval to ratio schedule do not pay heed to Thorndikian law of effect, thus only the increase is seen as accommodative. One subject increased its reinforcements per unit time by its quick response on the ratio schedule. The other subject slow response on the ratio schedule led to a reduced rate of reinforcement. Thus neither stamping-in nor adaptation, can fully explain for what is being termed the strength of behavior.
Reinforcement as strengthening
Thorndike’s used the belief of stamping-in, as reinforcement, which may have been enough for obtaining of new behavior. But for behavior that is already learned, stamping-in seemed unnecessary as the response form no longer changes. But reinforcement still affects what might be thought of as the strength of the behavior. For Skinner, who preferred measurement, to admit that behavior is strengthened is to mean some dimension of behavior is modified during the replication and its strength is modified. The measurement problem turned out to be experimental and rather not theoretical, and was not meant to refute the virtue of plain way of thinking projected by Thorndike’s law. Rather it was to show that the only persuasive argument for any measure of response strength is to declare defined boundaries and relations between measurements of reinforcements (Ferster & Skinner, 1957).
Other than this experiment, several other experiments were conducted to measure the reinforcement strength and choice with time. One such experiment is that of Shull and Pliskoff who used albino rats to determine whether delaying of reinforcement would affect the reaction of the rats (1967). The reinforcement in this case is the expected reward by the animal to a certain action, which was responsible for the repetition of the action by the animal in consequent experiments. For the albino rats, the reinforcement was an electric current passing through the rat’s brain to the posterior hypothalamus.
In the experiment it was established that matching of the previous reaction occurred as long as the time of delay was not significantly larger compared to the time taken for the incentive to occur in the previous experiment. However with the increase in the period of lapse to the introduction of an incentive, there was a change in the observable reaction meaning that the rats were adapting and reacting differently than they previously were. This experiment disputes the notion of stamping in, meaning that behavioral response to an incentive is not stamped in but varies based on the conditions presented (Shull & Pliskoff, 1967).
In this experiment, the distribution of response matched the distribution of reinforcements varying proportionally. Hence as the reinforcement increased the response increased and when there were delays in reinforcement, the responses diminished proportionally. In most of the experiments justifying matching in responses to certain reinforcement, the magnitude of the reward also plays a part in dictating whether it is repeated. A bigger reinforcement had a greater effect of repetition and matching.
For instance, if the reinforcement required standing in a certain location for a certain period in order to get it, the length of time that the animal under investigation stood in that position was directly proportional to the amount of reinforcement received. The response strength of a subject is directly controlled by the relative reinforcement presented. Hence the strength of the response and its variance in regards to variations in the reinforcement is directly proportional (Shull & Pliskoff, 1967). Therefore the response varies with the parameters of the reinforcement.
With the instances where there are two probable resultants to reinforcement, the subject is bound to undertake the response that is most likely to bring about the greatest incentive. This law of probability differs from matching as is observed in the experiments and indicates maximization. This indicates that the subject takes the option that it believes will result in the greatest reward. As much as the resultant response is influenced by the reinforcement, the rate of reaction is also important in determining consequent responses. As the response varies with the varying reinforcement the rate is also bound to vary. This is deemed an important factor in the free-operant method where matching seizes to occur and the subject’s response changes with the incentive provided.
In the operant method the output or reaction is free to vary hence the rate of reaction will vary with the rate at which the reinforcement is received. The basic assumption in each experiment is that with every possible action that confronts a subject there exists alternatives (Skinner, 1940). By enacting that action, the animal does so as a result of taking a choice rather not because it was the only option. Hence the problem facing the accuracy of the experiments is the problem of identification and measurement of the alternatives as they may not be easily identifiable alternatives. Therefore, the rate of reaction of a subject is influenced by the choices that the subject has since some time will be spent making a decision as to which choice of behavior to enact. This therefore proves that a response is not only influenced by the reinforcement that the subject receives but also by the possible choices present.
The rate of the reinforcements also has an influence on the responses. In the experiments by Catania and Reynolds, the rates of reinforcements were varied in the same time period (Catania & Reynolds, 1968). Where two types of determinants were used, on one, the rate of reinforcement determined the rate of response since the subject only switched to it when the reinforcement was presented whereas in the second determinant choice was the major factor influencing response.
Therefore, where matching is not obtained in an experiment either the asymptotic rates of concurrent experiments differ or the reinforcement contexts are different. In the experiments, the discrepancy between single response and multiple responses has been evident with single response situations being dependent on only one factor, mostly reinforcement. Multiple situation experiments’ results were dependent on multiple factors of reinforcement, choice and rates. It was therefore impossible for matching to occur in multiple response situations.
The frequency of response is directly proportional to the reinforcements as has been established in the multiple experiments afore-mentioned (Catania, 1966). This is established in the principle of operant-conditioning that shows that there exists an interaction between successive conditions of reinforcements. One experiment that proved this interaction is Reynold’s experiment where he used pigeons as the subjects. In the experiment, he used a red and a green light to show a period of reinforcement provision. After some time when one of the reinforcements was discontinued, the response of the pigeons to the other reinforcement was twice that of initial response. This showed that the pigeons compensated for the discontinued reinforcement by responding more often to the other reinforcement (Reynolds, 1963).
Contrast effect, as he called it has been greatly confirmed under many situations. But currently, the question is whether or not such changes in the rate of response belong with the current account of response strength. This shows that the multiple schedules, where options precede one another, are different in some ways from the choices that run simultaneously as regards the analysis of response strength. For choices that run at the same time, they are thought of to exert a complete effect on every response alternative. Even though this assumption is for simultaneous procedures, it is less so for multiple schedules, where different sources of reinforcement are not concurrently working. The relations across parts of a multiple schedule are likely to decrease as the parts gradually turn different, and if limited to a fault, the interaction might be completely disrupted, thus no interaction at all (Reynolds, 1963). These experiments serve to show that many of the effects of multiple and simultaneous procedures are taken from one conception of response strength.
The contrast effect is dependent upon the reinforcement in other parts. This is because reinforcement on the interacting part with the subject, with or without response, avoids contrast. As Reynolds later came to conclude that “the frequency of reinforcement in the presence of a given stimulus, relative to the frequency during all of the stimuli that successively control an organism’s behavior, in part determines the rate of responding that the given stimulus controls”
Rachlin and Baum experiment included two keys where one was for reinforcement. This key was to oversee access to food. The other key was not put on until a scheduled programmer was through with an interval, then it was lit until the next response collected the reinforcement (Baum & Rachlin, 1969). During the experiment, the reinforcement’s were varied widely, and thus the response on one key varied inversely with the size of reinforcement, that is if the quantity and occurrence of reinforcement could be compared against each other.
Terrace (1966), discovered a small contrast effect, that appear to refute the effect mainly in regard to transitory effects, thus he found a transient difference in going from a simple variable interval schedule to the same variable interval thereby alternating with extinction. With exposure to conditions or with replicated changing between the variable interval in seclusion and the many schedules that contain variable interval and extinction, Terrace’s contrast effect was found to have disappeared. He also experimented with four pigeons, which were trained in the presence of a given stimulus. The response turned out to be positive. The stimulus was changed after every one and a half minutes with another stimulant, but reinforcement was not changed. For the experimental pigeons, the degree of response reduced during the stimulus common to the first and second procedure. He concluded that the temporary contrast is poignant based on the evasiveness of extinction (Terrace, 1966). In his experiments, Terrace observed that contrast came about with any procedure that produced a peak shift in response whereas in procedures that prevented a peak shift reduced the occurrence of contrast.
In some individuals, there is failure to respond to some stimuli which leads to small differences in the delay gradient during early development of the brain. This may lead to greater differences in behavioral responses in future. Short delays of reinforcement can be used to create higher-order behavioral units as one is forced to learn to adapt (Catania, 1995). This is necessary in building cognitive and social skills as it aids individuals in maintaining longer attention periods hence raising the attention span. This can be used in the remedial program of an individual with autism. Use of programs that require rapid responsiveness such as computer games can be useful in creating structural embedment of minimal responsive behavior into coordinated units of a higher order.
This law also applies in the growth and development of an adolescent. At this age, children do not quite understand themselves fully hence their actions are often strengthened by the response that follows the action. For instance, if an action is followed by acceptance and praise by the peers, the adolescent is bound to repeat the same action. However, if an action is followed by an aversive stimulus, the teenager is bound to repeat the action less often in future. The effect of the negative stimulus is dependent on the termination of the action. Hence where an individual realizes that by not acting in a certain way the negative effect will not occur; they are bound to abstain from the action causing the effect.
Baum, W. H. & Rachlin, H. C. (1969). Choice as time allocation. Journal of the Experimental Analysis of Behavior, 861-874.
Catania, A. C. (1995) Higher-order behavior classes: Contingencies, beliefs, and verbal behavior. Journal of Behavior Therapy and Experimental Psychiatry 26:191–200.
Catania, A. C. (1966). Concurrent operants. In W. K. Honig (Ed.), Operant behavior: areas of research and application. New York: Appleton-Century-Crofts, Pp. 213-270. Print.
Catania, A. C. Reynolds, G. S. (1968). A quantitative analysis of the responding maintained by interval schedules of reinforcement. Journal of the Experimental Analysis of Behavior, 327-383.
Ferster, C. B. & Skinner, B. F. (1957). Schedules of reinforcement. New York: Appleton-Century- Crofts. Print.
Reynolds, G. S. (1963). On some determinants of choice in pigeons. Journal of the Experimental Analysis of Behavior, 53-59.
Shull, R. L. & Pliskoff, S. S. (1967). Changeover delay and concurrent performances: some effects on relative performance measures. Journal of the Experimental Analysis of Behavior, 517-527.
Skinner, B. F. (1940). The nature of the operant reserve. Psychological Bulletin, 423.
Terrace, H. S. (1966). Behavioral contrast and the peak shift: effects of extended discrimination training. Journal of the Experimental Analysis of Behavior, 613-617.
Thorndike, E. L. (1911). Animal intelligence. New York: Macmillan. Print.