23 Apr What Skinner’s Rats and Value Investors have in common
In the study of psychology there is a phenomenon called Operant Conditioning.
Operant conditioning shapes behaviour, and the underlying principle is that behaviour that is positively reinforced tends to be repeated, while behaviour that is not reinforced tends to be extinguished.
Think of a child crying for attention. He wants to be picked up and cuddled. When the mother picks up the child, she is actually reinforcing the child. The action (crying) is leads to the desired outcome (being picked up). If a child is picked up every time he cries, he is being constantly reinforced and in no time will associate the desired outcome with the required action.
Operant conditioning is studied extensively by psychologist B.F Skinner. He experimented with rats in an apparatus called a Skinner Box. A hungry rat would be placed in a box with a lever by the side. As the rat moved about the box it would accidentally knock into the lever. When that happens, a food pallet would drop into a container next the the lever.
In no time the rats became conditioned and would go straight to the lever the moment they are placed in the box. Once that is established, Skinner explored schedules of reinforcement to see how and which were most effective. He studied four different schedules.
Schedules of Reinforcement
For a Fixed Ratio reinforcement schedule, the rat would be given a pallet after a specific number of presses on the lever. If a rat is exposed to a three press schedule, it would soon learn to make three quick presses of the lever in a row to obtain its food. The fixed ratio schedule is predictable and the best way to condition a new behaviour.
A sales person who is paid according to the number of product he sells is fixed ratio reinforced. For every product or every quota he meets, a reward is promised. This conditions him to sell more. Another real world example is loyalty cards. By buying five cups of bubble tea, you get a free drink. Vendors are reinforcing our purchasing behaviour in order to sell more drinks.
A Variable Ratio schedule calls for reinforcement to be applied after a random number of response. Instead of being fixed (eg. pallet after every two presses of the lever), the reinforcement might come after one press, then four presses, followed by three presses.
Classic everyday example of this would be Jackpot machines. Our behaviour (feeding hard earned money into the machine) is reinforced (monetary payouts) when we pull the handle. Sometimes we need to pull the handle many times before getting a payout. If we are lucky, we get a payout after one or two pulls.
The reinforcement for Fixed Interval is time dependent. The pallet would drop at a fixed time interval, regardless of the number of presses on the lever. As a result, the lever pressing behaviour of rats tend to become weaker over time.
Receiving a paycheck every month is a form of Fixed Interval reinforcement. Come to think of it. If our jobs are secure and the pay is assured every month, where is the impetus to work hard?
Finally, a Variable Interval is when behaviour is reinforced randomly. Pallets will be dispersed at random intervals regardless of lever presses. Checking of email is an example of a variable interval schedule. Whether you check consistently or not, emails will be delivered to and remain in your inbox at irregular intervals.
Value Investing as a behaviour
Think of value investing as a behaviour we want to establish. We want to consistently select value stocks with a good margin of safety, buy and hold them until their value is realised before selling them. To invest intelligently and consistently, we need to be rewarded accordingly.
Ideally, the best way to establish the behaviour is be continuously reinforced. That is, every time we buy a stock, we make money. The reinforcement would be powerful and we would all exhibit stable investing behaviour. Of course we all know that is impossible.
Otherwise, a fixed ratio might just work. For every three stocks you buy, one would be a massive winner. If it happens often enough, we would constantly be looking to buy more stocks.
If not, our behaviour could also be reinforced by a fixed interval schedule. After a fixed period of time we are able to realise profits. This spurs us to continue investing. Dividends are a good example of fixed interval reinforcement. Many investors swear by dividends because otherwise they are unable to convince themselves to remain unrewarded for years.
Unfortunately, the market reinforces in a Variable Ratio cum Variable Interval fashion. We have to buy stocks and stay invested (push the lever) to be rewarded (pallet drops). As we are all keenly aware, markets are random. They move in a random fashion, totally without regards to our needs as Investors. The market does not care if we remain invested or if we get fed up by the lack of reinforcement and drop out of the game.
As a result, investing behaviour is not reinforced. There is no clear link between action (buying good stocks) and reaction (profits). Even if there is, the time interval is too long and too random for most investors to form a clear association between the two. It is as though we are rats in the Skinner Box set up by an experimenter bent on extinguishing any investing behaviour on our part.
For these two reasons alone, value investing remains one of the hardest behaviour to shape, reinforce and mantain. And because of that, those who are able to overcome this barrier remain highly successful and profitable in their forays. Are you one of them?