Learning Throw-and-Catch — The (Re-)Awakening

Anand Jagadeesh
7 min readJan 12, 2024

--

On a vacation, while explaining a technical concept to someone when a couple of technical terms came into the discussion, I ended up re-triggering some of my ways of explaining technology to students when I used to do guest sessions at colleges and universities during my past professional role. Here’s what happened!

Some Background

I had been trying for years to explain technology concepts with stories and jokes for students and friends who used to ask me for explanation. Finding analogies had been my strong suit. I try to compare and contrast with more visible stuff around me when trying to explain computing ideas to people. I had been running away from information sharing sessions for students since the COVID days as I had been going through quite stressful days. I had a couple of failed technical sessions done on Git training in 2021 for a technical institution in Bangalore which led me to hit the brakes on this but I had still been helping students with final year projects and just regular technology explanation. Recently, while I was on a vacation, a similar technology discussion happened and this is derived from that.

The game of Throw-and-Catch

(I hope) Everyone is familiar with the game of Throw-and-Catch but let me put it in simple words. Alright, you know that game where you toss a ball back and forth? Yeah, it’s the classic Throw-and-Catch! Let me break it down for you. Basically, it’s just a bunch of folks having a good time, tossing a ball around. One person throws it, and the other tries to catch it, then tosses it back. It’s all about coordination and quick moves. Whether you’re goofing off in the backyard or getting a bit competitive on the field, it’s a game for everyone. Not only does it get you moving, but it’s also a great way to connect with others and work as a team. But let’s consider the simplest version with two players. And for the context I am going to explore in this article, let’s consider one with a teacher and learner for throw-and-catch game and let’s call them T and L.

T decides to teach L this game and L is a learner with 0 prior experience. We are going to explore two different scenarios in this learning process.

#1: The left handed learner problem

Let’s say that L is a left handed individual and always tries to catch the balls thrown with left hand and left hand alone. T seems to be a careless instructor who never really focused on L’s technique and never told them to use both hands and be more agile and moving. This makes them miss balls thrown from the right side once the training is complete. This is called BIAS. If L has high bias, they are never very good at catching balls thrown in different directions. This is a real-life example of bias in machine learning. Bias is about how well L can catch the ball on average. High bias might mean they consistently misses in a particular way. For example, in machine learning, let’s say the weather system wasn’t trained for the correlation between one specific cloud thickness and wind combination, we end up with a system that consistently miss predicting rain or no rain on the occurrence of that particular configuration. :)

#2: The all-over-the-place technique problem

In this scenario, the learner has no bias and uses both hands. But T is still a careless instructor who never really focused on L’s technique and doesn’t train L well. Now that the training is over, L and T does some trials and as expected there is a problem with L’s play. They catch the ball differently each time and quite randomly by using either right or left hand with no recognisable pattern of any sort. What does this mean? The ball gets thrown randomly, as the other player pleases, right? So, L fails to catch the ball most of the time as balls might sometimes come on the right and L uses their left arm making it a miss most of the time this happens. And well, as mentioned earlier, they use their left or right arm randomly. Making them the most unreliable player!

Do you see the problem here? This problem is called VARIANCE. Variance is like how consistent L is in catching the ball, or how reliable L is in catching the ball. If L has high variance, it means their catching style is all over the place, and they might not catch the ball well each time (or we can never predict how well they will catch the ball every time). Same as in machine learning where, for example, if the technique is all over the place, the weather prediction might not be really accurate! ;)

The ideal player

Now, who would be the ideal player? Let’s say T is a great teacher and L a great learner. T watches L throughout the training, providing corrective measures whenever L misses, training them well. Finally L is a trained player with excellent playing techniques. So who would be this excellent player?

L is a great player if they end up with the right balance, that is low bias (catches the ball well on average) and low variance (catches the ball consistently). Now given this, let’s project this to machine learning:

“finding the right balance between bias and variance helps create accurate and reliable models”

The Throw-and-Catch ML Impact

I realised how easy it is to explain some ideas when we simplify it and map it to things we see in real world. This really helped me look at the concepts from a different angle compared to the deep technological way I talk in the recent years. This could be the beginning of a series on such topics or a standalone write up. But I wish to get back to the system that helped me build “An ‘Idiot’s Guide’ for Explaining Things” and “How does DNS work?” and look at technology with my idiot mind’s curiosity and analogy-seeking behaviour.

Bonus thoughts from a sleepless night

0.0 The self-eating machine learning systems

Recently I had been reading a ton about how the training data impacts the modern ML, especially using ML output to train more ML. So here is bonus thoughts about that. The idea is from my brain and I could be wrong in more than one way in trying to analyse this complex scenario and associated problems in my idiotic mind.

Before we begin, here’s a fun fact about rabbits: Rabbits Eat Their Own Poop.

For simplicity, we can call the instructions provided by T as L’s dataset they learnt from. Now let’s think of Scenario #x where L from scenario #1 becomes an indirect teacher for a kid later, who is named L1. Now, let’s assume L had a flawed, lightly biased technique that produce a result that wasn’t much of a problem in the initial days of the game causing lesser accuracy of catching. Now L1 sees and learn from L’s play. Now L1’s data is the lightly flawed output of L and eventually assumes that’s how the game is to be played. As L1 may not be as good as L and ends up being 90% as good as L. Now Let’s say this goes on, that is, L1 teaches an L2, L2 teaches an L3, and so on. Eventually L-last in this series is 90% as good as L-second-last. See where this is going? This is what people were talking about in many recent researches where irreversible defects creep into systems as a generative models learn from datasets created by another generative models or a chaining of that sort. For instance, see this paper as a food for thought: “The Curse of Recursion: Training on Generated Data Makes Models Forget

Note on the images: The images used in this article are all AI generated using Leonardo.Ai. Try it out and explore GenAI image generation using many different pre-trained and fine tuned models. Find these images and more on my Leonardo.Ai profile.

That’s how I created the story and explained bias and variance to my friend. And with this, the support system in my mind that creates analogies when explaining technological ideas to people got re-awakened once again.

--

--

Anand Jagadeesh

⌨ Writes about: ⎇DevOps, 🧠ML/AI, 🗣️XAI & 💆Interpretable AI, 🕸️Edge Computing, 🌱Sustainable AI