Machine Learning Explained : Introduction
If you’re curious about the buzzwords taking over the tech world and want to dive into the fascinating world of AI, start here 👉 Explore AI with me.
In my previous blog, I broke down the hierarchy of AI technologies and explained Large Language Models (LLMs) in detail:
đź”— Introduction to Large Language Models (LLMs)
In this post, we will understand Machine Learning. The goal of Machine Learning is to discover patterns in data. More specifically, it’s about finding a pattern that describes the relationship between an input and an outcome. Whether it’s predicting a movie genre, identifying an image, or determining the sentiment behind a sentence, Machine Learning aims to solve these problems by learning from examples. In this article, we will explore how Machine Learning can tackle different types of tasks, why these examples are important, and what conclusions we can draw from them. By the end, you’ll also have an idea of what other concepts are key to understanding Machine Learning.
What We’ll Cover
- Introduction to Machine Learning with examples from movies
- Image classification explained
- Sentiment analysis in text
- Complexity in Machine Learning models
- Conclusion and key takeaways
Introduction: Understanding Machine Learning
To understand Machine Learning, it’s best to start with something relatable. Let’s say we would like to distinguish between two of my favorite movie genres: action and romantic comedies. If you’re not familiar with those genres, here’s a very quick intro that will help us understand the task. Action movies are known for their explosions, car chases, and over-the-top stunts, while romantic comedies (rom-coms) are characterized by sappy love stories, comedic misunderstandings, and more hugs and kisses than one might be able to stomach on a rainy Sunday afternoon.
Machine Learning in Practice: Predicting Movie Genres
Predicting movie genre is an example of a classification problem. Suppose we have 20 movies, and we know each movie’s “explosion count” and “love confession density,” two metrics that can be easily measured or estimated. In addition, we’ve labeled each one with a genre, either action or rom-com. When we visualize the data, we notice that high explosion count, low love density movies are mostly action, whereas movies with minimal explosions and an abundance of love confessions are mostly rom-coms. Makes sense, right?
But here’s the thing—labeling every movie by hand takes way too much time (trust me, I tried, and it wasn’t pretty). Instead, we could learn the relationship between the metrics (explosions, love confessions) and the genre so that we can predict future movies based purely on these characteristics.
In Machine Learning terms, this is a classification problem, because the outcome variable (the genre) can only take on one of a fixed set of classes—in this case, action or rom-com. This is different from a regression problem, where the outcome is a continuous value, like the length of time you’ll be wondering why on earth you watched that three-hour blockbuster.
We can now “train” a Machine Learning model (or “classifier”) using our labeled dataset, i.e., our collection of movies with known genres. Visually speaking, what training the model does here is find the line that best separates the two classes.
How is that useful? Well, now that we know this line, we can make a prediction for any new movie based on where it falls—explosive action, or lovey-dovey rom-com? The further away a movie is from the line, the more confident we are that our prediction is right—like if there are 100 explosions and no love confessions, I’m pretty sure it’s not “Sleepless in Seattle.”
In Reality, Things Are Often More Complex
Of course, reality is more complex than two simple metrics. The boundary that separates action from rom-com might not be a straight line—it could be curved, or look like a wiggly spaghetti noodle. In addition, we might have more than just two genres: throw in horror, sci-fi, and musical, and suddenly our classifier has to deal with more than two classes. We might even have a hundred input variables: number of songs, presence of aliens, average velocity of car chases, etc.
The key takeaway here is that as the relationships become more complex, we need more powerful Machine Learning models to capture the patterns—and we need more data too. After all, it’s hard to make a solid prediction about a musical-action-horror-rom-com with only a few examples.
Image Classification Example
Let’s switch gears to a different problem: image classification. For instance, let’s say we have an image of a dog wearing sunglasses (because why not). We want to figure out if this cool pup is actually a “dog,” a “wolf,” or a “cat” (it’s 2024, we have to be open-minded).
We know this is a classification problem because the output can only be one of a few fixed classes. But how would a computer figure out whether our suave friend is a dog, wolf, or cat? It turns out that images are already numeric—an image is made up of pixels, and each pixel has a value representing color intensity. So, in theory, we could feed these pixel values directly into a Machine Learning model.
But here’s the catch: even a small 224×224 image has more than 150,000 pixels, and these are all potential inputs for our model. Plus, the relationship between raw pixels and the image label (dog, wolf, or cat) is extremely complex. Imagine trying to determine if an animal is a dog or a cat just by looking at one pixel at a time—you’d probably just end up very confused (and so would the model).
Sentiment Classification Example
Now let’s move on to something even more challenging—understanding emotions. Imagine you’re reading a movie review that says, “This movie was a blast… if you enjoy watching paint dry.” The sentiment here is negative, though the words could be misleading if taken out of context.
In this scenario, our input is a sentence, and our outcome is the sentiment—positive or negative. To make this work for a Machine Learning model, we first need to turn words into numbers. We do this using word embeddings, which are like little numeric representations that capture the meaning of each word. These embeddings can be fed into a model that tries to learn the complex relationship between the words and the sentiment of the whole sentence. Think of it as teaching a model not only how words sound but how they “feel” in different combinations.
Of course, this leads to the same issues as before: longer sentences mean more inputs, and the relationship between language and sentiment is complex—especially when sarcasm gets involved. We need powerful models and lots of data, which is where Deep Learning comes into play.
Why Are These Examples Important?
So, why did we go through all these examples? Each of these examples illustrates a different kind of problem that Machine Learning can solve. Whether it’s classification (like identifying a movie genre, an image, or sentiment) or understanding complex relationships, Machine Learning is versatile and can be applied to a wide variety of domains. These examples give you an idea of the power of Machine Learning and its potential applications.
The main takeaway here? Machine Learning is all about finding relationships between inputs and outcomes and making predictions based on those patterns. The more complex these relationships become, the more sophisticated our models need to be. Whether it’s predicting a movie genre, identifying a wolf in sunglasses, or figuring out if someone loved a film or hated every second of it, Machine Learning helps us uncover patterns and make sense of the chaos.
Conclusion and What Comes Next
Machine Learning isn’t magic—it’s about finding patterns and applying them in practical ways. We looked at movie genres, image classification, and sentiment analysis to illustrate the versatility of Machine Learning. The more complex the problem, the more sophisticated the approach we need.
Further, we’ll dive into more advanced topics like neural networks, different types of Machine Learning (supervised, unsupervised, and reinforcement learning), and how to choose the right model for a given problem. So stick around, because the world of Machine Learning is full of surprises (and more dogs wearing sunglasses).
Now, if you’ll excuse me, I have to train a model to identify the difference between a thriller and my mother-in-law’s annual Thanksgiving speech—wish me luck!