If you don’t happen to spend all of your free time obsessing over logistic regressions, gradient descents, and big data, but still want to know what machine learning is and why everyone’s going crazy over it, then this guide is for you. Read on to find out what all the hype is about and why machine learning is one of the coolest things to get into right now.
What is machine learning?
It seems like every time I open my browser there’s a new article or email telling me that machine learning is where it’s at and that it’s the going to solve pretty much every problem out there.
And I’m like, “no need to tell me, this is why I get out of bed in the morning.”
Before diving into the crazy cool world of machine learning though, let’s first talk about some of the things machine learning is and some of the things it isn’t.
Machine learning isn’t:
🔮 A crystal ball
⛲ The fountain of youth
🤖 Robots taking over the world
Machine learning is:
💪 An incredibly powerful way of making predictions
📚 A way of making machines smarter by learning from their mistakes
🧠 The path towards creating more sophisticated artificial intelligence
Machine learning is a sub-field of artificial intelligence and uses algorithms to make computers learn from data without being explicitly programmed to do so.
By feeding these algorithms huge amounts of data, the algorithm can change and improve itself, without human input. This is great if you’re into doing IRL stuff away from your computer and letting the machine do the legwork (or brainwork).
Making a computer learn basically means you “teach” it by feeding it loads of data over and over until it’s able to solve problems with a high degree of accuracy on its own.
Generally, this data takes the form of observations of real-world interactions, such as website clicks, online transactions, or search queries.
With data in hand, we can get down to the serious business of making the machine accurately tell us whether the images below are of a) a cuddly feline friend or b) the thing you grab from the freezer after a hard day at work:
FYI – this technology isn’t just limited to kittens vs ice cream though. Computers are now learning to tell the difference between labradoodles and fried chicken, chihuahuas and muffins, and pugs and loaves.
While these are undoubtedly important applications of developments in this field, there are, of course, also plenty of non-animal related uses for machine learning that are getting people seriously excited.
Machine learning: the basics
Machine learning is effectively based on two different techniques: supervised learning, which works with labeled data to make future predictions, and unsupervised learning, which is used to find hidden patterns in data.
A supervised learning algorithm takes a known set of data (input) and known response (output) and then trains a model to generate a prediction on a new set of data. Supervised learning should be used when you have known data that you are trying to predict.
Let’s say you want to quit your job and become a full-time blogger. You have a list of how much money other bloggers have made and how long each blog has been live and you want to know how many months it will take before you’re earning enough money to start feeling flush.
Linear regression is a great example of a supervised learning algorithm that you could use to answer this question because you already have data with labels ready to go (number of months and amount of money earned).
Types of supervised learning
There are two main methods of supervised learning that you’ll want to get familiar with, regression and classification.
Regression is used to predict continuous responses that are measured along a sliding numeric scale. This could be changes in house prices, temperature fluctuations, or profit. A common regression algorithm is linear regression.
Classification methods predict discrete responses, which means the outcome can be placed into a category or classification. For example, a classification method could be used to determine whether an email is or is not spam, whether it may or may not rain tomorrow, or whether a political candidate is or is not likely to win.
In contrast to the first, linear regression method which looks at the relationship between variables on a sliding scale, classification works best when data can be categorized or separated into specific, distinct groups. Two popular classification algorithms are logistic regression and support vector machine (SVM).
Unsupervised learning works by taking unlabeled data and finding intrinsic structures, or patterns, within that data.
Let’s say that, after running our (supervised learning) linear regression analysis, you went ahead and quit your job and have been running your blog for a while now.
Since you’re obviously into data, you’ve been collecting information about your readers from day 1 such as their gender, age, location and other attributes that might help you build a clearer picture of who your readers are.
You could use an unsupervised learning algorithm like k-means clustering, to split this data into natural clusters that exist within your data, such as 20-something fashion lovers or 30-something dog owners.
Types of unsupervised learning
There are two primary methods of unsupervised learning, clustering and data compression.
Clustering is used to explore data and find hidden patterns. This could be used by companies to better understand their customers, or by scientists trying to develop new drugs. A common algorithm for clustering is k-means clustering.
Data compression techniques perform something called dimensionality reduction which basically removes redundant and unuseful information from your data set so the algorithm only has to process the most important data. Principle component analysis (PCA) is a really common data compression method.
Why machine learning matters
In case you weren’t convinced that interest in machine learning is currently skyrocketing, check out this chart that shows Google searches for the term “machine learning”:
Machines that learn are all the rage right now because they’re able to find patterns much faster and more reliably than humans can, and in turn, can help us solve more problems and make better-informed decisions on some pretty complex challenges.
As the amount of data that humans produce continues to grow at an ever-increasing rate (to the tune of 2.5 quintillion bytes of data every day), the ability to work quickly and accurately through massive quantities of data is becoming a necessity.
Because these algorithms are playing an ever-increasing role in our daily lives, it’s important that as many diverse voices are involved with machine learning and artificial intelligence as possible.
Since algorithms only do what they’re trained to do, these systems need to learn from as many different sets of data as possible, which relies on the discretion of the person selecting and feeding in the data.
With this in mind, having a wide spectrum of people involved in selecting, processing, and analyzing the data will help to make algorithms that can effectively respond to the infinite complexities, situations, and possibilities of real-world scenarios.
Otherwise, they will make potentially costly, offensive and even life-threatening mistakes.
And those, are not outcomes anybody wants.
The future of machine learning
Today, machine learning algorithms are used to:
- Suggest songs to you on Spotify, shows on Netflix, and posts on Facebook
- Help enable cars to drive themselves
- Write and publish sport match reports
- Identify faces in Facebook photos
- Keep spam out of your inbox
- Finish your sentence in Gmail
- Recommend products on Amazon
- Prevent fraudsters from using your PayPal account
- Let you have conversations with Siri and Alexa
And the list goes on. Basically, machine learning is increasingly all around us.
Despite significant advances though, there are some people who argue that the hype around machine learning is overrated and unwarranted. And to be fair, many of the algorithms aren’t even that new, they are from the 70’s.
What has changed though is the power and speed of our modern day computers. In terms of processing power, your smart phone alone is more powerful than even the most OG of supercomputers from 40 years ago.
At the same time, we’re producing more data than in all of human history before us combined. And of course, machine learning is fueled by data.
The combination of more data and faster, more powerful computers has led to a surge in AI, but what does the future hold for machine learning?
Many are predicting more of the same, but bigger, faster, and cooler. This means more companies adopting machine learning techniques with more real world applications.
As the power of our computers increases so does the potential of machine learning algorithms.
At the same time, there is a push to improve interactions between humans and AI to the point where we don’t even know we’re interacting with a bot. Google Duplex is nearly there.
While we’re still a ways away from reaching the singularity (if indeed we ever will), machine learning and AI continue to transform the way we work, consume information, and even the way we understand the world around us.
Luckily, my friends, this is the best time to join the AI revolution and stake your claim on this wild frontier.
Interested in learning more about machine learning? Here’s a list of some of the best guides and tutorials. Elle Knows Machines is always growing, so check back regularly for more resources to help you up your machine learning game.
What is Artificial Intelligence?
What is Natural Language Processing?
Linear Regression: The Beginner’s Machine Learning Algorithm
DIY: Simple Linear Regression
Logistic Regression: Are You In or Out?