Have you ever used your voice to ask Siri, Alexa, Cortana, or Google Assistant a question?
If so, then congrats you’re already a natural language processing expert!
Just kidding, but you’re well on the way to getting started.
Natural language processing (or NLP) is actually pretty complex (surprise, surprise), but voice command assistants like Alexa and Siri have made the ability to analyze and synthesize natural (aka human) language and speech more mainstream than ever before.
Thanks to some significant improvements in NLP software, computers have gotten a lot better lately at communicating with humans in our own languages, and they’re just getting started 🎉🎉
What is natural language processing?
Natural language processing is a sub-field of artificial intelligence that combines computer science and linguistics in order to enable computers to understand and process human language.
Basically since computers became a thing, scientists have wanted them to be able to understand natural language.
Computers have their own language but it’s nearly impossible for humans to understand in any practical sort of way. So rather than try to decipher the strings of 1’s and 0’s that all programming language is ultimately reduced to, our best bet for mass communicating with machines is to get computers to understand human-speak.
Since you can’t just give a computer a dictionary that covers every possible scenario related to real-world communication scientists have had to look at ways of solving the language problem that are both flexible and scalable.
Unfortunately, language is really messy.
To start with, each human language (of which there are over 6,000) contains a huge amount of vocabulary and a basically infinite number of ways in which to arrange words in a sentence.
Next, you have to add in grammar and all of the different rules that inform how to structure a sentence.
Don’t forget all those pesky exceptions to the rule (I’m looking @ you “i before e”), words that sound the same but have different meanings, and of course, the words that sound different but actually have the same meaning. There’s wordplay when you’re feeling fancy and slang when you’re not, along with different accents and (mis)pronunciations, vernacular and regional dialects, speech impediments, slurred speech and those vague sentences that could mean any number of things depending on the speaker and the context.
Generally speaking, these factors aren’t a problem for us humans.
After all, we’ve spent our whole lives training to communicate with other humans. Plus, there’s a wide range of signals that our brains factor in when processing speech like body language, intonation, and contextual clues.
This complexity is hard to capture though, so researchers use natural language processing techniques to help computers try and understand exactly what we mean when we communicate with them.
Language processing: the basics
Breaking information down into smaller chunks of data allows computers to more easily access information and predict the likely relationship between words. By treating language as interchangeable blocks, computers have become pretty good at understanding human language.
So, for example, in English there are 9 basic types of words: conjunctions, verbs, adjectives, nouns, interjections, pronouns, adverbs, articles, and prepositions. These are known as parts of speech.
Knowing the type of words being spoken or written is the first step in untangling the complexities of language, but words alone aren’t enough for a computer as a single word can take on multiple different meanings.
Take the words rose and leaves. Are we talking plants or describing someone getting up and departing from a location?
That’s where grammar rules come in. These inform how to create sentences that actually make sense in different languages.
In English, there’s a pretty big difference between:
“The cat sat on the hat” and “Sat the on hat cat the”.
Grammar provides additional contextual information necessary to construct comprehensible sentences.
Once you have the word type and the grammar rules in place, it’s relatively easy to construct what’s known as a parse tree that tags every word with the part of speech.
A parse tree is basically an illustration of the grammatical structure of a sentence.
Presto – you can now see how a sentence is constructed.
Natural language parser programs work out the grammatical structure of sentences in order to produce the most likely analysis of new sentences – their development was one of the biggest breakthroughs in NLP.
There are many different techniques available, but ultimately, NLP is about analyzing and synthesizing natural language and is used for things like chatbots and voice command interfaces.
Why natural language processing matters
Data comes in two forms, structured and unstructured.
Structured data is really well organized and looks like a spreadsheet with rows and columns. Computers love structured data.
Unfortunately (for computers) a lot of the information in the world is unstructured. It looks like emails and tweets, snippets of conversations, Instagram captions, and texts to friends.
It looks like the stuff that language is made up of.
In short, unstructured data is messy.
Still, we continue to produce and output unstructured data in staggering amounts, every minute of every day. NLP provides the tools that allow us to analyze and use this data.
At the same time, machine learning and AI algorithms continue to become more and more popular. NLP allows non-programmers to get useful and personalized information from computers.
Take Google’s search engine as an example. Remember when you used to have to search for something based on keywords? It worked but it was clunky and made you pose your question in an unnatural way. Now you can search for something online in much the same way you would in person.
As voice-based virtual assistants like Alexa and Siri continue to develop, the ability to use natural language to get information from our increasingly intelligent machines will get better and better.
But voice controlled systems are just the beginning. Companies are investing a lot of time and money into developing more advanced NLP algorithms that not only understand human language, but are able to produce correct responses in a range of complex situations.
Get ready to say hello to the next wave of virtual assistants 👋
Future of natural language processing
Parsing and generating text works well for sentences and phrases with relatively straightforward construction, but things begin to get a bit funky when you speak as you would casually in a conversation.
It’s still very hard to have a natural conversation with a computer.
Google, however, has taken this challenge on and has made some remarkable breakthroughs with what they call Duplex, a technology that conducts natural “real world” conversations on the phone for you like scheduling certain types of appointments, or booking tables at the restaurants.
Like a robo-PA.
The remarkable thing is that it actually sounds like a real person talking. Check it out.
Here’s an example of Duplex scheduling a hair salon appointment:
And Duplex calling a restaurant to make a reservation:
Duplex makes “the conversational experience as natural as possible, allowing people to speak normally, like they would to another person, without having to adapt to a machine.”
Not only does the system sound almost indistinguishable from an actual human talking, but the system is also able to accurately respond to a range of conversational challenges, including foreign accents, hesitations, and general disruptions you’d expect to encounter in regular, everyday speech.
It’s likely that in the coming years, chatbots and virtual assistants will continue to develop to the point where we don’t realize that we’re actually interacting with a machine and not with another human.
Want to learn more about natural language processing? Here are some useful guides: