3 Ways Watson Manifests the Future of Search
In my last post, I made the claim that it won’t be long before people have a Watson in their pockets, a system users can access from mobile devices through the Internet that gives accurate answers to just about every human question. I said that this system represents some sort of future ideal state for search. But I didn’t elaborate on how it works or why it is so good at answering human questions. Perhaps if we can better understand how it works, we can understand what the likely future of search technology looks like. That’s what I want to do in this post.
Like most of my IBM colleagues, I made a point of watching the PBS program The Smartest Machine on Earth, about IBM’s quest to invent a computer that can beat the top Jeopardy! champions in the history of the show. As Watson competes on Jeopardy! next week, we are all eagerly awaiting the outcome, as I’m sure hundreds of thousands of IBMers did when Deep Blue took on World Chess Champion Garry Kasparov almost 14 years ago.
Win or lose, Watson has already made history. Jeopardy! producers would only allow a computer on the show if it played the game like a champion. I want to tell a part of that history here and along the way explain three facets of Watson that manifests the future of search. If you’re interested, please read on.
To avoid the ridiculous scenarios highlighted in those annoying Bing commercials, one of the first things the IBM Research team did when it began building Watson was to give it semantic smarts. Rather than simply looking at strings of letters and spaces in the clues and matching them with other strings of letters and spaces in its vast data bank, Watson needed to parse the language of the clues and the language in its data bank. It looks at parts of speech, phraseology, and other semantic clues from the structure of the English language that tend to turn meaningless strings of letters and spaces into meaningful words, phrases and sentences.
One of the requirements for a Jeopardy! player is it cannot receive any outside help, including the Internet. So all the knowledge Watson needs has to be loaded into its permanent memory: encyclopedias, thesauruses, info from pop culture sites, etc. When the clue is given, Watson searches through all of this data to find the most likely question to which the clue is an answer. If that search algorithm lacked semantic smarts, the results would look like what happens in those Bing commercials: Random strings of text that show up in the system with the most frequency but with little relevance.
As I have written elsewhere on this site, most search engines today build some form of semantic smarts into their algorithms. But the practice is somewhat limited. As far as I can tell, the Google crawler scans sites mostly by brute force and indexes their assets according to a lot of simple rules, which mimic the way users scan and engage with web content. Once all that data is indexed, the engine will try to match the query to the sites in that index that best follow the mix of rules. Most of the changes to the algorithm deal with how the rules are weighted, not with the rules themselves. For example, we think Google tends to weight the rule that favors pages with more external links into them stronger as time goes on.
If Google wanted to add more semantic smarts into its algorithm, it could look not just at keyword density, proximity and other counting mechanisms. It could look at how the keywords are related to other words within the grammar of the content. It must do some of this now because it seems to be able to recognize, for example, the noun form of table from the verb form of table. But it can’t do a lot of it yet because it is just too computationally grueling. Watson does quite a bit more semantic analysis than any commercially available search engine I know of. By showing the power of semantic smarts, Watson is bound to spark smarter search engine development.
Natural Language Processing
Jeopardy! provides all kinds of challenges that computers tend to struggle with. In particular, the clues in a Jeopardy! game are nothing like a search query. Savvy searchers tend to make simple and clear queries because search engines will return better results if the query is clear. Jeopardy! clues contain all kinds of puns and other ambiguities that could trip up the best search engines.
Part of the natural language processing stems from the semantic smarts built into Watson. But humans don’t parse complex English through semantics alone. Humans rely on pragmatics to resolve ambiguities and to parse fragments and other linguistic weirdness. Pragmatics is the study of the reasoning humans perform when they weigh many possible semantic interpretations of an ambiguous chuck of language, in order to determine the most likely interpretation.
We don’t even recognize most of this reasoning as we do it. We learn it as we learn to communicate in our formative years. But the science of how this reasoning works is by no means complete. In our book, we follow Sperber’s and Wilson’s brand of pragmatics, which uses relevance as the guiding assumption for all pragmatic reasoning. That is, when humans try to parse linguistic weirdness, they tend to assume that the reason someone says something or writes something is to maximize its relevance for the audience. This is a particularly fruitful assumption for our purposes because it is also the guiding assumption of search engines.
In the case of Jeopardy!, the clues follow even more complexity than simple relevance. So the Watson team needed to study the pragmatics of the clues from thousands of Jeopardy! games in history, and build more sophisticated pragmatic assumptions into Watson. These assumptions can help Watson become more than just a search engine, but a sophisticated information system that approximates human understanding.
The way the team develops these reasoning mechanisms and plugs them into a search engine will enable a whole new brand of search, one that is tailored to the type of inquiry needed, not just relevance. For example, a medical diagnosis uses clues from a patient’s symptoms and circumstances to determine likely causes. With the right pragmatics, Watson could be tuned as the diagnostician’s dream desk reference. Still, I’d be happy for a natural language processor based on the assumption of intended relevance for the purposes of search within the web medium.
In early tests, even with the latest semantic and natural language technologies, Watson did not compete well with past Jeopardy! players. The team needed to help Watson learn, just as children need to learn the subtle rules of their complex social and cultural environments in order to communicate. For most of us, this is a lifelong journey. Watson seems to have accomplished much of it in a few short months. How? Well, for one, Watson never sleeps. While the test players were resting their synapses, Watson was cranking thorough past Jeopardy! clues to build more sophisticated pragmatic processing.
Machine learning is not new. It is used in many programs that mimic human language to continually improve the way the program approximates humanity. One of humanity’s greatest strengths lies in our diversity. We each have our own unique way of saying things. We each have our own unique voice when we write. But programs need to take paradigm cases of users and mimic their behavior. This leads to a lot of bad pattern matching when someone speaks or writes in an inimitable way.
Thus speech recognition programs used to force users to speak in a particular way, the way they were built for their test users. If you haven’t noticed, speech recognition programs have improved tremendously over the last decade. How? The more people used them, the more they learned the range of use cases, and the better they were able to recognize patterns among a diverse group of users.
This is how Watson has learned so much in such a short time. He has been cranking through the clues and learning patterns among the tremendous diversity of language use hidden within them. It’s not unlike how Google learns though continuous A/B testing. The pages that get the most engagement and the lowest bounce rates will rise in their rankings. Those with high bounce rates will fall in their rankings. If Google didn’t add new pages to its index all the time that used nefarious means to trick it to rank them highly, Google’s results would continuously improve. Because Watson has a finite universe of clues, eventually, he will master Jeopardy!, perhaps well enough to win the big match.
Watson has shown that using machine learning can greatly improve how it understands both the semantics and the pragmatics of human language. Google’s program of machine learning shows the way for more pervasive use of this practice to continually improve search results for users. I expect machine learning to be an integral part of all future search engines.