Long time editor of the ASRNews monthly newsletter, Walt Tetschner, has come out with a rather curmudgeonly review of Siri in his October issue of his newsletter (published today, November 9, 2011). Here is an excerpt from the section in the newsletter that focuses on Siri, titled “SIRI positioning is a bad mistake”:
Apple has positioned SIRI as a personal assistant. By doing so, they are setting expectations that will be a challenge for SIRI to achieve. If your assistant makes an error, it might not bother you the 1st time. You correct it, and expect the assistant to learn to do the chore correctly the next time. When the next error occurs, you probably aren’t as forgiving. By the 3rd error, you are thinking of firing the assistant. Data exists that indicates that a similar thing happens with a speech-enabled assistant. As time goes on and the speech-enabled assistant keeps making errors, users simply stop using it. As a general purpose assistant for dealing with all communications, SIRI is simply inappropriate. Aside from the lack of robustness, speaking to a mobile phone is often totally inappropriate. When other people are around, it invariably is wrong. It isn’t private and can disrupt and irritate others. Talking to a machine is perceived as socially weird behavior. Speech is totally appropriate and most effective for use in a hands-eyes busy environment. It is often the only safe way of communicating. Apple would have more appropriately positioned SIRI as a tool for hands-eyes busy communications. One of the weak spots of SIRI is that it requires the user to push a button to get it to recognize speech. SIRI needs to add the Sensory Truly Hands Free technology. The recent 2-day SIRI power outage made it clear that SIRI needs a data connection to do local tasks like play a song or schedule an appointment. This further limits its utility. Siri has gotten a lot of visibility since it was made available. The primary utility appears to be amusement, though. Most users have found it more entertaining than actually helpful. It’s amusing, but how much can it handle your day to day tasks? Two highly publicized speech-enabled personal assistants have failed in the past. Wildfire failed in the late 1990s and General Magic failed in 2002. Users claimed that they loved them. They had high expectations for the products. Over time, these expectations were not met and the users simply stopped using them.
First, having witnessed Mr. Tetschner’s decades-long sustained shrill complaints about how sub-performing Speech Technology has been, I find it surprising that he did not bother to mention that the speech recognition of Siri is remarkable in its accuracy. I have owned an iPhone 4S for almost 3 weeks now, and have been using Siri on a daily basis – to send email, text, voice dial, look up stuff, or just goof off — and my awe at the level accuracy has yet to wear off. It is not perfection, but it sure is close to it — so close, that I wonder if its error rate is comparable to that of a human (and the error rate on humans is NOT zero). And I love the fact that I now type and peck and swipe a lot less than I used to.
Second, Mr. Tetschner seems to miss some pretty basic aspects about Siri that put it in a unique position compared to what has come before. First and foremost is the fact that Apple is behind it. Why is that important? To begin with, Apple cares about the user experience, and so they will make it a mission to do all that they can to improve it. Second, because it is Apple, they have the resources that are needed to invest in such improvement. And Third, Apple cares a lot about its brand and will not let any of its products tarnish it. Apple will not let Siri fail, nor will the legions of Apple users who love Apple’s daring vision and understand fully why Apple is moving in the direction of Siri.
Thirdly, Mr. Tetschner betrays the outlines of the small box within which he seems to have confined his vision of what we should ultimately strive for in a speech interface: let’s use speech only when our eyes and hands are busy. Really? If I can reliably dictate an email, even if my eyes and hands are NOT busy, you think I will bother with typing that email? To be sure, I will type that email if I can’t privately dictate it, but when I can, I will. And I will voice dial in most scenarios that I can think of, except in meetings, where I shouldn’t dial out anyone in the first place…. Apple is going for the real thing: the most natural User Interface that humans can interact with: naturally spoken language. When Steve Jobs kept repeating in his last keynote in WWDC back in June, “It’s that simple,” and “It just works,” he really meant it, and the ultimate interface that is that simple and just works is the spoken word.
Fourth, it seems to me that Mr. Tetschner is missing (or is not aware of) the fact that Siri is a service in the cloud that continuously learns and improves. His comparisons with Wildfire and General Magic are off the mark. If indeed those products are to be called failures, past failures are not necessarily predictors of future failure. Was the Apple Newton (a cousin of General Magic) a failure? To some it was, but in my eyes, it was simply technology before its time, and in any case, the technology was at the very least the conceptual glint in Apple’s eye of what would later become the iPad. But more crucially, the key difference between Wildfire and General Magic on one side and Siri on the other, is that Siri trains against the user’s voice, it is in the cloud and is continually learning. None of the previous “Assistants” can make that claim.
Siri has a long way to go, there will be outages, it will behave stupidly, its recognition will not be as good as perfection, but for those in the field who have been dreaming of the day when we can just talk to our machines naturally, without having to peck, and swipe, and tap, Siri is a monumental step forward. The birth of Siri is an occasion to rejoice and to cheer, rather than to pretend that it’s business as usual. Because, it is not business as usual.