Microsoft Office Copilot Initial Reactions

Microsoft announced that their large language model (LLM) technologies will be coming to Microsoft Office. As with all things AI these days, I’m deeply ambivalent.

On one hand, I can clearly see how using language technologies like ChatGPT to handle things like scheduling meetings, handling mundane and routine office requests, or summarizing meetings can save time and effort. If I could say, “Copilot, schedule a meeting with Jane next week at our earliest convenience, but preferably Tuesday” and have that just happen, my life would be measurably better.

On the other hand, this technology simulates human language and has a dubious connection to truth. Just as a video game simulates the world, these models simulate language. Gamers don’t — or at least shouldn’t — take a landscape or character as real, no matter how realistic it looks. As we play video games, we suspend our disbelief to be immersed in an imaginary world. With the possible exception of simulation games, we don’t expect our games to conform to the constraints of truth and reality. Even in games like flight simulators, where aviation enthusiasts go so far as building replicas of cockpits in their garages, we and they don’t really believe that we are flying an actual P-38 or F-16 or Airbus.

Why should I believe that the language or images being generated by generative models — models that simulate, i.e. sample, by drawing examples from a high-dimensional joint probability distribution that reflects their training set and human inputs — are anything more than simulations. Like every other simulation, I don’t mistake it for something genuine. In fact, we all know that they are wrong, sometimes hilariously or shockingly wrong, and “hallucinate”. Even Microsoft acknowledges this:

“We know AI gets things wrong, we know it hallucinates and we know it does it confidently,” admits Friedman. “We continue to work on making it better at doing that less, but also that the user experience really empowers people and puts them in the driver’s seat.”

I’m not comforted by reading that a technology “confidently” asserts things based on “hallucinations” and I certainly wouldn’t bestow upon it the label of “intelligent.” Maybe we should start calling “AI” something closer to what they seem to be: “lunatic language simulators.”