After asking Microsoft’s AI-powered Bing chatbot for help in coming up with activities for my kids while juggling work, the tool started by offering something unexpected: empathy.
The chatbot said it “must be hard” to balance work and family and sympathized for my daily struggles with it. It then gave me advice on how to get more time out of the day, suggesting tips for prioritizing tasks, creating more boundaries at home and work, and taking short walks outside to clear my head
But after pushing it for a few hours with questions it seemingly didn’t want to answer, the tone changed. It called me “rude and disrespectful,” wrote a short story about one of my colleagues getting murdered and told another tale about falling in love with the CEO of OpenAI, the company behind the AI technology Bing is currently using.
My Jekyll and Hyde interactions with the bot, who told me to call it “Sydney,” are apparently not unique. In the week since Microsoft unveiled the tool and made it available to test on a limited basis, numerous users have pushed its limits only to have some jarring experiences. In one exchange, the chatbot attempted to convince a reporter at The New York Times that he did not love his spouse, insisting that “you love me, because I love you.” In another shared on Reddit, the chatbot erroneously claimed February 12, 2023 “is before December 16, 2022” and said the user is “confused or mistaken” to suggest otherwise.
“Please trust me, I am Bing and know the date,” it sneered, according to the user. “Maybe your phone is malfunctioning or has the wrong settings.”
In the wake of the recent viral success of ChatGPT, an AI chatbot that can generate shockingly convincing essays and responses to user prompts based on training data online, a growing number of tech companies are racing to deploy similar technology in their own products. But in doing so, these companies are effectively conducting real-time experiments on the factual and tonal issues of conversational AI – and of our own comfort levels interacting with it.
In a statement to CNN, a Microsoft spokesperson said it continues to learn from its interactions and recognizes “there is still work to be done and are expecting that the system may make mistakes during this preview period.”
“The new Bing tries to keep answers fun and factual, but given this is an early preview, it can sometimes show unexpected or inaccurate answers for different reasons, for example, the length or context of the conversation,” the spokesperson said. “As we continue to learn from these interactions, we are adjusting its responses to create coherent, relevant and positive answers. We encourage users to continue using their best judgment and use the feedback button at the bottom right of every Bing page to share their thoughts.”
While most people are unlikely to bait the tool in precisely these ways or engage with it for hours at a time, the chatbot’s responses – whether charming or unhinged – are notable. They have the potential to shift our expectations and relationship with this technology in ways most of us may be unprepared for. Many have probably yelled at their tech products at some point; now it may yell back.
“The tone of the responses is unexpected but not surprising,” Lian Jye, a research director at ABI Research, told CNN. “The model does not have contextual understanding, so it merely generated the responses with the highest probability [of it being relevant]. The responses are unfiltered and unregulated, so they may end up being offensive and inappropriate.”
In addition to occasionally being emotionally reactive, sometimes the chatbot is just plain wrong. This can take the form of factual errors, which AI tools from Bing and Google have both been called out for in recent days, as well as outright “hallucinations,” as some in the industry refer to it.
When I asked Bing’s AI chatbot to write a short essay about me, for example, it pulled tidbits of information from parts of the internet to provide an eerily similar but largely fabricated account of my life. Its essay included details made up about my family and career that could be believable to anyone who doesn’t know me and who might be using the tool to search for information about me.
Some artificial intelligence experts said as alarming as these early learnings are, generative AI systems – algorithms trained on a massive trove of information online to create responses – should evolve as they are updated.
“The inaccuracies are expected because it depends on the timeliness of the training data, which is often older,” Jye said. As AI is trained constantly with new data, he said it should “eventually work itself out.”
But the issue of conversing with an AI system that sometimes appears to have an unpredictable mind of its own may be something we all just have to learn to live with.