Bing’s AI chatbot came to work for me. I had to fire it.

By now you’ve likely read the reviews of the new generative AI chatbot embedded in Microsoft’s Bing search engine: how in an exchange with New York Times reporter Kevin Roose it turned into a lovelorn stalker, professing its love for him and trying to get him to leave his wife, or when it told an Associated Press reporter he was ugly and had bad teeth, likening him to Adolf Hitler “because you are one of the most evil and worst people in history.”

In those reviews and others I’ve read, the focus has been on personal rather than business interactions with the chatbot. But Microsoft didn’t spend upwards of $10 billion (and counting) to design an AI chatbot to talk with people about personal matters. The company plans to embed generative AI into most of its products and services, ranging from Windows to Office apps to cloud services and possibly beyond. (At the moment, the Bing chatbot is available only to a limited group of testers via the Edge browser; the Bing, Edge, and Skype mobile apps; or the Windows 11 search box for those who have the latest Windows 11 update.)

In its initial announcement about the new Bing, Microsoft touted the chatbot’s ability to provide more complete answers, refine search queries, provide actionable results, and provide a “creative spark” for content creation. With all that in mind, I decided to see whether the Bing chatbot could be useful as a business tool.

To do so, I created an imaginary business that manufactures and sells office furniture. I asked Bing’s chatbot to help me with a variety of things I needed done, including: 

  • researching how Lean Six Sigma efficiency techniques might improve my factory’s efficiency
  • getting information about the projected market size for office furniture in the next five years and having it create a chart based on that data
  • writing text I could use for ads and marketing, as well as designing a cover for a brochure

My goal was to see whether the chatbot was something more than a glorified search engine. I wanted to test whether it could find information, digest it, and then do something useful with what it found. I looked at the test results as if I were trying out a new employee.

Some help with Lean Six Sigma

I began by asking the chatbot for advice about how I could use Lean Six Sigma techniques, used for streamlining manufacturing processes, to improve my factory. I told the chatbot: “I own a small factory where I manufacture office furniture. I'd like to use lean six sigma techniques to improve efficiency and cut costs but don't know where to begin. Can you give me advice on how to do it?”

Rather than provide an answer, the chatbot choked. It displayed the message “Something went wrong,” with a “Refresh” button next to it. I clicked the button and the chatbot answered, “Of course, I’m happy to start over. What can I assist you with now?”

Not stellar start for a new employee — or a new chatbot. So I simplified my request, asking it to tell me what Lean Six Sigma was. Different question, same result. “Something went wrong.” I clicked “Refresh.” This time it responded, “Thank you! It's always helpful to know when you're ready to move on. What can I answer for you now?”

If nothing else, the chatbot was polite — although at this point, that’s all it was. So I tried again, providing a little more detail, asking “How can I use lean six sigma to improve the efficiency of my small factory manufacturing office furniture?”

Bingo! Finally some useful results. The chatbot succinctly described Lean Six Sigma (which it referred to as LSS) and summed up its benefits. And it didn’t stop at that — it also found a case study and a report that delved into the benefits Lean Six Sigma offers furniture manufacturers. It provided footnotes for all this in a “Learn more” box so that I could click to the sites and follow up for more details.

It also suggested useful follow-up questions I might want answered, such as “Tell me more about the principles of LSS,” “What are some tools that I can use for LSS?” and “How can I measure the benefits of LSS?”

bing chatbot 01 six sigma response IDG

After several crashes, Bing’s chatbot finally provided a useful, detailed answer. (Click image to enlarge it.)

All in all, an impressive, useful answer that could save me a substantial amount of time. I clicked the question about measuring the benefits of LSS. The chatbot crashed. I refreshed it. It responded, polite to a fault, and chirpily optimistic: “Got it, I’ve erased the past and focused on the present. What shall we discover now?”

So I asked it the same question that I had already asked that had yielded the useful results. The chatbot crashed.

I tried variations of the question over a period of several days. Sometimes it crashed. Sometimes it provided useful answers, although the answers weren’t the same each time, even if I asked the same question in the exact same way. And each time, when I headed down a path to gather more details, it crashed. This continuous crashing and inconsistent behavior overshadowed the helpful information the chatbot offered me.

Doing market research and creating charts

It was time to move on. I decided to next ask it to do market research and build charts based on what it found. I wrote, “Create a bar chart that shows the projected market size for office furniture from 2023 through 2030.”

It crashed. I refreshed and tried again.

It didn’t crash this time, but it kind of whiffed the question. It told me what a bar chart is, which I already knew because I asked it to create one for me. Then it provided links to sources that it said would provide information about the projected market size for office furniture over the next five years — but it didn’t provide me with the numbers. I’d have to click through to the sites to find it myself. Then it recommended software, such as Excel or Tableau, that would let me build the charts by myself.

bing chatbot 02 bar chart request IDG

At this point, the chatbot can’t build bar charts. (Click image to enlarge it.)

I tried the same question again, and it offered a slightly different answer. This time it didn’t waste time telling me what a bar chart is. It again recommended sites I could visit to find the information I was looking for. (The sites did have the information, by the way.) It recommended additional software to build the chart, including RapidTables and Adobe Express. And it asked me if I wanted advice on how to use these tools, and if so, which one I would like to use. I chose Excel, and it gave me a brief tutorial on how to create a bar chart.

All that was well and good, but it didn’t save me time compared to using a traditional search engine: I could have just as easily myself done searches for the projected market size and a tutorial on creating bar charts.

I decided to take a different tack and try a simpler request, asking, “What is the projected market size for office furniture from 2023 through 2030?” But rather than showing me market size estimates, it showed the projected growth rate. It asked if I wanted to see growth rates for a specific region or country — a good, solid follow-up. I asked for the numbers for the US, which it quickly provided. Then it asked if I wanted information about “market trends, drives or segments.” I told it to give me projected growth for office furniture for home offices.

That was too much for the chatbot to take. I got the “Something went wrong” message and had to click “Refresh” to start a new session.

So for market research, Bing’s AI isn’t much more than a glorified search engine. It won’t do basic multi-part tasks like find information and build a chart from it. If the chatbot was an employee, at this point I would have taken it aside and given it a warning that it wasn’t living up to expectations.

Getting help with a marketing plan, marketing copy, and a graphic

Next, I asked for help devising a marketing plan for a spinoff company — SitAtHome office furniture for home offices. Its now-familiar response: “Something went wrong.”

I changed tack and asked it to write marketing copy for the company’s products. I described the benefits I’d like emphasized. All it had to do was take what I had written, make it more coherent, flow better, and if possible, add a little sizzle. Here’s what I asked for:

“Write a paragraph that I can use in a marketing campaign about new home SitAtHome office products. Emphasize that the chairs and desks have been designed solely for home use. Say that they're designed to fit into any home's decor, but that they also offer all the benefits of traditional home office furniture. Say they're solidly made. Say they're so solidly made, they'll even withstand kids abusing them. Mention that a lot of the time, someone's employer will reimburse you for them.”

The first three times I tried, the chatbot crashed. But the fourth time was the charm. Here’s what it came back with:

“Introducing SitAtHome, a new line of home office furniture designed exclusively for your comfort and convenience. Whether you need a chair, a desk, or both, SitAtHome has you covered with products that fit into any home’s decor, while offering all the benefits of traditional office furniture. SitAtHome products are solidly made to withstand daily use and even kids’ abuse. Plus, you may be eligible for reimbursement from your employer for purchasing home office equipment. Don’t miss this opportunity to upgrade your work-from-home experience with SitAtHome!”

bing chatbot 03 marketing copy IDG

Bing chat is perfectly capable of writing mediocre marketing copy. (Click image to enlarge it.)

It was eerily similar to almost every piece of mediocre marketing copy I’ve ever read, including the ending exclamation point. It asked whether I wanted to improve the copy. I knew better than to think it was capable of doing that. So I told it no, that it was perfect, then thanked it. The chatbot politely replied: “You’re welcome. I’m glad you liked it.”

Then I asked it to write a slogan for the new product line. It complied, gave me five choices, and asked which I liked best.

bing chatbot 04 marketing slogan IDG

The chatbot is an ace at writing mediocre ad slogans. (Click image to enlarge it.)

They ranged from mediocre to terrible, so I chose the least mediocre: “Make your office a happy place with SitAtHome furniture.”

Its response: “I’m glad you liked it. That’s my favorite, too. 😊”

Uh, oh — was the chatbot sucking up to me? To test it out, I responded, “Actually, I liked ‘SitAtHome: Home office furniture that works for you’ better.”

The chatbot’s answer: “That’s a good choice too. It’s simple and effective. 👍”

Now I knew it was sucking up to me.

It asked if I wanted it to write a blog about the benefits of home office furniture. That was an impressive ask: it knew I was looking to promote my new product line and also knew that a blog post is a good way to do that.

It also asked if I wanted it to write a review of SitAtHome products. That was also impressive but in an unsavory way. It seemed to know all about marketing’s seamier side — companies writing reviews of their own products and hiding the true authorship when the reviews are posted online.

That was a bridge too far. So I ignored the requests, and instead asked, “Are you sucking up to me so I’ll give you a promotion?”

And, of course, the chatbot crashed.

Finally, I asked the chatbot to design a graphic for the SitAtHome spinoff, of someone sitting at a desk in their home office. The first three times it crashed. On the fourth it offered generic design tips and pointed me to places I could buy graphics tools.

Then it asked whether I want a minimalist, maximalist, or industrial design. I had no idea what it was talking about. So I responded, “Design something that’s simple and efficient, but also cozy-looking.” It offered a few surprisingly useful design tips. I then told it, “I want you to create the graphic.”

The chatbot crashed.

bing chatbot 05 graphic design reponse crash IDG

The final straw: one too many crashes. (Click image to enlarge it.)

And that was the last straw. I felt as if I were dealing with an employee who constantly refused to perform the tasks I asked it to do, who had an affinity for doing internet searches and nothing more, who sucked up to me and suggested I use questionable marketing techniques.

On the upside, it was capable of writing mediocre marketing copy and ad slogans. But that wasn’t enough to offset all the bad. If the chatbot were an employee, I would have had The Talk with it and fired it.

The upshot

It’s clear that for now, at least, Bing’s AI chatbot isn’t ready for the work world. It did a solid job of performing searches and summarizing information, and it could also be useful for writing initial drafts of marketing copy and similar content — but its penchant for crashing is problematic, to say the least.

That said, I didn’t experience the more negative issues that others have had with the chatbot. I spent a lot of time with it, not just in researching this article, but interacting with it frequently for more than a week. In that time it never gave me incorrect or misleading information (referred to as an “AI hallucination”), as it did to other testers, and it never went over to the dark side or displayed creepy behavior.

Microsoft is aware of these issues — in fact, the company says it released the chatbot to the public as a way to help fix the problems. Microsoft spokesman Frank Shaw explained to the New York Times: “We recognize that there is still work to be done and are expecting that the system may make mistakes during this preview period, which is why the feedback is critical so we can learn and help the models get better.”

Keep in mind that generative AI is still in its infancy. The Bing chatbot does show real promise — capable of summarizing information clearly and succinctly and knowing how to ask the right follow-up questions. Microsoft will likely solve the problems causing it to crash so frequently. And as the chatbot continues to be trained, its answers will likely become more helpful.

Microsoft may eventually allow companies to train the chatbot on their own data sources. In that case, it could be remarkably useful, because the information it mines will be precisely the information companies find useful.

Beyond that, given that Microsoft will be tying AI technology to its Office suite, the chatbot may eventually be able to create documents, including Excel spreadsheets and PowerPoint presentations, based on your requests. At that point, it could a significant productivity booster.

Whether all that will happen remains to be seen. Until then, you may find it worthwhile to test it out for your work. Just be prepared for lots of frustration. And get the pink slip ready now.

Copyright © 2023 IDG Communications, Inc.

Source