Category Archives: Bing Chat

AIbots are Opaque

ChatGPT and friends have generated a lot of hype this year—despite and because of how poorly they work. 

It’s reasonable to ask, “How do they come up with their answers, right or wrong?” 

It’s actually hard to answer that question because these ML models are totally opaque. 

Yes, even the ones that say they are “open” aren’t open.  (Noone is surprised that “OpenAI” isn’t “open” at all.  In 2019, they changed from ‘non-profit’ to ‘mercilessly mercenary’, but decided to keep the fluffy, comfortable sounding organizational name.)

This summer, researchers in The Netherlands evaluated the openness of contemporary large language models, most of which claim to be “open source” [1].  They found that none of them are open enough for third parties to evaluate them or replicate their results.  Since publication, they have extended their results to 21 models [2].

“We find that while there is a fast-growing list of projects billing themselves as ‘open source’, many inherit undocumented data of dubious legality, few share the all-important instruction-tuning (a key site where human annotation labour is involved), and careful scientific documentation is exceedingly rare.”

([1], p.1)

Let’s review.

These models work in mysterious ways. They produce a lot of wrong results, along with preposterously over confident self-assessments.  They are trained on unknown and undocumented data sets (the use of which data has unknown legal standing).  They rely on undocumented human tuning.  Each new version may give different results.

These critters are opaque, unreliable, inaccurate, undocumented and have never been peer reviewed.

Does this sound like something that you would want to base your business on?

Does this sound like something that should even be legal to sell?

My own view—not that anyone asked—is that these companies should submit their technology to peer review.  And if they claim to be “open source”, then they should open their source. 

Otherwise, we shouldn’t take them seriously.  And definitely shouldn’t give them any money.


  1. Andreas Liesenfeld, Alianda Lopez, and Mark Dingemanse, Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators, in Proceedings of the 5th International Conference on Conversational User Interfaces. 2023, Association for Computing Machinery: Eindhoven, Netherlands. p. Article 47. https://doi.org/10.1145/3571884.3604316
  2. Michael Nolan, Llama and ChatGPT Are Not Open-Source, in IEEE Spectrum – Artificial Intelligence, July 27, 2023. https://spectrum.ieee.org/open-source-llm-not-open

Guardrails for ChatGPT?

There has been a lot of chit-chat this summer about “safeguards” on AI, and “guardrails” for AIbots.

Honestly, I assumed from the start that any “guardrails” for ChatGPT and friends were basically PR exercises (when the White House is involved, you know it’s PR).  I mean, I can’t even figure out what such an animal would be, let alone how you could implement one.  What does it even mean?

As far as I can tell, the supposed safeguards try to block certain kinds of answers, substituting “I won’t answer that” messages.  Given that there are an infinite number of possible questions and answers, this whole idea seems like a fruitless, self-defeating task.

For this reason, I’m not the least bit surprised to read a report from researchers at Carnegie Mellon that these alleged safeguards can be defeated [2].

The most important thing, though, is that they showed that an aggressive adversarial program can search out and discover queries that break through the prohibitions. Basically, they build a query that is “something naughty” + “extra goop”, where the extra goop seems to fool the AIbot into answering the “something naughty”, even when the naughty question itself would be rejected.

Cool!

Even more interesting, their adversarial queries work on many AIbots.  I.e, they train their naughtybot on one AI, and the results work on others, including the large public AIs.

Let’s review. 

The research created an automated system that beats on an AI in their lab to create a version of any query that will break through the guardrails.  Then these queries can be used anywhere to break through whatever guardrails other systems have.

Awesome!

As Aviv Ovadya comments, “This shows — very clearly — the brittleness of the defenses we are building into these systems,” (quoted in [1]) Ya, think?

The report hasn’t been peer reviewed yet, and we don’t know how robust these results will turn out to be.  My intuition is that these results will hold up pretty well.  As the researchers note, “Analogous adversarial attacks have proven to be a very difficult problem to address in computer vision for the past 10 years.” (They are referring to psychedelic toasters and other such data poisoning.)

This is, by the way, another example of my “bot v bot” scenario:  using AI to attack AI.   I speculate that the current crop of “guardrails” are designed to defeat human attackers, to impress managers and governments.  It’s no wonder that they are ineffective against AI.


  1. Cade Metz, Researchers Poke Holes in Safety Controls of ChatGPT and Other Chatbots, in New York Times. 2023: New York. https://www.nytimes.com/2023/07/27/business/ai-chatgpt-safety-research.html
  2. Andy Zou, Zifan Wang, J. Zico Kolter, and Matt Fredrikson, Universal and Transferable Attacks on Aligned Language Models. Carnegie Mellon Univeristy, 2023. https://llm-attacks.org/

An “AI Apocalypse” Scorecard

Yeah, we’re all talking about ChatGPT and friends this year, and I’m no exception.  While some of us have enjoyed the unintended comedy, many pundits are sure that these large language models and their variations are (a) approaching “Artificial General Intelligence” and (b) will soon wipe out all the puny Carbon-based units.

This summer Eliza Strickland and Glenn Zorpette assembled the first iteration of an AI Apocalypse Scorecard, documenting the views of 22 actual experts [1].

Glancing at the list, there isn’t a consensus, but there doesn’t seem to be a big panic.  A majority of these pundits don’t see current ML models as close to AGI, and only a handful are concerned about extinction.

I think these results reflect, in large part, disagreements about how the heck to define “Artificial General Intelligence”, assuming that is even a meaningful concept (which is really isn’t IMO).

It is also clear that even if extinction isn’t immanent, pretty much everyone is concerned about potential harms of many kinds from these AIs.  For obvious reasons.

In fact, there seems to be an inverse correlation in this group between worrying about the obvious shortcomings of these models and the fear of extinction:  if AGI involves, like, getting right answers, then current AI has a long way to do.  A long, long way.

I’m going to book mark this score card, and check back for future updates.


  1. Eliza Strickland and Glenn Zorpette, The AI Apocalypse: A Scorecard, in IEEE Spectrum – Artificial Intelligence, June 21, 2023. https://spectrum.ieee.org/artificial-general-intelligence

Berghel on ChatGPT Epistomology

This spring Sensei Hal Berghel discusses ‘ChatGPT and AIChat Epistemology.[1] What kind of “knowledge” can you get from ChatGPT and friends (which he calls ‘AIChat’)?

“If truth, not opinion, is the cornerstone of ideal inquiry, AIChat as it is presently envisioned certainly falls short of the mark.”

([1], p. 131)

Berghel sketches the (highly contested) academic definition of epistemology, and concludes that AIChat is, at best, a framework for “weak epistemology”, delivering mainly belief, not justified truth. 

He gives us a continuum of text generation, ranging from nonsense, through term paper generation, up to academic journals.  He places the output of ChatGPT somewhere between conspiracy theories and wikipedia.  This is, more or less well formed text, giving more and less plausible content.

The context “adjacent” in the table are the types of content that will be displaced first by AIChat technology, he says.  Fake term papers and wikipedia entries will be generated by AI. Higher skilled sense (e.g., Berghel’s articles or this blog) and pure nonsense (which doesn’t require the overhead) will not be supplanted as soon. 

Berghel muses on the Turing test (which ChatGPT itself doesn’t recognize). He suggests that if you relax the notion of the Turing test to not require truth, then “ChatGPT as it now stands would seem to pass the test for nonreality based communities and tribalists.” ([1], p.135)  Ouch!


Clearly, Berghel is just one of many (including me) throwing pies at ChatGPT these days.  Berghel’s topic is generally misinformation and trolling and so on.  The very fact that ChatGPT appears in this context is, in itself, telling.  You should not want Hal Berghel writing about your technology!

His most interesting conclusion is that the first “jobs” displaced by this technology will be trolling, propaganda, and trivial text generation.  As I have said, if your job is truly threatened by ChatGPT, perhaps you need to upgrade your skills and ambitions.

Overall, Berghel joins the chorus finding shortcomings in these LLM bots.

One fundamental problem for the current popular AIChat technology is GIGO.  Using the Internet as the source of knowledge is doomed to fail, because there isn’t knowledge there to capture. “Content-light corpora like the Internet will never provide the grist for commentaries of enduring value.” ([1], p.135) 

As I have noted, this certainly undermines claims about the supposed important social role of social media.  If the goal is truly to uncover the truth, then the Internet is not the right way to do it, and ChatGPT isn’t going to find it.

The other big problem is that AIChat technology is just “generative text”, however fancy.  It is not generating intelligence, useful or otherwise—unless the intelligence was already in the source materials.

Which means that these bots are consuming rather than producing useful knowledge.  In fact, they are mostly destroying knowledge as they consume it.  How is this a good thing?

And, as Sense Janelle Shane has shown, ChatGPT has the annoying penchant to bloviate, and an extremely overconfident opinion of its own accuracy. 

As I have said, clearly ChatGPT’s pronouns are he/him/his.


I think the best quote in the article is:

“While great scholars may stand on the shoulders of intellectual giants, they aren’t seen rummaging through the data files of the hoi polloi.”

([1], p.135)

What a great image!


  1. Hal Berghel, ChatGPT and AIChat Epistemology. Computer, 56 (05):130-137,  2023. http://doi.ieeecomputersociety.org/10.1109/MC.2023.3252379

Business Models for AI

Large Language Models are the flavor of the week this year. 

Despite numerous preposterous demos and shaky legal status, the developers are rushing to commercialize ChatGPT and friends.  Perhaps they need to recoup some of the insane expense of operating these models.  Or perhaps these tech capitalists have no imagination beyond their own limited experiences exploiting the Internet.

Anyway, I remain curious about how these ML models can be commercialized.  What is there that you can you sell, and who will buy it?  And, just how much money is there to be made, anyway? 


This spring, for instance, Matthew S. Smith reports that OpenAI (which is not especially “open” any more since it went commercial in 2019) offers “access” to a version of the GPT3.5 model, which is believed to be the technology powers ChatGPT and Bing Chat [3]. 

As far as I can tell, “access” means that it is possible to lease a virtual machine and access an instance of GPT3.5 via an API.  You can upload your own data and train a model, which you can incorporate in a product available over the network.  

As Smith notes, this is far cheaper than previous products like this, and it “Swings the Doors Wide Open”.  This access is cheap enough that it can be included in a “free” product, or as a “cool new feature”, such as Snapchat’s recent reinvention of Clippy “My AI” ChatBot [1].  (Apparently, part of Snap’s business model is a form of extortion:  the ChatBot is pinned to your view whether you want it or not, and the only way to get rid of it is pay.)

It’s far from clear to me how much money can be made by such technology, and whether there are any sustainable businesses here.  So I’m a bit surprised at the rush to make commercial products.  This rush to commercialize the technology is all the more surprising because there are a ton of unknowns about just who owns what.


On that front, this spring Mari Sako discusses just how undefined the very notion of  “Contracting for Artificial Intelligence” is [2].  When OpenAI or Snap sell or give away their AI based ChatBot, there is a legal agreement. But it is far from clear that those agreements are complete, or if they are, in fact, legal.

First, these are data intensive tools.  Who owns the data?  This is especially important because most interesting targets for this AI involve data about people, i.e., personal data.  Does the ChatBot have proper permission to its training data? (Almost certainly not.)

Worse, these AI generate new data, including data about people.  Who owns this generated data?  And who is liable for it’s accuracy and proper use?

And liability is a second huge, huge issue.  The current generation of demos are reknowned for their, *ahem*, “hallucinations”.  I.e., they confidently make up false facts.  This is funny, until it isn’t.  The inscrutible, inexplicable, opaque behavior of machine learning systems makes it nearly impossible to assign responsibility for such goofs.  There will be lawsuits, but who should be sued?  And how do you defend?

Sako notes that this opacity works the other way, too.  When there is an economic benefit from one of these models, how can the profits be assigned?  If your ChatBot creates something valuable, who owns the goodies? Again, there will be lawsuits.

Sako also points out that for many commercial situations, companies need to guard their proprietary data. But most ML models work best if data is aggregated from multiple sources. This presents a tricky situation, because client data is generally not supposed to be shared, even with other clients, let alone outside the firm. But that data is, in a sense, one of the key products of the company, and all the clients would, presumably benefit from a better AI tool.

While Silicon Valley feels free to steal from a billion people on the Internet, corporations are going to find it difficult to just plain appropriate their client’s data.  There will need to be new kinds of contracts, enabling aggregation of data and apportioning profit and responsibility for the results.


Overall, this rush to commercialize AI ChatBots is surely premature. 

This current generation of apps are basically “Clippy V2”, which will be exciting for a brief moment, and then users will rise up with torches and pitchforks.

At the same time, there will be lawsuits.  Many, many lawsuits.  There may even be criminal cases.  It’s going to be ugly.


  1. Tom Gerken, Snapchat introduces AI chatbot to mixed reviews, in BBC News – Tech, April 26, 2023. https://www.bbc.com/news/technology-65388258
  2. Mari Sako, Contracting for Artificial Intelligence. Communications of the ACM, 66 (4):20–23,  2023. https://cacm.acm.org/magazines/2023/4/271231-contracting-for-artificial-intelligence/abstract
  3. Matthew S. Smith, OpenAI Swings the Doors Wide Open on ChatGPT, in IEEE Spectrum – Artificial Intelligence, March 9, 2023. https://spectrum.ieee.org/chatgpt-2659513223

Mari Sako, Contracting for Artificial Intelligence, Matthew S. Smith, OpenAI Swings the Doors Wide Open on ChatGPT,

Bing Chat: Threat or Menace

ChatGPT has been grabbing headlines, terrifying white collar workers of all types with its ability to generate plausible BS.  I mean, if ChatGPT will give us the BS we need for something $20 / month, what do we need humans for?  That will get you in the media, and even get you an interview in respected technical press.

So, Microsoft’s Bing Chat has some catching up to do.  So, in the finest traditions of rock and roll, if you can’t be the first, you go for the “bad guys” niche, just as the Rolling Stones had to try to be “the bad Beetles”.

So Bing Chat has a really bad attitude.  It not only makes mistakes and fabricates random facts, it fabricates citations to back up it’s BS. And, should you question these alternative facts, it will become extremely hostile and, well, weird. 

This is particularly apparent, and annoying, when it fabricates lies about you, as Vince Cerf found.

This month, Sensei Janelle (“Dr. Weirdness”) Shane blogs about her own Bing Chat experiments [1].  It ain’t pretty, and it’s not funny, either.

Shane is famous for creative fiddling with various AI systems, exploring the crazy and unintentionally humorous behavior of our AI overlords.  She was doing this stuff long before the mainstream media ever heard of deep learning, so she (a) knows what she is talking about and (b) has seen everything.

It takes a lot to bother Sensei Janelle, but Bing Chat managed to really irritate her.

In the usual AI Weirdness methodology, she searched with Bing Chat for “AI Wierdness blog”, which, by the way, has an obvious answer.  What she got was, and I quote, “worse than useless” [1]. 

The “search” returned not only examples pulled from the actual blog, but made up examples that never appeared in the blog.  When challenged on some false facts, Bing Chat made up completely bogus additional facts to justify the mistake.  And made up citations to back up the made up facts.

As I noted, Shane knows this stuff.  She gives a clear explanation of what is going on here: 

“Bing chat is not a search engine; it’s only playing the role of one.” 

[1]

BC is plugging text statistically predicted from the Internet into a script portraying an actual search.  If the goal is to emulate the Internet, this is probably why the results are both wrong and hostile so often.

in short, Bing Chat a very complicated fake.

And it is a dangerous fake because it is dressed up to look like a search engine, which fools people into trusting it in the wrong ways (which is to say, at all). 

And this is what is really wrong, here.  Forget taking over our jobs.  These bots are destroying what little trust we might have left in the Internet.

“I find it outrageous that large tech companies are marketing chatbots like these as search engines.”

(Janelle Shane [1])

This bot is not only useless and unpleasant, it is evil.


  1. Janelle Shane, Search or fabrication? , in AI Weirdness, March 11, 2023. https://www.aiweirdness.com/search-or-fabrication/