Tag Archives: Andrew Paul

Machine Learning Does Not Necessarily Mean Getting Smarter

I guess it is always possible to learn the wrong stuff.

ChatGPT and friends have been providing unintentional comedy this year, delivering preposterously wrong answers, while pompously bot-splaining how brilliant their answers are.  In this, they successfully mimic the Internet.  But they are utterly useless if you want to do something important.

<<earlier posts?>>

But, wait.  These machine learning models do not have to be frozen in stone.  They can learn new stuff!  Heck, even dunces like me learn new stuff all the time!  However shabby the first versions may be, they should get better and better, right?

This summer some nasty, skeptical researchers from Stanford and UCB report a comparative study of several versions of the GPT large language models accessible on the Internet [1]. The study asked the same questions of two different models, and asked the same questions three months later.  These included instances of GPT3.5 and GPT4, which are what ChatGPT uses. 

The results showed that even over the three month period covered by the study, the “same” model produced different answers to the same questions.  Some of the differences were slight improvements, but others were definitely degradations.  

As Andrew Paul put it, “ChatGPT’s accuracy has gotten worse”. [2]

Ouch!

Some of the changes over time were clearly due to deliberate policy decisions to not answer sensitive questions.   Other differences have no obvious explanation.

The upshot is these models not only give wrong answers, they may give a different wrong answer next time.  Even if ChatGPT seems to be giving good answers for you, don’t count on it doing that for very long.  Soon, it may “drift” into inaccurate answers to the same questions.

The researchers comment, this study “highlights the need to continuously evaluate and assess the behavior of LLMs in production applications.”  ([1], p. 7)

This seems like a serious drawback to me.

And, while this sort of unreliability is somewhat “human”, this doesn’t seem to be what you’d hope for in an “Artificial General Intelligence” that is destined to exterminate us.


  1. Lingjiao Chen, Matei Zaharia, and James Zou, How is ChatGPT’s behavior changing over time?, arXiv, 2023. https://arxiv.org/abs/2307.09009
  2. Andrew Paul, ChatGPT’s accuracy has gotten worse, study shows, in PopSci, July 19, 2023. https://www.popsci.com/technology/chatgpt-human-inaccurate/