Would You Notice the Quality of Your AI Dropping?

You know that ChatGPT is getting more politically correct. But did you know that it is also getting dumber? Researchers have repeatedly been asking it to do tasks like generating code to solve math problems. In March, ChatGPT 4 could generate functioning code 50% of the time. By June, that ability had dropped to 10%. If you’re not paying, you are stuck with ChatGPT 3.5. This version managed 20% correct code in March but was down to approximately zero ability in June 2023.

This phenomenon is known to AI researchers as “drift.” It happens when you don’t like the answers the machine gives, and take the shortcut of tweaking the parameters instead of expensively re-training your model on a more appropriate data set. Twisting the arm of an AI to generate more socially acceptable answers has been proven to have unpredictable and sometimes negative consequences.

If you are using any AI-based services, do you know what the engine behind the solution is? If you ask, and your vendor is willing to tell you, you will find that most SaaS AI solutions today are running ChatGPT with a thin veneer of fine-tuning. Unless you continually test your AI solution with a suite of standard tests, you will never notice that the quality of your AI solution has gone down the drain because OpenAI engineers are pursuing the goal of not offending anyone.

Leave a Reply Cancel reply