7 dark secrets of generative AI

They are a copyright nightmare

There are some nice LLMs on the market already that can handle general chores like doing high school homework assignments or writing college admissions essays that emphasize a student’s independence, drive, writing ability, and moral character — oh, and their ability to think for themselves.

But most enterprises don’t have these kinds of general chores for AI to undertake. They need to customize the results for their specific business. The basic LLMs can provide a foundation but there’s still a great deal of training and fine-tuning needed.

Few have figured out the best way to assemble this training data. Some enterprises are lucky enough to have big datasets they control. Most, however, are discovering that they don’t have all the legal issues settled regarding copyrights (here, here, and here). Some authors are suing because they weren’t consulted on using their writing to train an AI. Some artists feel plagiarized. Issues of privacy are still being sorted out (here and here).

Can you train your AI on your customers’ data? Are the copyright issues settled? Do you have the right legal forms in place? Is the data available in the right format? There are so many questions that stand in the way of creating a great, customized AI ready to work in your enterprise.

They may invite vendor lock-in

In theory, AI algorithms are generalized tools that have abstracted away all the complexity of user interfaces. They’re supposed to be standalone, independent, and able to handle what life — or the idiot humans they serve — throws their way. In other words, they’re not supposed to be rigid and inflexible as an API. In theory, this means it should be easy to switch vendors quickly because the AIs will just adapt. There won’t be a need for some team of programmers to rewrite the glue code and do all the things that cause trouble when it becomes time to switch vendors.

In reality, though, there are still differences. The APIs may be simple, but they still have differences, like JSON structures for invocations. But the real differences are buried deep inside. Writing prompts for the generative AIs is a real art form. The AIs don’t make it easy to get the best performance out of them. Already there’s a job description for smart people who understand the idiosyncrasies and can write better prompts that will deliver better answers. Even if the API differences are minor, the weird differences in the prompt structure makes it hard to just switch AIs quickly.

Their intelligence remains shallow

The gap between a casual familiarity with the material and a deep, intelligent understanding has long been a theme in universities. Alexander Pope wrote, “A little learning is a dangerous thing ;

Drink deep, or taste not the Pierian spring.” That was in 1709.

Other smart people have noted similar problems with the limits of human intelligence. Socrates concluded that for all his knowledge, he really knew nothing. Shakespeare thought that the wise man knows himself to be a fool.

The list is long and most of these insights into epistemology apply in one form or another to the magic of generative AI, often to a much greater extent. CIOs and tech leadership teams have a difficult challenge ahead of them. They need to leverage the best that generative AIs can, well, generate while trying to avoid running aground on all the intellectual shoals that have long been a problem for intelligences anywhere, human, alien, or computational.

About Author

AndyC

Andy Curtis is an award-winning security consultant, researcher and public speaker. He has been working in the computer security industry since the early 1990s, having been employed by state and federal government, leading healthcare and banking providers across three continents. He has given talks about computer security for some of the world’s largest companies, worked with law enforcement agencies on investigations into hacking groups, and is a regular voice on TV and radio explaining IT security threats.

See author's posts