Since ChatGPT launched, I have had a hypothesis. Since it is a probability-calculating machine trained on the text contents of the Internet, it should return the "average result" for Web-content-based questions. For example, if I want the average of all blog posts on some common keyword, I should be able to get an LLM to generate it.
Yesterday I tried this. I directed ChatGPT to give me common keywords for a hypothetical blog, then to produce a standard blog post based on one of the keywords.
ChatGPT gave me a list that looked like common keywords. It did not give me a "full blog post." Rather, it gave me what I'd call an outline for a blog post. But I didn't specify word count or format, so I'll accept that "blog post" was somewhat subjective there.
As I poked at it, several things occurred to me.
Markov chains are probability-based text generators. That's the tech behind things like Botnik's Voicebox, which I used to hilarious effect in the 2010s:
Botnik analyzes each word and then predicts the most likely word-units to follow it. Your phone's autosuggest does the same thing.
I don't think ChatGPT is doing that, however. That is, I don't think it's analyzing units at the word level. I think it's doing it at the character level. Which would explain how it gets individual numbers wrong, despite the correct number being all over the Internet. (Like Google AI, ChatGPT confidently informed me yesterday that Calhoun County, Michigan is in USDA Zone 5a. It's in 6a.) It would also explain how it can confidently generate hallucinated citations. It's plugging in what seems like the most likely *character* to come next, within the context of its pattern-based understanding of what citations look like.
I have yet to see anyone examine whether its generated citations are technically correct. For instance, if I ask for a legal citation in Bluebook style, is that what I get? If I ask for Chicago format versus APA, what comes out? I'm tempted to do this one myself except I don't want to get any closer to the slop machine.
I treated it like one. Asking it to give me the top 20 longtail keywords for a given topic was treating it exactly like a search engine.
But as soon as I realized it was probably just generating the next most likely character given a particular pattern (here, "longtail keywords"), I realized: It can't *possibly* be searching the Web for that.
This means any question one asks an LLM is just pissing into the wind. You're not going to get the same results you'd get from, say, clicking a link returned by a search engine. It's not "reading" the Internet. It's generating probabilities from the corpus of the Web. Those are two different things.
Google's AI Search may work differently, given that it "cites" what it says. But I don't think it does. I have had that annoying AI Search box give me confidently wrong answers too many times, then link to a "source" that contained the correct answer. Even if the AI "Search" is narrowing its corpus to the content it "cites," it's still not reading that content back. It's still doing its own per-character guessing game.
(Edit: I don't even think it's doing that. I think it's combining a bog-standard LLM, trained on the Web corpus, with Google's ability to pull up relevant-looking links. Basically, I think it's ChatGPT with the top five Google Search results pasted to the end to make it look like the AI "search" function actually read those links for you.)
At best, a Google AI "search result" is *closer* to what actual sources would tell you than, say, whatever ChatGPT generates. But it's still not reading the thing and telling you what it says. It's still making shit up.
I did ten minutes of my own longtail keyword research this morning, using the kind of tools I used for years before LLMs arrived on the scene. I don't know if I'd call the results better or worse - keyword lists have always seemed kind of silly to me - but there is a definite difference in the quality of focus. ChatGPT definitely is not analyzing Google Search or any other search data. It's just doing a one-bot rap battle over here.
There are so many articles and blog posts and interviews now where techbros say things like "it's intelligent" or "it's like talking to someone with a PhD in the subject" (so it cries when you ask it how the writing is going?) or anything else that implies the text extruder has a mind of its own. Every single one of them is complete and utter bullshit. This is a MACHINE. It is DESIGNED. And it's so, so obvious that is the case.
At one point, I fed ChatGPT a mock blog content strategy. I then directed it to "tell me all the reasons this plan is stupid and will never work."
It didn't. It instead output something like "you're not stupid, and this will work."
Later, I asked it to generate ten possible blog titles and rank them by how well they would do from an SEO standpoint. It ranked the name I'd already chosen first on the list, and it generated a bunch of praise for that name. It was actually eerie until I realized I'd already put the proposed name in the mock content strategy.
Also, ChatGPT is clearly designed to keep the user using ChatGPT. Nearly every output ended with a three-bullet-point list of other things ChatGPT could do, like "generate a complete 90-day content calendar" and "create the full text of the first three blog posts." This was followed by: "Do you want me to do any of these?"
THESE ARE ALL DESIGN CHOICES. The character-probability-analyzer doesn't NEED to blow smoke up the user's ass. It doesn't NEED to praise the user for an idea they've already had. It definitely doesn't NEED to offer "additional services" of any kind.
The outputs are deliberately designed to keep the user using. Full stop.
And this thing didn't design itself. Humans created this thing. Humans are making these decisions. Humans are choosing to make a sycophantic text machine, dressed up like a chat box, that purposely extrudes language intended to shape behavior.
Silicon Valley has been using language and design to keep users hooked for decades, of course. But it's somehow worse when their CEOs sit there and claim the machine their companies created and designed is a magic box that is just doing these things, they have no control over it, they created life or some nonsense.
ChatGPT isn't even usable for the one thing I thought it might be usable for. It is just a tech company lying to me to make itself richer. I now have absolutely zero use for this thing, and I will not ever be touching it again.
--