No, ChatGPT is not taking your PM job.
Recently I have been seeing a lot of hooplah, mostly from self-declared AI experts on LinkedIn, about the capability of AI, and specifically large language models like ChatGPT, to replace the Product Management function. According to some, ChatGPT is not only already superior to PMs at accomplishing certain tasks, it is rapidly approaching the point of mastering the most ambiguous and complicated aspects of the PM role. For the doomsday preppers among us, the release of GPT-5 will signal the beginning of the end for PMs, with nothing left to do but update the resume (or have GPT-5 do it for you). PM already stands for so many things, so why not Prompt Manager?
Not so fast. Should ChatGPT and other LLMs be a tool in the proverbial PM tool belt? Almost certainly. Should you already be using it to efficiently supplement or even “outsource” aspects of your PM role? Probably, although like any tool or framework we should be cautious about over-reliance; the fact that something is beneficial in many scenarios does not preclude it from being deliterious in others. But I see four critical blockers with no clear path to resolution that I’m aware of that currently prevent ChatGPT and other LLMs from fully replacing you as a PM.
Product Management is an inherently creative job. I don’t mean “creative” in the hand wavy sense that it’s often used - I’m not over here finger painting and interpretive dancing my way to launching impactful products. Rather, I mean creative in the way that it is discussed by two experts in the field (to whom I may or may not be related): the ability to synthesize novel ideas, approaches, or solutions by combining knowledge, skills, and insights from various disciplines. While ChatGPT can generate potentially novel ideas by synthesizing information from various sources, it lacks the intentionality that true creativity demands. Moreover, creativity lives in the margins: a creative idea is one that combines concepts that are neither universally obvious nor trivial to connect. By contrast, ChatGPT and other LLMs return probabilistic results that inherently skew towards the common or average, or else the variability of the results you might receive would render the tool virtually unusable. While prompt engineering can help skew the outputs of LLMs away from the mean, a result being improbable is not sufficient to make it creative. The work required to validate that an idea is not just new but also valuable is - yep, you guessed it - still PM work.
Product Management requires logical reasoning. One of the most worrisome aspects of evaluating LLMs-as-PMs is the sheer ease with which you can get ChatGPT to give you the wrong answer…er, I’m sorry, hallucinate. This took me all of three tries to get:
You don’t even need to work out the logic of this problem to see that ChatGPT is wrong here, just skip to the last line where it concludes that 38 + 2 = 50.
Would you hire a PM who made critical logic errors on a regular basis? (Would ChatGPT recognize that this is a rhetorical question?) While the above is a very simple and understandable example, we should almost certainly be more worried about the potential for logical errors within more complex and less understood problems. If we’re not sure what the right answer looks like, it is exceptionally risky to trust the answer provided by a notorious hallucinator at face value, and so we must in some form or another analyze the series of logical arguments that support the answer. And so again, we’re back to somebody doing PM work to validate the logic of a complex solution.
Now, can the issue of LLMs failing at logical reasoning…sorry, I’ll get the hang of it eventually - hallucinating be fixed going forward? Possibly, although most solutions to this problem acknowledge that the solution isn’t likely to lie within the LLM itself and instead propose the creation of a distinct, and as-yet-undefined, logic engine for LLMs to interface with. Perhaps a better question is, can we improve an LLM’s logical reasoning capabilities without doing any harm to other aspects of its performance?
You cannot continuously improve all of the outputs of a complex system simultaneously. Without getting too wonky, this is simply an immutable law of any complex system in which all objective functions are not completely aligned: you cannot optimize everything when optimization of one function negatively impacts the performance of another function. An appropriate (and subtly punny) metaphor here is Dr. Doolittle’s Pushmi-Pullyu: the two llama’s heads create a complex system with distinct goals, and unless both head’s goals are perfectly aligned, at a certain point one llama head cannot get closer to its goal without pulling the other llama head further away from its own. And yet, the AI hype machine would have us believe that not only has every subsequent release of ChatGPT improved performance in every conceivable way, this comprehensive improvement will continue into the future ad infinitum. With the current pace at which new iterations of LLMs are released, it is challenging to thoroughly test these claims, but the reality is that where testing has been performed it has already revealed that optimizations to some aspects of ChatGPT and other LLM’s performance have come at the cost of declines in performance to other areas. For example, this study conducted on ChatGPT-3.5 and ChatGPT-4 by researchers at Stanford analyzed the LLM’s performance across a variety of tasks such as answering logic problems, multi-hop knowledge questions that require combining information from disparate sources, and reasoning questions, and found significant variations between the two models’ behavior, with performance on some tasks showing improvement while others deteriorated. Most notably, ChatGPT-4's ability to follow instructions and demonstrate reasoning declined over time.
Contrast this with humans: you are fully capable of learning how to improve your writing or presentation skills without causing any corresponding decline in your logical reasoning or analytical capabilities, and vice versa. Moreover, while your skills may decline over time if unused, there is no scenario barring medical emergency in which you wake up one morning suddenly dramatically less capable of consistently performing a task than you were the day before. I will not claim to understand the human brain’s objective functions, suffice to say they allow us to generate accretive performance in multiple areas of our “brain system” simultaneously in ways that do not harm - and, from the lens of creativity, in fact enhance - other parts of the system.
Product Management requires trust. Finally, let’s imagine all of the above issues are magically solved as we hurtle towards the singularity. For us to outsource our critical thinking and creativity to an AI we must be able to trust in the result, and in order to trust AI we must also be able to hold it accountable. Can we really imagine a future in which an AI generates our product vision, roadmap and specifications and humans (in whatever capacity they remain useful) simply execute? What happens when the results are bad - how exactly would the CEO hold employees accountable, or the board hold the CEO accountable, when an unfeeling and indifferent AI is ultimately steering the direction of the company? Or are we naive enough to believe that everything we build will suddenly become successful simply because it was planned by an AI instead of a human?
It almost doesn’t matter what AI is capable of if we’re not capable of trusting it as we would a human. The best idea, the optimal strategy, and the perfect product have no value if a company cannot align around them and make them reality, and that, once you strip away all of the ceremony and artifacts of the role, is the core responsibility of the PM.
Perhaps I sound like a Luddite. Perhaps this post will age very poorly over the next decade, or year, or even month. But, I am simply doing what I do as a PM, which is to reason my way to a set of strong beliefs while remaining open to having them changed. This topic brings to mind for me something that one of my leaders at Amazon once said: “Let’s automate our way out of this work so we can go do something more interesting.” If I’m wrong about any or all of the above, whatever comes next will be a new challenge, requiring rapid development of skills and solutions to bridge the gap between technology and humans, filled with ambiguity. Sounds perfect for a PM. Come on ChatGPT, take my job!