Forecasting Change in the Time of GPT
It is impossible to predict what technology will look like in five years, who will develop it, or what it will do. Five years ago, almost no one outside academia and researchers in a few technology labs was concerned about large language models (LLMs) or Generative Pre-trained Transformers (GPT). For most of us, the two decades between the early 2000s and 2020 were spent being excited about neural networks and deep learning for image classification, speech recognition, and natural language processing. In 2021, that scenario flipped. Instead of classifying images, technology was deftly generating original images and art. Along with ChatGPT, DALL-E and Midjourney had become everyday names. By 2022, GPTs were on our desktops, ready to transform industries, hinting at a profound change in our lives. By 2023, the technology was creating original music, speech, videos, and code. In less than five years, Generative AI (GAI) had arrived. Who could have known?
Today, the Stanford University Human-Centered Artificial Intelligence (HAI) report tells us that as of 2023, surpassing human performance in image classification, visual reasoning, and understanding natural language has been achieved. Together with several notable changes in AI, the report pointed out that until 2014, academia was primarily releasing Machine Learning models. That had changed. In 2023, academia released 15 models while industry released a massive 51.
In May 2024, OpenAI unveiled GPT-4o (Omni), its “new flagship model that could reason across audio, vision, and text in real time.” The model could listen, watch, talk, sing, question, joke, laugh, reprimand, and role-play. OpenAI invited Salman Khan of Khan Academy for one demo to showcase GPT-4o’s capabilities. Salman Khan used the model to tutor his son Imran about a math problem simply by asking GPT-4o to “visually” examine a screenshot of the problem. GPT-4o combined its ability to listen, see, and read with impressive dexterity to show it could understand the problem and be a great tutor.
What could the next five years bring? The question is important because, as Gartner predicts, “By 2026, more than 80% of enterprises will have used generative artificial intelligence (GenAI) application programming interfaces (APIs) or models, and/or deployed GenAI-enabled applications in production environments, up from less than 5% in 2023.” So, here are our top three predictions (spoiler alert: the third one is a no-brainer):
One, a vast amount of expertise will be free. The Khan Academy demo should be an early indication of what is coming. Children will find AI-driven tutors. Doctors, lawyers, financial advisors, and citizens will have access to affordable AI-based experts.
Two, personal AIs will know everything about us. Crunching mountains of data in real time is no longer a challenge. Apple is already doing this with Apple Intelligence, which understands personal context to simplify tasks and take action. In a recent TED Talk called With spatial intelligence, AI will understand the real world, Fei-Fei Li, the AI pioneer, computer science professor at Stanford University, and founding director of the Stanford Institute for HAI said, “The urge to act is innate to all beings with spatial intelligence, which links perception with action. And if we want to advance AI beyond its current capabilities, we want more than AI that can see and talk. We want AI that can do.” She was talking in the context of embodied AI, but her thinking also holds good for digital in-the-cloud GAI. Personal AIs will step in with action because they will have access to our health records, our bank accounts and investments, our educational backgrounds, our journals and emails, and our likes and dislikes about organizations, businesses, news sites, movies, books, food, places, and people. Businesses and other institutions have been using APIs for years. Soon, APIs designed for personal AIs will be available. These APIs will have a public interface, letting us make data about ourselves available selectively for specific types of relationships.
Three, people will become the key differentiators of a service or a brand. Reason: Nothing is more potent than human experience and intuition, nothing more comforting than a promise made by a human, and nothing more reassuring than the human touch. Human connection is a craving firmly etched into our DNA that will take centuries to change. At work, we will always want to speak to the head of HR to sort out our problems, we will always want to talk to a human when the car service or a watch service goes awry, and we will always want to hear the reassuring voice of an actual human airline executive when we reach the airport late, and our flight has taken off.
Surpassing human performance is a legitimate goal. But human connection matters more than productivity, efficiency, and the power of logic. GAI and personal AIs cannot create organizational cultures and social contracts. Humans can. So, organizations will keep AI away from managing several types of processes independently. Instead, far-sighted organizations will work towards creating an environment where humans and AI can work together. Machines will, thankfully, do what we find boring; we will do what we have always excelled at—building and maintaining reputation and trust.
Author:
Sandeep Kumar,
Sr. VP & Head Global Consulting