The world was amazed by ChatGPT’s natural language abilities, powered by the GPT-3.5 large language model. But the arrival of GPT-4, the highly anticipated next generation, has redefined what we thought possible with AI, with some even calling it the dawn of AGI (artificial general intelligence).
GPT-4 is the newest language model from OpenAI, capable of generating human-like text. It builds upon the technology behind ChatGPT, which previously used GPT-3.5 but has since been updated. GPT stands for Generative Pre-trained Transformer, a deep learning technology using artificial neural networks to emulate human writing. According to OpenAI, this latest model surpasses ChatGPT in three key areas: creativity, visual input, and longer context.
GPT-4 exhibits significantly enhanced creativity, allowing it to collaborate with users on creative projects like music, screenplays, technical writing, and even learning a user’s writing style. This is further facilitated by its ability to process up to 128k tokens of text from the user, enabling extended conversations and the creation of long-form content. GPT-4 can even interact with text from a website simply by being given a link.
Adding to its versatility, GPT-4 can now receive images as input for interaction. In an example provided on the GPT-4 website, the chatbot is given an image of baking ingredients and asked what can be made with them. While the ability to process video is currently unknown, the introduction of visual input marks a significant step towards multimodal AI.
OpenAI also emphasizes GPT-4’s improved safety compared to its predecessor. It reportedly produces 40% more factual responses and is 82% less likely to respond to requests for disallowed content. This advancement is attributed to extensive training with human feedback, involving over 50 experts in domains like AI safety and security.
In the initial weeks following its launch, users shared impressive feats accomplished using GPT-4, including inventing new languages, detailing escape strategies from the real world, and creating complex animations for apps from scratch. One user reportedly even made GPT-4 create a working version of Pong in just sixty seconds using HTML and JavaScript.
GPT-4 is accessible to all users across OpenAI’s subscription tiers. Free tier users have limited access to the full GPT-4 model (around 80 chats within a 3-hour period) before being switched to the smaller and less capable GPT-4o mini until the cooldown timer resets. To gain unlimited access to GPT-4 and image generation capabilities with Dall-E, users can upgrade to ChatGPT Plus for a $20 monthly subscription.
For those who prefer not to pay, Microsoft’s Bing Chat offers a free alternative. Powered by GPT-4, Bing Chat allows users to engage with the advanced language model. While some GPT-4 features are missing, it still offers access to the expanded LLM and its capabilities. However, Bing Chat is limited to 15 chats per session and 150 sessions per day.
Beyond ChatGPT and Bing Chat, GPT-4 is being integrated into various applications, including the question-answering site, Quora.
GPT-4 was officially announced on March 13th, as Microsoft had confirmed beforehand, and initially became available to users through a ChatGPT Plus subscription and Microsoft Copilot. OpenAI also released an API version for developers to build applications and services. Several companies have already integrated GPT-4, including Duolingo, Be My Eyes, Stripe, and Khan Academy. The first public demonstration of GPT-4 was livestreamed on YouTube, showcasing its new capabilities.
GPT-4o mini, the latest iteration of OpenAI’s GPT-4 model line, is a streamlined version of the larger GPT-4o model, optimized for simple but high-volume tasks that prioritize quick inference speed over utilizing the full model’s power. Released in July 2024, it replaced GPT-3.5 as the default model in ChatGPT after users reach their three-hour limit with GPT-4o. According to Artificial Analysis, 4o mini outperforms similarly sized models like Google’s Gemini 1.5 Flash and Anthropic’s Claude 3 Haiku in the MMLU reasoning benchmark.
The free version of ChatGPT was originally based on the GPT 3.5 model. However, as of July 2024, ChatGPT now runs on GPT-4o mini. This streamlined version of GPT-4o significantly surpasses even GPT-3.5 Turbo, demonstrating improved comprehension, enhanced safeguards, more concise responses, and 60% lower operating costs.
GPT-4 is available as an API for developers who have made at least one successful payment to OpenAI. OpenAI offers various versions of GPT-4 and legacy GPT-3.5 models through its API. While GPT-3.5 will eventually be taken offline, OpenAI has not provided a specific timeline for this.
While the API primarily caters to developers creating new apps, it has also caused some confusion for consumers. Plex, for example, allows users to integrate ChatGPT into the Plexamp music player, requiring a separate ChatGPT API key. This necessitates signing up for a developer account to gain API access.
Although GPT-4 initially impressed users, some have noticed a degradation in its answer quality over time. This observation has been noted by influential figures in the developer community and shared on OpenAI’s forums. While OpenAI initially attributed these concerns to user perception, a study later confirmed a decline in answer accuracy between March and June, with GPT-4’s accuracy dropping from 97.6% to 2.4%.
One of GPT-4’s most anticipated features is visual input, allowing ChatGPT Plus to interact with images in addition to text, making it a truly multimodal model. Uploading images for GPT-4 to analyze and manipulate is as simple as uploading documents, requiring users to click the paperclip icon next to the context window, select the image source, and attach the image to their prompt.
Despite its advanced capabilities, GPT-4 still has limitations. Like previous versions, it struggles with social biases, hallucinations, and adversarial prompts. In other words, it’s not perfect and will occasionally provide incorrect answers. While OpenAI acknowledges these issues and is actively working to address them, GPT-4 is generally considered less creative in its responses, reducing the likelihood of fabricated information.
Another limitation is that GPT-4 was trained on internet data up to December 2023 (GPT-4o and 4o mini cut off at October of that year). However, its ability to conduct web searches allows it to access and retrieve more recent information from the internet, mitigating this limitation.
GPT-4o is the latest release, with GPT-5 still on the horizon, signaling continued advancements in AI. The future of AI is here, but as with any powerful technology, it’s essential to be aware of both its potential and limitations.