Anthropic, the generative AI startup known for its powerful Claude models, has just unveiled a major upgrade to its flagship Claude 3.5 Sonnet. This enhanced version doesn’t just boast a performance boost, it pushes the boundaries of AI by introducing a groundbreaking feature: basic computer control.
The new Claude 3.5 Sonnet has already been making waves in the industry. It’s a coding powerhouse, outperforming its predecessor and even surpassing both Gemini 1.5 and GPT-4 on several industry benchmarks. In fact, the only model that bested the new Sonnet was Gemini 1.5 Pro, but only on the MATH benchmark.
Anthropic also released a smaller but equally impressive model called Claude 3.5 Haiku. This streamlined version, scheduled to be released later this month, outperforms Claude 3.0 Opus, the company’s previous largest model. Remarkably, even with its compact size, Haiku excels at coding tasks, scoring higher on the SWE-bench Verified than both GPT-4 and the original Claude 3.5 Sonnet.
But it’s the ‘Computer Use’ API that truly sets Claude 3.5 Sonnet apart. This new capability allows the AI to interact with desktop applications by generating the necessary keystrokes, mouse clicks, and movements to mimic human behavior. Think of it as an AI agent that can automate tasks within other software.
Anthropic acknowledges that this technology is still in its early stages and prone to errors. The public beta release aims to gather feedback from developers to rapidly improve the API’s performance. “We trained Claude to see what’s happening on a screen and then use the software tools available to carry out tasks,” Anthropic explained in a blog post. This means Claude analyzes screenshots, calculates precise cursor movements, and then executes the necessary clicks to complete tasks within software applications.
This ‘Computer Control’ functionality has the potential to revolutionize the way we interact with computers. Imagine an AI assistant that can automatically generate marketing leads, analyze medical data, or even fill out forms online – all without your direct intervention.
Early adopters of this feature include companies like Asana, Canva, Cognition, DoorDash, Replit, and The Browser Company. Replit, for instance, is using Computer Control to develop a feature that evaluates apps as they are being built for its Replit Agent product.
While the idea of an AI controlling our computers might raise concerns, Anthropic assures us that humans remain in control. “People enable access and limit access as needed. Claude breaks down the user’s prompts into computer commands (e.g., moving the cursor, clicking, typing) to accomplish that specific task,” an Anthropic spokesperson told TechCrunch.
Anthropic also acknowledges the potential for misuse, such as generating spam or spreading misinformation. To address these concerns, the company has developed new classifiers that can identify when the API is being used and whether that use is harmful.
The release of Claude 3.5 Sonnet and its innovative ‘Computer Control’ API marks a significant step forward in the evolution of AI. With its impressive performance and groundbreaking capabilities, Claude is poised to become a powerful tool for automating tasks and enhancing our digital experiences. However, it’s crucial to remember that while this technology holds immense potential, its responsible and ethical development is paramount.