Silicon Valley-based startup Two Platforms Inc. has introduced Sutra, a multilingual large language model (LLM), and Geniya, an LLM-powered chatbot. Backed by Jio Platforms and Naver Corp, Two Platforms aims to bridge the language gap in AI for non-English markets. Sutra is built with a foundational transformer architecture to enhance language learning without fine-tuning, offering cost savings and improved efficiency compared to existing LLMs. The company’s upcoming Zappy messaging app will integrate AI capabilities. India has witnessed a surge in the development of local multilingual LLMs, with companies like Sarvam AI, Tech Mahindra, and Krutrim contributing to the space.
Results for: Large Language Models (LLMs)
Direct Preference Optimization (DPO), a novel approach to aligning large language models (LLMs) with human preferences, has emerged as a game-changer in the field of natural language processing. Developed by researchers at Stanford University, DPO offers a streamlined and efficient alternative to reinforcement learning from human feedback (RLHF), the method employed by OpenAI in its popular ChatGPT model.
DPO hinges on the mathematical observation that every LLM implicitly contains a theoretical reward model that would evaluate its responses favorably. By allowing the LLM to learn directly from the data, rather than through an intermediary reward model, DPO eliminates the need for a separate LLM to act as a proxy for human feedback. This simplification results in significant efficiency gains, making DPO three to six times faster than RLHF.
The ease of use and effectiveness of DPO have made it accessible to companies beyond the world-leading AI labs that previously dominated the field of LLM alignment. Since its introduction in December 2023, eight out of the ten highest-ranked LLMs on an industry leaderboard have adopted DPO, including startups like Hugging Face and tech giants like Meta. While further improvements are anticipated from both the DPO method and proprietary algorithms developed by AI labs, DPO represents a major step forward in the quest to align LLMs with human expectations and desires.
Tencent and China’s automotive industry are collaborating to explore AI, cloud computing, and online mapping technologies to enhance supply chains and improve smart cockpit experiences. Large Language Models (LLMs) are expected to play a significant role in various aspects of the industry, including research, production, marketing, and customer service. Over 48% of new vehicles sold in China since 2021 have smart cockpit configurations, and this number is projected to reach 75% by 2025.
GPT-4, the latest multimodal large language model from OpenAI, has raised concerns due to its ability to exploit zero-day security vulnerabilities with minimal human assistance. Researchers have demonstrated that GPT-4 can analyze flaw descriptions and autonomously generate exploits, highlighting the potential for misuse and cybercrime democratization.
A recent assessment conducted by Tsinghua University revealed that Chinese large language models (LLMs) Baidu’s Ernie Bot 4.0 and Zhipu AI’s GLM-4 rank high among domestic models, but fall short compared to foreign counterparts in overall capabilities. The SuperBench assessment found that overseas models, such as OpenAI’s GPT-4 and Anthropic’s Claude-3, demonstrated superior performance in semantic comprehension, coding abilities, and alignment with human commands. The report highlights the need for Chinese LLMs to bridge gaps in code-writing and operative abilities in real-world settings.