Large Language Models Archives - Gizmochina

Meta Unveils New Open-source Language Model, Llama 3.1 with an increased Context Length of 128K tokens

Anubhav — Tue, 23 Jul 2024 18:02:01 +0000

Meta unveiled its latest open-source language model, Llama 3.1, on July 23rd. This new iteration boasts several improvements, including enhanced inference capabilities, broader multilingual support, and a significant increase in context length to 128K tokens.

The new LLM is comparable to GPT-4, GPT-4o, and Claude 3.5 Sonnet

The star of the show is the flagship 405B parameter Llama 3.1-405B. This powerhouse model, according to Meta, rivals the performance of leading closed-source models in tasks like common-sense reasoning, guidance, mathematics, tool use, and multilingual translation. Meta compares its capabilities to GPT-4, GPT-4o, and Claude 3.5 Sonnet.

But the improvements extend beyond the top tier. The 8B and 70B parameter versions of Llama 3.1 are also said to be highly competitive with other open-source and closed-source models of similar sizes.

For those eager to experiment, Llama 3.1 is now downloadable from Meta’s official website and Hugging Face. Additionally, over 25 major partners including cloud giants like AWS, Azure, and Google Cloud, alongside hardware manufacturers like Nvidia and Dell, have been confirmed as ready to support the new model.

(Via)

The post Meta Unveils New Open-source Language Model, Llama 3.1 with an increased Context Length of 128K tokens appeared first on Gizmochina.

A New Open Source LLM, DBRX Claims to be the Most Powerful – Here are the Scores

Anubhav — Thu, 28 Mar 2024 01:50:47 +0000

A whole new contender has entered the ring of large language models (LLMs). Databricks, a company specializing in data processing, has unveiled DBRX, claiming it to be the most powerful open-source LLM yet. But is it backing those claims up? Let’s find out.

132 billion parameters is a big number – GPT-3.5 has 175 billion parameters

DBRX utilizes a transformer architecture and boasts a massive 132 billion parameters. It leverages a unique approach called a Mixture-of-Experts (MoE) model, consisting of 16 individual expert networks. During any given task, only 4 of these experts are active, utilizing 36 billion parameters for efficiency. GPT 4 also uses an MoE model.

Databricks compares DBRX to other prominent open-source LLMs like Meta‘s Llama 2-70B, Mixtral (from France’s MixtralAI), and Grok-1 (developed by Elon Musk‘s xAI). DBRX reportedly outperforms its rivals in several key areas:

Language Understanding: DBRX achieves a score of 73.7%, surpassing GPT-3.5 (70.0%), Llama 2-70B (69.8%), Mixtral (71.4%), and Grok-1 (73.0%).
Programming Ability: Here, DBRX demonstrates a significant lead with a score of 70.1%, compared to GPT-3.5’s 48.1%, Llama 2-70B’s 32.3%, Mixtral’s 54.8%, and Grok-1’s 63.2%.
Mathematics: DBRX takes another win with a score of 66.9%, edging out GPT-3.5 (57.1%), Llama 2-70B (54.1%), Mixtral (61.1%), and Grok-1 (62.9%).

Databricks attributes DBRX’s speed to its MoE architecture, built upon their MegaBlocks research and open-source projects. This allows the model to output tokens at a very high rate. Additionally, Databricks positions DBRX as the most advanced open-source MoE model currently available, potentially paving the way for future advancements in the field.

The open-source nature of DBRX allows for wider adoption and contribution from the developer community. This could accelerate further development and potentially solidify DBRX’s position as a leading LLM.

RELATED:

(Via)

The post A New Open Source LLM, DBRX Claims to be the Most Powerful – Here are the Scores appeared first on Gizmochina.

Apple’s New MM1 Large Language Model Blurs the Lines Between Image and Text

Anubhav — Sat, 16 Mar 2024 19:10:14 +0000

Apple‘s research team has taken a huge step forward with their new “MM1” multi-modal large language model. This exciting development was detailed in a recent paper titled “MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training”, and it showcases a model with impressive capabilities in both image recognition and natural language reasoning.

The model is available in 3 billion, 7 billion and 30 billion parameter sizes

MM1 comes in three sizes: 3 billion, 7 billion, and 30 billion parameters. Researchers used these models to conduct experiments, pinpointing the key factors that influence performance. Interestingly, image resolution and the number of image tags have a greater impact than visual language connectors, and different pre-training data sets can significantly affect the model’s effectiveness.

The research team meticulously built MM1 using a “Mixture of Experts” architecture and a “Top-2 Gating” method. This approach not only yielded excellent results in pre-training benchmarks, but also translated to strong performance on existing multi-modal benchmarks. Even after fine-tuning for specific tasks, MM1 models maintained competitive performance.

Testing revealed that the MM1-3B-Chat and MM1-7B-Chat models outperform most similarly sized competitors in the market. These models particularly shine in tasks like VQAv2 (question answering based on an image and text), TextVQA (text-based question answering about an image), and ScienceQA (scientific question answering). However, the overall performance of MM1 doesn’t quite surpass Google’s Gemini or OpenAI‘s GPT-4V models (yet). While MM1 may not be the absolute leader yet, it still is a significant leap forward for Apple in artificial intelligence. The company also recently acquired DarwinAI, read more about that here.

RELATED:

(VIA)

The post Apple’s New MM1 Large Language Model Blurs the Lines Between Image and Text appeared first on Gizmochina.

Google DeepMind’s SIMA is Training to Become Your New In-Game Teammate, Here’s How

Anubhav — Thu, 14 Mar 2024 01:42:14 +0000

Get ready for a new kind of gaming buddy! Google DeepMind has introduced SIMA, a large language model being trained to become your in-game teammate. Is this what AI was meant for? Sounds about right.

The AI companion will perceive elements of both the map and the gameplay

SIMA, which stands for “Scalable, Instructable, Multiworld Agent,” is currently under development, but it has the potential to revolutionize the way we play games. Unlike traditional AI companions, SIMA won’t simply be another NPC character. This model is designed to be a cooperative teammate, understanding your actions and adapting its own accordingly. Imagine getting a co-op buddy in Borderlands who lets you loot first before doing it themselves. How cool would that be?

To achieve this, SIMA works on a combination of natural language processing and image recognition. This allows it to perceive the 3D game world and respond to your instructions and actions. To train this AI teammate, Google has partnered with eight game developers, including big studios behind titles like No Man’s Sky and Valheim.

Through these collaborations, SIMA is learning the fundamentals of gameplay – from basic actions like turning left and climbing ladders to utilizing menus and maps. While complex tasks like resource gathering and camp building are beyond its current capabilities, Google expects SIMA’s skillset to expand significantly in the future. It won’t be long before gamers can use a Google AI game buddy to fill up the third slot in their Apex Legends Lobby.

RELATED:

(Via)

The post Google DeepMind’s SIMA is Training to Become Your New In-Game Teammate, Here’s How appeared first on Gizmochina.

Claude 3 is the Newest AI Chatbot Competitor, Claims to Surpass ChatGPT & Google’s Gemini

Anubhav — Tue, 05 Mar 2024 03:13:22 +0000

A new challenger has emerged to shake up the landscape of AI and chatbots. Anthropic, an AI startup, has unveiled its “Claude 3” family, a trio of large language models (LLMs) claiming to surpass Google’s Gemini and OpenAI‘s ChatGPT in various benchmarks.

Claude 3 has three different variations: Haiku, Sonnet and Opus

Claude 3 comes in three distinct flavors: Haiku, Sonnet, and Opus, each offering varying levels of capability. Anthropic boasts that the entire family delivers exceptional performance across multiple dimensions – multimodality (handling different data types), improved accuracy, enhanced context understanding, and faster response times. Additionally, the new models exhibit a greater willingness to tackle challenging questions, addressing a limitation found in earlier Claude versions that sometimes shied away from prompts deemed risky.

While all three models offer a significant performance boost, Opus takes center stage as the most potent member of the family. Anthropic claims it demonstrates “near-human levels of comprehension” for complex tasks, further showcasing its capabilities through a “Needle in a Haystack” evaluation, where it excelled at recalling information with near-perfect accuracy. Opus is also touted as a problem-solving whiz, adept at handling math challenges, generating computer code, and exhibiting superior reasoning abilities compared to GPT-4.

However, no technology is perfect, and Claude 3 is no exception. While Anthropic emphasizes improved accuracy, the issue of “hallucinations” – factually incorrect information generated by the models – persists, albeit at a significantly reduced rate compared to previous iterations. Additionally, Opus encounters some lag in responding to queries, exhibiting speeds comparable to the earlier Claude 2 model.

Despite these limitations, Haiku and Sonnet each have their own strengths. Haiku shines in delivering quick responses and extracting information from unstructured data, although it might stumble when faced with complex math problems. Sonnet, a larger-scale model, aims to assist users with mundane tasks, even parsing text from images. Opus, on the other hand, is ideally suited for handling large-scale operations.

Currently, Sonnet and Opus are available for purchase, while a free version of Claude remains accessible on Anthropic’s website. Haiku’s launch date is still under wraps, but the company assures a soon-to-come release. The primary target audience for Claude 3 appears to be businesses seeking to automate specific workflows. Users will likely encounter these models integrated into online chatbots.

RELATED:

(Via)

The post Claude 3 is the Newest AI Chatbot Competitor, Claims to Surpass ChatGPT & Google’s Gemini appeared first on Gizmochina.