A whole new contender has entered the ring of large language models (LLMs). Databricks, a company specializing in data processing, has unveiled DBRX, claiming it to be the most powerful open-source LLM yet. But is it backing those claims up? Let’s find out.

132 billion parameters is a big number – GPT-3.5 has 175 billion parameters

DBRX utilizes a transformer architecture and boasts a massive 132 billion parameters. It leverages a unique approach called a Mixture-of-Experts (MoE) model, consisting of 16 individual expert networks. During any given task, only 4 of these experts are active, utilizing 36 billion parameters for efficiency. GPT 4 also uses an MoE model.

DBRX

Databricks compares DBRX to other prominent open-source LLMs like Meta‘s Llama 2-70B, Mixtral (from France’s MixtralAI), and Grok-1 (developed by Elon Musk‘s xAI). DBRX reportedly outperforms its rivals in several key areas:

  • Language Understanding: DBRX achieves a score of 73.7%, surpassing GPT-3.5 (70.0%), Llama 2-70B (69.8%), Mixtral (71.4%), and Grok-1 (73.0%).
  • Programming Ability: Here, DBRX demonstrates a significant lead with a score of 70.1%, compared to GPT-3.5’s 48.1%, Llama 2-70B’s 32.3%, Mixtral’s 54.8%, and Grok-1’s 63.2%.
  • Mathematics: DBRX takes another win with a score of 66.9%, edging out GPT-3.5 (57.1%), Llama 2-70B (54.1%), Mixtral (61.1%), and Grok-1 (62.9%).

Databricks attributes DBRX’s speed to its MoE architecture, built upon their MegaBlocks research and open-source projects. This allows the model to output tokens at a very high rate. Additionally, Databricks positions DBRX as the most advanced open-source MoE model currently available, potentially paving the way for future advancements in the field.

The open-source nature of DBRX allows for wider adoption and contribution from the developer community. This could accelerate further development and potentially solidify DBRX’s position as a leading LLM.

RELATED:

(Via)