28.2 C
New York
Friday, September 20, 2024

Snowflake ropes in AI21’s Jamba-Instruct to help enterprises decode long documents


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


Today, data cloud giant Snowflake announced it is adding Israeli AI startup AI21 Labs’ enterprise-focused Jamba-Instruct LLM into its Cortex AI service

Available starting today, the model will enable Snowflake’s enterprise customers to build generative AI applications (like chatbots and summarization tools) capable of handling long documents without compromising on quality and accuracy. 

Given enterprises’ massive dependence on large files and documents, Jamba-Instruct could be a great asset for teams. However, it’s important to note here that AI21 is not Snowflake’s only large language model (LLM) partner. The Sridhar Ramaswamy-led company has been laser-focused on the gen AI category. It has already initiated several engagements to create a whole ecosystem for developing highly performant, data-driven AI apps. 

Just a couple of days ago, the company announced a partnership with Meta to bring the all-new Llama 3.1 family of LLMs to Cortex. Before that, it debuted a proprietary enterprise model called ‘Arctic’. The approach has been quite similar to that of rival Databricks, which acquired MosaicML last year and has since been moving aggressively, building its own DBRX model and adding new LLMs and tools for customers to build upon.

What does Jamba-Instruct offer to Snowflake users?

Back in March, AI21 made headlines with Jamba, an open generative AI model combining the tried-and-tested transformer architecture with a novel memory-efficient Structured State Space model (SSM). The hybrid model provided users access to a massive 256K context window (the amount of data an LLM can take in to process) and activated just 12B of its 52B parameters — delivering not only a powerful but also an efficient solution at the same. 

According to AI21, Jamba delivered 3x throughput on long contexts compared to Mixtral 8x7B (another model in its size class), making an enticing offering for enterprises. This led the company to debut Jamba-Instruct, an instruction-tuned version of the model with additional training, chat capabilities and safety guardrails to make it suitable for enterprise use cases. 

The commercial model launched on AI21’s platform in May and is now making its way to Cortex AI, Snowflake’s no-code, fully managed service for building powerful gen AI applications on top of the data hosted on the platform.

“Due to its large context window capacity, Jamba-instruct has strong processing capabilities. It can handle up to 256K tokens, which is equivalent to approximately 800 pages of text. This makes Jamba-instruct a highly powerful model for a variety of use cases related to extensive document processing such as corporate financial history, earnings call transcripts or lengthy clinical trial interviews,” Baris Gultekin, the head of AI at Snowflake, told VentureBeat.

For instance, financial analysts in investment banks or hedge funds can use a Q&A or summarization tool powered by the LLM to get quick and accurate insights from 10-K filings, which often contain more than 100 pages. Similarly, clinicians could go through long patient reports in a short period to extract relevant information and retailers could build chatbots capable of coherently sustaining long and reference-based conversations with customers.

Gultekin noted that the long-context window of the model can simplify the whole experience of building RAG pipelines by allowing a single large chunk of information to be retrieved, and even support “many-shot prompting” to guide the model into following a specific tone while generating.

Major cost benefits

In addition to the ability to handle long documents, Snowflake customers can also expect major cost benefits from Jamba-Instruct.

Essentially, the hybrid nature of the model, combined with mixture-of-experts (MoE) layers activating select parameters, makes its 256K context window more economically accessible than other instruction-tuned transformer models of the same size. Further, Cortex AI’s serverless inference with a consumption-based pricing model ensures enterprises only have to pay for used resources rather than maintaining dedicated infrastructure on their own. 

“Organizations can balance performance, cost, and latency by leveraging Snowflake’s scalability and Jamba-Instruct’s efficiency. Cortex AI’s framework enables easy scaling of compute resources for optimal performance and cost benefits. Meanwhile, Jamba-Instruct’s architecture minimizes latency,” Pankaj Dugar, SVP & GM for North America at AI21 Labs, told VentureBeat.

As of now, along with Jamba-Instruct, the fully managed service covers a range of LLMs, including Snowflake’s own Arctic and those from Google, Meta, Mistral AI and Reka AI

“We aim to provide our customers with the flexibility to choose from either open source or commercial models, selecting the best model to meet their specific use case, cost, and performance requirements — without having to set up complex integrations or move data from where it is already governed within the AI Data Cloud,” Gultekin explained.

The list is expected to get bigger, with more large models, including those from AI21, launching on the platform in the coming months. However, the AI head pointed out the company continuously follows customer feedback when evaluating and integrating LLMs to ensure it only includes models that address specific requirements and use cases. 

“We have very strict guidelines and processes when it comes to bringing LLMs within Cortex AI…We want to make sure the model offering covers a broad range of use cases from automated BI to conversational assistants to text processing and summarization. And the model should have unique capabilities — for example, Jamba-instruct has the largest context window amongst the models that we offer to date,” he added. 

Snowflake also acquired TruEra a couple of months ago to keep companies from being overwhelmed by the growing model choices. Gultekin said they can use the startup’s TruLens offering to run LLM experiments and evaluate what works best for them. 

Currently, over 5,000 enterprises are using Snowflake’s AI capabilities (Cortex and other related features), with the top use cases being automated BI, conversational assistants and text summarization.


Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles