Exploring Snowflake Arctic: The Open-Source LLM for Enterprises

0:00 / 0:00

On April 24th, 2024 Snowflake announced their enterprise-grade large language model (LLM) - Arctic. We often are asked about new models as they are released but its not every day that they get as much business attention as Arctic generated. For Blend clients and AI enthusiasts who are wondering the impact Arctic is going to have on the LLM scene– this post is for you.

First, let’s cover the model’s highlights: Arctic emerged with a promise of robust performance, low training costs, a novel hybrid architecture, and a collaborative outlook towards open-source materials and code. Arctic takes an innovative step forward for enterprise LLMs.

It's worth noting that all the LLMs on the market are an incredible balance of tradeoffs:

Training data – All data, curated data, or passes of each
Target Parameters – Bigger is more comprehensive but notably more computing-intensive
Training Time – Is there a point of diminishing returns? When is it more beneficial to begin training a new version?
Model Architecture – Consider the size of context and types of outputs to optimize for when designing the model

‍

Arctic's Architecture

Architecturally, Arctic is quite interesting. Snowflake has gone with a very large number of parameters spread out over a mixture-of-experts approach with fairly small expert sizes. This equates to large GPU requirements but high speed throughput.

Functionally, Arctic’s scores stand out in SQL generation, coding, instruction following, and common sense. These are well aligned with the tradeoffs businesses are looking for in LLMs. When designing for busiess applications we’re looking for a high degree of output reliability. As a result, we’re less concerned with coverage for general purpose use cases. Snowflake calls this balance “Enterprise Intelligence.”

The result here is a well targeted set of tradeoffs; Arctic may not have a plethora of broad knowledge like other LLMs, but it focuses on doing business-critical tasks and doing it fast. Based on this, we predict this model will be more efficient to scale for large enterprise concurrency needs (but not for deployment at the edge).

For context, here are some of the comparison scores Snowflake has shared and are consistent with our initial testing. Below you’ll find a good overview of the performance and focus trade-offs.

We anticipate Arctic will excel in the following areas

Highly accurate Snowflake co-pilot

Query Writing
Query Explanations
Object Creation
Access Control
Scripting and Procedural Code

API Code Writing

Snowpark Scripts
API Connector Code
Snowpipe API

Native App Development

Deployment of container services
Deployment of models
Object Dependency Identification
Object Naming, Comments, Code Formatting

‍

Opening Doors with Open-Source

In recent years, Snowflake has expanded its services beyond a modern relational database product, evolving into a comprehensive cloud and data platform. The company has enriched its suite of features (eg. Native application development, ML Ops, and more) that pave the way for custom applications, machine learning, and AI models that have seamless access to the underlying data.

Thanks to the open-source release of Arctic, Snowflake enables clients to build custom stacks by combining this LLM with a selection of proprietary models (via partnerships), or any other open-source model to fit any specific use case uniquely. Not to mention that Arctic is building into their new Cortex features.

‍

Initial Arctic Use Cases

There are clear use cases of Arctic as an internal tool and a means to enhance data products that you can take advantage of today. For example, you can get started with:

Deploying Arctic internally to drive efficiency to write syntax-accurate snowflake queries and stored procedures
Explain queries to streamline code reviews and refactors
Create scripts to roll out new Snowflake objects and grant permissions

And that’s just the beginning. Teams can get ahead of the curve by speeding up tedious tasks, allowing more time for design and decision-making. Ultimately, effective use of Arctic will increase the productivity of Snowflake development teams.

‍

Beyond benefitting internal teams, we also see possibilities with data sharing in cleanrooms and custom applications in Streamlit.

In a data cleanroom – or a general environment for secure data sharing, a data provider may get new consumers every day that are new to datasets and the possible analytics that can be derived. Snowflake Arctic could be a virtual assistant that guides new consumers through experimentation, data comprehension, research, analysis, and data consumption. This eliminates overhead imposed on the data provider to effectively enable data democratization. A cleanroom is an ideal environment to grant functionality to external users while still imposing model and data access guidelines.

In custom applications (perhaps created in Streamlit), more flexible data queries can be facilitated by enabling dynamic query concepts (visual or language-driven), letting Arctic problem-solve the query needed.

Finding the right use cases and leveraging a high-performing model within the Snowflake ecosystem creates a variety of possibilities. For enterprise teams, Arctic is another model we encourage you to experiment with and adapt to your specialized use cases. If you have the computing power, tune it and leverage it outside of the Snowflake ecosystem, as well.

‍

Snowflake: The Future of GenAI

The announcement of Snowflake Arctic, along with adjacent features such as Snowflake Cortex, Streamlit in Snowflake, Snowflake ML, and Snowflake Container Services, should excite artificial intelligence and Snowflake shops alike. Snowflake is gearing up to be an all-encompassing solution that binds data, artificial intelligence, and human interaction in a cohesive way. In conclusion, Arctic is another signal that shows that Snowflake understands enterprise needs and clearly understands the use of generative AI in enterprise use cases.

Blend can help you navigate the ins-and-outs of AI-leveraged problem solving; be it strategy, integration, or fine tuning models to achieve the necessary performance. We believe the future belongs to those who grow with AI. Connect with us today.

‍

It's worth noting that all the LLMs on the market are an incredible balance of tradeoffs:

Training data – All data, curated data, or passes of each
Target Parameters – Bigger is more comprehensive but notably more computing-intensive
Training Time – Is there a point of diminishing returns? When is it more beneficial to begin training a new version?
Model Architecture – Consider the size of context and types of outputs to optimize for when designing the model

‍

Arctic's Architecture

For context, here are some of the comparison scores Snowflake has shared and are consistent with our initial testing. Below you’ll find a good overview of the performance and focus trade-offs.

We anticipate Arctic will excel in the following areas

Highly accurate Snowflake co-pilot

Query Writing
Query Explanations
Object Creation
Access Control
Scripting and Procedural Code

API Code Writing

Snowpark Scripts
API Connector Code
Snowpipe API

Native App Development

Deployment of container services
Deployment of models
Object Dependency Identification
Object Naming, Comments, Code Formatting

‍

Opening Doors with Open-Source

‍

Initial Arctic Use Cases

There are clear use cases of Arctic as an internal tool and a means to enhance data products that you can take advantage of today. For example, you can get started with:

Deploying Arctic internally to drive efficiency to write syntax-accurate snowflake queries and stored procedures
Explain queries to streamline code reviews and refactors
Create scripts to roll out new Snowflake objects and grant permissions

‍

Beyond benefitting internal teams, we also see possibilities with data sharing in cleanrooms and custom applications in Streamlit.

‍

Snowflake: The Future of GenAI

‍

Thank you! The file will start to download shortly

Oops! Something went wrong while submitting the form.

Exploring Snowflake Arctic: The Open-Source LLM for Enterprises

Arctic's Architecture

Opening Doors with Open-Source

Initial Arctic Use Cases

Snowflake: The Future of GenAI

Download youre-book today!

Arctic's Architecture

Opening Doors with Open-Source

Initial Arctic Use Cases

Snowflake: The Future of GenAI

Related Articles

Related Articles

AI Transformation Challenge

AI Transformation Challenge

Download your
e-book today!