Exploring Snowflake Arctic: The Open-Source LLM for Enterprises

Luke Turanski & Rob Fuller
.
May 16, 2024
Exploring Snowflake Arctic: The Open-Source LLM for Enterprises

On April 24th, 2024 Snowflake announced their enterprise-grade large language model (LLM) - Arctic.  We often are asked about new models as they are released but its not every day that they get as much business attention as Arctic generated. For Blend clients and AI enthusiasts who are wondering the impact Arctic is going to have on the LLM scene– this post is for you.  

First, let’s cover the model’s highlights: Arctic emerged with a promise of robust performance, low training costs, a novel hybrid architecture, and a collaborative outlook towards open-source materials and code. Arctic takes an innovative step forward for enterprise LLMs.    

It's worth noting that all the LLMs on the market are an incredible balance of tradeoffs:  

Arctic's Architecture

Architecturally, Arctic is quite interesting. Snowflake has gone with a very large number of parameters spread out over a mixture-of-experts approach with fairly small expert sizes.   This equates to large GPU requirements but high speed throughput.  

Functionally, Arctic’s scores stand out in SQL generation, coding, instruction following, and common sense.  These are well aligned with the tradeoffs businesses are looking for in LLMs.  When designing for busiess applications we’re looking for a high degree of output reliability. As a result, we’re less concerned with coverage for general purpose use cases.  Snowflake calls this balance “Enterprise Intelligence.”  

The result here is a well targeted set of tradeoffs; Arctic may not have a plethora of broad knowledge like other LLMs, but it focuses on doing business-critical tasks and doing it fast. Based on this, we predict this model will be more efficient to scale for large enterprise concurrency needs (but not for deployment at the edge).

For context, here are some of the comparison scores Snowflake has shared and are consistent with our initial testing.  Below you’ll find a good overview of the performance and focus trade-offs.

Image from Snowflake

We anticipate Arctic will excel in the following areas  

Highly accurate Snowflake co-pilot  

API Code Writing  

Native App Development  

Opening Doors with Open-Source

In recent years, Snowflake has expanded its services beyond a modern relational database product, evolving into a comprehensive cloud and data platform. The company has enriched its suite of features (eg. Native application development, ML Ops, and more) that pave the way for custom applications, machine learning, and AI models that have seamless access to the underlying data.  

Thanks to the open-source release of Arctic, Snowflake enables clients to build custom stacks by combining this LLM with a selection of proprietary models (via partnerships), or any other open-source model to fit any specific use case uniquely. Not to mention that Arctic is building into their new Cortex features.

Initial Arctic Use Cases

There are clear use cases of Arctic as an internal tool and a means to enhance data products that you can take advantage of today. For example, you can get started with:

And that’s just the beginning. Teams can get ahead of the curve by speeding up tedious tasks, allowing more time for design and decision-making. Ultimately, effective use of Arctic will increase the productivity of Snowflake development teams.

Beyond benefitting internal teams, we also see possibilities with data sharing in cleanrooms and custom applications in Streamlit.  

In a data cleanroom – or a general environment for secure data sharing, a data provider may get new consumers every day that are new to datasets and the possible analytics that can be derived. Snowflake Arctic could be a virtual assistant that guides new consumers through experimentation, data comprehension, research, analysis, and data consumption. This eliminates overhead imposed on the data provider to effectively enable data democratization. A cleanroom is an ideal environment to grant functionality to external users while still imposing model and data access guidelines.  

In custom applications (perhaps created in Streamlit), more flexible data queries can be facilitated by enabling dynamic query concepts (visual or language-driven), letting Arctic problem-solve the query needed.    

Finding the right use cases and leveraging a high-performing model within the Snowflake ecosystem creates a variety of possibilities.  For enterprise teams, Arctic is another model we encourage you to experiment with and adapt to your specialized use cases. If you have the computing power, tune it and leverage it outside of the Snowflake ecosystem, as well.

Snowflake: The Future of GenAI

The announcement of Snowflake Arctic, along with adjacent features such as Snowflake Cortex, Streamlit in Snowflake, Snowflake ML, and Snowflake Container Services, should excite artificial intelligence and Snowflake shops alike. Snowflake is gearing up to be an all-encompassing solution that binds data, artificial intelligence, and human interaction in a cohesive way. In conclusion, Arctic is another signal that shows that Snowflake understands enterprise needs and clearly understands the use of generative AI in enterprise use cases.    

Blend can help you navigate the ins-and-outs of AI-leveraged problem solving; be it strategy, integration, or fine tuning models to achieve the necessary performance. We believe the future belongs to those who grow with AI. Connect with us today.

Download your e-book today!

Download your report today!

On April 24th, 2024 Snowflake announced their enterprise-grade large language model (LLM) - Arctic.  We often are asked about new models as they are released but its not every day that they get as much business attention as Arctic generated. For Blend clients and AI enthusiasts who are wondering the impact Arctic is going to have on the LLM scene– this post is for you.  

First, let’s cover the model’s highlights: Arctic emerged with a promise of robust performance, low training costs, a novel hybrid architecture, and a collaborative outlook towards open-source materials and code. Arctic takes an innovative step forward for enterprise LLMs.    

It's worth noting that all the LLMs on the market are an incredible balance of tradeoffs:  

Arctic's Architecture

Architecturally, Arctic is quite interesting. Snowflake has gone with a very large number of parameters spread out over a mixture-of-experts approach with fairly small expert sizes.   This equates to large GPU requirements but high speed throughput.  

Functionally, Arctic’s scores stand out in SQL generation, coding, instruction following, and common sense.  These are well aligned with the tradeoffs businesses are looking for in LLMs.  When designing for busiess applications we’re looking for a high degree of output reliability. As a result, we’re less concerned with coverage for general purpose use cases.  Snowflake calls this balance “Enterprise Intelligence.”  

The result here is a well targeted set of tradeoffs; Arctic may not have a plethora of broad knowledge like other LLMs, but it focuses on doing business-critical tasks and doing it fast. Based on this, we predict this model will be more efficient to scale for large enterprise concurrency needs (but not for deployment at the edge).

For context, here are some of the comparison scores Snowflake has shared and are consistent with our initial testing.  Below you’ll find a good overview of the performance and focus trade-offs.

Image from Snowflake

We anticipate Arctic will excel in the following areas  

Highly accurate Snowflake co-pilot  

API Code Writing  

Native App Development  

Opening Doors with Open-Source

In recent years, Snowflake has expanded its services beyond a modern relational database product, evolving into a comprehensive cloud and data platform. The company has enriched its suite of features (eg. Native application development, ML Ops, and more) that pave the way for custom applications, machine learning, and AI models that have seamless access to the underlying data.  

Thanks to the open-source release of Arctic, Snowflake enables clients to build custom stacks by combining this LLM with a selection of proprietary models (via partnerships), or any other open-source model to fit any specific use case uniquely. Not to mention that Arctic is building into their new Cortex features.

Initial Arctic Use Cases

There are clear use cases of Arctic as an internal tool and a means to enhance data products that you can take advantage of today. For example, you can get started with:

And that’s just the beginning. Teams can get ahead of the curve by speeding up tedious tasks, allowing more time for design and decision-making. Ultimately, effective use of Arctic will increase the productivity of Snowflake development teams.

Beyond benefitting internal teams, we also see possibilities with data sharing in cleanrooms and custom applications in Streamlit.  

In a data cleanroom – or a general environment for secure data sharing, a data provider may get new consumers every day that are new to datasets and the possible analytics that can be derived. Snowflake Arctic could be a virtual assistant that guides new consumers through experimentation, data comprehension, research, analysis, and data consumption. This eliminates overhead imposed on the data provider to effectively enable data democratization. A cleanroom is an ideal environment to grant functionality to external users while still imposing model and data access guidelines.  

In custom applications (perhaps created in Streamlit), more flexible data queries can be facilitated by enabling dynamic query concepts (visual or language-driven), letting Arctic problem-solve the query needed.    

Finding the right use cases and leveraging a high-performing model within the Snowflake ecosystem creates a variety of possibilities.  For enterprise teams, Arctic is another model we encourage you to experiment with and adapt to your specialized use cases. If you have the computing power, tune it and leverage it outside of the Snowflake ecosystem, as well.

Snowflake: The Future of GenAI

The announcement of Snowflake Arctic, along with adjacent features such as Snowflake Cortex, Streamlit in Snowflake, Snowflake ML, and Snowflake Container Services, should excite artificial intelligence and Snowflake shops alike. Snowflake is gearing up to be an all-encompassing solution that binds data, artificial intelligence, and human interaction in a cohesive way. In conclusion, Arctic is another signal that shows that Snowflake understands enterprise needs and clearly understands the use of generative AI in enterprise use cases.    

Blend can help you navigate the ins-and-outs of AI-leveraged problem solving; be it strategy, integration, or fine tuning models to achieve the necessary performance. We believe the future belongs to those who grow with AI. Connect with us today.