Snowflake has unveiled new enhancements for data scientists, engineers, and application developers to improve programmability.
The latest updates bring Python to the forefront, with the launch of Snowpark for Python – now in public preview, and a native integration with Streamlit for rapid application development and iteration, currently in development.
Additionally, Snowflake is streamlining access to more data with new updates for working with streaming data and making data stored in open formats and on-premises available in the Data Cloud.
The introduction of Snowpark, Snowflake’s developer framework, opens up a programming environment for data scientists, data engineers, and application developers to build scalable pipelines, applications, and machine learning (ML) workflows directly in Snowflake using their preferred languages and libraries.
Snowflake further extends what users can build with Snowpark for Python, making Python’s ecosystem of open-source packages and libraries accessible in the Data Cloud.
With a highly secure Python sandbox, Snowpark for Python runs on the same Snowflake computing infrastructure as Snowflake pipelines and applications written in other languages.
Developers now have the opportunity to streamline and modernize their data processing architecture by consolidating their Python-based data processing in Snowflake using Snowpark.
Additional updates that complement Snowpark for Python include:
Python’s ecosystem of open-source packages is a top choice for developers, and Snowflake’s continued partnership with Anaconda extends access to more Python packages in Snowflake, with all code running in a highly secure sandbox environment. Snowflake Worksheets for Python, now in private preview, allows users to develop pipelines, ML models, and applications directly in Snowsight, Snowflakes UI, using Python and Snowparks DataFrame APIs for Python, streamlining development with code auto-completion, and the ability to produce custom logic in seconds. Snowflake’s Streamlit integration, currently in development, brings Python-based application development directly into Snowflake, enabling users to build interactive applications and securely share, iterate, and collaborate with business teams to increase the impact of development. Large Memory Warehouses, currently under development, will allow users to securely perform memory-intensive operations, such as feature engineering and model training, on large data sets using popular open-source Python libraries available through the Anaconda integration. SQL Machine Learning, starting with time series forecasting now in a private example, enables SQL users to embed ML-powered forecasting into their day-to-day business intelligence and analytics to improve the quality and speed of decisions. The Snowpark Accelerated program has also seen continued growth, thanks largely to the advancement of Snowflakes Python, with more partners building with Python to extend the power of the Data Cloud in their language of choice, the company said.
Snowflake said that getting access to the right data quickly and efficiently is critical to improving developer productivity, building ML models with increased accuracy, and delivering more powerful applications. According to the company, the latest improvements allow teams to experiment faster, with more data at their fingertips, leading to more programming options and deeper user insights.
Innovations include:
Streaming data support to eliminate the boundaries between streaming and batch pipelines with Snowpipe Streaming, now in private preview, for serverless ingestion of streaming data, and materialized tables, currently under development, that make it easy to transform streaming data declaratively. Iceberg Tables in Snowflake, currently in effect, enable users to work with Apache Iceberg, a popular open table format, in external storage while taking advantage of the Snowflake platform, simplifying overall data management and enabling architectural flexibility created. External tables for on-premises storage, now in private preview, to give users access to their data in on-premises storage systems such as Dell Technologies, Pure Storage, and more from Snowflake, so they can take advantage of the elasticity of the data cloud without having to worry about to move this data.
Snowflake senior vice president Christian Kleinerman said, “We are investing heavily in Python to make it easier for data scientists, data engineers, and application developers to build even more in the Data Cloud without compromising governance.
“Our latest innovations increase the value of our customers’ data-driven ecosystems, giving them greater access to data and new ways to develop with it, right in Snowflake.
“These capabilities, combined with Snowflake’s best-in-class data security and privacy, are changing how teams experiment, iterate, and collaborate with data to generate value.”