Snowflake experts powering world-class AI teams

Transform messy data into clean, structured foundations that power your AI and analytics. Our elite engineers deliver cost-efficient Snowflake platforms fast – built right the first time.

Talk to an expert

Trusted
Snowflake Partner

Why us

Scale beyond legacy systems

AI, real-time analytics, and compliance demands are pushing legacy data stacks to their limits. Business leaders want results, but data teams are stuck firefighting.

Snowstack bridges that gap with its Snowflake-first architecture, built for performance, governance, and AI readiness from day one. We modernize your infrastructure so you can move fast, stay compliant, and make smarter decisions without the overhead and enterprise costs.

Why us

Fast and cost-efficient execution

Get enterprise-grade results fast and cost-efficiently

Delivered by best-in-class engineering team

Certified Snowflake experts who deliver results right the first time

Built-in security and compliance

Governance controls, access policies, and audit trails embedded into each project

Services

Turn your data into competitive advantage

From migration to ongoing maintenance and integration, we deliver the full spectrum of Snowflake expertise your team needs. Fast implementation, built-in security, and continuous support that adapts to your business growth

Enterprise
Data

Snowflake implementation

End-to-end setup of scalable, secure, and AI-ready data infrastructure — built natively on Snowflake.

Platform
Scalable

Platform team as a service

Get a dedicated, senior Snowflake team to manage, optimize, and scale your data platform without hiring in-house.

Fast
Trusted

Migrations & integrations

Seamlessly move from legacy systems and connect the tools your business relies on — with zero disruption.

Compliance
Governance

AI & data governance

Ensure your data is structured, secure, and compliant — ready to power AI, ML, and analytics at scale.

Cost
Transparency

FinOps

Gain full visibility and control over your Snowflake spend with cost monitoring, optimization, and forecasting.

Future-proof
Expert

Snowflake consulting

Strategic support to audit your current stack, design future-proof architecture, and align data with business goals.

Want to lean more?
Explore all services
Benefits

Why leading companies choose Snowflake

Get answers in seconds

Reports that used to take all day now complete in seconds, so your team can make faster decisions with current data.

Strong ROI with smart costs

Companies get their money by paying only for what they use, with automatic optimization that cuts costs by up to 65%.

Scale with your business

Handle any data volume or user count without slowdowns—your platform automatically scales resources based on actual demand.

Solutions

Deep expertise in data-intensive industries

We understand the unique challenges of regulated sectors where data accuracy, security, and speed directly impact business outcomes.

Sucess stories

How our Snowflake consulting transforms data operations

Case study
5 min read

How a $45B FMCG leader regained control of their Snowflake platform with Snowstack

Companies that fail to master their data platforms in 2025 will not just fall behind. They will become irrelevant as AI-native competitors rewrite the rules of the market.

Companies that fail to master their data platforms in 2025 will not just fall behind. They will become irrelevant as AI-native competitors rewrite the rules of the market. One global FMCG manufacturer recognized this early on. By partnering with us, they turned their underperforming Snowflake environment into an innovation engine.

Key outcomes:

  • 30% reduction in Snowflake costs through intelligent optimization
  • 60% faster incident resolution with 24/7 monitoring
  • 5+ new AI/BI use cases unlocked from reliable, curated datasets
  • 100% audit readiness for SOC 2 and GDPR frameworks

Having a dedicated Snowflake team that truly understands our platform made all the difference. We no longer chase incidents or firefight pipeline issues - we’re focused on enabling the business. Their ownership, responsiveness, and expertise elevated our data platform from a bottleneck to a strategic asset.                                                                                                                          - Senior Director, Data Platforms

Client overview

The client is a multinational Fast-Moving Consumer Goods (FMCG) manufacturer operating in over 180 countries through both corporate offices and an extensive franchise network. With global revenues exceeding $45 billion and more than 6,300 employees worldwide, they manage a diverse product portfolio distributed through complex regional supply chains.

The challenge

Despite investing in modern cloud infrastructure, the client was stuck. Their internal teams lacked the specialized expertise needed to run the platform. When key engineers left, so did the expertise. This resulted in growing technical debt. Critical pipelines regularly failed or ran late. Compliance and audit demands became difficult to satisfy due to inconsistent governance. Without proper optimization, Snowflake costs increased. As a result, the platform’s reputation fell from being seen as an innovation enabler to becoming a business blocker.

What made things even harder was the seasonal nature of FMCG operations. Demand for data engineering resources fluctuated throughout the year. Resource needs spiked during busy times and dropped during slow periods. This led to ongoing hiring and retention challenges. Meanwhile, competitors kept moving forward with steady expertise and data strategies.

Our solution

The client wanted a better way to manage their data and prepare for future growth. They asked us to provide a full Snowflake delivery team that could handle the project from start to finish. Instead of hiring separate contractors, they gained a team of Snowflake-certified experts who worked together to deliver the solution quickly.

Role Responsibility
Service Delivery Manager Coordination, client communication, strategic alignment
Snowflake Platform Lead Solution architecture, technical strategy, governance
L1/L2 Support Specialists Incident response, monitoring 24/7, routine maintenance
L3 Dev Team Experts Complex integrations, enhancements, and advanced troubleshooting
Data Engineers Pipeline development, data modelling, ETL optimization (optional based on client needs)
DevOps/FinOps Specialists Infrastructure automation, cost optimization, performance tuning (optional based on client needs)
AI/BI Architects Advanced analytics, machine learning enablement, dashboard strategy (optional based on client needs)

Our execution

With our support, the client regained platform stability, resolved recurring system issues, and accelerated the delivery of new data solutions. The Snowflake environment became easier to manage, more predictable, and better aligned with business priorities.

Structured collaboration

The client led a phased rollout, supported by bi-weekly service reviews and backlog planning sessions. We worked directly within their workflows (Slack, Teams, Jira) and joined daily stand-ups and steering meetings. To help address long-standing challenges with knowledge retention, we introduced clear RACI ownership and thorough documentation practices.

SLA-Driven Support Model

The engagement featured a service model tailored to the client’s operational needs. Platform support was aligned to business hours, extended hours, or 24/7 coverage depending on requirements. SLAs were defined by incident severity, with guaranteed response and resolution times in place. To give the client real-time visibility and control, we implemented automated monitoring and alerting.

Platform optimisation and future-proofing

The client was committed to building a Snowflake environment that could scale with the business. With our support, they focused on optimising performance, controlling costs, and staying ahead of future demands.

Faster delivery, greater impact

We supported ongoing initiatives by onboarding new data sources, integrating BI tools and APIs, and maintaining platform standards across internal and third-party teams. Automation and reusable pipelines cut source-to-Snowflake integration time from weeks to days.

Continuous improvement and strategic reporting

Monthly platform reports provided clear visibility into KPIs, usage trends, incidents, and optimisation opportunities. This helped the client move from reactive support to a proactive and data-driven platform management.

Governance and security practices

To support regulatory and internal compliance requirements, we implemented platform-wide governance controls. These included RBAC, data masking policies, access audits, and full alignment with SOC 2 and GDPR frameworks.

The results

Area Before After
Platform Stability Recurring system issues, unpredictable performance. Stable Snowflake environment aligned with business priorities.
Collaboration External support created silos, unclear ownership, and poor knowledge retention. Embedded in client workflows, clear RACI ownership, and a strong documentation culture.
Support Model Reactive issue handling, manual oversight, and inconsistent coverage. SLA-driven with defined severity levels, automated monitoring, and 60% faster resolution.
Cost & Performance Not optimised Snowflake’s cost with limited scalability planning. Up to 30% cost reduction with no performance loss, scalable architecture.
Delivery Speed Weeks to integrate new data sources, limited AI/BI enablement. 5+ new AI and BI use cases supported by production-ready datasets.
Reporting & Visibility Limited insight into platform health and usage trends. Monthly KPI-driven reports, proactive platform optimisation.
Governance & Security Gaps in compliance and audit readiness. Full SOC 2 and GDPR alignment, 100% audit readiness with traceability.

Strategic value

Owning a data platform is not the goal. Making it work for the business is.

This partnership showed how Team as a Service can turn a complex platform into a strategic asset. By working directly inside the client’s operations, our certified Snowflake experts turned a complex, high-maintenance platform into a scalable foundation for growth.

Now, they are ready to take on AI and advanced analytics, backed by an architecture built to grow with the business.

At Snowstack, we don’t just help companies manage Snowflake. Our model helps enterprises stay ahead in a data environment that keeps changing

Ready to turn your Snowflake platform into a competitive advantage?

Let’s talk about how our team can help you get there

Project details

Industry

FMCG

Duration

Ongoing (Support Service)

Engagement Model

Team as a Service

Frequently Used Snowflake Components

Core Snowflake Data Cloud, Snowpipe, Tasks & Streams, Materialized Views, Secure Data Sharing, RBAC & Data Masking, Snowpark, Resource Monitors

Other Tools Integrated

dbt, Fivetran / Airbyte, Power BI / Tableau / Looker, Azure Blob / AWS S3 / GCP Storage, GitHub / GitLab, ServiceNow / Jira, Okta / Azure AD, Great Expectations / Monte Carlo

Case study
5 min read

How a top global logistics leader boosted BI performance by 65% with Snowflake

One wrong move during their Snowflake migration could have brought down hundreds of BI applications and reports. With legacy systems built over 15 years and rising maintenance costs putting operations at risk, this top-5 global logistics company faced its most critical data challenge yet.

One wrong move during their Snowflake migration could have brought down hundreds of BI applications and reports. With legacy systems built over 15 years and rising maintenance costs putting operations at risk, this top-5 global logistics company faced its most critical data challenge yet.

Our experts at Snowstack stepped in to navigate this complex transformation. The outcome? A smooth migration that turned the company’s greatest risk into a long-term competitive advantage.

Key outcomes:

  • Report performance improved by 65%, with dashboards running in minutes instead of hours.
  • Infrastructure costs fell by 40% while system performance increased.
  • The migration achieved zero disruption, maintaining 100% uptime.
  • Over 65% of legacy SQL was converted automatically, saving months of effort.
  • More than 40 developers were trained and upskilled on Snowflake.

Over the years, our BI teams developed an effective approach to data modeling, which had long been a strength. However, with the ongoing migration of the central data warehouse to Snowflake, we knew that adopting new tools could take months, if not years. We urgently needed support from Snowflake professionals to guide the adoption process and help our BI teams incorporate the new technology into their workflows.                                                                                                                         - Lead Data Architect

Client overview

The client operates as one of the top 5 key players in the industry, managing supply chains that span multiple continents and serve millions of customers worldwide. Their data ecosystem had evolved organically, supporting hundreds of BI applications that power everything from real-time shipment tracking to route optimization algorithms.

The client’s BI reports weren't just internal dashboard. They powered customer-facing systems that enterprise clients used to track shipments worth millions of dollars. Any disruption to these systems could trigger contract penalties and damage relationships with major accounts.

The challenge

15 years of business growth had created a BI environment that was difficult to manage. Hundreds of reports were built independently by different teams with varying skill levels. Although they all drew from the same data warehouse, each team applied its own transformation logic within separate BI systems. What began as team-specific solutions had grown into a web of technical debt that no one fully understood.

Challenge Area Client Experience
Technical debt and inconsistency Over many years, different teams created their own data models. As a result, reports often contradicted each other — the same metric could show different results depending on the report. This inconsistency frustrated business users and eroded confidence in the data.
Escalating maintenance costs Keeping the BI landscape running became increasingly expensive. Daily maintenance and constant support from the BI team drained resources, while longer service interruptions disrupted critical supply chain operations.
Migration risk The decision to migrate the legacy data warehouse to Snowflake came with high stakes. Leaders worried that reporting would break during the transition, and teams were reluctant to commit knowing how many reports required refactoring.
Governance and scalability issues The architecture had grown in silos. Each team worked with its own logic and processes, making it nearly impossible to apply consistent governance or scale analytics across the organization. Collaboration was limited, and data reusability suffered.

Our solution

Recognizing the critical need for modernization, the client made the strategic decision to unify their data model and move it to Snowflake alongside their ongoing data warehouse migration. We guided the client through five steps.

Step 1: identifying the foundation

Together with the client, we analysed their extensive BI landscape to identify the datasets most frequently used across reports. This joint assessment defined a minimum viable product (MVP) scope that would deliver immediate value and build momentum for the broader transformation.

Step 2: building the Snowflake environment

We worked with the client to establish a dedicated Snowflake environment designed specifically for BI collaboration. Together, we implemented:

  • Standardized schemas and roles to ensure consistent data access patterns across teams
  • Compute scaling strategies optimized for BI workloads
  • Role-based access control (RBAC) to strengthen governance
  • BI-specific access patterns tailored to Snowflake datasets

Step 3: automating the migration process

To accelerate the transition and protect prior investments, we partnered with the client to implement automated migration scripts that converted legacy SQL into Snowflake SQL. This achieved a 65% automatic refactor success rate, dramatically reducing manual work while preserving business logic.

Step 4: orchestrating seamless integration

In close collaboration, we designed and deployed new orchestration pipelines that synchronized Snowflake model builds with BI report refreshes. These pipelines were integrated with the client’s existing technology stack, including:

  • Airflow Snowflake Operator for workflow management
  • AWS SNS for notifications
  • AWS S3 for data staging
  • Git for version control

Step 5: investing in the team

Recognizing that technology transformation must go hand-in-hand with people transformation, we partnered with the client to deliver training for more than 40 BI developers. This knowledge transfer ensured teams could confidently work with Snowflake as their new backend, embedding long-term value into the organization.

Foundation for Future Innovation

Still running hundreds of disconnected BI reports with inconsistent data models?

Upgrading your BI architecture is no longer a matter of if. The real question is how quickly you can create a one source of truth before competitors pull so far ahead you can’t catch up. The companies winning today are those replacing broken reporting with accurate, unified data that every team can trust. Each month you delay, they improve decision accuracy and grow their market share.

We help you close that gap fast. Our Snowflake-certified experts bring years of experience and a proven approach to modern BI transformation. We can take years of messy, disconnected systems and turn them into a single, reliable analytics platform in months. With one source of truth in place, your teams spend less time fixing reports, more time acting on accurate information, and deliver faster business decisions.

Ready to unify your BI architecture on Snowflake?

Book your strategy session

Project details

Industry

Global Logistics & Supply Chain

Duration

3 months implementation

Engagement Model

Migration service with comprehensive training & support

Team Composition

Lead Architect, Data Engineers, Migration Specialists, BI Developers

Frequently Used Snowflake Components

Warehouses, RBAC, Snowpipe, Tasks & Streams, Secure Data Sharing, Materialized Views, Time Travel, Stored Procedures

Other Tools Integrated

Airflow Snowflake Operator, AWS SNS, AWS S3, dbt, Fivetran, Power BI, Azure AD

From 80% faster reporting to 65% cost savings, here's how our clients turned data into business results.

View all stories
Transparent and proven methodology

The expert-led delivery framework

No big bang. No black boxes. Our signature transparent methodology, refined through years of Snowflake experience, coordinated to deliver fast, high-quality results.

Accelerators

Designed to move fast

Whether you’re building a modern data warehouse, governed data sharing, or AI-driven use cases - our Snowflake-native accelerators eliminate months of development while embedding enterprise-grade practices.

Ingestion templates

For batch, API, and streaming data sources with error handling and monitoring, built using Airflow, AWS Glue, or Snowflake OpenFlow.

Enable AI with your data

Cortex Agents and Snowflake Intelligence applied to your data using semantic models defined by the business.

Snowpark starter kits

Python-based ML and data engineering frameworks with optimized performance patterns for Snowflake compute.

Cost guardrails

To keep usage optimized and transparent with automated alerts and warehouse scaling rules.

CI/CD deployment frameworks

For repeatable, secure platform rollouts with GitOps workflows and automated testing pipelines.

Data product blueprints

Accelerates domain-aligned architecture and business adoption with built-in governance and access controls, built using dbt.

Enterprise-grade security

Enterprise security controls and governance frameworks built into every Snowflake implementation. Role-based access, data encryption, and audit trails configured from day one.

SOC 2
Compliance

Testimonials

What our clients say

What used to take us hours of manual clean-up across dozens of Excel files is now a seamless process. The Snowstack team didn't just give us technology – they gave us our time back. We now build better reports much faster, and can finally think about predictive analytics as a reality, not just a wish. They felt like part of our team from day one.

Head of Sales Intelligence

Having a dedicated Snowflake team that truly understands our platform made all the difference. We no longer chase incidents or firefight pipeline issues – we’re focused on enabling the business. Their ownership, responsiveness, and expertise elevated our data platform from a bottleneck to a strategic asset.

Senior Director, Data Platforms

Working with Snowstack was a game-changer. Their team came in with a clear methodology, deep Snowflake expertise, and zero handholding needed. We didn't have to move a muscle in-house – they brought it all, tailored it to our business, and delivered fast.

CTO, Regional Pharma Distributor

Over the years, our BI teams developed an effective approach to data modelling, which had long been a strength. However, with the ongoing migration of the central data warehouse to Snowflake, we knew that adopting new tools could take months, if not years. We urgently needed support from Snowflake professionals to guide the adoption process and help our BI teams incorporate the new technology into their workflows.

Lead Data Architect
Insights

Learnings for data leaders

Blog
5 min read

Understanding Snowflake: 7 core capabilities that set it apart from legacy databases in 2025

Most enterprise databases were built for monthly reports, not AI products that need fresh, reliable data every hour. This guide breaks down 7 core Snowflake capabilities, explains how they solve the typical Oracle, Teradata, SQL Server and on premises PostgreSQL or MySQL limitations, and shows what they mean for your teams in real projects.

Let's be honest. Your current database was most likely built for monthly reports, not AI products that demand regular updates and reports all the time. This is the reason why, in 2025, really innovative and data-driven businesses continue their migration away from legacy databases like Oracle, Teradata, SQL Server, and on-premises MySQL/PostgreSQL toward modern cloud-native architectures. Snowflake has become the industry leader, powering analytics and AI workloads across finance, retail, technology, and enterprise sectors.

This guide breaks down 7 core Snowflake capabilities and shows how the right Snowflake consulting can turn them into best results for your teams.

What is the legacy database challenge?

Before diving into Snowflake's capabilities, it's crucial to understand the limitations organisations face with traditional databases. Therefore, let’s consider the scenario of a global FMCG company operating in multiple regions, where we helped transform the data infrastructure from legacy on-prem systems to

With our expert Snowflake migration services, the company moved to Snowflake + dbt + Fivetran + Tableau as a modern data stack.

Talk to our Snowflake consultant →

Challenge Impact
Legacy on-prem SQL servers and siloed BI systems Slow insights, high maintenance burden
Manual ETL pipelines Inconsistent data accuracy
High infrastructure and scaling costs Limits on reporting and forecasting
Slow experimentation for data science Delays in business decisions

The 7 core Snowflake capabilities in 2025

1. Multi-cluster shared data architecture

The fundamental differentiator: Snowflake's three-layer architecture completely separates storage from compute resources.

Key benefits:

  • Unlimited concurrency
  • Auto-scaling virtual warehouses
  • Near-zero locking and contention
  • Pay-as-you-use compute

This means analysts, data scientists, and applications can work in parallel on the same datasets without contention.

Business impact:

You no longer have to buy extra storage just to get more compute. You scale up when you need power, scale down when you don’t, and you can see what that means for your bill in minutes with our FinOps savings calculator

2. Cross-cloud & multi-region replication

This Snowflake capability is critical for regulated industries (financial services, healthcare, insurance) and companies with international operations requiring data sovereignty compliance.

Snowflake delivers:

  • Multi-cloud availability on AWS, Azure, and Google Cloud Platform
  • Easy cross-region replication and failover
  • Global application distribution
  • Built-in disaster recovery without complex configuration

Plan residency, failover, and recovery during **platform architecture,** then implement Snowflake like a pro.

Business impact:

A global FMCG company can maintain synchronised data across North American, European, and Asian markets while meeting local data residency requirements. This is difficult to achieve with legacy on-premises databases.

3. Zero-copy cloning & time travel

Snowflake's innovative approach to data management enables instant environment creation with zero additional storage costs.

Game-changing features:

  • Clone terabyte-scale databases in seconds without duplicating data
  • Time Travel for historical queries and point-in-time recovery
  • Safe dev/test environment provisioning without impacting production

Development teams can spin up complete production-like environments instantly for testing, while legacy databases require duplicated environments that consume massive storage and take hours or days to provision.

Business impact:

Data engineers can test complex transformations on production-scale data without risk, dramatically accelerating development cycles and improving data reliability.

4. Built-in governance & RBAC security

In 2025, data governance and security are business-critical requirements for compliance and risk management.

Snowflake's security framework includes:

  • Fine-grained access control with row-level and column-level masking
  • Data lineage and classification for understanding data provenance
  • Policy-based access control with external tokenisation partner support
  • Automatic encryption at rest and in transit
  • Dynamic data masking to protect sensitive information
  • Audit logging and monitoring for compliance reporting

These are essential for organisations operating under SOC 2, HIPAA, GDPR, PCI DSS.

5. Native AI & Python ecosystem

Snowflake has built-in support for Python and machine learning, so your team can build and run models where the data already lives instead of exporting them elsewhere. With solid AI and data governance in place, it becomes easier to try new ideas safely and move them into production. The key building blocks are:

Feature Value
Snowpark for Python Run Python directly in Snowflake
Native ML inference Zero data movement
UDFs / Stored Procedures Custom logic at scale
ML ecosystem partners Dataiku, H2O.ai, SAS integration

Business impact:

This means that teams can train, deploy & serve ML models securely inside Snowflake. Data scientists spend less time on data engineering and infrastructure management and more time building models that drive business value.

6. Marketplace & data sharing economy

The Snowflake Marketplace reshapes how enterprises access 3rd-party data (functioning as the "App Store for data"). We are looking at:

  • Thousands of data providers covering financial data, geospatial information, retail insights, weather patterns, ESG metrics, and logistics intelligence
  • Live data feeds without pipelines (No ETL required)
  • Private data exchange across subsidiaries, partners, and customers

Business impact:

You can now achieve faster analytics, better forecasting, and smarter decisions by instantly accessing external data sources that would traditionally require weeks of negotiation, integration work, and ongoing pipeline maintenance.

7. Extensibility: unistore & native apps

Snowflake is no longer just a data warehouse. In 2025, it can also handle simple day-to-day transactions and apps that run directly on your data.

Next-generation capabilities:

  • Unistore for OLTP-lite workloads, enabling hybrid transactional/analytical processing
  • Snowflake Native Apps for custom application development
  • Streamlit integration for building interactive data applications
  • Real-time data pipelines via Kafka connectors and Snowpipe Streaming

Business impact:

Snowflake serves hybrid workloads that legacy databases struggle to handle without significant operational complexity. Organisations consolidate their data infrastructure rather than maintaining separate systems for transactional and analytical workloads.

Real-world example: Snowflake consulting & migration results

Here’s what the shift looks like in practice. In a recent Snowflake project with a global FMCG company, we rebuilt the analytics backbone by establishing a governed core data model, automating ingestion and orchestration with native services and partner connectors, and reconnecting BI directly to a single, auditable source of truth. As seen in the table below, the result was a step-change in reliability and speed.

Documented results from migration to Snowflake:

Before Snowflake After Snowflake
Overnight BI refreshes Same-day analytics refreshes
High ETL maintenance 80% automation via Pipes & Streams or Snowflake partner integrations like Fivetran
Siloed regional reporting Centralized data lakehouse
Manual Excel forecasting Automated ML-powered forecasting
Slow KPI access for business Real-time dashboards in Tableau

Beyond the database

Snowflake’s strengths include a unique design, flexible scaling, strong access and security controls, built-in AI features, and safe sharing across regions, which make it more than a database. It is a modern cloud data platform that powers predictive analytics, self-service reporting so product teams can trust the data and use it with ease. In business, the faster you get answers, the stronger your advantage, and Snowflake is setting the standard for company data platforms.

If you are choosing a data platform in 2025, plan for what you will need next year as well as today. Snowflake’s design is built for an AI-ready cloud-based future. We help you make that future real by setting up Snowflake, connecting your data, putting clear access rules in place, and keeping costs under control with a simple 90-day plan that we build with your team.

Ready to turn Snowflake into results?

Book a 30 minute call with our Snowflake consultant →

FAQs

They decide how fast your teams can work, how often they’re blocked, and how much you pay every month. Features like multi-cluster compute, Time Travel, zero-copy cloning, governance, AI support, and Marketplace only help if they’re wired into a clear plan. That’s what our advisory and architecture and Snowflake implementation projects are designed to do.

Yes. You can replicate data across regions and even across clouds (AWS, Azure, GCP) for disaster recovery, latency, and compliance needs. The important part is to plan this up front: which regions you need, what your RPO/RTO targets are, and how you will test failover. We design this as part of Advisory and architecture.

Yes. With Snowpark, Cortex, and support for unstructured data, you can build AI use cases (scoring, recommendations, search) directly on Snowflake. Vector search lets you work with embeddings for things like document or product search without moving data into a separate stack. We help you do this safely under one set of rules via AI and data governance.

The Snowflake Marketplace is a catalog of live third-party data and apps that you can plug straight into your account without building heavy ETL pipelines. It’s useful when you need external data such as demographics, weather, payments, ESG, or location data to enrich your own. We help you pick the right data products and wire them into your models and dashboards through Migrations and integrations.

Unistore and Hybrid Tables let Snowflake handle simple transactional or row-based workloads (for example, orders, events, or app states) close to your analytics. They matter when you want to keep both “what just happened” and “what does it mean” on the same platform, instead of running a separate operational database. We include them where it makes sense in Snowflake implementation projects.

Yes. Snowflake can read and write Apache Iceberg tables in external storage, which is helpful if you are building or keeping an open data lake or a hybrid “lakehouse” setup. That way you don’t have to lock everything into a single format or vendor. We usually design this as part of Migrations and integrations.

Blog
5 min read

Can Snowflake store unstructured data? How Snowflake handles documents, images, and other data in 2025

Snowflake isn’t just rows and columns anymore. In 2025 you can land PDFs, images, logs, and app data next to your tables, then query, enrich, and search them with SQL, Snowpark, and Cortex AI.

What if your PDFs, transcripts, and logs could live in the same place as your BI dashboards? For years, Snowflake was known primarily as a cloud native data warehouse built for structured analytics. It was the go-to solution for SQL analysts, BI teams, and data engineers working with neat rows and columns. Meanwhile, many teams dealing with documents, images, logs, and raw application data assumed they needed separate storage such as Amazon S3, Google Cloud Storage, Azure Blob, or NoSQL databases.

In 2025, that separation no longer has to exist. Snowflake is now a multimodal data platform that can store, process and query unstructured data.

So yes, Snowflake can store unstructured data, but more importantly, it can use it. This capability offers significant architectural advantages for modern data teams. In this blog post, we’ll break down exactly how and why it matters.

What is unstructured data?

Unstructured data refers to any information that doesn't fit neatly into traditional rows and columns. This includes:

  • Documents: PDF, DOCX, TXT files
  • Images: PNG, JPG, TIFF formats
  • Audio and video files: Media content and recordings
  • Logs and event data: Application and system logs
  • Communication data: Email threads and chat transcripts
  • Markup and structured text: HTML, XML, JSON blobs
  • Binary files: Application-specific file formats

As organisations increasingly generate massive volumes of this data, the need for unified platforms that can both store and analyse unstructured content has become critical.

How Snowflake stores unstructured data?

Snowflake stages for unstructured data

Snowflake manages unstructured data through stages. This means through storage locations that reference files either within Snowflake's managed infrastructure or in external cloud storage:

  • Internal Stages: Files are stored within Snowflake's managed storage, offering quick setup and seamless integration
  • External Stages: Files remain in external cloud locations (Amazon S3, Azure Blob Storage, Google Cloud Storage), with Snowflake accessing them via metadata references

You can also combine both approaches for optimal performance and scalability based on your specific requirements.

The FILE data type in Snowflake for unstructured files and metadata

Snowflake provides a dedicated FILE data type for unstructured data. A FILE value represents a reference to a file stored in an internal or external stage, without storing the actual file content in the table itself. This approach allows:

  • Efficient storage and cost management
  • Fast metadata querying
  • Seamless integration with processing pipelines

Accessing unstructured files in Snowflake

Snowflake provides familiar commands for file management:

  • PUT: Upload files to stages
  • GET: Download files from stages
  • LIST: View files stored in stages

These operations mirror cloud storage interactions while maintaining Snowflake's security and governance standards.

Processing and querying unstructured data in Snowflake

Storage is just the beginning. Snowflake's real power lies in its ability to process and extract insights from unstructured data.

Snowflake Cortex AI and Document AI for PDFs, images and hybrid search

Cortex AI enables advanced analytics on unstructured data directly within Snowflake:

  • Document analysis: Extract text, summarise content, and perform batch LLM inference on PDFs and documents
  • Image processing: Run classification and analysis on stored images
  • Multimodal SQL functions: Query and transform documents, images, and audio using SQL-powered pipelines
  • Schema-aware extraction: Automatically extract structured tables from unstructured documents like invoices and reports

Snowpark for custom processing

With Snowpark, you can:

  • Extract text from PDFs using Python
  • Perform image classification with embedded ML models
  • Parse JSON or log files into VARIANT columns
  • Run OCR, NLP, and generate embeddings via external functions
  • Build semantic search capabilities over document collections

VARIANT data type for semi-structured data

The VARIANT data type handles semi-structured data formats like JSON, XML, Parquet, and Avro:

  • Store complex, nested data structures
  • Query JSON fields directly using SQL
  • Maintain schema flexibility while preserving query performance

Why unified data architecture matters?

In most companies, data still lives in many places and tools. Dashboards sit on a legacy SQL warehouse, logs go to a separate observability stack, and documents and images disappear into unmanaged cloud buckets or shared drives.

Instead of stitching together a dozen point solutions, you can use Snowflake as the backbone of your data architecture and keep external systems only where they add unique value. The table below shows how data stack functions shift when you standardise on Snowflake in 2025:

Function Old architecture Snowflake in 2025
Analytics Separate SQL data warehouse Snowflake core engine
File storage S3, Google Cloud Storage, Azure Blob Internal storage plus external tables and integrations
Processing Spark clusters or ad hoc Python scripts Snowpark running in the same Snowflake account
Semi-structured & unstructured NoSQL database or object storage Native support in Snowflake tables and stages
Search & retrieval Elasticsearch or a separate search service Cortex search and vector search
ML & AI Separate ML platform and custom pipelines Snowflake AI Studio and Snowpark ML

Real-world use cases of handling unstructured data in Snowflake

Here is how this looks in practice. Below is our recent project, plus common patterns we see when teams bring documents, images, logs, and app data into Snowflake and put them to work.

Global finance, AI-ready in 90 days

A multinational finance firm spending more than 800K per month on cloud was battling rising costs and fragmented data. They needed a governed place for documents, logs, and tables. We used OpenFlow to ingest both structured and unstructured data into Snowflake, tracked lineage and policies in Horizon Catalog, set consistent business logic with semantic views, and enabled natural language querying through Cortex AI SQL. The result was about an 80% reduction in ingestion latency, real-time cost visibility with FinOps, and a platform ready for analytics, ML, and AI at scale.

Read how a global finance managed unstructured data in Snowflake →

Limitations and considerations of Snowflake

Snowflake’s unstructured data capabilities are strong, but it won’t fully replace your data lake or media platform. For B2B teams planning at scale, keep these practical constraints in mind:

  • Not a pure object storage replacement: Snowflake complements rather than replaces S3/GCS for massive-scale raw object storage
  • File retrieval performance: Binary object retrieval speed varies by file size and stage type
  • Compute costs: AI and ML workloads require careful resource management
  • Specialised use cases: For intensive video/audio editing, use specialised systems.

Best practices for managing unstructured data in Snowflake in 2025

1. Keep big binaries in external object storage, keep brains in Snowflake

Register S3, Blob, or GCS as external stages and reference files via the FILE type; keep only hot assets in internal stages for speed.

2. Standardize file layout and formats from day one

Use predictable paths (org/source/system/YYYY/MM/DD/id) and checksums; prefer compressed columnar formats like Parquet, with extracted text or page JSON beside PDFs and images.

3. Store metadata and embeddings in Snowflake, not in files

Put raw files in stages, but keep metadata, chunks, and embeddings in Snowflake tables linked by stable URIs for fast search and governance. Use directory tables to catalog staged files.

4. Orchestrate ingest → extract → enrich → index → serve with Snowpark

Run OCR, NLP, and parsers as Snowpark tasks and UDFs; batch, log runs, and make jobs idempotent so reruns are safe. Implementation flow in processing files with Snowpark.

5. Treat AI as a costed product

Separate warehouses for ELT and AI, strict auto-suspend, resource monitors, caching, and reuse of embeddings and summaries. Get a baseline with the FinOps savings calculator.

6. Govern at the row, column, and file edge

Classify on arrival, enforce row and column policies with masking, and keep least-privilege stage access and full lineage. For role design patterns, see Snowflake role hierarchy best practices.

Need a hand?

Our snowflake experts at Snowstack can audit your current setup, design a lean reference architecture, and prove value with a focused pilot. Read how we deliver in How we work or talk to a Snowflake expert.

Talk with a Snowflake consultant→

Final thoughts

Snowflake doesn’t just store unstructured data; it makes it usable for search, analytics, and AI. With stages, the FILE data type, VARIANT, Snowpark, and Cortex, you can land documents, images, and logs alongside your tables, extract text and entities, generate embeddings, and govern everything under a single security and policy model. The winning pattern is simple: keep raw binaries in low-cost object storage, centralise metadata and embeddings in Snowflake, and start with one focused, high-value use case you can scale.

Ready to try this in your stack?

Book a 30-minute call with our Snowflake consultant →

FAQs

Can Snowflake store unstructured data (PDFs, images, audio)?

Yes. Snowflake stores and processes unstructured files via stages (internal or external) and a FILE column type. You can access them with SQL and AI features. For setup help, see Snowflake implementation and AI and data governance.

Who can help me implement unstructured data on Snowflake?

Snowstack builds end-to-end pipelines for documents, images, logs, and app data. Start with Snowflake implementation or Contact.

What does a typical Snowstack pilot include?

A focused 4–6 week build: audit, reference architecture, secure stages and directory tables, ingest and extract jobs, embeddings and search, cost guards, and a demo with success metrics. See How we work.

What is Snowflake’s FILE data type?

FILE is a column type that holds a reference to a staged file (plus metadata like MIME type, size, etag, last modified, and URLs). It doesn’t store the binary itself—just a pointer with metadata and helper functions (e.g., FL_GET_SIZE). We design schemas that use FILE in Advisory and architecture.

How do I put PDFs or images “into” Snowflake?

Create a stage, enable a directory table, then map staged files into a FILE column. We set this up during Migrations and integrations and Snowflake implementation.

Should I use internal or external stages?

Use internal stages for simplicity and hot paths. Use external stages when files live in S3, Azure Blob, or GCS. We help you choose in Advisory and architecture.

How do I upload, list, and download files?

Use PUT to upload to internal stages, LIST to enumerate, and GET to download from internal stages. For external stages, upload with your cloud provider tools. At Snowstack, we standardise this in Migrations and integrations.

What are directory tables, and why do they matter?

A directory table catalogs files on a stage so you can query, join to metadata, and build pipelines that react to file changes (with refresh/auto-refresh).

Can Snowflake run AI over documents and images?

Yes. Use built-in services for document extraction, image understanding, and natural language queries. We enable safe usage through AI and data governance.

Does Snowflake support vector search and embeddings?

Yes. Snowflake provides a VECTOR data type, vector similarity functions, and embedding utilities for RAG/search over your files’ text.

What file sizes work best for loading in Snowflake?

Aim for mid-sized files to balance parallelism and overhead; split very large files and compact many tiny ones. Get a sizing plan via Advisory and architecture.

How do I share or serve files securely?

Use scoped URLs (time-limited ~24h) or file URLs (require stage privileges). You can also generate scoped URLs with BUILD_SCOPED_FILE_URL.

How is unstructured data billed in Snowflake?

Internal stage storage is billed by Snowflake; external stage storage is billed by your cloud provider; compute and any egress are separate. Start with the FinOps Savings Calculator and FinOps services.

Can I join unstructured files with tables?

Yes. Use a directory table (file catalog) and join it to tables holding metadata (e.g., owners, tags, PII flags) to power governance and pipelines.

Blog
5 min read

From zero to production: a comprehensive guide to managing Snowflake with Terraform

Manual clicks don’t scale. As Snowflake environments grow, managing them through the UI or ad-hoc scripts quickly leads to drift, blind spots, and compliance risks. What starts as a quick fix often becomes a challenge that slows delivery and exposes the business to security gaps.

Manual clicks don’t scale. As Snowflake environments grow, managing them through the UI or ad-hoc scripts quickly leads to drift, blind spots, and compliance risks. What starts as a quick fix often becomes a challenge that slows delivery and exposes the business to security gaps.

Infrastructure as Code with Terraform solves these challenges by bringing software engineering discipline to Snowflake management. Using Terraform’s declarative language, engineers define the desired state of their Snowflake environment, track changes with version control, and apply them consistently across environments. Terraform communicates with Snowflake’s APIs through the official snowflakedb/snowflake provider, translating configuration into the SQL statements and API calls that keep your platform aligned and secure.

This guide provides a complete walkthrough of how to manage Snowflake with Terraform. From provisioning core objects like databases, warehouses, and schemas to building scalable role hierarchies and implementing advanced governance policies such as dynamic data masking.

Section 1: bootstrapping Terraform for secure Snowflake automation

The initial setup of the connection between Terraform and Snowflake is the most critical phase of the entire process. A secure and correctly configured foundation is paramount for reliable and safe automation. This section focuses on establishing this connection using production-oriented best practices, specifically tailored for non-interactive, automated workflows typical of CI/CD pipelines.

1.1 The principle of least privilege: the terraform service role

Terraform should not operate using a personal user account. Instead, a dedicated service user must be created specifically for Terraform automation. Before any Terraform code can be executed, a one-time manual bootstrapping process must be performed within the Snowflake UI or via SnowSQL. This involves using the ACCOUNTADMIN role to create the dedicated service user and a high-level role for Terraform's initial operations.

The following SQL statements will create a TERRAFORM_SVC user and grant it the necessary system-defined roles:

-- Use the highest-level role to create users and grant system roles
USE ROLE ACCOUNTADMIN;

-- Create a dedicated service user for Terraform
-- The RSA_PUBLIC_KEY will be set in the next step
CREATE USER TERRAFORM_SVC    
COMMENT = 'Service user for managing Snowflake infrastructure via Terraform.'    
RSA_PUBLIC_KEY = '<YOUR_PUBLIC_KEY_CONTENT_HERE>';

-- Grant the necessary system roles to the Terraform service user
GRANT ROLE SYSADMIN TO USER TERRAFORM_SVC;
GRANT ROLE SECURITYADMIN TO USER TERRAFORM_SVC;

Granting SYSADMIN and SECURITYADMIN to the service user is a necessary starting point for the infrastructure management. The SYSADMIN role holds the privileges required to create and manage account-level objects like databases and warehouses. The SECURITYADMIN role is required for managing security principals, including users, roles, and grants.

1.2 Authentication: the key to automation

The choice of authentication method is important. The Snowflake provider supports several authentication mechanisms, including basic password, OAuth, and key-pair authentication. For any automated workflow, especially within a CI/CD context, key-pair authentication is the industry-standard and recommended approach.

A CI/CD pipeline, such as one running in GitHub Actions, is a non-interactive environment. Basic password authentication is a significant security risk and not recommended. This leaves key-pair authentication as the only method that is both highly secure, as it avoids transmitting passwords, and fully automatable.

The following table provides a comparative overview of the primary authentication methods available in the Snowflake provider, reinforcing the recommendation for key-pair authentication in production automation scenarios.

Table 1: Snowflake provider authentication methods

Method Primary Use Case Security Profile CI/CD Suitability
Password Local development, quick tests Low. Exposes credentials in state or environment variables. Low. Requires secure secret management; often blocked by MFA.
OAuth User-delegated access for third-party applications High. Token-based, short-lived credentials. Medium. Complex to set up for non-interactive server-to-server flows.
Key-Pair Recommended for Automation. Service accounts, CI/CD pipelines. High. Asymmetric cryptography; no passwords transmitted. High. Designed for secure, non-interactive authentication.

To implement key-pair authentication, an RSA key pair must be generated. The following openssl commands will create a 2048-bit private key in the required PKCS#8 format and its corresponding public key:

Bash

# Navigate to a secure directory, such as ~/.ssh
cd ~/.ssh

# Generate an unencrypted 2048-bit RSA private key in PKCS#8 format
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out snowflake_terraform_key.p8 -nocrypt

# Extract the public key from the private key
openssl rsa -in snowflake_terraform_key.p8 -pubout -out snowflake_terraform_key.pub

After generating the keys, the content of the public key file (snowflake_terraform_key.pub), including the -----BEGIN PUBLIC KEY----- and -----END PUBLIC KEY----- headers, must be copied and pasted into the ALTER USER statement from the previous step to associate it with the TERRAFORM_SVC user. For enhanced security, the private key itself can be encrypted with a passphrase. The Snowflake provider supports this by using the private_key_passphrase argument in the provider configuration.

1.3 Provider configuration: connecting Terraform to Snowflake

With the service user created and the key-pair generated, the final step is to configure the Snowflake provider in the Terraform project. This is typically done in a providers.tf file.

The foundational configuration requires defining the snowflakedb/snowflake provider and setting the connection parameters.

terraform {  
required_providers {    
snowflake = {      
source  = "snowflakedb/snowflake"      
version = ">= 1.0.0" // Best practice: pin to a major version to avoid breaking changes    
    }  
  }
}

provider "snowflake" {  
organization_name = var.snowflake_org_name  
account_name      = var.snowflake_account_name  
user              = var.snowflake_user         // e.g., "TERRAFORM_SVC"  
role              = "SYSADMIN"                 // Default role for the provider's operations  
authenticator     = "SNOWFLAKE_JWT"  
private_key       = var.snowflake_private_key
}

It is critical that sensitive values, especially the private_key, are never hardcoded in configuration files. The recommended approach is to define them as input variables marked as sensitive = true and supply their values through secure mechanisms like environment variables (e.g., TF_VAR_snowflake_private_key) or integration with a secrets management tool like GitHub Secrets or AWS Secrets Manager.

A common source of initial connection failures is the incorrect identification of the organization_name and account_name. These values can be retrieved with certainty by executing the following SQL queries in the Snowflake UI: SELECT CURRENT_ORGANIZATION_NAME(); and SELECT CURRENT_ACCOUNT_NAME();. Providing these simple but effective commands can prevent significant user frustration.

For more mature IaC implementations that strictly adhere to the principle of least privilege, Terraform supports the use of aliased providers. This powerful pattern allows for the definition of multiple provider configurations within the same project, each assuming a different role. This mirrors Snowflake's own best practices, where object creation (SYSADMIN) is separated from security management (SECURITYADMIN).

The following example demonstrates how to configure aliased providers:

# Default provider uses SYSADMIN for object creation (e.g., databases, warehouses)
provider "snowflake" {  
alias             = "sysadmin"  
organization_name = var.snowflake_org_name  
account_name      = var.snowflake_account_name  
user              = var.snowflake_user  
private_key       = var.snowflake_private_key  
authenticator     = "SNOWFLAKE_JWT"  
role              = "SYSADMIN"
}

# Aliased provider for security-related objects (e.g., roles, users, grants)
provider "snowflake" {  
alias             = "securityadmin"  
organization_name = var.snowflake_org_name  
account_name      = var.snowflake_account_name  
user              = var.snowflake_user  
private_key       = var.snowflake_private_key  
authenticator     = "SNOWFLAKE_JWT"  
role              = "SECURITYADMIN"
}

When using aliased providers, individual resource blocks must explicitly specify which provider to use via the provider meta-argument (e.g., provider = snowflake.securityadmin). This ensures that each resource is created with the minimum necessary privileges, enforcing a robust security posture directly within the code.

Section 2: provisioning core Snowflake infrastructure

Once the secure connection is bootstrapped, Terraform can be used to define and manage the fundamental building blocks of the Snowflake environment. This section provides code examples for creating databases, virtual warehouses, and schemas - the foundational components for any data workload.

2.1 Laying the foundation: databases

The database is the top-level container for schemas and tables in Snowflake. The snowflake_database resource is used to provision and manage these containers.

The following HCL example creates a primary database for analytics workloads, demonstrating the use of the aliased sysadmin provider and an optional parameter for data retention.

‍resource "snowflake_database" "analytics_db" {  
provider = snowflake.sysadmin // Explicitly use the sysadmin provider for object creation  

name    = "ANALYTICS"  
comment = "Primary database for analytics workloads managed by Terraform."  

// Optional: Configure Time Travel data retention period.  
// This setting can have cost implications.  
data_retention_time_in_days = 30
}

A core strength of Terraform is its ability to manage dependencies implicitly through resource references. In this example, once the analytics_db resource is defined, other resources, such as schemas, can reference its attributes (e.g., snowflake_database.analytics_db.name).

2.2 Compute power: warehouses

Virtual warehouses are the compute engines in Snowflake, responsible for executing queries and data loading operations. The snowflake_warehouse resource provides comprehensive control over their configuration, enabling a balance between performance and cost.

This example defines a standard virtual warehouse for analytics and business intelligence tools, showcasing parameters for cost optimization and scalability.

resource "snowflake_warehouse" "analytics_wh" {  
provider = snowflake.sysadmin  

name    = "ANALYTICS_WH"  
comment = "Warehouse for the analytics team and BI tools."  

// Define the compute capacity of the warehouse.  
warehouse_size = "X-SMALL"  

// Cost-saving measures: suspend the warehouse when idle.  
auto_suspend = 60 // Suspend after 60 seconds of inactivity.  
auto_resume  = true  

// Optional: Configure for multi-cluster for higher concurrency.  
min_cluster_count = 1  
max_cluster_count = 4  
scaling_policy    = "ECONOMY" // Prioritize conserving credits over starting clusters quickly.
}

The parameters in this resource directly impact both performance and billing. warehouse_size determines the raw compute power and credit consumption per second. auto_suspend is a critical cost-control feature, ensuring that credits are not consumed when the warehouse is idle. For workloads with high concurrency needs, the min_cluster_count, max_cluster_count, and scaling_policy parameters allow the warehouse to dynamically scale out to handle query queues, and then scale back in to conserve resources. Managing these settings via Terraform ensures that cost and performance policies are consistently applied and version-controlled.

2.3 Organizing your data: schemas

Schemas are logical groupings of database objects like tables and views within a database. The snowflake_schema resource is used to create and manage these organizational units.

The following HCL creates a RAW schema within the ANALYTICS database defined earlier.

resource "snowflake_schema" "raw_data" {  
provider = snowflake.sysadmin  

// Create an explicit dependency on the database resource.  
database = snowflake_database.analytics_db.name  

name    = "RAW"  
comment = "Schema for raw, unprocessed data ingested from source systems."
}

It is important to note that when a new database is created in Snowflake, it automatically includes a default schema named PUBLIC. While this schema is created outside of Terraform's management, administrators should be aware of its existence. For environments that require strict access control, it is a common practice to immediately revoke all default privileges from the

PUBLIC schema to ensure it is not used inadvertently. Terraform can be used to manage this revocation if desired, but the schema itself will not be in the Terraform state unless explicitly imported.

Section 3: mastering access control with role hierarchies

Effective access control is a cornerstone of data governance and security. Snowflake's Role-Based Access Control (RBAC) model is exceptionally powerful, particularly its support for role hierarchies. Managing this model via Terraform provides an auditable, version-controlled, and scalable approach to permissions management. This section details how to construct a robust RBAC framework using a best-practice model of functional and access roles.

3.1 The building blocks: creating account roles

The foundation of the RBAC model is the creation of roles. A recommended pattern is to create two distinct types of roles:

  • Functional roles: These roles represent a job function or a persona, such as DATA_ANALYST or DATA_ENGINEER. Users are granted these roles.
  • Access roles: These roles represent a specific set of privileges on a specific set of objects, such as SALES_DB_READ_ONLY or RAW_SCHEMA_WRITE. These roles are granted to functional roles, not directly to users.

This separation decouples users from direct permissions, making the system vastly more scalable and easier to manage. The snowflake_account_role resource is used to create both types of roles.

// Define a functional role representing a user persona.
resource "snowflake_account_role" "data_analyst" {  
provider = snowflake.securityadmin // Use the securityadmin provider for role management 

name    = "DATA_ANALYST"  
comment = "Functional role for users performing data analysis and reporting."
}

// Define an access role representing a specific set of privileges.
resource "snowflake_account_role" "analytics_db_read_only" {  
provider = snowflake.securityadmin  

name    = "ANALYTICS_DB_READ_ONLY"  
comment = "Grants read-only access to all objects in the ANALYTICS database."
}

3.2 Constructing the hierarchy: granting roles to roles

The true power of Snowflake's RBAC model is realized by creating hierarchies of roles. By granting access roles to functional roles, a logical and maintainable privilege structure is formed. If a data analyst needs access to a new data source, the corresponding access role is granted to the DATA_ANALYST functional role once, rather than granting privileges to every individual analyst. This pattern is essential for managing permissions at scale.

The snowflake_grant_account_role resource is used to create these parent-child relationships between roles. It is important to use this resource, as the older snowflake_role_grants resource is deprecated.

The following example demonstrates how to grant the ANALYTICS_DB_READ_ONLY access role to the DATA_ANALYST functional role, and then nest the functional role under the system SYSADMIN role to complete the hierarchy.

// Grant the access role to the functional role.
// This gives all members of DATA_ANALYST the privileges of ANALYTICS_DB_READ_ONLY.
resource "snowflake_grant_account_role" "grant_read_access_to_analyst" {  
provider = snowflake.securityadmin  

role_name        = snowflake_account_role.analytics_db_read_only.name  
parent_role_name = snowflake_account_role.data_analyst.name
}

// Grant the functional role to SYSADMIN to create a clear role hierarchy.
// This allows system administrators to manage and assume the functional role.
resource "snowflake_grant_account_role" "grant_analyst_to_sysadmin" {  
provider = snowflake.securityadmin  
role_name        = snowflake_account_role.data_analyst.name  
parent_role_name = "SYSADMIN"
}

3.3 Assigning privileges to access roles

With the role structure in place, the final step is to grant specific object privileges to the access roles. The snowflake_grant_privileges_to_account_role resource is a consolidated and powerful tool for this purpose. This resource has evolved significantly in the Snowflake provider; older versions required separate grant resources for each object type (e.g., snowflake_database_grant), which resulted in verbose and repetitive code. The modern resource uses a more complex but flexible block structure (on_account_object, on_schema, etc.) to assign privileges. Users migrating from older provider versions may find this a significant but worthwhile refactoring effort.

This example grants the necessary USAGE and SELECT privileges to the ANALYTICS_DB_READ_ONLY access role.

// Grant USAGE privilege on the database to the access role.
resource "snowflake_grant_privileges_to_account_role" "grant_db_usage" {  
provider          = snowflake.securityadmin  
account_role_name = snowflake_account_role.analytics_db_read_only.name  
privileges        =    on_account_object {    

object_type = "DATABASE"    
object_name = snowflake_database.analytics_db.name  
  }
 }
 
 // Grant USAGE privilege on the schema to the access role.
 resource "snowflake_grant_privileges_to_account_role" "grant_schema_usage" {  
 provider          = snowflake.securityadmin  
 account_role_name = snowflake_account_role.analytics_db_read_only.name  
 privileges        =  
 
 on_schema {    
 // Use the fully_qualified_name for schema-level objects.    
 schema_name = snowflake_schema.raw_data.fully_qualified_name  
  }
 }
 
 // Grant SELECT on all existing tables in the schema.
 resource "snowflake_grant_privileges_to_account_role" "grant_all_tables_select" {    
 provider          = snowflake.securityadmin    
 privileges        =    
 account_role_name = snowflake_account_role.analytics_db_read_only.name    
 on_schema_object {        
 all {            
 object_type_plural = "TABLES"            
 in_schema          = snowflake_schema.raw_data.fully_qualified_name    
   }  
  }
 }
 
 // Grant SELECT on all FUTURE tables created in the schema.
 resource "snowflake_grant_privileges_to_account_role" "grant_future_tables_select" {  
 provider          = snowflake.securityadmin  
 account_role_name = snowflake_account_role.analytics_db_read_only.name  
 privileges        =  
 
 on_schema_object {    
 future {      
 object_type_plural = "TABLES"      
 in_schema          = snowflake_schema.raw_data.fully_qualified_name   
   }  
  }
 }

A particularly powerful feature demonstrated here is the use of the future block. Granting privileges on future objects ensures that the access role will automatically have the specified permissions on any new tables created within that schema. This dramatically reduces operational overhead, as permissions do not need to be manually updated every time a new table is deployed. However, it is important to understand Snowflake's grant precedence: future grants defined at the schema level will always take precedence over those defined at the database level. This can lead to "insufficient privilege" errors if not managed carefully across different roles and grant levels.

3.4 An optional "Audit" role for bypassing data masks

In certain scenarios, such as internal security audits or compliance reviews, it may be necessary for specific, highly-trusted users to view data that is normally protected by masking policies. Creating a dedicated "audit" role for this purpose provides a controlled and auditable mechanism to bypass data masking when required.

This role should be considered a highly privileged functional role and granted to users with extreme care.

// Define a special functional role for auditing PII data.
resource "snowflake_account_role" "pii_auditor" {  
provider = snowflake.securityadmin  

name    = "PII_AUDITOR"  
comment = "Functional role for users who need to view unmasked PII for audit purposes."
}

Crucially, creating this role is not enough. For it to be effective, every relevant masking policy must be explicitly updated to include logic that unmasks data for members of the PII_AUDITOR role. This ensures that the ability to view sensitive data is granted on a policy-by-policy basis. An example of how to modify a masking policy to incorporate this audit role is shown in the following section.

Section 4: advanced data governance with dynamic data masking

Moving beyond infrastructure provisioning, Terraform can also codify and enforce sophisticated data governance policies. Snowflake's Dynamic Data Masking is a powerful feature for protecting sensitive data at query time. By managing these policies with Terraform, organizations can ensure that data protection rules are version-controlled, auditable, and consistently applied across all environments.

4.1 Defining the masking logic

A masking policy is a schema-level object containing SQL logic that determines whether a user sees the original data in a column or a masked version. The decision is made dynamically at query time based on the user's context, most commonly their active role.

The snowflake_masking_policy resource is used to define this logic. The policy's body contains a CASE statement that evaluates the user's session context and returns the appropriate value.

The following example creates a policy to mask email addresses for any user who is not in the DATA_ANALYST or PII_AUDITOR role.

resource "snowflake_masking_policy" "email_mask" {  
provider = snowflake.sysadmin // Policy creation often requires SYSADMIN or a dedicated governance role  n

ame     = "EMAIL_MASK"  
database = snowflake_database.analytics_db.name  
schema   = snowflake_schema.raw_data.name    

// Defines the signature of the column the policy can be applied to.  
// The first argument is always the column value to be masked.  
argument {    
name = "email_val"    
type = "VARCHAR"  }    

// The return data type must match the input data type.  
return_type = "VARCHAR" 

// The core masking logic is a SQL expression.  
body = <<-EOF    
CASE      
WHEN IS_ROLE_IN_SESSION('DATA_ANALYST') OR IS_ROLE_IN_SESSION('PII_AUDITOR') THEN email_val      
ELSE '*********'   
END  
EOF  

comment = "Masks email addresses for all roles except DATA_ANALYST and PII_AUDITOR."
}

The SQL expression within the body argument offers immense flexibility. It can use various context functions (like CURRENT_ROLE() or IS_ROLE_IN_SESSION()) and even call User-Defined Functions (UDFs) to implement complex logic. However, this flexibility means the logic itself is not validated by Terraform's syntax checker; it is sent directly to Snowflake for validation during the

terraform apply step. It is also a strict requirement that the data type defined in the argument block and the return_type must match the data type of the column to which the policy will eventually be applied.

4.2 Applying the policy to a column

Creating a masking policy is only the first step; it does not protect any data on its own. The policy must be explicitly applied to one or more table columns. This crucial second step is often a point of confusion for new users, who may create a policy and wonder why data is still unmasked. The snowflake_table_column_masking_policy_application resource creates this essential link between the policy and the column.

The following example demonstrates how to apply the EMAIL_MASK policy to the EMAIL column of a CUSTOMERS table.

// For this example, we assume a 'CUSTOMERS' table with an 'EMAIL' column
// already exists in the 'RAW' schema. In a real-world scenario, this table
// might also be managed by Terraform or by a separate data loading process.
// We use a data source to reference this existing table.
data "snowflake_table" "customers" {  
database = snowflake_database.analytics_db.name  
schema   = snowflake_schema.raw_data.name  
name     = "CUSTOMERS"
}

// Apply the masking policy to the specific column.resource "snowflake_table_column_masking_policy_application" "apply_email_mask" {  
provider = snowflake.sysadmin  

table_name  = "\"${data.snowflake_table.customers.database}\". \"${data.snowflake_table.customers.schema}\". \"${data.snowflake_table.customers.name}\""  
column_name = "EMAIL" // The name of the column to be masked  

masking_policy_name = snowflake_masking_policy.email_mask.fully_qualified_name    

// An explicit depends_on block ensures that Terraform creates the policy  
// before attempting to apply it, preventing race conditions.  
depends_on = [    
snowflake_masking_policy.email_mask  
]
}

This two-step process—defining the policy logic and then applying it - provides a clear and modular approach to data governance. The same policy can be defined once and applied to many different columns across multiple tables, ensuring that the masking logic is consistent and centrally managed.

Conclusion: the path to mature Snowflake IaC

This guide has charted a course from the initial, manual bootstrapping of a secure connection to the automated provisioning and governance of a production-grade Snowflake environment. To ensure the long-term success and scalability of managing Snowflake with Terraform, several key practices should be adopted as standard procedure:

  • Version control: All Terraform configuration files must be stored in a version control system like Git. This provides a complete, auditable history of all infrastructure changes and enables collaborative workflows such as pull requests for peer review before any changes are applied to production.
  • Remote state management: The default behaviour of Terraform is to store its state file locally. In any team or automated environment, this is untenable. A remote backend, such as an Amazon S3 bucket with a DynamoDB table for state locking, must be configured. This secures the state file, prevents concurrent modifications from corrupting the state, and allows CI/CD pipelines and team members to work from a consistent view of the infrastructure.
  • Modularity: As the number of managed resources grows, monolithic Terraform configurations become difficult to maintain. Code should be refactored into reusable modules. For instance, a module could be created to provision a new database along with a standard set of access roles and default schemas. This promotes code reuse, reduces duplication, and allows for more organized and scalable management of the environment.
  • Provider versioning: The Snowflake Terraform provider is actively evolving. To prevent unexpected breaking changes from new releases, it is crucial to pin the provider to a specific major version in the terraform block (e.g., version = "~> 1.0"). This allows for intentional, planned upgrades. When upgrading between major versions, it is essential to carefully review the official migration guides, as significant changes, particularly to grant resources, may require a concerted migration effort.

With this robust foundation in place, the path is clear for expanding automation to encompass even more of Snowflake's capabilities. The next logical steps include using Terraform to manage snowflake_network_policy for network security, snowflake_row_access_policy for fine-grained data filtering, and snowflake_task for orchestrating SQL workloads. Ultimately, the entire workflow should be integrated into a CI/CD pipeline, enabling a true GitOps model where every change to the Snowflake environment is proposed, reviewed, and deployed through a fully automated and audited process. By embracing this comprehensive approach, organizations can unlock the full potential of their data platform, confident in its security, scalability, and operational excellence.

Why Snowstack for Terraform and Snowflake

Automation without expertise can still fail. Terraform gives you the tools, but it takes experience and the right design patterns to turn Snowflake into a secure, cost-efficient, and scalable platform.

Managing Snowflake with Terraform is powerful, but putting it into practice at enterprise scale requires experience, discipline, and the right patterns. That is where Snowstack comes in. As a Snowflake-first consulting partner, we help organizations move beyond trial-and-error scripts to fully automated, production-grade environments. Our engineers design secure architectures, embed Terraform best practices, and ensure governance and cost controls are built in from day one.

👉 Book a strategy call with Snowstack and see how we can take your Snowflake platform from manual operations to enterprise-ready automation.

Want to lean more?
View more insights

Ready to modernize your data strategy?

Book a free consultation with our founder to discuss your current data setup and explore if Snowflake is the right solution for your business.

Talk to an expert