Insights

Explore Snowflake best practices and data insights

Learn from our certified Snowflake experts to transform your data into commercial success.

Governance

5 min read

Best practices for protecting your data: Snowflake role hierarchy

One stolen password can bring down an entire enterprise. As businesses move more of their data to the cloud and centralize it on platforms like Snowflake, a critical question emerges: who should have access, and how do you manage it at scale without slowing the business or weakening security?

One stolen password can bring down an entire enterprise. The 2024 Snowflake breaches revealed how fragile weak access controls are, with 165 organizations and millions of users affected. The breaches were not the result of advanced attacks. They happened because stolen passwords went unchecked, and multi-factor authentication was missing. As businesses move more of their data to the cloud and centralize it on platforms like Snowflake, a critical question emerges: who should have access, and how do you manage it at scale without slowing the business or weakening security?

In this article, we’ll break down the Snowflake Role Hierarchy, explain why it matters, and share best practices for structuring roles that support security, compliance, and day-to-day operations.

What is Snowflake’s role hierarchy?

Snowflake’s role hierarchy is a structured framework that defines how permissions and access controls are organized within the platform. In Snowflake, access to data and operations is governed entirely by roles. Using the Role-Based Access Control (RBAC) model, you grant privileges to roles, and then assign users to those roles, simplifying administration, ensuring consistency, and making audit access easier. RBAC is generally recommended for production environments and enterprise-level governance.

The hierarchy operates on a parent-child relationship model where higher-level roles inherit privileges from subordinate roles, creating a tree-like structure. This structure provides granularity, clarity, and reusability, but it requires thoughtful planning to avoid sprawl or over-permissioned users.

Core components of Snowflake RBAC

Roles: The fundamental building blocks that encapsulate specific privileges
Privileges: Defined levels of access to securable objects (databases, schemas, tables)
Users: Identities that can be assigned roles to access resources
Securable Objects: Entities like databases, tables, views, and warehouses that require access control
Role Inheritance: The mechanism allowing roles to inherit privileges from other roles

Understanding Snowflake's system-defined roles

Understanding the default role structure is crucial for building secure hierarchies:

ACCOUNTADMIN

Root-level access to all account operations
Can view and manage billing and credit data
Should be tightly restricted to emergency use only
Not a "superuser" - still requires explicit privileges for data access

SYSADMIN

Full control over database objects and users
Recommended parent for all custom roles
Manages warehouses, databases, and schemas

SECURITYADMIN

Manages user and role grants
Controls role assignment and privilege distribution
Essential for maintaining RBAC governance

Custom roles

Created for specific teams or functions within an organization (e.g ANALYST_READ_ONLY, ETL_WRITER).

Best practices for designing a secure Snowflake role hierarchy

A well-structured role hierarchy minimizes risk, supports compliance, and makes onboarding/offboarding easier. Here’s how one should do it right:

1. Follow the Principle of Least Privilege

Grant only the minimum required permissions for each role to perform its function. Avoid blanket grants like GRANT ALL ON DATABASE.

Do this:

Specific, targeted grants
Avoid cascading access down the role tree unless absolutely needed
Regularly audit roles to ensure they align with actual usage

GRANT SELECT ON TABLE SALES_DB.REPORTING.MONTHLY_REVENUE TO ROLE ANALYST_READ;
GRANT USAGE ON SCHEMA SALES_DB.REPORTING TO ROLE ANALYST_READ;
GRANT USAGE ON DATABASE SALES_DB TO ROLE ANALYST_READ;

Not this:

Overly broad permissions

GRANT ALL ON DATABASE SALES_DB TO ROLE ANALYST_READ;

Why does it matter?

Least privilege prevents accidental (or malicious) misuse of sensitive data. It also supports data governance and compliance with various regulations like GDPR or HIPAA.

2. Use a layered role design

Design your roles using a layered and modular approach, often structured like this:

Functional Roles (what the user does):

CREATE ROLE ANALYST_READ;
CREATE ROLE ETL_WRITE;
CREATE ROLE DATA_SCIENTIST_ALL;

Environment Roles (where the user operates)

CREATE ROLE DEV_READ_WRITE;
CREATE ROLE PROD_READ_ONLY;

Composite or Team Roles (Group users by department or team, assigning multiple functional/environment roles under one umbrella)

CREATE ROLE MARKETING_TEAM_ROLE → includes PROD_READ_ONLY + ANALYST_READ

3. Avoid granting privileges directly to users

Always assign privileges to roles and not users. Then, assign users to those roles.

Why it matters?

This keeps access transparent and auditable. If a user leaves or changes teams, simply revoke or change the role. There’s no need to hunt down granular permissions.

4. Establish consistent naming conventions

Enforce naming conventions as consistent role and object naming makes automation and governance far easier to scale.

Recommended Naming Pattern:

Access Roles: {ENV}_{DATABASE}_{ACCESS_LEVEL} (e.g., PROD_SALES_READ)
Functional Roles: {FUNCTION}_{TEAM} (e.g., DATA_ANALYST, ETL_ENGINEER)
Service Roles: {SERVICE}_{PURPOSE}_ROLE (e.g., FIVETRAN_LOADER_ROLE)

5. Use separate roles for Administration vs. Operations

Split roles that manage infrastructure (e.g., warehouses, roles, users) from roles that access data.

Admins: SYSADMIN, SECURITYADMIN‍
Data teams: DATA_ENGINEER_ROLE, ANALYST_ROLE, etc.

Why it matters? This separation of duties limits the potential impact of security incidents and supports audit compliance. Administrators should not have access to sensitive data unless it's absolutely necessary for their role.

6. Secure the top-level roles

Roles like ACCOUNTADMIN and SECURITYADMIN should be assigned to the fewest people possible, protected with MFA, and monitored for any usage.

Implementation Checklist:

Limit ACCOUNTADMIN to 2-3 emergency users maximum
Enable MFA for all administrative accounts
Set up monitoring and alerting for admin role usage
Regular access reviews and privilege audits
Document and justify all administrative access

Monitoring, auditing & compliance: keeping your Snowflake hierarchy healthy

Even the best-designed role trees can get messy over time. Here’s how to maintain security:

1. Regular access reviews

Implement quarterly access reviews to maintain security hygiene:

Role Effectiveness Analysis: Identify unused or over-privileged roles
User Access Validation: Verify users have appropriate role assignments
Privilege Scope Review: Ensure roles maintain least privilege principles
Compliance Mapping: Document role mappings to business functions

2. Logging and monitoring

Enable Access History and Login History in Snowflake to track activity and implement automation tools for role assignments during employee transitions.

3. Onboarding/offboarding automation

Implement automation tools or scripts to efficiently manage role assignments during employee transitions.

4. Object Tagging for enhanced security

Use object tagging to classify sensitive data and control access accordingly.

Measuring RBAC Success: Key Performance Indicators

1. Security Metrics

Access Review Coverage: % of roles reviewed quarterly
Privilege Violations: Number of excessive privilege grants identified
Failed Authentication Attempts: Monitor for unauthorized access patterns
Role Utilization Rate: % of active roles vs. total created roles

2. Operational Metrics

User Onboarding Time: Average time to provision new user access
Role Management Efficiency: Time to modify/update role permissions
Audit Response Time: Speed of access review and remediation
Automation Coverage: % of role operations automated vs. manual

3. Compliance Metrics

SOC 2 Readiness: Role hierarchy documentation completeness
GDPR/Data Privacy: Data access control effectiveness
Industry Compliance: Sector-specific requirement adherence
Change Management: Role modification approval and documentation

Future-Proofing Your RBAC Strategy

The way you manage access today will define how secure and scalable your Snowflake environment is tomorrow. The strength of Snowflake’s RBAC model lies in its flexibility, but that power comes with responsibility. As AI features mature, as multi-cloud deployments become the norm, and as regulators tighten expectations around data privacy, static role hierarchies quickly fall behind. A poorly structured role hierarchy can lead to data leaks, audit failures, higher operational costs, and stalled innovation.

At Snowstack, we specialize in building RBAC strategies that are not only secure today but ready for what’s next. Our team of Snowflake-first engineers has designed role models that scale across continents, safeguard sensitive data for regulated industries, and enable AI without exposing critical assets. We continuously monitor Snowflake’s roadmap and fold new security capabilities into your environment before they become business risks.

Don’t wait for the next breach to expose the cracks in your access controls. Let’s design an RBAC strategy that keeps you secure, compliant, and future-ready.

👉 Book a free RBAC assessment

‍

FAQs

RBAC provides scalability and centralized control by granting privileges to roles, which are then assigned to users. UBAC allows privileges to be assigned directly to individual users and is intended for collaborative scenarios like building Streamlit applications.

Follow the "Role of Three" principle: create Access Roles (data-centric), Functional Roles (business-centric), and Service Roles (system-centric). This approach avoids role explosion while maintaining necessary granularity.

Always assign privileges to roles and not users. Then, assign users to those roles. This keeps access transparent and auditable.

Design with hierarchy in mind – role ownership and grant structure should align with your intended control model. Map business functions to role layers and ensure clear inheritance paths.

Create emergency “break-glass” roles with elevated privileges that are heavily monitored and logged, require additional approval workflows, automatically expire after a set period of time, and immediately notify the security team when activated.

Conduct comprehensive access reviews every quarter, perform monthly spot checks on high-privilege administrative roles, service accounts, recently modified permissions, and roles tied to employees who have left the company.

Yes, automation is critical for scaling. Create stored procedures for role provisioning, use CI/CD pipelines for role deployment, and integrate with identity providers for user lifecycle management.

Classify and tag sensitive data, enforce row-level security and column masking, maintain detailed audit logs with supporting access documentation, run regular compliance assessments and gap analyses, and document the business justification for every access role granted.

Cloud

5 min read

Understanding Snowflake: 7 core capabilities that set it apart from legacy databases in 2025

Most enterprise databases were built for monthly reports, not AI products that need fresh, reliable data every hour. This guide breaks down 7 core Snowflake capabilities, explains how they solve the typical Oracle, Teradata, SQL Server and on premises PostgreSQL or MySQL limitations, and shows what they mean for your teams in real projects.

Let's be honest. Your current database was most likely built for monthly reports, not AI products that demand regular updates and reports all the time. This is the reason why, in 2025, really innovative and data-driven businesses continue their migration away from legacy databases like Oracle, Teradata, SQL Server, and on-premises MySQL/PostgreSQL toward modern cloud-native architectures. Snowflake has become the industry leader, powering analytics and AI workloads across finance, retail, technology, and enterprise sectors.

This guide breaks down 7 core Snowflake capabilities and shows how the right Snowflake consulting can turn them into best results for your teams.

‍

What is the legacy database challenge?

Before diving into Snowflake's capabilities, it's crucial to understand the limitations organisations face with traditional databases. Therefore, let’s consider the scenario of a global FMCG company operating in multiple regions, where we helped transform the data infrastructure from legacy on-prem systems to

With our expert Snowflake migration services, the company moved to Snowflake + dbt + Fivetran + Tableau as a modern data stack.

Talk to our Snowflake consultant →

Challenge	Impact
Legacy on-prem SQL servers and siloed BI systems	Slow insights, high maintenance burden
Manual ETL pipelines	Inconsistent data accuracy
High infrastructure and scaling costs	Limits on reporting and forecasting
Slow experimentation for data science	Delays in business decisions

‍

The 7 core Snowflake capabilities in 2025

1. Multi-cluster shared data architecture

The fundamental differentiator: Snowflake's three-layer architecture completely separates storage from compute resources.

Key benefits:

Unlimited concurrency
Auto-scaling virtual warehouses
Near-zero locking and contention
Pay-as-you-use compute

This means analysts, data scientists, and applications can work in parallel on the same datasets without contention.

Business impact:

You no longer have to buy extra storage just to get more compute. You scale up when you need power, scale down when you don’t, and you can see what that means for your bill in minutes with our FinOps savings calculator

2. Cross-cloud & multi-region replication

This Snowflake capability is critical for regulated industries (financial services, healthcare, insurance) and companies with international operations requiring data sovereignty compliance.

Snowflake delivers:

Multi-cloud availability on AWS, Azure, and Google Cloud Platform
Easy cross-region replication and failover
Global application distribution
Built-in disaster recovery without complex configuration

Plan residency, failover, and recovery during **platform architecture,** then implement Snowflake like a pro.

Business impact:

A global FMCG company can maintain synchronised data across North American, European, and Asian markets while meeting local data residency requirements. This is difficult to achieve with legacy on-premises databases.

3. Zero-copy cloning & time travel

Snowflake's innovative approach to data management enables instant environment creation with zero additional storage costs.

Game-changing features:

Clone terabyte-scale databases in seconds without duplicating data
Time Travel for historical queries and point-in-time recovery
Safe dev/test environment provisioning without impacting production

Development teams can spin up complete production-like environments instantly for testing, while legacy databases require duplicated environments that consume massive storage and take hours or days to provision.

Business impact:

Data engineers can test complex transformations on production-scale data without risk, dramatically accelerating development cycles and improving data reliability.

4. Built-in governance & RBAC security

In 2025, data governance and security are business-critical requirements for compliance and risk management.

Snowflake's security framework includes:

Fine-grained access control with row-level and column-level masking
Data lineage and classification for understanding data provenance
Policy-based access control with external tokenisation partner support
Automatic encryption at rest and in transit
Dynamic data masking to protect sensitive information
Audit logging and monitoring for compliance reporting

These are essential for organisations operating under SOC 2, HIPAA, GDPR, PCI DSS.

5. Native AI & Python ecosystem

Snowflake has built-in support for Python and machine learning, so your team can build and run models where the data already lives instead of exporting them elsewhere. With solid AI and data governance in place, it becomes easier to try new ideas safely and move them into production. The key building blocks are:

Feature	Value
Snowpark for Python	Run Python directly in Snowflake
Native ML inference	Zero data movement
UDFs / Stored Procedures	Custom logic at scale
ML ecosystem partners	Dataiku, H2O.ai, SAS integration

‍

Business impact:

This means that teams can train, deploy & serve ML models securely inside Snowflake. Data scientists spend less time on data engineering and infrastructure management and more time building models that drive business value.

6. Marketplace & data sharing economy

The Snowflake Marketplace reshapes how enterprises access 3rd-party data (functioning as the "App Store for data"). We are looking at:

Thousands of data providers covering financial data, geospatial information, retail insights, weather patterns, ESG metrics, and logistics intelligence
Live data feeds without pipelines (No ETL required)
Private data exchange across subsidiaries, partners, and customers

Business impact:

You can now achieve faster analytics, better forecasting, and smarter decisions by instantly accessing external data sources that would traditionally require weeks of negotiation, integration work, and ongoing pipeline maintenance.

7. Extensibility: unistore & native apps

Snowflake is no longer just a data warehouse. In 2025, it can also handle simple day-to-day transactions and apps that run directly on your data.

Next-generation capabilities:

Unistore for OLTP-lite workloads, enabling hybrid transactional/analytical processing
Snowflake Native Apps for custom application development
Streamlit integration for building interactive data applications
Real-time data pipelines via Kafka connectors and Snowpipe Streaming

Business impact:

Snowflake serves hybrid workloads that legacy databases struggle to handle without significant operational complexity. Organisations consolidate their data infrastructure rather than maintaining separate systems for transactional and analytical workloads.

Real-world example: Snowflake consulting & migration results

Here’s what the shift looks like in practice. In a recent Snowflake project with a global FMCG company, we rebuilt the analytics backbone by establishing a governed core data model, automating ingestion and orchestration with native services and partner connectors, and reconnecting BI directly to a single, auditable source of truth. As seen in the table below, the result was a step-change in reliability and speed.

Documented results from migration to Snowflake:

Before Snowflake	After Snowflake
Overnight BI refreshes	Same-day analytics refreshes
High ETL maintenance	80% automation via Pipes & Streams or Snowflake partner integrations like Fivetran
Siloed regional reporting	Centralized data lakehouse
Manual Excel forecasting	Automated ML-powered forecasting
Slow KPI access for business	Real-time dashboards in Tableau

‍

Beyond the database

Snowflake’s strengths include a unique design, flexible scaling, strong access and security controls, built-in AI features, and safe sharing across regions, which make it more than a database. It is a modern cloud data platform that powers predictive analytics, self-service reporting so product teams can trust the data and use it with ease. In business, the faster you get answers, the stronger your advantage, and Snowflake is setting the standard for company data platforms.

If you are choosing a data platform in 2025, plan for what you will need next year as well as today. Snowflake’s design is built for an AI-ready cloud-based future. We help you make that future real by setting up Snowflake, connecting your data, putting clear access rules in place, and keeping costs under control with a simple 90-day plan that we build with your team.

Ready to turn Snowflake into results?

Book a 30 minute call with our Snowflake consultant →

‍

FAQs

They decide how fast your teams can work, how often they’re blocked, and how much you pay every month. Features like multi-cluster compute, Time Travel, zero-copy cloning, governance, AI support, and Marketplace only help if they’re wired into a clear plan. That’s what our advisory and architecture and Snowflake implementation projects are designed to do.

Yes. You can replicate data across regions and even across clouds (AWS, Azure, GCP) for disaster recovery, latency, and compliance needs. The important part is to plan this up front: which regions you need, what your RPO/RTO targets are, and how you will test failover. We design this as part of Advisory and architecture.

Yes. With Snowpark, Cortex, and support for unstructured data, you can build AI use cases (scoring, recommendations, search) directly on Snowflake. Vector search lets you work with embeddings for things like document or product search without moving data into a separate stack. We help you do this safely under one set of rules via AI and data governance.

The Snowflake Marketplace is a catalog of live third-party data and apps that you can plug straight into your account without building heavy ETL pipelines. It’s useful when you need external data such as demographics, weather, payments, ESG, or location data to enrich your own. We help you pick the right data products and wire them into your models and dashboards through Migrations and integrations.

Unistore and Hybrid Tables let Snowflake handle simple transactional or row-based workloads (for example, orders, events, or app states) close to your analytics. They matter when you want to keep both “what just happened” and “what does it mean” on the same platform, instead of running a separate operational database. We include them where it makes sense in Snowflake implementation projects.

Yes. Snowflake can read and write Apache Iceberg tables in external storage, which is helpful if you are building or keeping an open data lake or a hybrid “lakehouse” setup. That way you don’t have to lock everything into a single format or vendor. We usually design this as part of Migrations and integrations.

Cloud

5 min read

Can Snowflake store unstructured data? How Snowflake handles documents, images, and other data in 2025

Snowflake isn’t just rows and columns anymore. In 2025 you can land PDFs, images, logs, and app data next to your tables, then query, enrich, and search them with SQL, Snowpark, and Cortex AI.

What if your PDFs, transcripts, and logs could live in the same place as your BI dashboards? For years, Snowflake was known primarily as a cloud native data warehouse built for structured analytics. It was the go-to solution for SQL analysts, BI teams, and data engineers working with neat rows and columns. Meanwhile, many teams dealing with documents, images, logs, and raw application data assumed they needed separate storage such as Amazon S3, Google Cloud Storage, Azure Blob, or NoSQL databases.

In 2025, that separation no longer has to exist. Snowflake is now a multimodal data platform that can store, process and query unstructured data.

So yes, Snowflake can store unstructured data, but more importantly, it can use it. This capability offers significant architectural advantages for modern data teams. In this blog post, we’ll break down exactly how and why it matters.

What is unstructured data?

Unstructured data refers to any information that doesn't fit neatly into traditional rows and columns. This includes:

Documents: PDF, DOCX, TXT files
Images: PNG, JPG, TIFF formats
Audio and video files: Media content and recordings
Logs and event data: Application and system logs
Communication data: Email threads and chat transcripts
Markup and structured text: HTML, XML, JSON blobs
Binary files: Application-specific file formats

As organisations increasingly generate massive volumes of this data, the need for unified platforms that can both store and analyse unstructured content has become critical.

How Snowflake stores unstructured data?

Snowflake stages for unstructured data

Snowflake manages unstructured data through stages. This means through storage locations that reference files either within Snowflake's managed infrastructure or in external cloud storage:

Internal Stages: Files are stored within Snowflake's managed storage, offering quick setup and seamless integration
External Stages: Files remain in external cloud locations (Amazon S3, Azure Blob Storage, Google Cloud Storage), with Snowflake accessing them via metadata references

You can also combine both approaches for optimal performance and scalability based on your specific requirements.

The FILE data type in Snowflake for unstructured files and metadata

Snowflake provides a dedicated FILE data type for unstructured data. A FILE value represents a reference to a file stored in an internal or external stage, without storing the actual file content in the table itself. This approach allows:

Efficient storage and cost management
Fast metadata querying
Seamless integration with processing pipelines

Accessing unstructured files in Snowflake

Snowflake provides familiar commands for file management:

PUT: Upload files to stages
GET: Download files from stages
LIST: View files stored in stages

These operations mirror cloud storage interactions while maintaining Snowflake's security and governance standards.

Processing and querying unstructured data in Snowflake

Storage is just the beginning. Snowflake's real power lies in its ability to process and extract insights from unstructured data.

Snowflake Cortex AI and Document AI for PDFs, images and hybrid search

Cortex AI enables advanced analytics on unstructured data directly within Snowflake:

Document analysis: Extract text, summarise content, and perform batch LLM inference on PDFs and documents
Image processing: Run classification and analysis on stored images
Multimodal SQL functions: Query and transform documents, images, and audio using SQL-powered pipelines
Schema-aware extraction: Automatically extract structured tables from unstructured documents like invoices and reports

Snowpark for custom processing

With Snowpark, you can:

Extract text from PDFs using Python
Perform image classification with embedded ML models
Parse JSON or log files into VARIANT columns
Run OCR, NLP, and generate embeddings via external functions
Build semantic search capabilities over document collections

VARIANT data type for semi-structured data

The VARIANT data type handles semi-structured data formats like JSON, XML, Parquet, and Avro:

Store complex, nested data structures
Query JSON fields directly using SQL
Maintain schema flexibility while preserving query performance

Why unified data architecture matters?

In most companies, data still lives in many places and tools. Dashboards sit on a legacy SQL warehouse, logs go to a separate observability stack, and documents and images disappear into unmanaged cloud buckets or shared drives.

Instead of stitching together a dozen point solutions, you can use Snowflake as the backbone of your data architecture and keep external systems only where they add unique value. The table below shows how data stack functions shift when you standardise on Snowflake in 2025:

Function	Old architecture	Snowflake in 2025
Analytics	Separate SQL data warehouse	Snowflake core engine
File storage	S3, Google Cloud Storage, Azure Blob	Internal storage plus external tables and integrations
Processing	Spark clusters or ad hoc Python scripts	Snowpark running in the same Snowflake account
Semi-structured & unstructured	NoSQL database or object storage	Native support in Snowflake tables and stages
Search & retrieval	Elasticsearch or a separate search service	Cortex search and vector search
ML & AI	Separate ML platform and custom pipelines	Snowflake AI Studio and Snowpark ML

‍

Real-world use cases of handling unstructured data in Snowflake

Here is how this looks in practice. Below is our recent project, plus common patterns we see when teams bring documents, images, logs, and app data into Snowflake and put them to work.

Global finance, AI-ready in 90 days

A multinational finance firm spending more than 800K per month on cloud was battling rising costs and fragmented data. They needed a governed place for documents, logs, and tables. We used OpenFlow to ingest both structured and unstructured data into Snowflake, tracked lineage and policies in Horizon Catalog, set consistent business logic with semantic views, and enabled natural language querying through Cortex AI SQL. The result was about an 80% reduction in ingestion latency, real-time cost visibility with FinOps, and a platform ready for analytics, ML, and AI at scale.

Read how a global finance managed unstructured data in Snowflake →

Limitations and considerations of Snowflake

Snowflake’s unstructured data capabilities are strong, but it won’t fully replace your data lake or media platform. For B2B teams planning at scale, keep these practical constraints in mind:

Not a pure object storage replacement: Snowflake complements rather than replaces S3/GCS for massive-scale raw object storage
File retrieval performance: Binary object retrieval speed varies by file size and stage type
Compute costs: AI and ML workloads require careful resource management
Specialised use cases: For intensive video/audio editing, use specialised systems.

Best practices for managing unstructured data in Snowflake in 2025

1. Keep big binaries in external object storage, keep brains in Snowflake

Register S3, Blob, or GCS as external stages and reference files via the FILE type; keep only hot assets in internal stages for speed.

2. Standardize file layout and formats from day one

Use predictable paths (org/source/system/YYYY/MM/DD/id) and checksums; prefer compressed columnar formats like Parquet, with extracted text or page JSON beside PDFs and images.

3. Store metadata and embeddings in Snowflake, not in files

Put raw files in stages, but keep metadata, chunks, and embeddings in Snowflake tables linked by stable URIs for fast search and governance. Use directory tables to catalog staged files.

4. Orchestrate ingest → extract → enrich → index → serve with Snowpark

Run OCR, NLP, and parsers as Snowpark tasks and UDFs; batch, log runs, and make jobs idempotent so reruns are safe. Implementation flow in processing files with Snowpark.

5. Treat AI as a costed product

Separate warehouses for ELT and AI, strict auto-suspend, resource monitors, caching, and reuse of embeddings and summaries. Get a baseline with the FinOps savings calculator.

6. Govern at the row, column, and file edge

Classify on arrival, enforce row and column policies with masking, and keep least-privilege stage access and full lineage. For role design patterns, see Snowflake role hierarchy best practices.

Need a hand?

Our snowflake experts at Snowstack can audit your current setup, design a lean reference architecture, and prove value with a focused pilot. Read how we deliver in How we work or talk to a Snowflake expert.

Talk with a Snowflake consultant→

Final thoughts

Snowflake doesn’t just store unstructured data; it makes it usable for search, analytics, and AI. With stages, the FILE data type, VARIANT, Snowpark, and Cortex, you can land documents, images, and logs alongside your tables, extract text and entities, generate embeddings, and govern everything under a single security and policy model. The winning pattern is simple: keep raw binaries in low-cost object storage, centralise metadata and embeddings in Snowflake, and start with one focused, high-value use case you can scale.

Ready to try this in your stack?

Book a 30-minute call with our Snowflake consultant →

‍

FAQs

Can Snowflake store unstructured data (PDFs, images, audio)?

Yes. Snowflake stores and processes unstructured files via stages (internal or external) and a FILE column type. You can access them with SQL and AI features. For setup help, see Snowflake implementation and AI and data governance.

Who can help me implement unstructured data on Snowflake?

Snowstack builds end-to-end pipelines for documents, images, logs, and app data. Start with Snowflake implementation or Contact.

What does a typical Snowstack pilot include?

A focused 4–6 week build: audit, reference architecture, secure stages and directory tables, ingest and extract jobs, embeddings and search, cost guards, and a demo with success metrics. See How we work.

What is Snowflake’s FILE data type?

FILE is a column type that holds a reference to a staged file (plus metadata like MIME type, size, etag, last modified, and URLs). It doesn’t store the binary itself—just a pointer with metadata and helper functions (e.g., FL_GET_SIZE). We design schemas that use FILE in Advisory and architecture.

How do I put PDFs or images “into” Snowflake?

Create a stage, enable a directory table, then map staged files into a FILE column. We set this up during Migrations and integrations and Snowflake implementation.

Should I use internal or external stages?

Use internal stages for simplicity and hot paths. Use external stages when files live in S3, Azure Blob, or GCS. We help you choose in Advisory and architecture.

How do I upload, list, and download files?

Use PUT to upload to internal stages, LIST to enumerate, and GET to download from internal stages. For external stages, upload with your cloud provider tools. At Snowstack, we standardise this in Migrations and integrations.

What are directory tables, and why do they matter?

A directory table catalogs files on a stage so you can query, join to metadata, and build pipelines that react to file changes (with refresh/auto-refresh).

Can Snowflake run AI over documents and images?

Yes. Use built-in services for document extraction, image understanding, and natural language queries. We enable safe usage through AI and data governance.

Does Snowflake support vector search and embeddings?

Yes. Snowflake provides a VECTOR data type, vector similarity functions, and embedding utilities for RAG/search over your files’ text.

What file sizes work best for loading in Snowflake?

Aim for mid-sized files to balance parallelism and overhead; split very large files and compact many tiny ones. Get a sizing plan via Advisory and architecture.

How do I share or serve files securely?

Use scoped URLs (time-limited ~24h) or file URLs (require stage privileges). You can also generate scoped URLs with BUILD_SCOPED_FILE_URL.

How is unstructured data billed in Snowflake?

Internal stage storage is billed by Snowflake; external stage storage is billed by your cloud provider; compute and any egress are separate. Start with the FinOps Savings Calculator and FinOps services.

Can I join unstructured files with tables?

Yes. Use a directory table (file catalog) and join it to tables holding metadata (e.g., owners, tags, PII flags) to power governance and pipelines.

Governance

5 min read

From zero to production: a comprehensive guide to managing Snowflake with Terraform

Manual clicks don’t scale. As Snowflake environments grow, managing them through the UI or ad-hoc scripts quickly leads to drift, blind spots, and compliance risks. What starts as a quick fix often becomes a challenge that slows delivery and exposes the business to security gaps.

Infrastructure as Code with Terraform solves these challenges by bringing software engineering discipline to Snowflake management. Using Terraform’s declarative language, engineers define the desired state of their Snowflake environment, track changes with version control, and apply them consistently across environments. Terraform communicates with Snowflake’s APIs through the official snowflakedb/snowflake provider, translating configuration into the SQL statements and API calls that keep your platform aligned and secure.

This guide provides a complete walkthrough of how to manage Snowflake with Terraform. From provisioning core objects like databases, warehouses, and schemas to building scalable role hierarchies and implementing advanced governance policies such as dynamic data masking.

Section 1: bootstrapping Terraform for secure Snowflake automation

The initial setup of the connection between Terraform and Snowflake is the most critical phase of the entire process. A secure and correctly configured foundation is paramount for reliable and safe automation. This section focuses on establishing this connection using production-oriented best practices, specifically tailored for non-interactive, automated workflows typical of CI/CD pipelines.

1.1 The principle of least privilege: the terraform service role

Terraform should not operate using a personal user account. Instead, a dedicated service user must be created specifically for Terraform automation. Before any Terraform code can be executed, a one-time manual bootstrapping process must be performed within the Snowflake UI or via SnowSQL. This involves using the ACCOUNTADMIN role to create the dedicated service user and a high-level role for Terraform's initial operations.

The following SQL statements will create a TERRAFORM_SVC user and grant it the necessary system-defined roles:

-- Use the highest-level role to create users and grant system roles
USE ROLE ACCOUNTADMIN;

-- Create a dedicated service user for Terraform
-- The RSA_PUBLIC_KEY will be set in the next step
CREATE USER TERRAFORM_SVC    
COMMENT = 'Service user for managing Snowflake infrastructure via Terraform.'    
RSA_PUBLIC_KEY = '<YOUR_PUBLIC_KEY_CONTENT_HERE>';

-- Grant the necessary system roles to the Terraform service user
GRANT ROLE SYSADMIN TO USER TERRAFORM_SVC;
GRANT ROLE SECURITYADMIN TO USER TERRAFORM_SVC;

‍

Granting SYSADMIN and SECURITYADMIN to the service user is a necessary starting point for the infrastructure management. The SYSADMIN role holds the privileges required to create and manage account-level objects like databases and warehouses. The SECURITYADMIN role is required for managing security principals, including users, roles, and grants.

1.2 Authentication: the key to automation

The choice of authentication method is important. The Snowflake provider supports several authentication mechanisms, including basic password, OAuth, and key-pair authentication. For any automated workflow, especially within a CI/CD context, key-pair authentication is the industry-standard and recommended approach.

A CI/CD pipeline, such as one running in GitHub Actions, is a non-interactive environment. Basic password authentication is a significant security risk and not recommended. This leaves key-pair authentication as the only method that is both highly secure, as it avoids transmitting passwords, and fully automatable.

The following table provides a comparative overview of the primary authentication methods available in the Snowflake provider, reinforcing the recommendation for key-pair authentication in production automation scenarios.

Table 1: Snowflake provider authentication methods

Method	Primary Use Case	Security Profile	CI/CD Suitability
Password	Local development, quick tests	Low. Exposes credentials in state or environment variables.	Low. Requires secure secret management; often blocked by MFA.
OAuth	User-delegated access for third-party applications	High. Token-based, short-lived credentials.	Medium. Complex to set up for non-interactive server-to-server flows.
Key-Pair	Recommended for Automation. Service accounts, CI/CD pipelines.	High. Asymmetric cryptography; no passwords transmitted.	High. Designed for secure, non-interactive authentication.

‍

To implement key-pair authentication, an RSA key pair must be generated. The following openssl commands will create a 2048-bit private key in the required PKCS#8 format and its corresponding public key:

Bash

# Navigate to a secure directory, such as ~/.ssh
cd ~/.ssh

# Generate an unencrypted 2048-bit RSA private key in PKCS#8 format
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out snowflake_terraform_key.p8 -nocrypt

# Extract the public key from the private key
openssl rsa -in snowflake_terraform_key.p8 -pubout -out snowflake_terraform_key.pub

‍

After generating the keys, the content of the public key file (snowflake_terraform_key.pub), including the -----BEGIN PUBLIC KEY----- and -----END PUBLIC KEY----- headers, must be copied and pasted into the ALTER USER statement from the previous step to associate it with the TERRAFORM_SVC user. For enhanced security, the private key itself can be encrypted with a passphrase. The Snowflake provider supports this by using the private_key_passphrase argument in the provider configuration.

1.3 Provider configuration: connecting Terraform to Snowflake

With the service user created and the key-pair generated, the final step is to configure the Snowflake provider in the Terraform project. This is typically done in a providers.tf file.

The foundational configuration requires defining the snowflakedb/snowflake provider and setting the connection parameters.

terraform {  
required_providers {    
snowflake = {      
source  = "snowflakedb/snowflake"      
version = ">= 1.0.0" // Best practice: pin to a major version to avoid breaking changes    
    }  
  }
}

provider "snowflake" {  
organization_name = var.snowflake_org_name  
account_name      = var.snowflake_account_name  
user              = var.snowflake_user         // e.g., "TERRAFORM_SVC"  
role              = "SYSADMIN"                 // Default role for the provider's operations  
authenticator     = "SNOWFLAKE_JWT"  
private_key       = var.snowflake_private_key
}

‍

It is critical that sensitive values, especially the private_key, are never hardcoded in configuration files. The recommended approach is to define them as input variables marked as sensitive = true and supply their values through secure mechanisms like environment variables (e.g., TF_VAR_snowflake_private_key) or integration with a secrets management tool like GitHub Secrets or AWS Secrets Manager.

A common source of initial connection failures is the incorrect identification of the organization_name and account_name. These values can be retrieved with certainty by executing the following SQL queries in the Snowflake UI: SELECT CURRENT_ORGANIZATION_NAME(); and SELECT CURRENT_ACCOUNT_NAME();. Providing these simple but effective commands can prevent significant user frustration.

For more mature IaC implementations that strictly adhere to the principle of least privilege, Terraform supports the use of aliased providers. This powerful pattern allows for the definition of multiple provider configurations within the same project, each assuming a different role. This mirrors Snowflake's own best practices, where object creation (SYSADMIN) is separated from security management (SECURITYADMIN).

The following example demonstrates how to configure aliased providers:

# Default provider uses SYSADMIN for object creation (e.g., databases, warehouses)
provider "snowflake" {  
alias             = "sysadmin"  
organization_name = var.snowflake_org_name  
account_name      = var.snowflake_account_name  
user              = var.snowflake_user  
private_key       = var.snowflake_private_key  
authenticator     = "SNOWFLAKE_JWT"  
role              = "SYSADMIN"
}

# Aliased provider for security-related objects (e.g., roles, users, grants)
provider "snowflake" {  
alias             = "securityadmin"  
organization_name = var.snowflake_org_name  
account_name      = var.snowflake_account_name  
user              = var.snowflake_user  
private_key       = var.snowflake_private_key  
authenticator     = "SNOWFLAKE_JWT"  
role              = "SECURITYADMIN"
}

‍

When using aliased providers, individual resource blocks must explicitly specify which provider to use via the provider meta-argument (e.g., provider = snowflake.securityadmin). This ensures that each resource is created with the minimum necessary privileges, enforcing a robust security posture directly within the code.

Section 2: provisioning core Snowflake infrastructure

Once the secure connection is bootstrapped, Terraform can be used to define and manage the fundamental building blocks of the Snowflake environment. This section provides code examples for creating databases, virtual warehouses, and schemas - the foundational components for any data workload.

2.1 Laying the foundation: databases

The database is the top-level container for schemas and tables in Snowflake. The snowflake_database resource is used to provision and manage these containers.

The following HCL example creates a primary database for analytics workloads, demonstrating the use of the aliased sysadmin provider and an optional parameter for data retention.

‍resource "snowflake_database" "analytics_db" {  
provider = snowflake.sysadmin // Explicitly use the sysadmin provider for object creation  

name    = "ANALYTICS"  
comment = "Primary database for analytics workloads managed by Terraform."  

// Optional: Configure Time Travel data retention period.  
// This setting can have cost implications.  
data_retention_time_in_days = 30
}

‍

A core strength of Terraform is its ability to manage dependencies implicitly through resource references. In this example, once the analytics_db resource is defined, other resources, such as schemas, can reference its attributes (e.g., snowflake_database.analytics_db.name).

2.2 Compute power: warehouses

Virtual warehouses are the compute engines in Snowflake, responsible for executing queries and data loading operations. The snowflake_warehouse resource provides comprehensive control over their configuration, enabling a balance between performance and cost.

This example defines a standard virtual warehouse for analytics and business intelligence tools, showcasing parameters for cost optimization and scalability.

resource "snowflake_warehouse" "analytics_wh" {  
provider = snowflake.sysadmin  

name    = "ANALYTICS_WH"  
comment = "Warehouse for the analytics team and BI tools."  

// Define the compute capacity of the warehouse.  
warehouse_size = "X-SMALL"  

// Cost-saving measures: suspend the warehouse when idle.  
auto_suspend = 60 // Suspend after 60 seconds of inactivity.  
auto_resume  = true  

// Optional: Configure for multi-cluster for higher concurrency.  
min_cluster_count = 1  
max_cluster_count = 4  
scaling_policy    = "ECONOMY" // Prioritize conserving credits over starting clusters quickly.
}

‍

The parameters in this resource directly impact both performance and billing. warehouse_size determines the raw compute power and credit consumption per second. auto_suspend is a critical cost-control feature, ensuring that credits are not consumed when the warehouse is idle. For workloads with high concurrency needs, the min_cluster_count, max_cluster_count, and scaling_policy parameters allow the warehouse to dynamically scale out to handle query queues, and then scale back in to conserve resources. Managing these settings via Terraform ensures that cost and performance policies are consistently applied and version-controlled.

2.3 Organizing your data: schemas

Schemas are logical groupings of database objects like tables and views within a database. The snowflake_schema resource is used to create and manage these organizational units.

The following HCL creates a RAW schema within the ANALYTICS database defined earlier.

resource "snowflake_schema" "raw_data" {  
provider = snowflake.sysadmin  

// Create an explicit dependency on the database resource.  
database = snowflake_database.analytics_db.name  

name    = "RAW"  
comment = "Schema for raw, unprocessed data ingested from source systems."
}

‍

It is important to note that when a new database is created in Snowflake, it automatically includes a default schema named PUBLIC. While this schema is created outside of Terraform's management, administrators should be aware of its existence. For environments that require strict access control, it is a common practice to immediately revoke all default privileges from the

PUBLIC schema to ensure it is not used inadvertently. Terraform can be used to manage this revocation if desired, but the schema itself will not be in the Terraform state unless explicitly imported.

Section 3: mastering access control with role hierarchies

Effective access control is a cornerstone of data governance and security. Snowflake's Role-Based Access Control (RBAC) model is exceptionally powerful, particularly its support for role hierarchies. Managing this model via Terraform provides an auditable, version-controlled, and scalable approach to permissions management. This section details how to construct a robust RBAC framework using a best-practice model of functional and access roles.

3.1 The building blocks: creating account roles

The foundation of the RBAC model is the creation of roles. A recommended pattern is to create two distinct types of roles:

Functional roles: These roles represent a job function or a persona, such as DATA_ANALYST or DATA_ENGINEER. Users are granted these roles.
Access roles: These roles represent a specific set of privileges on a specific set of objects, such as SALES_DB_READ_ONLY or RAW_SCHEMA_WRITE. These roles are granted to functional roles, not directly to users.

This separation decouples users from direct permissions, making the system vastly more scalable and easier to manage. The snowflake_account_role resource is used to create both types of roles.

// Define a functional role representing a user persona.
resource "snowflake_account_role" "data_analyst" {  
provider = snowflake.securityadmin // Use the securityadmin provider for role management 

name    = "DATA_ANALYST"  
comment = "Functional role for users performing data analysis and reporting."
}

// Define an access role representing a specific set of privileges.
resource "snowflake_account_role" "analytics_db_read_only" {  
provider = snowflake.securityadmin  

name    = "ANALYTICS_DB_READ_ONLY"  
comment = "Grants read-only access to all objects in the ANALYTICS database."
}

‍

3.2 Constructing the hierarchy: granting roles to roles

The true power of Snowflake's RBAC model is realized by creating hierarchies of roles. By granting access roles to functional roles, a logical and maintainable privilege structure is formed. If a data analyst needs access to a new data source, the corresponding access role is granted to the DATA_ANALYST functional role once, rather than granting privileges to every individual analyst. This pattern is essential for managing permissions at scale.

The snowflake_grant_account_role resource is used to create these parent-child relationships between roles. It is important to use this resource, as the older snowflake_role_grants resource is deprecated.

The following example demonstrates how to grant the ANALYTICS_DB_READ_ONLY access role to the DATA_ANALYST functional role, and then nest the functional role under the system SYSADMIN role to complete the hierarchy.

// Grant the access role to the functional role.
// This gives all members of DATA_ANALYST the privileges of ANALYTICS_DB_READ_ONLY.
resource "snowflake_grant_account_role" "grant_read_access_to_analyst" {  
provider = snowflake.securityadmin  

role_name        = snowflake_account_role.analytics_db_read_only.name  
parent_role_name = snowflake_account_role.data_analyst.name
}

// Grant the functional role to SYSADMIN to create a clear role hierarchy.
// This allows system administrators to manage and assume the functional role.
resource "snowflake_grant_account_role" "grant_analyst_to_sysadmin" {  
provider = snowflake.securityadmin  
role_name        = snowflake_account_role.data_analyst.name  
parent_role_name = "SYSADMIN"
}

‍

3.3 Assigning privileges to access roles

With the role structure in place, the final step is to grant specific object privileges to the access roles. The snowflake_grant_privileges_to_account_role resource is a consolidated and powerful tool for this purpose. This resource has evolved significantly in the Snowflake provider; older versions required separate grant resources for each object type (e.g., snowflake_database_grant), which resulted in verbose and repetitive code. The modern resource uses a more complex but flexible block structure (on_account_object, on_schema, etc.) to assign privileges. Users migrating from older provider versions may find this a significant but worthwhile refactoring effort.

This example grants the necessary USAGE and SELECT privileges to the ANALYTICS_DB_READ_ONLY access role.

// Grant USAGE privilege on the database to the access role.
resource "snowflake_grant_privileges_to_account_role" "grant_db_usage" {  
provider          = snowflake.securityadmin  
account_role_name = snowflake_account_role.analytics_db_read_only.name  
privileges        =    on_account_object {    

object_type = "DATABASE"    
object_name = snowflake_database.analytics_db.name  
  }
 }
 
 // Grant USAGE privilege on the schema to the access role.
 resource "snowflake_grant_privileges_to_account_role" "grant_schema_usage" {  
 provider          = snowflake.securityadmin  
 account_role_name = snowflake_account_role.analytics_db_read_only.name  
 privileges        =  
 
 on_schema {    
 // Use the fully_qualified_name for schema-level objects.    
 schema_name = snowflake_schema.raw_data.fully_qualified_name  
  }
 }
 
 // Grant SELECT on all existing tables in the schema.
 resource "snowflake_grant_privileges_to_account_role" "grant_all_tables_select" {    
 provider          = snowflake.securityadmin    
 privileges        =    
 account_role_name = snowflake_account_role.analytics_db_read_only.name    
 on_schema_object {        
 all {            
 object_type_plural = "TABLES"            
 in_schema          = snowflake_schema.raw_data.fully_qualified_name    
   }  
  }
 }
 
 // Grant SELECT on all FUTURE tables created in the schema.
 resource "snowflake_grant_privileges_to_account_role" "grant_future_tables_select" {  
 provider          = snowflake.securityadmin  
 account_role_name = snowflake_account_role.analytics_db_read_only.name  
 privileges        =  
 
 on_schema_object {    
 future {      
 object_type_plural = "TABLES"      
 in_schema          = snowflake_schema.raw_data.fully_qualified_name   
   }  
  }
 }

‍

A particularly powerful feature demonstrated here is the use of the future block. Granting privileges on future objects ensures that the access role will automatically have the specified permissions on any new tables created within that schema. This dramatically reduces operational overhead, as permissions do not need to be manually updated every time a new table is deployed. However, it is important to understand Snowflake's grant precedence: future grants defined at the schema level will always take precedence over those defined at the database level. This can lead to "insufficient privilege" errors if not managed carefully across different roles and grant levels.

3.4 An optional "Audit" role for bypassing data masks

In certain scenarios, such as internal security audits or compliance reviews, it may be necessary for specific, highly-trusted users to view data that is normally protected by masking policies. Creating a dedicated "audit" role for this purpose provides a controlled and auditable mechanism to bypass data masking when required.

This role should be considered a highly privileged functional role and granted to users with extreme care.

// Define a special functional role for auditing PII data.
resource "snowflake_account_role" "pii_auditor" {  
provider = snowflake.securityadmin  

name    = "PII_AUDITOR"  
comment = "Functional role for users who need to view unmasked PII for audit purposes."
}

‍

Crucially, creating this role is not enough. For it to be effective, every relevant masking policy must be explicitly updated to include logic that unmasks data for members of the PII_AUDITOR role. This ensures that the ability to view sensitive data is granted on a policy-by-policy basis. An example of how to modify a masking policy to incorporate this audit role is shown in the following section.

Section 4: advanced data governance with dynamic data masking

Moving beyond infrastructure provisioning, Terraform can also codify and enforce sophisticated data governance policies. Snowflake's Dynamic Data Masking is a powerful feature for protecting sensitive data at query time. By managing these policies with Terraform, organizations can ensure that data protection rules are version-controlled, auditable, and consistently applied across all environments.

4.1 Defining the masking logic

A masking policy is a schema-level object containing SQL logic that determines whether a user sees the original data in a column or a masked version. The decision is made dynamically at query time based on the user's context, most commonly their active role.

The snowflake_masking_policy resource is used to define this logic. The policy's body contains a CASE statement that evaluates the user's session context and returns the appropriate value.

The following example creates a policy to mask email addresses for any user who is not in the DATA_ANALYST or PII_AUDITOR role.

resource "snowflake_masking_policy" "email_mask" {  
provider = snowflake.sysadmin // Policy creation often requires SYSADMIN or a dedicated governance role  n

ame     = "EMAIL_MASK"  
database = snowflake_database.analytics_db.name  
schema   = snowflake_schema.raw_data.name    

// Defines the signature of the column the policy can be applied to.  
// The first argument is always the column value to be masked.  
argument {    
name = "email_val"    
type = "VARCHAR"  }    

// The return data type must match the input data type.  
return_type = "VARCHAR" 

// The core masking logic is a SQL expression.  
body = <<-EOF    
CASE      
WHEN IS_ROLE_IN_SESSION('DATA_ANALYST') OR IS_ROLE_IN_SESSION('PII_AUDITOR') THEN email_val      
ELSE '*********'   
END  
EOF  

comment = "Masks email addresses for all roles except DATA_ANALYST and PII_AUDITOR."
}

‍

The SQL expression within the body argument offers immense flexibility. It can use various context functions (like CURRENT_ROLE() or IS_ROLE_IN_SESSION()) and even call User-Defined Functions (UDFs) to implement complex logic. However, this flexibility means the logic itself is not validated by Terraform's syntax checker; it is sent directly to Snowflake for validation during the

terraform apply step. It is also a strict requirement that the data type defined in the argument block and the return_type must match the data type of the column to which the policy will eventually be applied.

4.2 Applying the policy to a column

Creating a masking policy is only the first step; it does not protect any data on its own. The policy must be explicitly applied to one or more table columns. This crucial second step is often a point of confusion for new users, who may create a policy and wonder why data is still unmasked. The snowflake_table_column_masking_policy_application resource creates this essential link between the policy and the column.

The following example demonstrates how to apply the EMAIL_MASK policy to the EMAIL column of a CUSTOMERS table.

// For this example, we assume a 'CUSTOMERS' table with an 'EMAIL' column
// already exists in the 'RAW' schema. In a real-world scenario, this table
// might also be managed by Terraform or by a separate data loading process.
// We use a data source to reference this existing table.
data "snowflake_table" "customers" {  
database = snowflake_database.analytics_db.name  
schema   = snowflake_schema.raw_data.name  
name     = "CUSTOMERS"
}

// Apply the masking policy to the specific column.resource "snowflake_table_column_masking_policy_application" "apply_email_mask" {  
provider = snowflake.sysadmin  

table_name  = "\"${data.snowflake_table.customers.database}\". \"${data.snowflake_table.customers.schema}\". \"${data.snowflake_table.customers.name}\""  
column_name = "EMAIL" // The name of the column to be masked  

masking_policy_name = snowflake_masking_policy.email_mask.fully_qualified_name    

// An explicit depends_on block ensures that Terraform creates the policy  
// before attempting to apply it, preventing race conditions.  
depends_on = [    
snowflake_masking_policy.email_mask  
]
}

‍

This two-step process—defining the policy logic and then applying it - provides a clear and modular approach to data governance. The same policy can be defined once and applied to many different columns across multiple tables, ensuring that the masking logic is consistent and centrally managed.

Conclusion: the path to mature Snowflake IaC

This guide has charted a course from the initial, manual bootstrapping of a secure connection to the automated provisioning and governance of a production-grade Snowflake environment. To ensure the long-term success and scalability of managing Snowflake with Terraform, several key practices should be adopted as standard procedure:

Version control: All Terraform configuration files must be stored in a version control system like Git. This provides a complete, auditable history of all infrastructure changes and enables collaborative workflows such as pull requests for peer review before any changes are applied to production.
Remote state management: The default behaviour of Terraform is to store its state file locally. In any team or automated environment, this is untenable. A remote backend, such as an Amazon S3 bucket with a DynamoDB table for state locking, must be configured. This secures the state file, prevents concurrent modifications from corrupting the state, and allows CI/CD pipelines and team members to work from a consistent view of the infrastructure.
Modularity: As the number of managed resources grows, monolithic Terraform configurations become difficult to maintain. Code should be refactored into reusable modules. For instance, a module could be created to provision a new database along with a standard set of access roles and default schemas. This promotes code reuse, reduces duplication, and allows for more organized and scalable management of the environment.
Provider versioning: The Snowflake Terraform provider is actively evolving. To prevent unexpected breaking changes from new releases, it is crucial to pin the provider to a specific major version in the terraform block (e.g., version = "~> 1.0"). This allows for intentional, planned upgrades. When upgrading between major versions, it is essential to carefully review the official migration guides, as significant changes, particularly to grant resources, may require a concerted migration effort.

With this robust foundation in place, the path is clear for expanding automation to encompass even more of Snowflake's capabilities. The next logical steps include using Terraform to manage snowflake_network_policy for network security, snowflake_row_access_policy for fine-grained data filtering, and snowflake_task for orchestrating SQL workloads. Ultimately, the entire workflow should be integrated into a CI/CD pipeline, enabling a true GitOps model where every change to the Snowflake environment is proposed, reviewed, and deployed through a fully automated and audited process. By embracing this comprehensive approach, organizations can unlock the full potential of their data platform, confident in its security, scalability, and operational excellence.

Why Snowstack for Terraform and Snowflake

Automation without expertise can still fail. Terraform gives you the tools, but it takes experience and the right design patterns to turn Snowflake into a secure, cost-efficient, and scalable platform.

Managing Snowflake with Terraform is powerful, but putting it into practice at enterprise scale requires experience, discipline, and the right patterns. That is where Snowstack comes in. As a Snowflake-first consulting partner, we help organizations move beyond trial-and-error scripts to fully automated, production-grade environments. Our engineers design secure architectures, embed Terraform best practices, and ensure governance and cost controls are built in from day one.

👉 Book a strategy call with Snowstack and see how we can take your Snowflake platform from manual operations to enterprise-ready automation.

‍

Cloud

5 min read

Databricks vs Snowflake: Which one is better in 2025?

A few years ago, choosing a data platform was about storage limits and running reports. Databricks and Snowflake are the two biggest names in this space. The real challenge is deciding which one fits your strategy better in 2025.

A few years ago, choosing a data platform was about storage limits and running reports. In 2025, the game has changed. Data speed is now business speed, and the platform running your analytics and AI determines how fast you can innovate, control Snowflake costs, and outpace competitors. Databricks and Snowflake are the two biggest names in this space, each offering a different path to turning data into a competitive edge. The real challenge is deciding which one fits your strategy better and how it fits into a modern implementation.

Picking between Databricks and Snowflake is less about comparing features and more about deciding how your business will compete. This guide shows you which platform can give you the advantage and where expert Snowflake consulting can help you in your data projects.

‍

What is Databricks?

Created by the team behind Apache Spark, Databricks unifies data engineering, data science, and machine learning in a single “lakehouse” platform. It handles structured and unstructured data at scale, excelling in complex pipelines, streaming analytics, and AI/ML workloads. By 2025, new features like Agent Bricks for domain-specific AI agents, Lakebase for AI-native applications, and expanded Unity Catalog governance have turned it into a full data intelligence platform for both technical and business users.

‍

What is Snowflake?

Snowflake redefined cloud data warehousing with its separate compute and storage architecture, making it easy to scale and manage. Originally built for SQL analytics, it has evolved into an AI Data Cloud supporting BI and advanced AI applications. In 2025, enhancements like Cortex AISQL, the Arctic LLM, document AI, and improved Python integration extend its reach to data scientists, while keeping its automation and strong data governance.

Databricks vs Snowflake: similarities

Both platforms have matured significantly by 2025, converging on several key capabilities that make them viable options for modern data architectures. Both offer:

Cloud-native architecture with automatic scaling and multi-cloud deployment options
Enterprise-grade security including encryption, compliance certifications, and granular access controls
Data sharing capabilities for secure collaboration across teams and organizations
Support for both structured and unstructured data with varying degrees of optimization
Integration ecosystems connecting to popular BI tools, data orchestration platforms, and cloud services
Pay-as-you-consume pricing models with cost optimization features
Streaming data ingestion for real-time analytics and decision-making
Machine learning capabilities though with different approaches and levels of sophistication

Databricks vs Snowflake: differences

While these platforms share similarities, their design and intended uses provide each with advantages in specific scenarios.

Performance

Snowflake is built for fast, predictable SQL at high concurrency. Multi-cluster warehouses and automatic optimization keep dashboards responsive. In June 2025, Snowflake introduced Adaptive Compute and Gen2 warehouses to further boost price-performance for interactive analytics.

Databricks is strongest on heavy transformations, ML, and streaming; Photon closes much of the SQL gap but still benefits from tuning.

Aspect	Snowflake	Databricks
Typical workloads	Interactive SQL, BI, dashboards	Complex ETL, ML, real-time streaming
Notable features	Multi-cluster warehouses, automatic optimization	Photon engine, Structured Streaming
Trade-offs	Less control over low-level tuning	More tuning to hit top SQL speed

‍

Winner: Snowflake for interactive SQL/BI and concurrent users; Databricks for heavy data processing, ML, and low-latency streaming.

Scalability

Snowflake scales with virtual warehouses and multi-cluster warehouses that add or remove clusters automatically, suspend when idle, and resume on demand, which makes high-concurrency BI straightforward with little operational overhead. It is simple to run for many concurrent users and to hand over to a dedicated platform team as a service when internal capacity is limited.

Databricks scales massive distributed jobs and offers autoscaling and serverless options across jobs, SQL, and pipelines.

“Snowflake had great performance consistency and easier scaling… Databricks gave us the best bang for buck on large-scale transformations and streaming.”

Aspect	Snowflake	Databricks
Concurrency	Scales out automatically for many users	Requires plan per workload pattern
Long jobs	Pays for bursts, easy to pause	Efficient for sustained large pipelines
Ops effort	Low	Higher, with more control

‍

Winner: Snowflake for easy, high-concurrency analytics; Databricks for large-scale data processing and ML.

Ease of Use

Snowflake is SQL-first with a clean web UI, so analysts can start fast, and most tuning is automatic.

Databricks is notebook- and code-centric, great for engineers and data scientists, but it asks more from the team. Across the data community, the pattern is consistent:

“Snowflake seems so much easier to manage … the fastest way to deliver stakeholder value,” while Databricks earns favour with teams that have deep technical know-how.

Aspect	Snowflake	Databricks
Onboarding	Hours for analysts	Days to weeks depending on setup
Primary users	BI and data analysts	Data engineers and data scientists
Admin workload	Light	Moderate to heavy

‍

Winner: Snowflake for business users and quick deployment; Databricks for technical teams requiring flexibility

Security

Snowflake ships enterprise controls out of the box, including RBAC, dynamic masking, row access, encryption, and detailed usage history. In 2025, updates added Trust Centre email alerts for policy violations, and Access History plus built-in lineage views support auditing. These map closely to the control models used in Snowflake AI & data governance.

Databricks centralises security and lineage in Unity Catalog with fine-grained policies and customer-managed keys, now including attribute-based access control (ABAC) policies.

Aspect	Snowflake	Databricks
Defaults	Strong, turnkey governance	Strong, more setup
Policy model	RBAC, masking, row policies	Unity Catalog, fine-grained and ABAC options
Keys and encryption	Managed, always-on	Managed and customer-managed keys

‍

Winner: Snowflake for turnkey, compliance-ready governance; Databricks for flexible, policy-rich control across data and AI when you have the engineering depth.

Integration

Snowflake connects cleanly to the BI stack and runs data and native apps inside the platform. Its Marketplace and Native App Framework let vendors ship apps that run inside Snowflake, and 2025 updates expanded in-market apps and data products. These patterns are common in enterprise Snowflake implementations where BI is the primary interface.

Databricks, on the other hand, leans on open formats and APIs, integrating broadly with Spark tools, ML frameworks, and engines that read Delta or Iceberg (and even Snowflake for reads).

Aspect	Snowflake	Databricks
BI tools	Native connectors for Tableau, Power BI, Looker	Works, but BI is not the core
Data apps	Marketplace and Native Apps	Open APIs, broad OSS ecosystem
Open formats	Growing Iceberg support	Delta Lake and Iceberg first-class

‍

Winner: Snowflake for BI and in-platform apps; Databricks for ML/AI ecosystem depth and open, cross-engine interoperability.

AI

Snowflake integrates AI directly into the platform, allowing teams to call large language models (LLMs) directly from SQL through Cortex AISQL. It also offers its own Arctic LLM family and, starting in 2025, supports running Snowflake ML models within Native Apps.

Meanwhile, Databricks focuses on end-to-end AI application development. Its Mosaic AI Agent Framework enables retrieval-augmented generation (RAG) and agent workflows, and it recently launched DBRX, an open LLM designed for enterprise customisation.

Aspect	Snowflake	Databricks
Primary approach	AI inside analytics and SQL	Full AI app development and MLOps
Core tools	Cortex AISQL, Document AI, Arctic LLMs, ML in Native Apps	Mosaic AI Agent Framework, Vector Search, Model Serving, DBRX
Governance & ops	Built into the Data Cloud, usage metered in-platform	Integrated with Unity Catalog and platform security
Best fit	AI-augmented BI, governed NLQ, document extraction	Custom models, agents, large-scale RAG and real-time AI

‍

Winner: Snowflake for AI in analytics with governance and low MLOps overhead. Databricks for custom AI apps, agents, and RAG at scale.

Cost

Snowflake charges per-second compute with auto-suspend and clear usage views, which makes BI spend predictable when set up well. Cost visibility is built in through Snowsight dashboards, usage views, resource monitors, and new cost-anomaly detection, and Cortex AI features are metered by tokens with documented credit rates and guardrails like the 10% cloud-services threshold. Many teams add a layer of Snowflake FinOps and the Snowflake Savings Calculator to keep spend under tight control.

Databricks uses DBUs that vary by workload and tier; it can be cheaper for large, long-running pipelines if you actively tune and monitor. The company is phasing out the Standard tier on AWS and GCP with Premium becoming the base on October 1, 2025, which makes governance features standard but still requires active monitoring and optimisation for steady costs.

As one user said:

“DBU pricing is confusing; you need active monitoring to understand what work maps to which cost.”

Aspect	Snowflake	Databricks
Billing model	Per-second compute, separate storage	DBUs by workload and tier
Cost control	Auto-suspend, usage views, resource monitors	Needs active optimization and tracking
Where it saves	Short, spiky BI workloads	Sustained ETL and ML at scale

‍

Winner: Snowflake for clearer, more predictable analytics spend and native cost controls; Databricks for cost efficiency on large, long-running data engineering and ML when tuned well.

‍

So, which one is better in 2025?

Use case	Snowflake	Databricks
Enterprise BI & reporting	✅ Fast SQL, concurrency, near-zero ops	✔️ Works, but not its sweet spot
Self-service analytics for business users	✅ Simple UI, governed access, AISQL	✔️ Notebooks require more expertise
Data sharing & collaboration	✅ Native Data Marketplace and sharing	✔️ Delta Sharing available
Governed analytics at scale	✅ Strong RBAC, masking, turnkey governance	✔️ Unity Catalog strong, more setup
Highly regulated environments	✅ Compliance-first, tight governance	✔️ Capable with more configuration
Cost predictability for BI	✅ Per-second warehouse billing, easy to model	✔️ Can be efficient with tuning
Custom ML / deep learning	✔️ Possible via Snowpark/Python, Cortex	✅ Native ML/AI stack with Spark/MLflow
AI agents and GenAI apps	✔️ AISQL, Arctic, Document AI streamline AI-in-analytics	✅ Mosaic AI Agent Framework, Vector Search, eval tooling
Large-scale data engineering (ETL/ELT)	✔️ Good for SQL ELT	✅ Spark/Delta excel at heavy pipelines
Streaming & real-time pipelines	✔️ Integrates via connectors	✅ Built-in streaming with Spark/Delta Live Tables
Unstructured data processing	✔️ Improving with Cortex & Document AI	✅ Strong with lakehouse and AI tooling
Open table formats (Iceberg/Delta)	✔️ Interop improving; strong for SQL workloads	✅ Native Delta; expanding Iceberg support
Lake modernization	✅ If goal is SQL-first analytics with governance	✅ If goal is open formats + ML/AI at scale
Hybrid architecture (both platforms)	✅ Great as the governed analytics layer	✅ Great as the engineering/AI layer

‍

The decision between Databricks vs Snowflake ultimately depends on your organization's primary use cases, team composition, and strategic priorities.

Choose Snowflake if:

Your primary focus is business intelligence, reporting, and governed analytics.
You have mixed technical teams, including business analysts who need self-service capabilities on a managed Snowflake platform.
You prioritise ease of use, quick deployment, and minimal maintenance overhead.
Data governance, compliance, and security are top priorities with limited dedicated resources, making AI & data governance a core requirement.
You need predictable, transparent pricing for analytical workloads with clear FinOps guardrails.
Your AI initiatives involve augmenting existing analytics rather than building custom models from scratch.

Consider a hybrid approach if:

You have both heavy ML/data science workloads AND extensive BI requirements
Different teams have varying technical capabilities and use case requirements
You're transitioning between platforms and need time to migrate workloads, often via staged migrations & integrations.
Specific regulatory or data residency requirements dictate platform choice by region

‍

Need expert guidance for your data platform decision?

Your data platform is not an IT purchase. It is a strategy decision. Our Snowflake consultants help data leaders design, build, and run modern platforms with a core focus on Snowflake and the surrounding stack. We handle migrations, performance tuning, FinOps, AI readiness and governance so your team spends smarter and stays compliant. We use the same delivery patterns proven in our success stories.

Let’s align your Snowflake platform to your strategy.

Talk to our Snowflake consultants →

‍

Databricks vs Snowflake FAQs

FAQs

Can you use both Databricks and Snowflake together? +

Absolutely. A common architecture uses Databricks for ETL and AI workloads, then loads into Snowflake for SQL analytics and business-level insights.

Is Databricks cheaper than Snowflake? +

Neither platform is always cheaper; Databricks can be more cost-efficient for long-running, well-tuned pipelines, while Snowflake often wins on short, spiky BI workloads where auto-suspend and clear warehouse sizing keep spend predictable. Real costs come from platform spend + engineering time.

The FinOps Savings Calculator for Snowflake and FinOps services break your bill into compute, storage, and services, then surface concrete optimization opportunities before you consider a re-platform.

How should a company choose between Databricks and Snowflake? +

Start from your use cases, team, and risk profile. If most value comes from governed reporting, self-service analytics, and finance-grade dashboards, Snowflake as the primary platform (sometimes plus Databricks for specialist workloads) is usually the safer bet.

If your roadmap is dominated by advanced ML, streaming, and data-science-heavy products, Databricks often plays a bigger role. Our consulting services are designed exactly for this decision: we audit your current stack and model costs. When you’re ready, you can book a consultation to walk through options with a senior Snowflake architect consultant.

Is it easy to migrate between Databricks and Snowflake? +

Migrating between Databricks and Snowflake is technically feasible, but the real challenge is cost and disruption. Teams usually switch from Databricks to Snowflake for simpler SQL analytics, stronger governance, and easier day-to-day operations.

Our Snowflake consultants help companies with platform migration by first modelling the business case through Snowflake consulting & advisory , then, if a move to Snowflake is justified, running a structured, low-risk migration using our migrations & integrations playbooks and proven timelines from real Snowflake case studies .

Does Snowflake support unstructured data like Databricks? +

Yes. Snowflake supports unstructured data and handles semi-structured data (JSON, Avro, Parquet) through the VARIANT type. It now supports unstructured data (documents, images, logs, media files) via external volumes, directory tables, and features like Document AI.

Is Snowflake better than Databricks for AI? +

For AI workloads in 2025, the answer depends on your specific implementation approach. Snowflake is better for adding AI into analytics. With Cortex AISQL, the Arctic LLM, Document AI, and stronger Python support, it lets teams use AI for insights, governed deployments, and SQL-based applications without deep ML expertise.

‍

Finance

5 min read

Snowflake in 2025: 5 real-world use cases that could transform your business

What if you never had to wait for answers again, whether you are searching 800,000 documents, tracking a global supply chain, or reacting to real-time sales? And what if you could forecast business demand weeks in advance with nothing more than a few lines of SQL?

In 2025, this is not a future vision. It is how leading companies already use Snowflake’s Data Cloud to make faster and smarter decisions. Snowstack helps organizations get there. As certified Snowflake experts, we help organizations go beyond using Snowflake as a warehouse, turning it into a secure, scalable, and AI-ready data platform.

In this blog, we explore five use cases that show how companies are driving results today.

What is Snowflake?

Snowflake is a cloud-native data platform that brings all of your organization’s data together in a secure, scalable, and easy-to-use environment. Traditional systems often lock you into a single vendor and require heavy infrastructure. Snowflake avoids these limits by running on AWS, Azure, and Google Cloud, giving you the flexibility to scale resources up or down as your business needs evolve.

At the heart of Snowflake’s performance is its unique architecture that separates compute from storage. This means you can scale performance and capacity independently, ensuring you only pay for what you use.

What Snowflake dose?

At its core, Snowflake is built to store, integrate, and analyse large volumes of data. It handles both structured data such as sales transactions and semi-structured formats such as JSON logs, all without the burden of hardware management, database tuning, or restrictive licensing.

By 2025, Snowflake has become much more than a warehouse for storage and analytics:

It is an AI-ready platform with capabilities like Snowflake Cortex, which brings natural-language queries, predictive modelling, and generative AI directly into the platform.
It enables real-time data sharing with partners, suppliers, and customers while keeping governance and security intact.
It delivers advanced business intelligence by making insights instantly accessible to both technical and non-technical users.

In practice, Snowflake is used to turn raw data into decisions that matter. An engineer can optimize a turbine setting in seconds, a retailer can respond to changing demand in real time, and a government agency can shape policy backed by timely, reliable information.

As the success stories below show, Snowflake is no longer just a tool for data teams. It is a strategic platform that changes how entire organizations collaborate, innovate, and grow.

Use case 1: Siemens Energy – turning 800,000 documents into instant answers

The challenge:

Siemens Energy operates in one of the most complex industries in the world - power generation and infrastructure. Their teams relied on over 800,000 technical documents: safety manuals, engineering diagrams, and operational reports. Searching for critical information could take hours or even days, slowing down maintenance and decision-making.

The solution:

Using Snowflake Cortex AI and retrieval-augmented generation (RAG), Siemens Energy deployed a document chatbot on its document repository. Engineers simply ask, “What’s the recommended torque for this turbine component?” and get back a precise, instant answer.

The result:

Faster access to knowledge means reduced downtime, quicker troubleshooting, and better-informed field operations, all while keeping sensitive data secure inside Snowflake’s governed environment.

Use case 2: Sainsbury’s – data insights for every store manager

The challenge:

With over 1,400 stores and thousands of employees, Sainsbury’s needed to put live performance data in the hands of managers on the shop floor — without requiring them to be data analysts. Traditional reports were static, delayed, and inaccessible during the daily rush.

The solution:

Sainsbury’s built a mobile-friendly analytics platform powered by Snowflake’s real-time data processing. Sales, staffing, waste management, and customer feedback are streamed into Snowflake, processed, and made available through intuitive dashboards and mobile apps.

The result:

Store managers can now make same-day staffing adjustments, reduce waste by acting on live inventory alerts, and respond to customer trends before they impact sales. The initiative has saved over 150,000 labour hours annually and boosted responsiveness at every level of the organization.

Use case 3: Deloitte – modernizing public sector data for the AI era

The challenge:

Government agencies often operate with siloed systems, outdated infrastructure, and strict compliance requirements. Integrating data for cross-departmental analysis is slow and expensive, making it harder to respond to citizens’ needs.

The solution:

‍Deloitte partnered with Snowflake to create the AI-Ready Data Foundation, a framework that enables secure, scalable, and compliant data sharing across public sector organizations. The platform is designed to support advanced analytics and generative AI workloads, enabling predictive services and faster policy decisions.

The result:

Agencies can now connect previously isolated datasets, generate real-time insights, and deploy AI applications without compromising security. This modernization has improved efficiency, transparency, and service delivery — earning Deloitte recognition as Snowflake’s 2025 Public Sector Data Cloud Services Partner of the Year.

Use case 4: Global retailer – harmonizing product data across brands

The challenge:

A global retail group managing multiple brands struggled with inconsistent product data across catalogs. The same product might appear under different names, SKUs, or descriptions, making inventory analysis, pricing strategies, and supplier negotiations a nightmare.

The solution:

Using Snowflake notebooks and embedded AI/ML models, the retailer developed a product data harmonization pipeline. The system cleans raw product data, generates vector embeddings for matching, and unifies records across different brand catalogs.

The result:

Unified product intelligence allows teams to analyse portfolio performance holistically, optimize pricing, and spot cross-brand sales opportunities. Supplier management has improved, and decision-makers finally trust that they’re working from a single, accurate source of truth.

Use case 5: Douglas – cutting analytics time from 2 hours to 40 seconds

The challenge:

Douglas, a leading European beauty retailer, relied on batch-processed reports that took up to two hours to compile. By the time teams received the data, it was already outdated - too late for fast-moving e-commerce campaigns and in-store promotions.

The solution:

By migrating to Snowflake and optimizing data pipelines, Douglas transformed their analytics process into a near real-time system. Inventory levels, sales performance, and customer engagement data are refreshed continuously, accessible within seconds.

The result:

Processing time dropped from 2 hours to just 40 seconds. Marketing teams can now adapt campaigns instantly, inventory managers can react to stock shortages in real-time, and the business can run more targeted promotions that actually align with current demand.

Why These Results Matter for Your Organization

Cross-Industry Platform Versatility: From energy infrastructure to retail operations to government services, Snowflake adapts to unique industry challenges while maintaining enterprise-grade security and compliance.
Measurable Business Impact, Not Theoretical Benefits: Every example demonstrates quantifiable improvements: Siemens' instant document retrieval, Sainsbury's 150,000 saved labour hours annually, Douglas' 99.7% performance improvement (2 hours to 40 seconds). They're production systems delivering ROI today.
AI and Analytics Integration at Enterprise Scale :These implementations showcase Snowflake's evolution beyond traditional data warehousing into AI-native operations. Organizations can implement advanced AI capabilities without replacing existing infrastructure or managing complex integrations.

Ready to Write Your Own Success Story?

The organizations in this analysis didn’t transform by chance. They worked with experts who understood how to align technology with business priorities and deliver lasting impact.

Explore our case studies to see how Snowstack has helped companies modernize their data, reduce costs, and build a sharper competitive edge. These stories show what becomes possible when Snowflake is turned from a warehouse into a true growth platform.

Schedule a strategic assessment and discover how we can design the same advantage for you:

Document intelligence at scale (Siemens Energy)
Real-time operational dashboards (Sainsbury’s)
Modern data foundations built for growth (Deloitte)
Harmonized product data across brands (Global retailer)
Analytics in seconds, not hours (Douglas)

Your competitors are already moving in this direction. The sooner you act, the sooner you can move past them.

‍

FAQs

Snowflake is moving beyond data warehousing into a full data platform. Its future lies in powering analytics, governance, and modern applications while supporting new AI-driven workloads.

Snowflake is trusted across industries by many major brands, including Deloitte, Capital One, Fidelity Investments, Amazon, Walmart, ExxonMobil, Apple, CVS Health, and UnitedHealth Group, among others. It is also used by organizations across the Forbes Global 2000 and processed over 4.2 billion daily queries in 2024.

Key features include Cortex integration, stronger governance tools, support for Apache Iceberg, and improvements in pipelines, lineage, and developer tooling.

The Snowflake Data Cloud is built on a cloud-native architecture that is not limited by legacy technology. Snowflake's architecture enables a variety of workloads across public clouds and regions, and it can handle near-unlimited amounts and types of data with low latency.

Yes. Snowflake is positioned to benefit as companies demand trusted, governed data for AI. Its platform already integrates AI capabilities that drive faster adoption and growth.

‍

Finance

5 min read

How Snowflake cost is calculated: 5 steps to optimize your data warehouse costs before your next renewal

For data teams, the pattern is almost always the same. You move to Snowflake for performance and scale. But then the first bill lands, and suddenly your Snowflake warehouse costs are far higher than forecast. What went wrong?

The first step to regaining control is understanding how Snowflake costs are calculated. This guide breaks down the cost structure and gives you five practical steps to optimize spend, so you only pay for the resources you actually need and can design a sustainable Snowflake FinOps practice before your next renewal.

But first, what Snowflake is used for?

Snowflake is a cloud data platform that enables organisations to store, process, and analyse data at scale. It operates on the three leading cloud providers (Amazon Web Services, Google Cloud Platform, and Microsoft Azure), giving businesses flexibility in how they deploy and expand their environments, whether as a greenfield implementation or as part of a larger Snowflake data platform rollout.

As a fully managed service, Snowflake removes the burden of infrastructure management. Users do not need to handle hardware, software updates, or tuning. Instead, they can focus entirely on working with their data while the platform manages performance, security, and scalability in the background - often with a lean internal team supported by a specialised Snowflake platform team.

One of Snowflake’s defining features is the separation of storage and compute, which allows each to scale independently. This design supports efficient resource usage, quick provisioning of additional capacity when needed, and automatic suspension of idle compute clusters known as virtual warehouses. These capabilities reduce costs while maintaining high performance when they’re configured with a cost-optimisation strategy.

Why your Snowflake cost keeps growing?

Before we discuses optimization, let's decode what you're actually paying for. Because if you're like most data teams, you're probably overpaying for things you didn't know you were buying.

Snowflake’s pay-as-you-go model is built on two primary components: compute and storage, along with a smaller component, cloud services.

1. Compute costs

This is typically the largest portion of your Snowflake bill. Compute is measured in Snowflake credits, an abstract unit that's consumed when a virtual warehouse is active.

Here's how the math works:

Virtual Warehouses: These are the compute clusters (EC2 instances on AWS, for example) that run your queries, data loads, and other operations.
"T-Shirt" Sizing: Warehouses come in sizes like X-Small, Small, Medium, Large, etc. Each size up doubles the number of servers in the cluster and, therefore, doubles the credit consumption per hour.
Per-Second Billing: You're billed for credits on a per-second basis after the first 60 seconds of a warehouse running.

The formula for calculating the cost:

Credits Consumed = (Credits per hour for the warehouse) × (Total runtime in seconds) ÷ 3600

Real example: Running a Large warehouse (8 credits/hour) for 30 minutes (1800 seconds) would consume (8 * 1800)÷3600 = 4 credits. If you're paying $3 per credit, that half-hour just cost you $12. Scale that across dozens of queries per day, and you can see how costs spiral.

2. Storage Costs

At first, storage looks inexpensive compared to compute, but as data grows costs can rise quickly. Snowflake calculates storage charges based on the average monthly volume of data you store in terabytes. Because your data is automatically compressed, you’re billed on the compressed size, which most teams overlook.

You're paying for three different types of storage:

Active Storage: The live data in your databases and tables.
Time-Travel: Data kept to allow you to query or restore historical data from a specific point in the past. The default retention period is 1 day, but it can be configured up to 90 days for Enterprise editions.
Fail-safe: A 7-day period of historical data storage after the Time-Travel window closes, used for disaster recovery by Snowflake support. This is not user-configurable.

3. Cloud services costs

The cloud services layer provides essential functions like authentication, query parsing, access control, and metadata management. For the most part, this layer is free. You only begin to incur costs if your usage of the cloud services layer exceeds 10% of your daily compute credit consumption. This is rare but can happen with an extremely high volume of very simple, fast queries.

5 steps to optimize your Snowflake warehouse costs

Now that you know what you're paying for, here are five steps to significantly reduce your spend.

Step 1: right-size your virtual warehouses

Running an oversized warehouse is like using a sledgehammer to crack a nut - it's expensive and unnecessary.

Start small: Don't default to a Large warehouse. Begin with an X-Small or Small and only scale up if performance is inadequate. It's often more efficient to run a query for slightly longer on a smaller warehouse than for a few seconds on a larger one. Look for slow queries in the Query History that generate a lot of “Bytes spilled to local storage”. For large joins or window functions, going for a Small to Large warehouse might be 4 times more expensive, but 10x faster, resulting in a 60% cost reduction.
Set aggressive auto-suspend policies: An active warehouse consumes credits even when it's inactive. Configure your warehouses to auto-suspend quickly when not in use. A setting of 1 to 5 minutes is a good starting point for most workloads. This single change can have a massive impact on your bill.
Separate your workloads: Don't use one giant warehouse for everything. Create separate warehouses for different teams and tasks: (e.g., ELT_WH for data loading, BI_WH for analytics dashboards, DATASCIENCE_WH for ad-hoc exploration). This prevents a resource-intensive data science query from slowing down critical business reports and allows you to tailor the size and settings for each specific workload.
Use multi-cluster warehouses for high concurrency: If you have many users running queries simultaneously (like a popular BI dashboard), instead of using a larger warehouse (scaling up), configure a multi-cluster warehouse (scaling out). This will automatically spin up additional clusters of the same size to handle the concurrent load and spin them down as demand decreases.

Step 2: optimize your queries and workloads

Inefficient queries are a primary driver of wasted compute credits. A poorly written query can run for minutes on a large warehouse when a well-written one could finish in seconds on a smaller one.

Use the Query Profile: This is your best friend for optimization. Before trying to fix a slow query, run it and then analyse its performance in the Query Profile. This tool provides a detailed graphical breakdown of each step of query execution, showing you exactly where the bottlenecks are (e.g., a table scan that should be a prune, an exploding join).
Avoid SELECT *: Only select the columns you actually need. Pulling unnecessary columns increases I/O and can prevent Snowflake from performing "column pruning," a key optimization technique.
Be careful with JOINs: Ensure you are joining on keys that are well-distributed. Accidental Cartesian products (cross-joins) are a notorious cause of runaway queries that can burn through credits.
Materialize complex views: If you have a complex view that is queried frequently, consider materializing it into a table. While this uses more storage, the compute savings from not having to re-calculate the view on every query can be substantial. Use Materialized Views for this, as Snowflake will automatically keep them up-to-date.

Step 3: manage your data storage lifecycle

While cheaper than compute, storage costs can creep up. Proactive data management is key.

Configure Time-Travel Sensibly: Do you really need 90 days of Time-Travel for every table? For staging tables or transient data, a 1-day retention period is often sufficient. Align the Time-Travel window with your actual business requirements for data recovery.
Use Transient and Temporary Tables: For data that doesn't need to be recovered (like staging data from an ELT process), use transient tables. These tables do not have a Fail-safe period and only have a Time-Travel period of 0 or 1 day. This can significantly reduce your storage footprint for intermediate data.
Periodically Review and Purge Data: Implement a data retention policy and periodically archive or delete data that is no longer needed for analysis.

Step 4: maximize caching to get free compute

Snowflake has multiple layers of caching that can dramatically reduce credit consumption if leveraged correctly. When a query result is served from a cache, it consumes zero compute credits.

The Result Cache: Snowflake automatically caches the results of every query you run. If another user submits the exact same query within 24 hours (and the underlying data has not changed), Snowflake returns the cached result almost instantly without starting a warehouse. This is perfect for popular dashboards where many users view the same report.
Local Disk Cache (Warehouse Cache): When a warehouse executes a query, it caches the data it retrieved from storage on its local SSD. If a new query requires some of the same data, it can be read from this much faster local cache instead of remote storage, speeding up the query and reducing compute time. This cache is cleared when the warehouse is suspended.

Step 5: implement robust governance and monitoring

You can't optimize what you can't measure. Use Snowflake's built-in tools to monitor usage and enforce budgets.

Set up Resource Monitors: This is your primary safety net. A Resource Monitor can be assigned to one or more warehouses to track their credit consumption. You can configure it to send alerts at certain thresholds (e.g., 75% of budget) and, most importantly, to suspend the warehouse when it hits its limit, preventing runaway spending.
Analyse your usage data: Snowflake provides a wealth of metadata in the SNOWFLAKE database, specifically within the ACCOUNT_USAGE schema. Views like WAREHOUSE_METERING_HISTORY, QUERY_HISTORY, and STORAGE_USAGE are invaluable. Query this data to find your most expensive queries, identify your busiest warehouses, and track your storage costs over time.
Tag everything for cost allocation: Use Snowflake's tagging feature to assign metadata tags to warehouses, databases, and other objects. You can tag objects by department (finance, marketing), project, or user. This allows you to query the usage views and accurately allocate costs back to the teams responsible, creating accountability.

Bringing it all together

So what’s your next step? These five practices will help you reduce costs and build smarter habits, but turning them into measurable savings at scale takes more than a checklist. It requires the right expertise and execution.

For example, a leading financial services company was spending more than $800K per month on cloud costs with no clear view of where the money was going. Within 90 days of working with our experts, they gained full visibility, reduced ingestion latency by 80%, and built a governed, AI-ready platform while bringing costs back under control.

👉 Read the full case study here

At Snowstack, we bring certified Snowflake expertise and proven delivery methods to help enterprises cut spend, improve performance, and prepare their platforms for AI and advanced analytics.

Ready to make your Snowflake environment cost-efficient and future-proof?

‍

Snowflake FinOps FAQs

FAQs

What is Snowflake FinOps, and who should own it? +

Snowflake FinOps applies cloud FinOps principles specifically to Snowflake: making engineers, data teams, and finance jointly accountable for usage, budgets, and optimisation. It combines technical levers (warehouse design, query tuning, data lifecycle) with financial controls (budgets, alerts, chargeback/showback).

In practice, high-performing teams treat FinOps as a shared function between a data platform lead, finance/controlling, and business owners.

Do I need a Snowflake consultant to optimise costs? +

Smaller teams can often get early wins with basic best practices, but once Snowflake spend reaches tens of thousands per month, involves multiple business units, or sits ahead of a major renewal or migration, a specialised Snowflake consultant usually pays for itself through avoided waste and better contract decisions.

How can I estimate Snowflake costs before a migration or renewal? +

You can use the ROI calculator with a Snowflake consultant . You can also estimate your Snowflake costs yourself, but you need to (1) baseline current or expected workloads, (2) translate them into warehouse sizes, runtimes, and storage growth, and (3) compare on-demand vs. capacity scenarios with your real credit price.

Many pricing guides recommend modelling this by workload type (batch, BI, AI, data sharing) rather than using a single “average” number.

Why does my Snowflake cost keep growing? +

For most teams, 80–90% of Snowflake spend comes from compute, and that’s where costs quietly creep up: oversized warehouses, no auto-suspend, inefficient queries, long-running jobs, and generous Time-Travel or Fail-safe settings that bloat storage.

Does Snowflake charge for data transfer? +

Yes. Snowflake does not charge to bring data into your account, but it does charge per-byte for data egress when you move data to another region or another cloud, and those charges vary by region.

When Snowstack designs multi-region or cross-cloud setups as part of Snowflake consulting & architecture , data placement and sharing patterns are modelled up front so your FinOps plan includes egress, not just warehouse and storage costs.

Why is Snowflake so expensive for some teams? +

Snowflake feels expensive when usage patterns are inefficient, not because the platform is inherently overpriced. Common cost drivers are always-on or oversized warehouses, multi-cluster configurations that never scale down, poorly tuned queries, overly long Time-Travel retention, and cross-region data movement.

Talk to Snowflake consultants for guidance.

Cloud

5 min read

How Snowflake is different from other databases: 3 architecture advantages for modern data teams

Some companies still run databases like it’s 1999. Others have adopted cloud-native architectures that cut costs in half and double performance. Guess who’s winning?

Traditional databases force a trade-off between performance and budget. Collaboration still means passing around CSVs. Forward-thinking organizations have shifted to Snowflake’s cloud-native architecture, which scales instantly, operates securely, and keeps costs under control. But what truly sets Snowflake apart from traditional databases or even other cloud data platforms?

In this blog, we’ll break down three key architectural advantages that make Snowflake a game-changer for businesses that want to migrate to the cloud.

But first, what is a cloud-native database?

A cloud-native database is designed from the ground up for the cloud. Unlike traditional databases that were adapted from on-premise systems, cloud-native platforms are purpose-built to take advantage of the cloud’s strengths: scalability, flexibility, and resilience.

They scale horizontally by adding capacity in parallel instead of relying on bigger machines. They automatically adjust resources up or down based on demand, so you only pay for what you use. They also come with built-in high availability through data replication and automated recovery.

In short, a cloud-native database removes the rigid trade-offs of legacy systems and gives modern businesses the performance, efficiency, and reliability they need to stay competitive.

Snowflake's architecture: 3 strategic advantages

Snowflake isn’t just faster or cheaper. It’s built differently. The three architectural choices below explain why modern data teams trust Snowflake to scale, collaborate, and deliver insights in ways legacy systems never could.

1. Separation of storage and compute: elasticity without trade-offs

Most databases tie storage and compute together. Need more power to run quarterly reports? You’ll also pay for storage you don’t use. Want to keep historical data at a lower cost? You’re still paying for compute you don’t actually need.

Snowflake's Solution: Snowflake's architecture fundamentally decouples storage and compute layers, creating unprecedented flexibility for modern data teams.

You can scale compute resources up or down independently of your data storage.
Multiple workloads (e.g., data ingestion, analytics queries, and reporting) can run simultaneously on isolated compute clusters without performance conflicts.
You can assign different warehouses (compute clusters) to different teams or departments without worrying about concurrency issues or resource contention.

Business impact: Imagine a BI team that runs heavy dashboards while a data science team trains models on the same data. The beauty behind this separation is that both can operate without stepping on each other’s toes. This translates to faster time-to-insight, cost control, and happy teams who aren’t waiting for resources to free up.

2. Multi-cluster shared data architecture: built for collaboration and scale

Traditional databases become performance challenge as more users access the system. Query response times degrade, teams queue for resources, and data silos emerge as different departments seek workarounds.

Snowflake's Solution: Snowflake’s multi-cluster shared data model allows any number of users and tools to access the same single source of truth without performance degradation. The platform automatically manages concurrency through intelligent multi-cluster compute scaling.

What this means for data teams:

Unlimited concurrency: Teams don’t have to wait in line to access the warehouse. Snowflake automatically adds compute clusters as needed and scales them back down when demand drops.
Cross-team collaboration: Data Engineers, analysts, and ML engineers can work off the same dataset in real time, using SQL, Python, or third-party tools.
Data sharing across organizations: Snowflake’s architecture supports secure data sharing with external partners or vendors without copying or moving data. You simply grant access.

Business impact: This makes Snowflake not just a warehouse but a collaboration platform for data. Whether your team is distributed across continents or collaborating with external partners, Snowflake enables fast, consistent, and secure access to data.

3. Zero management with cloud-native infrastructure

Managing a traditional database means dealing with provisioning, tuning, indexing, patching, and more. These tasks require specialized DBAs and often lead to downtime, delays, and human error.

Snowflake flips the script with a “zero-management” approach.

Thanks to its fully managed SaaS model:

No infrastructure to manage. Snowflake runs entirely in the cloud (on AWS, Azure, or GCP), abstracting away the underlying hardware.
Automatic tuning and optimization. No need to manually set indexes or optimize queries, Snowflake handles that under the hood.
Security and compliance out of the box. Features like automatic encryption, role-based access control, and compliance with standards (HIPAA, GDPR, SOC 2) are built-in.

Business impact: This lets your team focus on data and insights, not on maintenance. IT teams no longer need to waste time on low-value operational tasks. Instead, they can accelerate innovation and reduce costs.

Snowflake vs. the competition: why architecture matters

In 2025, your data architecture is more than a technical choice. It is a strategic decision that defines how quickly your organization can compete, innovate, and scale. When you compare modern data platforms, Snowflake's architectural advantages become clear when compared to alternatives:

How Snowflake’s architecture drives results?

Snowflake’s architecture solves the trade-offs that hold traditional databases back and delivers flexibility that many cloud platforms still lack. But technology alone is not enough. The difference comes from how you implement it.

Take the case of a $200M pharmaceutical distributor. Their teams were stuck with siloed on-prem systems, compliance risks, and reports that took hours to run. Our Snowflake-certified experts helped them migrate to Snowflake’s cloud-native architecture with a single governed data layer, dedicated compute clusters, and built-in role-based access. In just 90 days, reporting was 80% faster, the architecture was ready for AI and advanced analytics, and teams finally worked from the same source of truth.

👉 Read the full case study here

Making Snowflake’s architecture work for your business

Every organization’s data challenges look different, but the goal is the same: to turn Snowflake into a platform that delivers measurable results. That’s where Snowstack comes in. We bring proven experience from complex projects in finance, pharma, and FMCG. This gives clients confidence that their architecture is designed for scale, collaboration, and compliance from day one. Our role goes beyond implementation. We act as a long-term partner who helps data teams adapt, optimize, and grow with Snowflake as business needs evolve.

Are you getting the full value from Snowflake’s architecture?

‍

FAQs

Snowflake's architecture separates storage and compute into independent layers, unlike traditional databases that tightly couple these resources. This means you can scale processing power without paying for additional storage, and store massive amounts of data without impacting query performance. Snowflake also provides unlimited concurrency through multi-cluster compute, automatic optimization, and zero infrastructure management.

Snowflake focuses on data warehousing, BI, and analytics with SQL-first approach and zero management overhead . Databricks specializes in data science, machine learning, and complex analytics with notebook-based development. Check our blog to explore the differences.

Snowflake uses a consumption-based pricing model with separate charges for storage and compute . You pay for data storage based on the amount stored (compressed), and compute costs based on the size and duration of warehouse usage. Credits are consumed only when warehouses are actively running queries. Check our blog to find out how you can optimize your data warehouse costs.

No, Snowflake is a cloud-native platform that runs exclusively on AWS, Azure, and Google Cloud Platform. However, this cloud-only approach is actually an advantage. It eliminates the infrastructure management overhead, provides automatic scaling, and ensures you always have access to the latest features and security updates without manual maintenance.

Yes, Snowflake natively supports semi-structured and unstructured data formats including JSON, XML, Parquet, Avro, and even binary data like images and documents.

Implementation timelines vary based on data complexity and organizational requirements. Simple migrations can be completed in 4–8 weeks, while comprehensive enterprise transformations typically take 3–6 months. Using proven frameworks and experienced implementation partners like Snowstack can significantly accelerate timelines while reducing risks and ensuring best practices from the start.

Snowflake runs natively on AWS, Azure, and Google Cloud Platform, using each cloud provider's infrastructure while maintaining a consistent experience across all platforms. You can even replicate data across different cloud regions or providers for disaster recovery and compliance requirements. Snowflake handles all the underlying infrastructure complexity, so you focus on your data, not cloud management.

Yes, Snowflake integrates with SQL Server, Oracle, and virtually any database through various methods: direct connectors, ETL tools like Fivetran or Informatica, custom APIs, and batch file transfers. Many organizations use Snowflake as their central data warehouse while keeping operational systems on SQL Server or Oracle, replicating data through automated pipelines.

Governance

5 min read

Best practices for protecting your data: Snowflake role hierarchy

In this article, we’ll break down the Snowflake Role Hierarchy, explain why it matters, and share best practices for structuring roles that support security, compliance, and day-to-day operations.

What is Snowflake’s role hierarchy?

Core components of Snowflake RBAC

Roles: The fundamental building blocks that encapsulate specific privileges
Privileges: Defined levels of access to securable objects (databases, schemas, tables)
Users: Identities that can be assigned roles to access resources
Securable Objects: Entities like databases, tables, views, and warehouses that require access control
Role Inheritance: The mechanism allowing roles to inherit privileges from other roles

Understanding Snowflake's system-defined roles

Understanding the default role structure is crucial for building secure hierarchies:

ACCOUNTADMIN

Root-level access to all account operations
Can view and manage billing and credit data
Should be tightly restricted to emergency use only
Not a "superuser" - still requires explicit privileges for data access

SYSADMIN

Full control over database objects and users
Recommended parent for all custom roles
Manages warehouses, databases, and schemas

SECURITYADMIN

Manages user and role grants
Controls role assignment and privilege distribution
Essential for maintaining RBAC governance

Custom roles

Created for specific teams or functions within an organization (e.g ANALYST_READ_ONLY, ETL_WRITER).

Best practices for designing a secure Snowflake role hierarchy

A well-structured role hierarchy minimizes risk, supports compliance, and makes onboarding/offboarding easier. Here’s how one should do it right:

1. Follow the Principle of Least Privilege

Grant only the minimum required permissions for each role to perform its function. Avoid blanket grants like GRANT ALL ON DATABASE.

Do this:

Specific, targeted grants
Avoid cascading access down the role tree unless absolutely needed
Regularly audit roles to ensure they align with actual usage

GRANT SELECT ON TABLE SALES_DB.REPORTING.MONTHLY_REVENUE TO ROLE ANALYST_READ;
GRANT USAGE ON SCHEMA SALES_DB.REPORTING TO ROLE ANALYST_READ;
GRANT USAGE ON DATABASE SALES_DB TO ROLE ANALYST_READ;

Not this:

Overly broad permissions

GRANT ALL ON DATABASE SALES_DB TO ROLE ANALYST_READ;

Why does it matter?

Least privilege prevents accidental (or malicious) misuse of sensitive data. It also supports data governance and compliance with various regulations like GDPR or HIPAA.

2. Use a layered role design

Design your roles using a layered and modular approach, often structured like this:

Functional Roles (what the user does):

CREATE ROLE ANALYST_READ;
CREATE ROLE ETL_WRITE;
CREATE ROLE DATA_SCIENTIST_ALL;

Environment Roles (where the user operates)

CREATE ROLE DEV_READ_WRITE;
CREATE ROLE PROD_READ_ONLY;

Composite or Team Roles (Group users by department or team, assigning multiple functional/environment roles under one umbrella)

CREATE ROLE MARKETING_TEAM_ROLE → includes PROD_READ_ONLY + ANALYST_READ

3. Avoid granting privileges directly to users

Always assign privileges to roles and not users. Then, assign users to those roles.

Why it matters?

This keeps access transparent and auditable. If a user leaves or changes teams, simply revoke or change the role. There’s no need to hunt down granular permissions.

4. Establish consistent naming conventions

Enforce naming conventions as consistent role and object naming makes automation and governance far easier to scale.

Recommended Naming Pattern:

Access Roles: {ENV}_{DATABASE}_{ACCESS_LEVEL} (e.g., PROD_SALES_READ)
Functional Roles: {FUNCTION}_{TEAM} (e.g., DATA_ANALYST, ETL_ENGINEER)
Service Roles: {SERVICE}_{PURPOSE}_ROLE (e.g., FIVETRAN_LOADER_ROLE)

5. Use separate roles for Administration vs. Operations

Split roles that manage infrastructure (e.g., warehouses, roles, users) from roles that access data.

Admins: SYSADMIN, SECURITYADMIN‍
Data teams: DATA_ENGINEER_ROLE, ANALYST_ROLE, etc.

6. Secure the top-level roles

Roles like ACCOUNTADMIN and SECURITYADMIN should be assigned to the fewest people possible, protected with MFA, and monitored for any usage.

Implementation Checklist:

Limit ACCOUNTADMIN to 2-3 emergency users maximum
Enable MFA for all administrative accounts
Set up monitoring and alerting for admin role usage
Regular access reviews and privilege audits
Document and justify all administrative access

Monitoring, auditing & compliance: keeping your Snowflake hierarchy healthy

Even the best-designed role trees can get messy over time. Here’s how to maintain security:

1. Regular access reviews

Implement quarterly access reviews to maintain security hygiene:

Role Effectiveness Analysis: Identify unused or over-privileged roles
User Access Validation: Verify users have appropriate role assignments
Privilege Scope Review: Ensure roles maintain least privilege principles
Compliance Mapping: Document role mappings to business functions

2. Logging and monitoring

Enable Access History and Login History in Snowflake to track activity and implement automation tools for role assignments during employee transitions.

3. Onboarding/offboarding automation

Implement automation tools or scripts to efficiently manage role assignments during employee transitions.

4. Object Tagging for enhanced security

Use object tagging to classify sensitive data and control access accordingly.

Measuring RBAC Success: Key Performance Indicators

1. Security Metrics

Access Review Coverage: % of roles reviewed quarterly
Privilege Violations: Number of excessive privilege grants identified
Failed Authentication Attempts: Monitor for unauthorized access patterns
Role Utilization Rate: % of active roles vs. total created roles

2. Operational Metrics

User Onboarding Time: Average time to provision new user access
Role Management Efficiency: Time to modify/update role permissions
Audit Response Time: Speed of access review and remediation
Automation Coverage: % of role operations automated vs. manual

3. Compliance Metrics

SOC 2 Readiness: Role hierarchy documentation completeness
GDPR/Data Privacy: Data access control effectiveness
Industry Compliance: Sector-specific requirement adherence
Change Management: Role modification approval and documentation

Future-Proofing Your RBAC Strategy

Don’t wait for the next breach to expose the cracks in your access controls. Let’s design an RBAC strategy that keeps you secure, compliant, and future-ready.

👉 Book a free RBAC assessment

‍

FAQs

Always assign privileges to roles and not users. Then, assign users to those roles. This keeps access transparent and auditable.

Design with hierarchy in mind – role ownership and grant structure should align with your intended control model. Map business functions to role layers and ensure clear inheritance paths.

Yes, automation is critical for scaling. Create stored procedures for role provisioning, use CI/CD pipelines for role deployment, and integrate with identity providers for user lifecycle management.

Finance

5 min read

Choosing the right Snowflake partner: what to look for in 2025

In 2025 Snowflake is more than a database. It has become the foundation for data, AI, and applications. With almost 10,000 active Snowflake customers** globally and more than 850 certified services partners, the challenge isn't finding a partner. It's finding the right partner who can deliver tangible results while building a sustainable, cost-effective data platform.

In this blog, we outline the key criteria to evaluate when selecting a Snowflake partner in 2025 and explain how the choice you make will directly shape the success of your data initiatives.

What is a Snowflake consulting partner?

A Snowflake consulting partner is a certified services provider that specializes in implementing, optimizing, and managing Snowflake's Data Cloud platform. These partners range from global system integrators managing petabyte-scale deployments to boutique firms focusing on specific industries or Snowflake features.

Snowstack is built for this role. As a Snowflake-first partner, our focus is entirely on helping organizations succeed with the platform. We design and deliver environments that are secure, cost-efficient, and ready for AI. Because we focus exclusively on Snowflake, we bring a level of technical depth, delivery discipline, and industry knowledge that generalist consultancies cannot match.

Best criteria for selecting your Snowflake partner in 2025:

In 2025, not every Snowflake partner delivers the same results. Your choice can determine whether your data projects drive real business value or slip into delays, cost overruns, and a loss of confidence across the organization. Here is what to look for when evaluating a partner’s approach:

1. Delivery methodology as the deciding factor

The single biggest predictor of Snowflake implementation success isn't the partner's brand recognition or size. It's how they deliver. In our analysis of successful Snowflake projects, delivery methodology consistently emerges as the most critical differentiator.

Ask prospective partners:

What is their delivery rhythm? Look for agile methodologies with short, business-visible delivery cycles rather than waterfall approaches with big reveals at the end
How do they balance technical debt vs. time to market? The best partners prioritize early wins while building sustainable architecture
Do they work in short iterations with quick business feedback? Partners should deliver "first dashboard live in 4 weeks" rather than 6-month black box projects
Can they balance governance and speed? Avoid partners who treat governance as an afterthought or create excessive bottlenecks

What to look for: Partners with repeatable, transparent, and well-documented processes that adapt to your internal structure while maintaining consistent quality standards.

2. Snowflake-native thinking vs. generic cloud advice

The difference between Snowflake specialists and generalist cloud consultants becomes evident in architecture decisions, cost optimization strategies, and feature utilization.

Depth of platform knowledge matters:

Do they understand Snowflake's native capabilities? Look for expertise in Streams & Tasks, Snowpark, Secure Sharing, Cortex AI, and Dynamic Tables
Do they optimize for platform strengths? The best partners design for Snowflake's unique architecture rather than forcing legacy patterns
Can they demonstrate platform-specific know-how? Ask about credit optimization, role hierarchy design, cost guardrails, and performance tuning strategies
Are they current with latest features? Snowflake releases new capabilities quarterly partners should stay updated

Evaluation technique: Ask candidates to walk through a specific Snowflake architecture decision and explain their reasoning. Generic answers reveal generalist thinking.

A leading financial services firm was spending more than 800,000 dollars per month on cloud costs with little visibility into where the money was going. Within 90 days, we delivered a governed Snowflake platform that reduced data ingestion latency by 80%, enabled AI readiness, and put full cost controls in place.

3. Time to value: shipping early and often

The era of 6-month data projects with big reveals is over. Modern Snowflake implementations should deliver value incrementally, building momentum and stakeholder confidence throughout the process.

Measurement criteria: Ask to see examples of their delivery cadence, backlog management practices, and documentation standards. Partners should have concrete examples of incremental value delivery. For instance, one of our clients, a regional pharma distributor, moved from legacy on-premises systems to a Snowflake-native platform. Instead of a single large rollout, we delivered in focused iterations. Dashboards came first, followed by finance and supply chain integrations, and advanced governance policies were in place before production go-live. This approach kept stakeholders engaged and satisfied.

5. Team structure and location strategy

The 2025 landscape offers multiple delivery models, each with distinct advantages and trade-offs. However critical questions beyond geography:

Will you get named engineers or a rotating bench? Consistency matters for knowledge retention
Is there a lead you can trust? Avoid partners who channel everything through project managers without technical depth
How do they ensure knowledge retention over time? Look for documentation practices and handover procedures

6. Embedded Support vs. one-and-done projects

Snowflake is a living platform that evolves continuously. Your partner relationship shouldn't end at go-live. Successful implementations require ongoing optimization, new source integration, and platform evolution support.

Post-implementation needs include:

Onboarding new data sources as business requirements evolve
Evolving data models based on changing business logic
Performance optimization as data volumes and user counts grow
Feature adoption as Snowflake releases new capabilities
Cost optimization through usage pattern analysis

Partner support models to evaluate:

Embedded engineers: Dedicated resources working as extended team members
Managed services: Full platform management with SLA guarantees
Retainer arrangements: On-demand expertise for specific needs
Training and enablement: Knowledge transfer to build internal capabilities

Key consideration: Partners offering only project-based work may leave you stranded when you need ongoing support most. Unlike project-only vendors, our experts stay engaged long after go-live. Our model ensures that as your data platform grows, you have continuous access to the same experts who built it, ready to integrate new sources, optimize costs, and adopt new Snowflake features.

7. Governance, cost control, and trust

Platform ownership extends far beyond delivering functional pipelines. Successful Snowflake implementations require robust governance frameworks, proactive cost management, and enterprise-grade security practices.

Essential governance capabilities:

Role-based access control and masking policies aligned with your security requirements
Cost observability and alerting systems to prevent budget surprises
Compliance framework alignment (SOC 2, GDPR, HIPAA, PCI-DSS)
CI/CD and documentation practices for long-term maintainability
Data quality and lineage tracking for trustworthy analytics

Without a solid governance foundation, a Snowflake platform may appear to work at first but will not scale sustainably. In our blog you can explore this topic in depth, but here is a snapshot of the cost control practices we recommend.

Warehouse auto-suspend and auto-resume configuration
Query result caching optimization
Clustering key recommendations
Storage optimization strategies
Credit usage monitoring and alerting

8. AI Readiness and responsible adoption

Snowflake is rapidly evolving into a core platform for AI and machine learning, but realizing its potential requires more than connecting models to data. Successful implementations demand partners who can design secure, scalable, and responsible AI foundations inside Snowflake.

Essential AI readiness capabilities:

Integration of Cortex AI for LLM-based applications with enterprise controls
Snowpark ML workflows for efficient model training and deployment
Feature store design for consistent and reusable machine learning pipelines
AI governance frameworks to manage bias, privacy, and ethical use

Without a clear AI strategy built on trusted data, organizations face wasted investment, compliance risks, and a loss of stakeholder confidence. One regional pharma distributor overcame these challenges by migrating to Snowflake with us. With Snowpark ML workflows and governed feature stores, they got accurate demand forecasting and optimized their supply chain while ensuring responsible AI adoption.

Industry-Specific Considerations

Different industries have unique requirements that affect partner selection:

Financial Services: Emphasis on regulatory compliance, data residency, audit trails, and risk management frameworks.

Healthcare & Life Sciences: Focus on HIPAA compliance, data privacy, clinical data standards, and FDA validation support.

Manufacturing: Requirements for IoT data integration, real-time analytics, supply chain optimization, and operational intelligence.

Retail & E-commerce: Need for customer 360 views, real-time personalization, inventory optimization, and marketing analytics.

Technology Companies: Emphasis on developer productivity, API integrations, event streaming, and product analytics.

Snowflake partner red flags to avoid in 2025

Watch for these warning signs during partner evaluation:

1. Methodology Red Flags	2. Technical Red Flags	3. Operational Red Flags	4. Cultural Red Flags
• Cannot articulate clear delivery methodology • No examples of iterative delivery • Promises unrealistic timelines • Treats governance as optional or “phase 2”	• Generic cloud advice without Snowflake-specific insights • Limited knowledge of recent Snowflake features • Cannot demonstrate cost optimization strategies • No examples of performance tuning success	• Lack of transparency about team structure • No named resources or clear escalation paths • Poor references from similar-sized implementations • Inflexible contract terms or scope definitions	• Poor communication during sales process • Misaligned expectations about collaboration style • No industry-specific examples or case studies • Dismissive of your current technology investments

‍

Who is the right Snowflake partner for you and your business in 2025?

Most data migrations don’t fail because of the technology. They fail because of poor execution and weak partner choices. When projects stall, the real cost is not just overspending. It is delayed initiatives, frustrated stakeholders, and lost confidence in the value of data.

In 2025, choosing a Snowflake partner is no longer about ticking boxes for certifications or chasing the lowest cost. It is a strategic decision that will shape whether your data initiatives deliver real business impact or fall short. At Snowstack, we combine deep Snowflake expertise with proven delivery methods, transparent team structures, and a focus on long-term governance and optimization. We help organizations move beyond one-off implementations to build scalable, AI-ready platforms that deliver measurable results and lasting trust in data.

👉 Book a strategy session with our experts now.

‍

FAQs

Look for partners with multiple SnowPro Core, Advanced, and Solution Architect certified professionals. More important than individual certifications is demonstrated project success, current platform knowledge, and proven delivery methodology. Ask for specific examples of recent implementations and results achieved.

Industry experience can be valuable but isn't always essential. Partners with deep Snowflake expertise can often adapt to new industries effectively. However, for highly regulated industries (healthcare, financial services) or complex compliance requirements, industry-specific experience becomes more critical.

Good partners offer multiple support options: embedded engineers, managed services, retainer arrangements, or training programs. Clarify support expectations upfront, including response times, escalation procedures, and ongoing optimization services. Avoid partners who only offer project-based work without ongoing support options.

Request technical deep-dive sessions where partners walk through specific Snowflake architecture decisions, cost optimization strategies, and performance tuning approaches. Ask for code samples, architecture diagrams, and examples of problem-solving in previous implementations. Consider conducting a limited proof-of-concept with top candidates.

‍

Stay up to date

Top data insights, delivered to your inbox

Oops! Something went wrong while submitting the form.

Snowflake savings calculator

Save on your Snowflake costs

Use our Snowflake Savings Calculator to cut costs, boost efficiency, and drive higher profitability.

Calculate my savings

Transform your data with Snowflake

You don't need to hire a data army or wait months to see results. Our Snowflake specialists will get you up and running fast, so you can make better decisions, cut costs, and beat competitors who are still stuck with spreadsheets and legacy systems.

Learn more