Manual clicks don’t scale. As Snowflake environments grow, managing them through the UI or ad-hoc scripts quickly leads to drift, blind spots, and compliance risks. What starts as a quick fix often becomes a challenge that slows delivery and exposes the business to security gaps.
Infrastructure as Code with Terraform solves these challenges by bringing software engineering discipline to Snowflake management. Using Terraform’s declarative language, engineers define the desired state of their Snowflake environment, track changes with version control, and apply them consistently across environments. Terraform communicates with Snowflake’s APIs through the official snowflakedb/snowflake provider, translating configuration into the SQL statements and API calls that keep your platform aligned and secure.
This guide provides a complete walkthrough of how to manage Snowflake with Terraform. From provisioning core objects like databases, warehouses, and schemas to building scalable role hierarchies and implementing advanced governance policies such as dynamic data masking.
Section 1: bootstrapping Terraform for secure Snowflake automation
The initial setup of the connection between Terraform and Snowflake is the most critical phase of the entire process. A secure and correctly configured foundation is paramount for reliable and safe automation. This section focuses on establishing this connection using production-oriented best practices, specifically tailored for non-interactive, automated workflows typical of CI/CD pipelines.
1.1 The principle of least privilege: the terraform service role
Terraform should not operate using a personal user account. Instead, a dedicated service user must be created specifically for Terraform automation. Before any Terraform code can be executed, a one-time manual bootstrapping process must be performed within the Snowflake UI or via SnowSQL. This involves using the ACCOUNTADMIN
role to create the dedicated service user and a high-level role for Terraform's initial operations.
The following SQL statements will create a TERRAFORM_SVC
user and grant it the necessary system-defined roles:
-- Use the highest-level role to create users and grant system roles
USE ROLE ACCOUNTADMIN;
-- Create a dedicated service user for Terraform
-- The RSA_PUBLIC_KEY will be set in the next step
CREATE USER TERRAFORM_SVC
COMMENT = 'Service user for managing Snowflake infrastructure via Terraform.'
RSA_PUBLIC_KEY = '<YOUR_PUBLIC_KEY_CONTENT_HERE>';
-- Grant the necessary system roles to the Terraform service user
GRANT ROLE SYSADMIN TO USER TERRAFORM_SVC;
GRANT ROLE SECURITYADMIN TO USER TERRAFORM_SVC;
Granting SYSADMIN
and SECURITYADMIN
to the service user is a necessary starting point for the infrastructure management. The SYSADMIN
role holds the privileges required to create and manage account-level objects like databases and warehouses. The SECURITYADMIN
role is required for managing security principals, including users, roles, and grants.
1.2 Authentication: the key to automation
The choice of authentication method is important. The Snowflake provider supports several authentication mechanisms, including basic password, OAuth, and key-pair authentication. For any automated workflow, especially within a CI/CD context, key-pair authentication is the industry-standard and recommended approach.
A CI/CD pipeline, such as one running in GitHub Actions, is a non-interactive environment. Basic password authentication is a significant security risk and not recommended. This leaves key-pair authentication as the only method that is both highly secure, as it avoids transmitting passwords, and fully automatable.
The following table provides a comparative overview of the primary authentication methods available in the Snowflake provider, reinforcing the recommendation for key-pair authentication in production automation scenarios.
Table 1: Snowflake provider authentication methods

To implement key-pair authentication, an RSA key pair must be generated. The following openssl
commands will create a 2048-bit private key in the required PKCS#8 format and its corresponding public key:
Bash
# Navigate to a secure directory, such as ~/.ssh
cd ~/.ssh
# Generate an unencrypted 2048-bit RSA private key in PKCS#8 format
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out snowflake_terraform_key.p8 -nocrypt
# Extract the public key from the private key
openssl rsa -in snowflake_terraform_key.p8 -pubout -out snowflake_terraform_key.pub
After generating the keys, the content of the public key file (snowflake_terraform_key.pub
), including the -----BEGIN PUBLIC KEY-----
and -----END PUBLIC KEY-----
headers, must be copied and pasted into the ALTER USER
statement from the previous step to associate it with the TERRAFORM_SVC
user. For enhanced security, the private key itself can be encrypted with a passphrase. The Snowflake provider supports this by using the private_key_passphrase
argument in the provider configuration.
1.3 Provider configuration: connecting Terraform to Snowflake
With the service user created and the key-pair generated, the final step is to configure the Snowflake provider in the Terraform project. This is typically done in a providers.tf
file.
The foundational configuration requires defining the snowflakedb/snowflake
provider and setting the connection parameters.
terraform {
required_providers {
snowflake = {
source = "snowflakedb/snowflake"
version = ">= 1.0.0" // Best practice: pin to a major version to avoid breaking changes
}
}
}
provider "snowflake" {
organization_name = var.snowflake_org_name
account_name = var.snowflake_account_name
user = var.snowflake_user // e.g., "TERRAFORM_SVC"
role = "SYSADMIN" // Default role for the provider's operations
authenticator = "SNOWFLAKE_JWT"
private_key = var.snowflake_private_key
}
It is critical that sensitive values, especially the private_key
, are never hardcoded in configuration files. The recommended approach is to define them as input variables marked as sensitive = true
and supply their values through secure mechanisms like environment variables (e.g., TF_VAR_snowflake_private_key
) or integration with a secrets management tool like GitHub Secrets or AWS Secrets Manager.
A common source of initial connection failures is the incorrect identification of the organization_name
and account_name
. These values can be retrieved with certainty by executing the following SQL queries in the Snowflake UI: SELECT CURRENT_ORGANIZATION_NAME();
and SELECT CURRENT_ACCOUNT_NAME();
. Providing these simple but effective commands can prevent significant user frustration.
For more mature IaC implementations that strictly adhere to the principle of least privilege, Terraform supports the use of aliased providers. This powerful pattern allows for the definition of multiple provider configurations within the same project, each assuming a different role. This mirrors Snowflake's own best practices, where object creation (SYSADMIN
) is separated from security management (SECURITYADMIN
).
The following example demonstrates how to configure aliased providers:
# Default provider uses SYSADMIN for object creation (e.g., databases, warehouses)
provider "snowflake" {
alias = "sysadmin"
organization_name = var.snowflake_org_name
account_name = var.snowflake_account_name
user = var.snowflake_user
private_key = var.snowflake_private_key
authenticator = "SNOWFLAKE_JWT"
role = "SYSADMIN"
}
# Aliased provider for security-related objects (e.g., roles, users, grants)
provider "snowflake" {
alias = "securityadmin"
organization_name = var.snowflake_org_name
account_name = var.snowflake_account_name
user = var.snowflake_user
private_key = var.snowflake_private_key
authenticator = "SNOWFLAKE_JWT"
role = "SECURITYADMIN"
}
When using aliased providers, individual resource blocks must explicitly specify which provider to use via the provider
meta-argument (e.g., provider = snowflake.securityadmin
). This ensures that each resource is created with the minimum necessary privileges, enforcing a robust security posture directly within the code.
Section 2: provisioning core Snowflake infrastructure
Once the secure connection is bootstrapped, Terraform can be used to define and manage the fundamental building blocks of the Snowflake environment. This section provides code examples for creating databases, virtual warehouses, and schemas - the foundational components for any data workload.
2.1 Laying the foundation: databases
The database is the top-level container for schemas and tables in Snowflake. The snowflake_database
resource is used to provision and manage these containers.
The following HCL example creates a primary database for analytics workloads, demonstrating the use of the aliased sysadmin
provider and an optional parameter for data retention.
resource "snowflake_database" "analytics_db" {
provider = snowflake.sysadmin // Explicitly use the sysadmin provider for object creation
name = "ANALYTICS"
comment = "Primary database for analytics workloads managed by Terraform."
// Optional: Configure Time Travel data retention period.
// This setting can have cost implications.
data_retention_time_in_days = 30
}
A core strength of Terraform is its ability to manage dependencies implicitly through resource references. In this example, once the analytics_db
resource is defined, other resources, such as schemas, can reference its attributes (e.g., snowflake_database.analytics_db.name
).
2.2 Compute power: warehouses
Virtual warehouses are the compute engines in Snowflake, responsible for executing queries and data loading operations. The snowflake_warehouse
resource provides comprehensive control over their configuration, enabling a balance between performance and cost.
This example defines a standard virtual warehouse for analytics and business intelligence tools, showcasing parameters for cost optimization and scalability.
resource "snowflake_warehouse" "analytics_wh" {
provider = snowflake.sysadmin
name = "ANALYTICS_WH"
comment = "Warehouse for the analytics team and BI tools."
// Define the compute capacity of the warehouse.
warehouse_size = "X-SMALL"
// Cost-saving measures: suspend the warehouse when idle.
auto_suspend = 60 // Suspend after 60 seconds of inactivity.
auto_resume = true
// Optional: Configure for multi-cluster for higher concurrency.
min_cluster_count = 1
max_cluster_count = 4
scaling_policy = "ECONOMY" // Prioritize conserving credits over starting clusters quickly.
}
The parameters in this resource directly impact both performance and billing. warehouse_size
determines the raw compute power and credit consumption per second. auto_suspend
is a critical cost-control feature, ensuring that credits are not consumed when the warehouse is idle. For workloads with high concurrency needs, the min_cluster_count
, max_cluster_count
, and scaling_policy
parameters allow the warehouse to dynamically scale out to handle query queues, and then scale back in to conserve resources. Managing these settings via Terraform ensures that cost and performance policies are consistently applied and version-controlled.
2.3 Organizing your data: schemas
Schemas are logical groupings of database objects like tables and views within a database. The snowflake_schema
resource is used to create and manage these organizational units.
The following HCL creates a RAW
schema within the ANALYTICS
database defined earlier.
resource "snowflake_schema" "raw_data" {
provider = snowflake.sysadmin
// Create an explicit dependency on the database resource.
database = snowflake_database.analytics_db.name
name = "RAW"
comment = "Schema for raw, unprocessed data ingested from source systems."
}
It is important to note that when a new database is created in Snowflake, it automatically includes a default schema named PUBLIC
. While this schema is created outside of Terraform's management, administrators should be aware of its existence. For environments that require strict access control, it is a common practice to immediately revoke all default privileges from the
PUBLIC
schema to ensure it is not used inadvertently. Terraform can be used to manage this revocation if desired, but the schema itself will not be in the Terraform state unless explicitly imported.
Section 3: mastering access control with role hierarchies
Effective access control is a cornerstone of data governance and security. Snowflake's Role-Based Access Control (RBAC) model is exceptionally powerful, particularly its support for role hierarchies. Managing this model via Terraform provides an auditable, version-controlled, and scalable approach to permissions management. This section details how to construct a robust RBAC framework using a best-practice model of functional and access roles.
3.1 The building blocks: creating account roles
The foundation of the RBAC model is the creation of roles. A recommended pattern is to create two distinct types of roles:
- Functional roles: These roles represent a job function or a persona, such as
DATA_ANALYST
orDATA_ENGINEER
. Users are granted these roles. - Access roles: These roles represent a specific set of privileges on a specific set of objects, such as
SALES_DB_READ_ONLY
orRAW_SCHEMA_WRITE
. These roles are granted to functional roles, not directly to users.
This separation decouples users from direct permissions, making the system vastly more scalable and easier to manage. The snowflake_account_role
resource is used to create both types of roles.
// Define a functional role representing a user persona.
resource "snowflake_account_role" "data_analyst" {
provider = snowflake.securityadmin // Use the securityadmin provider for role management
name = "DATA_ANALYST"
comment = "Functional role for users performing data analysis and reporting."
}
// Define an access role representing a specific set of privileges.
resource "snowflake_account_role" "analytics_db_read_only" {
provider = snowflake.securityadmin
name = "ANALYTICS_DB_READ_ONLY"
comment = "Grants read-only access to all objects in the ANALYTICS database."
}
3.2 Constructing the hierarchy: granting roles to roles
The true power of Snowflake's RBAC model is realized by creating hierarchies of roles. By granting access roles to functional roles, a logical and maintainable privilege structure is formed. If a data analyst needs access to a new data source, the corresponding access role is granted to the DATA_ANALYST
functional role once, rather than granting privileges to every individual analyst. This pattern is essential for managing permissions at scale.
The snowflake_grant_account_role
resource is used to create these parent-child relationships between roles. It is important to use this resource, as the older snowflake_role_grants
resource is deprecated.
The following example demonstrates how to grant the ANALYTICS_DB_READ_ONLY
access role to the DATA_ANALYST
functional role, and then nest the functional role under the system SYSADMIN
role to complete the hierarchy.
// Grant the access role to the functional role.
// This gives all members of DATA_ANALYST the privileges of ANALYTICS_DB_READ_ONLY.
resource "snowflake_grant_account_role" "grant_read_access_to_analyst" {
provider = snowflake.securityadmin
role_name = snowflake_account_role.analytics_db_read_only.name
parent_role_name = snowflake_account_role.data_analyst.name
}
// Grant the functional role to SYSADMIN to create a clear role hierarchy.
// This allows system administrators to manage and assume the functional role.
resource "snowflake_grant_account_role" "grant_analyst_to_sysadmin" {
provider = snowflake.securityadmin
role_name = snowflake_account_role.data_analyst.name
parent_role_name = "SYSADMIN"
}
3.3 Assigning privileges to access roles
With the role structure in place, the final step is to grant specific object privileges to the access roles. The snowflake_grant_privileges_to_account_role
resource is a consolidated and powerful tool for this purpose. This resource has evolved significantly in the Snowflake provider; older versions required separate grant resources for each object type (e.g., snowflake_database_grant
), which resulted in verbose and repetitive code. The modern resource uses a more complex but flexible block structure (on_account_object
, on_schema
, etc.) to assign privileges. Users migrating from older provider versions may find this a significant but worthwhile refactoring effort.
This example grants the necessary USAGE
and SELECT
privileges to the ANALYTICS_DB_READ_ONLY
access role.
// Grant USAGE privilege on the database to the access role.
resource "snowflake_grant_privileges_to_account_role" "grant_db_usage" {
provider = snowflake.securityadmin
account_role_name = snowflake_account_role.analytics_db_read_only.name
privileges = on_account_object {
object_type = "DATABASE"
object_name = snowflake_database.analytics_db.name
}
}
// Grant USAGE privilege on the schema to the access role.
resource "snowflake_grant_privileges_to_account_role" "grant_schema_usage" {
provider = snowflake.securityadmin
account_role_name = snowflake_account_role.analytics_db_read_only.name
privileges =
on_schema {
// Use the fully_qualified_name for schema-level objects.
schema_name = snowflake_schema.raw_data.fully_qualified_name
}
}
// Grant SELECT on all existing tables in the schema.
resource "snowflake_grant_privileges_to_account_role" "grant_all_tables_select" {
provider = snowflake.securityadmin
privileges =
account_role_name = snowflake_account_role.analytics_db_read_only.name
on_schema_object {
all {
object_type_plural = "TABLES"
in_schema = snowflake_schema.raw_data.fully_qualified_name
}
}
}
// Grant SELECT on all FUTURE tables created in the schema.
resource "snowflake_grant_privileges_to_account_role" "grant_future_tables_select" {
provider = snowflake.securityadmin
account_role_name = snowflake_account_role.analytics_db_read_only.name
privileges =
on_schema_object {
future {
object_type_plural = "TABLES"
in_schema = snowflake_schema.raw_data.fully_qualified_name
}
}
}
A particularly powerful feature demonstrated here is the use of the future
block. Granting privileges on future
objects ensures that the access role will automatically have the specified permissions on any new tables created within that schema. This dramatically reduces operational overhead, as permissions do not need to be manually updated every time a new table is deployed. However, it is important to understand Snowflake's grant precedence: future grants defined at the schema level will always take precedence over those defined at the database level. This can lead to "insufficient privilege" errors if not managed carefully across different roles and grant levels.
3.4 An optional "Audit" role for bypassing data masks
In certain scenarios, such as internal security audits or compliance reviews, it may be necessary for specific, highly-trusted users to view data that is normally protected by masking policies. Creating a dedicated "audit" role for this purpose provides a controlled and auditable mechanism to bypass data masking when required.
This role should be considered a highly privileged functional role and granted to users with extreme care.
// Define a special functional role for auditing PII data.
resource "snowflake_account_role" "pii_auditor" {
provider = snowflake.securityadmin
name = "PII_AUDITOR"
comment = "Functional role for users who need to view unmasked PII for audit purposes."
}
Crucially, creating this role is not enough. For it to be effective, every relevant masking policy must be explicitly updated to include logic that unmasks data for members of the PII_AUDITOR
role. This ensures that the ability to view sensitive data is granted on a policy-by-policy basis. An example of how to modify a masking policy to incorporate this audit role is shown in the following section.
Section 4: advanced data governance with dynamic data masking
Moving beyond infrastructure provisioning, Terraform can also codify and enforce sophisticated data governance policies. Snowflake's Dynamic Data Masking is a powerful feature for protecting sensitive data at query time. By managing these policies with Terraform, organizations can ensure that data protection rules are version-controlled, auditable, and consistently applied across all environments.
4.1 Defining the masking logic
A masking policy is a schema-level object containing SQL logic that determines whether a user sees the original data in a column or a masked version. The decision is made dynamically at query time based on the user's context, most commonly their active role.
The snowflake_masking_policy
resource is used to define this logic. The policy's body
contains a CASE
statement that evaluates the user's session context and returns the appropriate value.
The following example creates a policy to mask email addresses for any user who is not in the DATA_ANALYST
or PII_AUDITOR
role.
resource "snowflake_masking_policy" "email_mask" {
provider = snowflake.sysadmin // Policy creation often requires SYSADMIN or a dedicated governance role n
ame = "EMAIL_MASK"
database = snowflake_database.analytics_db.name
schema = snowflake_schema.raw_data.name
// Defines the signature of the column the policy can be applied to.
// The first argument is always the column value to be masked.
argument {
name = "email_val"
type = "VARCHAR" }
// The return data type must match the input data type.
return_type = "VARCHAR"
// The core masking logic is a SQL expression.
body = <<-EOF
CASE
WHEN IS_ROLE_IN_SESSION('DATA_ANALYST') OR IS_ROLE_IN_SESSION('PII_AUDITOR') THEN email_val
ELSE '*********'
END
EOF
comment = "Masks email addresses for all roles except DATA_ANALYST and PII_AUDITOR."
}
The SQL expression within the body
argument offers immense flexibility. It can use various context functions (like CURRENT_ROLE()
or IS_ROLE_IN_SESSION()
) and even call User-Defined Functions (UDFs) to implement complex logic. However, this flexibility means the logic itself is not validated by Terraform's syntax checker; it is sent directly to Snowflake for validation during the
terraform apply
step. It is also a strict requirement that the data type defined in the argument
block and the return_type
must match the data type of the column to which the policy will eventually be applied.
4.2 Applying the policy to a column
Creating a masking policy is only the first step; it does not protect any data on its own. The policy must be explicitly applied to one or more table columns. This crucial second step is often a point of confusion for new users, who may create a policy and wonder why data is still unmasked. The snowflake_table_column_masking_policy_application
resource creates this essential link between the policy and the column.
The following example demonstrates how to apply the EMAIL_MASK
policy to the EMAIL
column of a CUSTOMERS
table.
// For this example, we assume a 'CUSTOMERS' table with an 'EMAIL' column
// already exists in the 'RAW' schema. In a real-world scenario, this table
// might also be managed by Terraform or by a separate data loading process.
// We use a data source to reference this existing table.
data "snowflake_table" "customers" {
database = snowflake_database.analytics_db.name
schema = snowflake_schema.raw_data.name
name = "CUSTOMERS"
}
// Apply the masking policy to the specific column.resource "snowflake_table_column_masking_policy_application" "apply_email_mask" {
provider = snowflake.sysadmin
table_name = "\"${data.snowflake_table.customers.database}\". \"${data.snowflake_table.customers.schema}\". \"${data.snowflake_table.customers.name}\""
column_name = "EMAIL" // The name of the column to be masked
masking_policy_name = snowflake_masking_policy.email_mask.fully_qualified_name
// An explicit depends_on block ensures that Terraform creates the policy
// before attempting to apply it, preventing race conditions.
depends_on = [
snowflake_masking_policy.email_mask
]
}
This two-step process—defining the policy logic and then applying it - provides a clear and modular approach to data governance. The same policy can be defined once and applied to many different columns across multiple tables, ensuring that the masking logic is consistent and centrally managed.
Conclusion: the path to mature Snowflake IaC
This guide has charted a course from the initial, manual bootstrapping of a secure connection to the automated provisioning and governance of a production-grade Snowflake environment. To ensure the long-term success and scalability of managing Snowflake with Terraform, several key practices should be adopted as standard procedure:
- Version control: All Terraform configuration files must be stored in a version control system like Git. This provides a complete, auditable history of all infrastructure changes and enables collaborative workflows such as pull requests for peer review before any changes are applied to production.
- Remote state management: The default behaviour of Terraform is to store its state file locally. In any team or automated environment, this is untenable. A remote backend, such as an Amazon S3 bucket with a DynamoDB table for state locking, must be configured. This secures the state file, prevents concurrent modifications from corrupting the state, and allows CI/CD pipelines and team members to work from a consistent view of the infrastructure.
- Modularity: As the number of managed resources grows, monolithic Terraform configurations become difficult to maintain. Code should be refactored into reusable modules. For instance, a module could be created to provision a new database along with a standard set of access roles and default schemas. This promotes code reuse, reduces duplication, and allows for more organized and scalable management of the environment.
- Provider versioning: The Snowflake Terraform provider is actively evolving. To prevent unexpected breaking changes from new releases, it is crucial to pin the provider to a specific major version in the
terraform
block (e.g.,version = "~> 1.0"
). This allows for intentional, planned upgrades. When upgrading between major versions, it is essential to carefully review the official migration guides, as significant changes, particularly to grant resources, may require a concerted migration effort.
With this robust foundation in place, the path is clear for expanding automation to encompass even more of Snowflake's capabilities. The next logical steps include using Terraform to manage snowflake_network_policy for network security, snowflake_row_access_policy
for fine-grained data filtering, and snowflake_task
for orchestrating SQL workloads. Ultimately, the entire workflow should be integrated into a CI/CD pipeline, enabling a true GitOps model where every change to the Snowflake environment is proposed, reviewed, and deployed through a fully automated and audited process. By embracing this comprehensive approach, organizations can unlock the full potential of their data platform, confident in its security, scalability, and operational excellence.
Why Snowstack for Terraform and Snowflake
Automation without expertise can still fail. Terraform gives you the tools, but it takes experience and the right design patterns to turn Snowflake into a secure, cost-efficient, and scalable platform.
Managing Snowflake with Terraform is powerful, but putting it into practice at enterprise scale requires experience, discipline, and the right patterns. That is where Snowstack comes in. As a Snowflake-first consulting partner, we help organizations move beyond trial-and-error scripts to fully automated, production-grade environments. Our engineers design secure architectures, embed Terraform best practices, and ensure governance and cost controls are built in from day one.