How to Connect Amazon AWS S3 to Snowflake and Load Data: A Complete Beginner’s Guide
Connecting Amazon S3 with Snowflake is one of the most essential skills for any aspiring data engineer or Snowflake student. This integration forms the backbone of modern data pipelines where raw data lands in cloud storage and Snowflake acts as the analytical engine.
This guide walks you through the complete workflow: creating the S3 bucket, preparing Snowflake objects, setting up a storage integration, and capturing required AWS details. Each step also includes placeholders for screenshots to help you insert visuals later.
1. Introduction: Why Connect S3 and Snowflake
Amazon S3 is one of the most widely used cloud storage systems for landing raw data. Many organizations store logs, CSV files, JSON files, and external datasets in S3 before loading them into Snowflake.
Connecting S3 and Snowflake allows you to:
- Access data stored in S3 without manual downloads
- Run SQL queries over files using external stages
- Load structured data directly into Snowflake tables
- Build scalable, automated ingestion pipelines
This S3 ↔ Snowflake integration is foundational for batch pipelines, incremental ingestion, and ELT workflows used in real-world data engineering.
2. Step-1: Create S3 Bucket and Landing Folder
Before Snowflake can read data, you must prepare your S3 bucket structure.
Bucket Structure Overview
A standard ingestion bucket includes:
- A bucket (top-level container)
- A folder (often named landing/, raw/, or input/)
- One or more CSV files placed inside the landing folder
What You Will Do
- Create an S3 bucket.
- Create a folder inside it named
landing/. - Upload CSV files such as:
- Brazil_Customer.csv
- India_Customer.csv
- USA_Customer.csv
These files will be used later when testing the connection from Snowflake.

Fig: AWS S3 bucket and landing folder view
This corresponds to the visual shown in Step-1 where S3 items like file name, type, last modified, and size are displayed.
3. Step-2: Create Table and File Format in Snowflake
Now that the data is in S3, Snowflake needs matching structures to load the data properly.
3.1 Create the Customer Table
Below is the schema used in the workflow:
CREATE OR REPLACE TABLE CUSTOMER (
C_CUSTKEY number,
C_NAME varchar,
C_ADDRESS varchar,
C_NATIONKEY number,
C_ACCTBAL number,
C_MKTSEGMENT varchar,
C_COMMENT varchar,
load_date date
);
This table will store customer details after being loaded from S3.
3.2 Create CSV File Format
To load CSVs correctly, define a file format:
CREATE OR REPLACE FILE FORMAT CSVTYPE
TYPE='CSV'
SKIP_HEADER=1
FIELD_DELIMITER=','
RECORD_DELIMITER='\n'
FIELD_OPTIONALLY_ENCLOSED_BY='"'
DATE_FORMAT='DD-MMM-YYYY'
COMPRESSION = NONE;
This file format ensures Snowflake knows how to interpret CSV delimiters, headers, date formats, and quoting.

Fig : Snowflake worksheet with table & file format
This matches the worksheet shown in Step-2 where the table and file format were created.
4. Step-3: Create Storage Integration in Snowflake
Storage Integration is a secure Snowflake object that allows Snowflake to read from external storage without requiring you to store AWS keys inside Snowflake.
It provides:
- IAM role-based access
- Secure credential forwarding
- Governance of allowed S3 locations
4.1 What Is a Storage Integration?
A storage integration is Snowflake’s recommended method of authorizing Snowflake to access S3. Instead of access keys, Snowflake uses:
- An IAM role ARN
- A controlled list of allowed S3 URL paths
4.2 Create the Integration
Use this template inside Snowflake:
CREATE OR REPLACE STORAGE INTEGRATION AWS_INT
TYPE = EXTERNAL_STAGE
ENABLED = TRUE
STORAGE_PROVIDER = 'S3'
STORAGE_AWS_ROLE_ARN = '<your_role_arn>'
STORAGE_ALLOWED_LOCATIONS = ('<your_s3_url>');
What Do These Fields Mean?
- STORAGE_AWS_ROLE_ARN: The IAM role Snowflake will assume
- STORAGE_ALLOWED_LOCATIONS: A list of S3 paths Snowflake is allowed to read from
Both values will be retrieved in later steps.
This corresponds to Step-3 in the workflow where Snowflake prepares to connect to S3.
5. Step-4: Get the S3 Allowed Location URL
Snowflake requires an explicit allowed location so it does not gain access to the entire S3 bucket accidentally.
How to Obtain the Allowed Location
- Open your S3 bucket
- Navigate to the
landing/folder - Select any file
- Click Copy S3 URL
The URL will look something like:
s3://mybucket1995/landing/
This value needs to be pasted into the integration created earlier.

Fig : Allowed location copied from S3)
This matches the visual shown in Step-4 where the Copy S3 URL button is highlighted.
Conclusion
By completing these four steps, you have prepared all foundations required to establish a secure connection between Amazon S3 and Snowflake:
- You created an S3 bucket and uploaded sample CSV files
- You created a Snowflake table and CSV file format
- You set up a storage integration that allows Snowflake to access S3 securely
- You retrieved the S3 allowed location URL for integration configuration
These steps form the essential first half of a complete S3-to-Snowflake ingestion pipeline. In the next stages, you will finalize IAM roles, update trust policies, test the integration, and load data into Snowflake tables—fully automating your cloud data movement.

