CI/CD Pipeline for Snowflake Using Jenkins and Schemachange

Nischal Vooda
SelectFrom
Published in
5 min readApr 25, 2022

--

Introduction

The pipeline is scheduled in Dockerized Jenkins that fetches SQL files committed into the AWS Code commit repository. Once the SQL file is committed successfully, we can run the job through Jenkins to execute changes. Schemachange will help Jenkins deploy changes mentioned in the loaded SQL file into the provided Snowflake account.

diagram showing flow of execution with the use of Docker, Snowflake, etc.
Execution Flow Diagram

Deployment Steps:

  • Create the pipeline.
  • Add the AWS CodeCommit repository with credentials.
  • Provide a jenkinsfile in the pipeline.
  • Apply the changes.
  • Execute a job from Jenkins once after successfully committing the SQL files.

Execution Steps

  • Create a new instance in AWS EC2.
  • Log in to the instance using Putty and install Docker and Git.
  • Create a repository in AWS code commit.
  • Go to IAM and generate credentials for AWS CodeCommit.
  • Clone the repository.
  • Run the dockerfile to build an image.
  • Run the image using the below command:

docker run -p 8080:8080 -v /var/run/docker.sock:/var/run/docker.sock — name jenkins jenkins

  • Copy the password to login into Jenkins.
  • Note down the password located in — /var/jenkins_home/secrets/initialAdminPassword
  • Paste the copied password here.
  • Click on install suggested plugins.
  • Jenkins will automatically restart after this step.
  • In this step, we are installing necessary plugins to make an environment to setup pipelines
  • Create a Jenkins user here.
  • This step is to create a Jenkins root user.
  • Once the Jenkins container is running, run this command to give Jenkins access to the Docker engine.
  • Provide your Snowflake credentials here into parameters.
  • To connect Snowflake with the pipeline we are creating in Jenkins, we need to provide Snowflake credentials in parameter.
  • SF_ACCOUNT
  • SF_USERNAME
  • SNOWFLAKE_PASSWORD
  • SF_ROLE
  • SF_WAREHOUSE
  • SF_DATABASE → RAW_DATALAKE
  • Provide the code commit URL here & click on “Save”:
  • Create a new file in the migration directory.
  • It will be a .sql file with all the SQL script we need to execute with the commit.
  • Push the changes to code commit.
  • Click on build with parameters in Jenkins and click on build. The job will start to run.
  • We can see here that in the fourth job we have one commit.
  • After completing job execution we can see results in Snowflake.
  • Result: Database will be created inside Snowflake.

Testing Scenario:

  • To execute CI/CD by committing an SQL file into the AWS Code-Commit remote repository from a local system.

Expected result : After committing and pushing the SQL file into a remote repository, the SQL scripts should be executed automatically into a Snowflake account of which detailed credentials have been provided when the pipline was set up Jenkins.

Prerequisites :

  • AWS Code-Commit
  • Jenkins Setup Localhost/EC2
  • Snowflake Account

Steps to execute:

  1. Take a pull from the remote repository to ensure we updated changes that had been committed into the remote repository and opened the updated folder in the IDE.
  2. Create a .sql file in IDE with all the appropriate SQL script and save it.
  3. Execute “Git add .” to add the file to commit from the local directory.
  4. Execute “Git commit -m Any message with committo commit changes from the local directory to the remote repository.
  5. Execute the “Git push origin” command to push all the changes into the remote repository.
  6. Go to the Jenkins dashboard and execute the job we created in it.
  7. Wait to complete process execution inside Jenkins.
  8. After the completion of the process, confirm the changes in your Snowflake Account through the History option where we can see the last executed SQL scripts.

And that’s all folks! Hope you found this helpful. PLease let me know if you have any queries/feedback in the comments section below.

The world’s fastest cloud data warehouse:

When designing analytics experiences which are consumed by customers in production, even the smallest delays in query response times become critical. Learn how to achieve sub-second performance over TBs of data with Firebolt.

--

--