- Blog
- 06.24.2024
- Data Fundamentals
Sentiment Analysis in Snowflake with Anthropic Claude 3 Sonnet using Amazon Bedrock

Sentiment analysis, the process of determining the emotional tone behind a piece of text, is a powerful tool for businesses seeking to understand customer feedback and public opinion. This article will guide you through various techniques for performing sentiment analysis in Snowflake, starting with Python.
We will also explore Anthropic Claude 3 Sonnet, a cutting-edge language model, and begin by explaining the fundamentals of sentiment analysis.
What is Sentiment Analysis?
Sentiment Analysis is a method used to extract a numeric sentiment score from unstructured text, enabling the quantification of opinions, emotions, or attitudes expressed in data. It transforms subjective language into objective metrics. Large Language Models (LLMs), such as Claude 3 Sonnet, are a great way to perform Sentiment Analysis by leveraging extensive training on diverse text corpora to understand and interpret the nuances of human language. These models utilize deep learning architectures, specifically transformers, to capture context and semantic meaning, facilitating accurate sentiment scoring.
Reliable data preparation is vital for effective Sentiment Analysis. Data engineers play a critical role in this process, managing data pipelines that clean, preprocess, and transform raw text data into forms suitable for LLM input. They ensure the data interface between storage solutions and computational environments is seamless, maintaining the integrity and consistency of datasets. This data preparation directly affects the performance and reliability of the sentiment models, underscoring the indispensable alliance between data engineering and advanced language models.
Business examples of Sentiment Analysis
- E-commerce: Analyze customer reviews for product sentiment, enabling targeted marketing and product improvements.
- Social Media Monitoring: Gauge public opinion on brands, campaigns, or events by analyzing sentiment in social media posts.
- Customer Service: Automatically route negative sentiment queries to prioritized queues for prompt resolution, improving customer experience.
What is Anthropic Claude 3 Sonnet?
Anthropic Claude 3 Sonnet is a large language model trained by Anthropic using a novel approach called "constitutional AI." The technical details of this approach are not publicly disclosed, but it aims to instill the model with certain values and behaviors during training. Claude 3 Sonnet is capable of understanding and generating human-like text across a wide range of topics and tasks.
Pros:
- Strong language understanding and generation capabilities
- Claimed to have robust ethical and truthful behavior
- Can handle open-ended tasks and follow-up questions
Cons
- Potential for biases or inconsistencies due to the opaque training process
- Limited transparency into the model's inner workings and decision-making processes
- Reliance on Anthropic's infrastructure for accessing the model
Ideal use cases are open-ended dialogue, task assistance, creative writing, research and analysis, tutoring and education.
How to perform Sentiment Analysis in Snowflake with Anthropic Claude 3 Sonnet using Python with the Amazon Bedrock SDK
Prerequisites for the boto3 Amazon Bedrock Python SDK
Start by installing the prerequisite libraries
python3 -m pip install snowflake-connector-python boto3
Afterwards load your source data into Snowflake.
Python boto3 for Anthropic Claude 3 Sonnet
The example below involves product reviews, and assumes that the data has been loaded into a database table named "stg_sample_reviews" with four columns: id (the primary key), stars, product and review.
The Python script is shown below. Note it is good practice to handle credentials more securely than shown in this simple example. You might choose to use a secret management service instead of environment variables or hardcoding.
Also please note that handling large amounts of data using fetchall() can be inefficient, and may result in memory issues. For large datasets, you should use a cursor to fetch rows incrementally using the fetchmany() method instead.
import os
import snowflake.connector
import logging
import json
import boto3
import botocore
from botocore.exceptions import ClientError
logger = logging.getLogger("demo")
# Use the Amazon Bedrock Converse API
def analyze_sentiment(text):
abc = boto3.client(service_name="bedrock-runtime", region_name="us-east-1")
model_id = "anthropic.claude-3-sonnet-20240229-v1:0"
prompt = f"""Provide a numeric rating that reflects the overall sentiment of the review.
The rating should be a single number between 1 and 5, where 1 represents the most negative sentiment and 5 represents the most positive sentiment.
Respond with only your numeric rating. Do not include any justification of the rating.
Review: {text}
"""
response = abc.converse(modelId = model_id,
messages = [{"role": "user", "content": [{"text": prompt}]}],
system = [{"text" : "Your job is to analyze online product reviews."}],
inferenceConfig = {"temperature": 0.5},
additionalModelRequestFields = {"top_k": 200}
)
return response['output']['message']['content'][0]['text']
# Establish a Snowflake connection
conn = snowflake.connector.connect(
user=os.environ["SF_USER"],
password=os.environ["SF_PASSWORD"],
account=os.environ["SF_ACCOUNT"],
warehouse=os.environ["SF_WH"],
database=os.environ["SF_DB"],
schema=os.environ["SF_SCHEMA"],
role=os.environ["SF_ROLE"]
)
try:
# Create a cursor object using the connection
cur = conn.cursor()
# Create the destination table
cur.execute('''CREATE TABLE IF NOT EXISTS "stg_sample_reviews_genai"
("id" NUMBER(6,0) NOT NULL, "ai_score" VARCHAR(1024) NOT NULL)''')
cur.execute(f'DELETE FROM "stg_sample_reviews_genai"')
# Select source rows from the table
cur.execute('SELECT "id", "review" FROM "stg_sample_reviews"')
# Fetch all rows from the executed query
rows = cur.fetchall()
# Loop through the fetched rows and call the analyze_sentiment function
for row in rows:
ai_score = analyze_sentiment(row[1])
cur.execute(f'INSERT INTO "stg_sample_reviews_genai" ("id", "ai_score") VALUES ({row[0]}, {ai_score})')
finally:
# Close the cursor and connection
if cur:
cur.close()
if conn:
conn.close()
After running the above script, you should find a new table has been created, which contains the AI-generated review score for every input record. Join this table to the original on the common id column to compare the AI-generated sentiment scores against the original star review.
The LLM was asked to score between 1 and 5, so you may choose to classify the scores more broadly as follows:
- 4 or 5 - Positive
- 3 - Neutral
- 1 or 2 - Negative
Sentiment Analysis in Snowflake using Matillion to run Anthropic Claude 3 Sonnet via Amazon Bedrock
In the Matillion Data Productivity Cloud, orchestration pipelines like the one shown in the screenshot below can:
- Directly extract and load data, or call other pipelines to do so (as shown)
- Invoke Anthropic Claude 3 Sonnet, with a nominated prompt, against all rows from a nominated table
Sentiment Analysis in Snowflake using Matillion
Data pipelines such as this manage all the connectivity and plumbing between the Snowflake source and target tables, and the LLM.
This allows you to focus on the overall design and architecture, and the data analysis. To compare the AI-generated sentiment scores against the original star review, use a transformation pipeline like the one in the next screenshot.
Checking the results of Sentiment Analysis in Snowflake using Matillion
The data sample shows two of the records. In one case the LLM's decision matches the original sentiment identically, but in the other record the ratings differ slightly. This is an example of the subjective nature of sentiment analysis.
Summary
Matillion is a data pipeline platform enabling data teams to rapidly build and manage pipelines for AI and analytics at scale. Its code-optional UI with pre-built components accelerates productivity, while still allowing coding in SQL, Python, or dbt.
Matillion integrates with cloud data platforms, CDPs, LLMs, and offers AI-generated documentation, no-code connectors, and REST API connectivity. It provides data lineage, pushdown ELT, vector store integration, and AI components for generative prompting and retrieval-augmented generation.
Leveraging Git, capable of hosting parameterized, dynamic data pipelines, and with an optional hybrid SaaS deployment, Matillion enables augmented data engineering, with AI capabilities seamlessly incorporated into your data pipelines.
For more examples of Matillion's AI components in action, check out our library of AI Videos and Demos.
To try Matillion yourself, using your own data, sign up for a free trial.
If you are already a Matillion user or trial customer, you can download the sentiment analysis example shown in the screenshots earlier, and run it on your own platform.
Featured Resources
What Is Massively Parallel Processing (MPP)? How It Powers Modern Cloud Data Platforms
Massively Parallel Processing (often referred to as simply MPP) is the architectural backbone that powers modern cloud data ...
BlogETL and SQL: How They Work Together in Modern Data Integration
Explore how SQL and ETL power modern data workflows, when to use SQL scripts vs ETL tools, and how Matillion blends automation ...
WhitepapersUnlocking Data Productivity: A DataOps Guide for High-performance Data Teams
Download the DataOps White Paper today and start building data pipelines that are scalable, reliable, and built for success.
Share: