- Blog
- 06.24.2024
- Data Fundamentals
Sentiment Analysis in Snowflake with Cohere Command using Amazon Bedrock

Sentiment Analysis is a powerful technique that enables organizations to gain valuable insights from unstructured text data, such as product reviews, social media posts, and customer feedback. This article will guide you through various approaches to performing Sentiment Analysis in Snowflake, starting with Python. We will explore the capabilities of the Cohere Command large language model and begin by understanding the fundamentals of Sentiment Analysis.
What is Sentiment Analysis?
Sentiment Analysis is a crucial text mining technique that assigns a numeric sentiment score to unstructured text. At its core, it involves extracting subjective information to determine if the expressed sentiment is positive, negative, or neutral. Large language models (LLMs), such as Cohere Command, have revolutionized this field by leveraging vast amounts of data to understand contextual nuances and sentiments more accurately. Utilizing transformers, these models consider the sequential nature of language, capturing dependencies and subtleties that previous models struggled with.
The reliability of Sentiment Analysis hinges on the quality of the prepared data. Data engineers play a critical role in this pipeline by ensuring the seamless integration of clean, labeled datasets from databases to LLMs. They must preprocess data, handle inconsistencies, and ensure balanced class distributions. This careful data preparation enables the LLMs to perform optimally, thus yielding accurate sentiment scores that can reliably drive critical business insights and decisions.
Business Examples of Sentiment Analysis
- E-commerce: Analyze customer reviews to gauge product satisfaction, identify pain points, and improve offerings.
- Social Media Monitoring: Track public sentiment towards brands, campaigns, or events by analyzing social media posts.
- Customer Support: Automatically categorize support tickets based on sentiment to prioritize and route urgent or negative issues.
What is Cohere Command?
The Cohere Command LLM is a large language model developed for natural language processing tasks. It utilizes transformer architecture to generate human-like text and perform summarization, translation, sentiment analysis, and question-answering. Built for scalability and efficiency, it offers a range of pre-trained models that can be fine-tuned for specific tasks.
Pros:
- High-quality text generation.
- Scalable and efficient.
- Versatile across various NLP tasks.
- Customizable through fine-tuning.
Cons:
- Requires substantial computational resources.
- Potential biases in generated text.
- Prone to occasional inaccuracies or irrelevant outputs.
- Can be costly for extensive usage.
Ideal Use Cases:
- Automating customer support responses.
- Generating digital content.
- Data analysis and extraction.
- Language translation services.
How to perform Sentiment Analysis in Snowflake with Cohere Command using Python with the Amazon Bedrock SDK
Prerequisites for the boto3 Amazon Bedrock Python SDK
Start by installing the prerequisite libraries
python3 -m pip install snowflake-connector-python boto3
Afterwards load your source data into Snowflake.
Python boto3 for Cohere Command
The example below involves product reviews, and assumes that the data has been loaded into a database table named "stg_sample_reviews" with four columns: id (the primary key), stars, product and review.
The Python script is shown below. Note it is good practice to handle credentials more securely than shown in this simple example. You might choose to use a secret management service instead of environment variables or hardcoding.
Also please note that handling large amounts of data using fetchall() can be inefficient, and may result in memory issues. For large datasets, you should use a cursor to fetch rows incrementally using the fetchmany() method instead.
import os
import snowflake.connector
import logging
import json
import boto3
import botocore
from botocore.exceptions import ClientError
logger = logging.getLogger("demo")
# Use the Amazon Bedrock InvokeModel API
def analyze_sentiment(text):
abc = boto3.client(service_name="bedrock-runtime", region_name="us-east-1")
model_id = "cohere.command-text-v14"
prompt = f"""Your job is to analyze online product reviews.
Provide a numeric rating that reflects the overall sentiment of the review.
The rating should be a single number between 1 and 5, where 1 represents the most negative sentiment and 5 represents the most positive sentiment.
Respond with only your numeric rating. Do not include any justification of the rating. Use only numbers in your response.
Review: {text}
"""
body = json.dumps({"prompt": prompt, "temperature":0.9})
response = abc.invoke_model(body=body, modelId=model_id, accept='application/json', contentType='application/json')
response_body = json.loads(response.get('body').read())
return response_body.get('generations')[0].get('text').strip()
# Establish a Snowflake connection
conn = snowflake.connector.connect(
user=os.environ["SF_USER"],
password=os.environ["SF_PASSWORD"],
account=os.environ["SF_ACCOUNT"],
warehouse=os.environ["SF_WH"],
database=os.environ["SF_DB"],
schema=os.environ["SF_SCHEMA"],
role=os.environ["SF_ROLE"]
)
try:
# Create a cursor object using the connection
cur = conn.cursor()
# Create the destination table
cur.execute(f'CREATE OR REPLACE TABLE "stg_sample_reviews_genai" ("id" NUMBER(6,0) NOT NULL, "ai_score" VARCHAR(1024) NOT NULL)')
# Select source rows from the table
cur.execute('SELECT "id", "review" FROM "stg_sample_reviews"')
# Fetch all rows from the executed query
rows = cur.fetchall()
# Loop through the fetched rows and call the analyze_sentiment function
for row in rows:
ai_score = analyze_sentiment(row[1])
cur.execute(f'INSERT INTO "stg_sample_reviews_genai" ("id", "ai_score") VALUES ({row[0]}, {ai_score})')
finally:
# Close the cursor and connection
if cur:
cur.close()
if conn:
conn.close()
After running the above script, you should find a new table has been created, which contains the AI-generated review score for every input record. Join this table to the original on the common id column to compare the AI-generated sentiment scores against the original star review.
The LLM was asked to score between 1 and 5, so you may choose to classify the scores more broadly as follows:
- 4 or 5 - Positive
- 3 - Neutral
- 1 or 2 - Negative
Sentiment Analysis in Snowflake using Matillion to run Cohere Command via Amazon Bedrock
In the Matillion Data Productivity Cloud, orchestration pipelines like the one shown in the screenshot below can:
- Directly extract and load data, or call other pipelines to do so (as shown)
- Invoke Cohere Command, with a nominated prompt, against all rows from a nominated table
Sentiment Analysis in Snowflake using Matillion
Data pipelines such as this manage all the connectivity and plumbing between the Snowflake source and target tables, and the LLM.
This allows you to focus on the overall design and architecture, and the data analysis. To compare the AI-generated sentiment scores against the original star review, use a transformation pipeline like the one in the next screenshot.
Checking the results of Sentiment Analysis in Snowflake using Matillion
The data sample shows two of the records. In one case, the LLM's decision matches the original sentiment identically, but in the other record, the ratings differ slightly. This is an example of the subjective nature of sentiment analysis.
Summary
Matillion is a data pipeline platform empowering teams to build and manage pipelines rapidly for AI and analytics at scale. It offers a code-optional UI with pre-built components or coding in SQL, Python, and DBT. Matillion integrates with hyperscalers, CDPs, LLMs, and provides AI-generated documentation. Key features include no-code connectors, custom REST API connectors, parameterization with variables, Git integration, data lineage, pushdown ELT, vector store connectivity, and reverse ETL for AI insights.
Matillion introduces unique AI components for generative AI prompting, retrieval-augmented generation, and a copilot to build pipelines using natural language. It enables highly productive augmented data engineering through collaboration, flexibility, scalability, and built-in AI capabilities.
For more examples of Matillion's AI components in action, check out our library of AI Videos and Demos.
To try Matillion yourself, sign up for a free trial using your own data.
If you are already a Matillion user or trial customer, you can download the sentiment analysis example shown in the screenshots earlier and run it on your own platform.
Featured Resources
What Is Massively Parallel Processing (MPP)? How It Powers Modern Cloud Data Platforms
Massively Parallel Processing (often referred to as simply MPP) is the architectural backbone that powers modern cloud data ...
BlogETL and SQL: How They Work Together in Modern Data Integration
Explore how SQL and ETL power modern data workflows, when to use SQL scripts vs ETL tools, and how Matillion blends automation ...
WhitepapersUnlocking Data Productivity: A DataOps Guide for High-performance Data Teams
Download the DataOps White Paper today and start building data pipelines that are scalable, reliable, and built for success.
Share: