- Blog
- 06.24.2024
- Data Fundamentals, Product
Sentiment Analysis in Amazon Redshift with Cohere Command using Amazon Bedrock

Sentiment Analysis is a powerful technique that enables organizations to gain valuable insights from textual data, such as customer reviews, social media posts, and survey responses.
This article will guide you through various approaches to performing Sentiment Analysis on data stored in Amazon Redshift, starting with a Python-based solution. We will then explore Cohere Command, a large language model service offered through Amazon Bedrock, and how it can streamline and enhance your Sentiment Analysis workflows.
To begin, we will provide an overview of Sentiment Analysis, its applications, and the underlying principles that drive this analytical technique.
What is Sentiment Analysis?
Sentiment Analysis is a sophisticated natural language processing (NLP) technique used to extract a numeric sentiment score from unstructured text. By analyzing phrases, sentences, or entire documents, it assigns a quantifiable value to the emotional tone—ranging from positive to negative sentiments. Large language models (LLMs), such as Cohere Command, play a pivotal role in performing Sentiment Analysis. These models are pre-trained on vast corpora and fine-tuned on sentiment-labeled datasets, enabling them to grasp intricate linguistic nuances and contextual subtleties.
The backbone of effective Sentiment Analysis lies in the reliability of data preparation. Data engineers are tasked with extracting, transforming, and loading (ETL) textual data from diverse databases into formats amenable for LLM processing. They must ensure the data is clean, contextually relevant, and annotated correctly to train the models effectively. Moreover, data engineers must manage data pipelines that maintain robust interaction between the stored raw data and the LLM, ensuring high throughput, low latency, and data integrity throughout the Sentiment Analysis workflow.
Business examples of Sentiment Analysis
- E-commerce: Analyze customer reviews to gauge product satisfaction and identify areas for improvement.
- Brand Monitoring: Track online conversations about a company's products or services to understand public sentiment and respond accordingly.
- Customer Support: Automatically categorize support tickets based on sentiment to prioritize and route negative feedback for immediate attention.
What is Cohere Command?
Cohere Command is a large language model (LLM) designed for natural language understanding and generation tasks. It leverages transformer architecture, similar to GPT-3, to perform a variety of linguistic tasks such as text completion, summarization, translation, and more. The model is trained on extensive datasets and is available through an API, making it accessible for integration into various applications.
Pros:
- High proficiency in understanding and generating human-like text.
- Versatility in performing a wide range of tasks.
- API accessibility simplifies integration.
Cons:
- Computationally intensive, requiring significant resources for deployment.
- Potential biases inherited from training data.
- It may require fine-tuning for specialized tasks.
Ideal Use Cases:
- Customer support automation through chatbots.
- Content creation, including articles and social media posts.
- Data analysis and summarization.
- Personalization in recommendations and search queries.
Overall, Cohere Command provides robust capabilities for developers looking to enhance their applications with advanced natural language processing features.
How to perform Sentiment Analysis in Redshift with Cohere Command using Python with the Amazon Bedrock SDK
Prerequisites for the boto3 Amazon Bedrock Python SDK
Start by installing the prerequisite libraries:
python3 -m pip install psycopg2-binary boto3
Afterwards load your source data into Redshift.
Python boto3 for Cohere Command
The example below involves product reviews, and assumes that the source data has been loaded into a table named "stg_sample_reviews" with four columns: id (the primary key), stars, product and review.
The Python script is shown below. Note it is good practice to handle credentials more securely than shown in this simple example. You might choose to use a secret management service instead of environment variables or hardcoding.
Also please note that handling large amounts of data using fetchall() can be inefficient, and may result in memory issues. For large datasets, you should use a cursor to fetch rows incrementally using the fetchmany() method instead.
If you are working in a schema other than "public" you will need the -c connection option to specify the object search path. Set your RS_OPTIONS environment variable to "-c search_path=yourSchemaName,public" replacing the schema name with your own. The newly created table will be added to this named schema.
import os
import psycopg2
import logging
import json
import boto3
import botocore
from botocore.exceptions import ClientError
logger = logging.getLogger("demo")
# Use the Amazon Bedrock InvokeModel API
def analyze_sentiment(text):
abc = boto3.client(service_name="bedrock-runtime", region_name="us-east-1")
model_id = "cohere.command-text-v14"
prompt = f"""Your job is to analyze online product reviews.
Provide a numeric rating that reflects the overall sentiment of the review.
The rating should be a single number between 1 and 5, where 1 represents the most negative sentiment and 5 represents the most positive sentiment.
Respond with only your numeric rating. Do not include any justification of the rating. Use only numbers in your response.
Review: {text}
"""
body = json.dumps({"prompt": prompt, "temperature":0.9})
response = abc.invoke_model(body=body, modelId=model_id, accept='application/json', contentType='application/json')
response_body = json.loads(response.get('body').read())
return response_body.get('generations')[0].get('text').strip()
# Database connection parameters
db_params = {
'dbname': os.environ["RS_DBNAME"],
'user': os.environ["RS_USER"],
'password': os.environ["RS_PASSWORD"],
'host': os.environ["RS_HOST"],
'port': os.environ["RS_PORT"],
'options': os.environ["RS_OPTIONS"]
}
# Create a connection to the Redshift database
try:
conn = psycopg2.connect(**db_params)
except Exception as e:
print(f"Unable to connect to the database: {e}")
exit(1)
conn.autocommit = True
# Create a cursor object
cur = conn.cursor()
# Fetch all the rows from the source table
query = f'SELECT "id", "review" FROM "stg_sample_reviews"'
try:
# Create the table to hold the results, if it does not already exist
cur.execute('''CREATE TABLE IF NOT EXISTS "stg_sample_reviews_genai"
("id" INT NOT NULL, "ai_score" VARCHAR(1024) NOT NULL)''')
cur.execute(f'DELETE FROM "stg_sample_reviews_genai"')
# Execute the query
cur.execute(query)
# Fetch and process each row
for row in cur.fetchall():
ai_score = analyze_sentiment(row)
cur.execute(f'INSERT INTO "stg_sample_reviews_genai" ("id", "ai_score") VALUES ({row[0]}, {ai_score})')
except Exception as e:
print(f"SQL error: {e}")
finally:
cur.close()
conn.close()
After running the above script, you should find a new table has been created, which contains the AI-generated review score for every input record. Join this table to the original on the common id column to compare the AI-generated sentiment scores against the original star review.
The LLM was asked to score between 1 and 5, so you may choose to classify the scores more broadly as follows:
- 4 or 5 - Positive
- 3 - Neutral
- 1 or 2 - Negative
Sentiment Analysis in Redshift using Matillion to run Cohere Command via Amazon Bedrock
In the Matillion Data Productivity Cloud, orchestration pipelines like the one shown in the screenshot below can:
- Directly extract and load data, or call other pipelines to do so (as shown)
- Invoke Cohere Command, with a nominated prompt, against all rows from a nominated table
Sentiment Analysis in Redshift using Matillion
Data pipelines such as this manage all the connectivity and plumbing between the Redshift source and target tables, and the LLM.
This allows you to focus on the overall design and architecture, and the data analysis. To compare the AI-generated sentiment scores against the original star review, use a transformation pipeline like the one in the next screenshot.
Checking the results of Sentiment Analysis in Redshift using Matillion
The data sample shows two of the records. In one case, the LLM's decision matches the original sentiment identically, but in the other record, the ratings differ slightly. This is an example of the subjective nature of sentiment analysis.
Summary
Matillion is a data pipeline platform that empowers teams to build and manage data pipelines rapidly for AI and analytics at scale. It offers a code-optional UI with pre-built components, or users can code in SQL, Python, or DBT.
Matillion integrates with cloud platforms, CDPs, LLMs, and provides AI-generated documentation. It has numerous no-code connectors, supports custom REST API connectors, and allows parameterization with variables. Components work seamlessly within Matillion's unified platform, enabling data lineage, pushdown ELT, and hybrid SaaS deployment. AI capabilities include generative AI prompting, vector store connectivity, reverse ETL for insights, and a natural language copilot for building pipelines.
For more examples of Matillion's AI components in action, check out our library of AI Videos and Demos.
To try Matillion yourself, using your own data, sign up for a free trial.
If you are already a Matillion user or trial customer, you can download the sentiment analysis example shown in the screenshots earlier, and run it on your own platform.
Featured Resources
Big Data London 2025: Key Takeaways and Maia Highlights
There’s no doubt about it – Maia dominated at Big Data London. Over the two-day event, word spread quickly about Maia’s ...
BlogSay Hello to Ask Matillion, Your New AI Assistant for Product Answers
We’re excited to introduce a powerful new addition to the Matillion experience: Ask Matillion.
BlogRethinking Data Pipeline Pricing
Discover how value-based data pipeline pricing improves ROI, controls costs, and scales data processing without billing surprises.
Share: