- Blog
- 06.24.2024
- Data Fundamentals
Sentiment Analysis in Snowflake with Meta Llama3 70B using Amazon Bedrock

Sentiment Analysis is a powerful technique that enables organizations to gain valuable insights from textual data, such as product reviews, social media posts, and customer feedback. This article will explore various methods for performing Sentiment Analysis in Snowflake, starting with Python-based approaches. We will delve into the capabilities of Meta Llama3 70B, a state-of-the-art language model, and how it can be employed through Amazon Bedrock for Sentiment Analysis tasks. To begin, we will provide an overview of Sentiment Analysis, explaining its significance and the underlying principles that drive this analytical technique.
What is Sentiment Analysis?
Sentiment Analysis extracts numerical sentiment scores from unstructured text, providing a quantifiable measure of opinions and emotions. This process involves processing textual data to classify its sentiment polarity, whether positive, negative, or neutral. The advent of large language models (LLMs), such as Meta's Llama3 70B, has dramatically enhanced Sentiment Analysis by understanding context, nuance, and linguistic subtleties.
Large language models are trained on vast datasets and fine-tuned to discern sentiment through context-aware embeddings. They excel by utilizing deep learning architectures, specifically transformer models, which capture long-range dependencies in text. This sophisticated understanding allows for highly accurate sentiment predictions, even in the face of complex and idiomatic expressions.
Reliable data preparation is paramount for optimal performance in sentiment analysis. Data engineers play a critical role in this process by ensuring that data from various sources are clean, consistent, and well-structured. They must efficiently interface between the database and LLMs, implementing pipelines for data ingestion, preprocessing, and storage. This ensures the models receive high-quality input, maintaining the integrity and accuracy of sentiment scores.
Business examples of Sentiment Analysis:
- Social media monitoring: Analyze user-generated content to gauge public sentiment towards brands, products, or campaigns.
- Customer feedback analysis: Automatically categorize customer reviews, support tickets, or survey responses to identify pain points and areas for improvement.
- Financial market prediction: Analyze news articles, social media posts, and earnings call transcripts to predict stock price movements based on sentiment towards companies or industries.
What is Meta Llama3 70B?
Meta Llama3 70B is a large language model developed by Meta AI, a variant of the LLaMA model family. It's a transformer-based architecture with approximately 70 billion parameters, trained on a massive dataset of text from the internet. The model uses a decoder-only architecture, similar to other popular language models like BERT and RoBERTa. It's designed to generate human-like text and can be fine-tuned for various natural language processing tasks.
Pros:
- High-quality text generation capabilities
- Can be fine-tuned for specific tasks like chatbots, language translation, and text summarization
- Open-source and available for research and development
Cons:
- Requires significant computational resources and memory for training and inference
- May exhibit biases and inaccuracies due to its training data
- Can be challenging to fine-tune and adapt to specific use cases
Ideal use cases:
- Building conversational AI systems like chatbots and virtual assistants
- Generating content for websites, social media, and marketing materials
- Developing language translation and localization tools
- Creating automated text summarization and analysis systems
How to perform Sentiment Analysis in Snowflake with Meta Llama3 70B using Python with the Amazon Bedrock SDK
Prerequisites for the boto3 Amazon Bedrock Python SDK
Start by installing the prerequisite libraries
python3 -m pip install snowflake-connector-python boto3
Afterwards load your source data into Snowflake.
Python boto3 for Meta Llama3 70B
The example below involves product reviews, and assumes that the data has been loaded into a database table named "stg_sample_reviews" with four columns: id (the primary key), stars, product and review.
The Python script is shown below. Note it is good practice to handle credentials more securely than shown in this simple example. You might choose to use a secret management service instead of environment variables or hardcoding.
Also please note that handling large amounts of data using fetchall() can be inefficient, and may result in memory issues. For large datasets, you should use a cursor to fetch rows incrementally using the fetchmany() method instead.
import os
import snowflake.connector
import logging
import json
import boto3
import botocore
from botocore.exceptions import ClientError
logger = logging.getLogger("demo")
# Use the Amazon Bedrock InvokeModel API
def analyze_sentiment(text):
abc = boto3.client(service_name="bedrock-runtime", region_name="us-east-1")
model_id = "meta.llama3-70b-instruct-v1:0"
prompt = f"""<|begin_of_text|><|start_header_id|>user<|end_header_id|>
Analyze the sentiment of the following text and return a score from 1 to 5, where 1 represents the most negative sentiment and 5 represents the most positive sentiment: {text}
Respond with a single number only. Do not include any notes, justification, explanation or confidence level, just the number.
<|eot_id|><|start_header_id|>assistant<|end_header_id|>"""
body = json.dumps({"prompt": prompt, "top_p": 0.9, "temperature": 0.5})
response = abc.invoke_model(body=body, modelId=model_id, accept='application/json', contentType='application/json')
response_body = json.loads(response.get('body').read())
return response_body.get('generation').strip()
# Establish a Snowflake connection
conn = snowflake.connector.connect(
user=os.environ["SF_USER"],
password=os.environ["SF_PASSWORD"],
account=os.environ["SF_ACCOUNT"],
warehouse=os.environ["SF_WH"],
database=os.environ["SF_DB"],
schema=os.environ["SF_SCHEMA"],
role=os.environ["SF_ROLE"]
)
try:
# Create a cursor object using the connection
cur = conn.cursor()
# Create the destination table
cur.execute(f'CREATE OR REPLACE TABLE "stg_sample_reviews_genai" ("id" NUMBER(6,0) NOT NULL, "ai_score" VARCHAR(1024) NOT NULL)')
# Select source rows from the table
cur.execute('SELECT "id", "review" FROM "stg_sample_reviews"')
# Fetch all rows from the executed query
rows = cur.fetchall()
# Loop through the fetched rows and call the analyze_sentiment function
for row in rows:
ai_score = analyze_sentiment(row[1])
cur.execute(f'INSERT INTO "stg_sample_reviews_genai" ("id", "ai_score") VALUES ({row[0]}, {ai_score})')
finally:
# Close the cursor and connection
if cur:
cur.close()
if conn:
conn.close()
After running the above script, you should find a new table has been created, which contains the AI-generated review score for every input record. Join this table to the original on the common id column to compare the AI-generated sentiment scores against the original star review.
The LLM was asked to score between 1 and 5, so you may choose to classify the scores more broadly as follows:
- 4 or 5 - Positive
- 3 - Neutral
- 1 or 2 - Negative
Sentiment Analysis in Snowflake using Matillion to run Meta Llama3 70B via Amazon Bedrock
In the Matillion Data Productivity Cloud, orchestration pipelines like the one shown in the screenshot below can:
- Directly extract and load data, or call other pipelines to do so (as shown)
- Invoke Meta Llama3 70B, with a nominated prompt, against all rows from a nominated table
Sentiment Analysis in Snowflake using Matillion
Data pipelines such as this manage all the connectivity and plumbing between the Snowflake source and target tables, and the LLM.
This allows you to focus on the overall design and architecture, and the data analysis. To compare the AI-generated sentiment scores against the original star review, use a transformation pipeline like the one in the next screenshot.
Checking the results of Sentiment Analysis in Snowflake using Matillion
The data sample shows two of the records. In one case the LLM's decision matches the original sentiment identically, but in the other record the ratings differ slightly. This is an example of the subjective nature of sentiment analysis.
Summary
Matillion is a data pipeline platform that empowers teams to build and manage pipelines rapidly for AI and analytics at scale. It offers a code-optional UI with pre-built components, or you can code in SQL, Python, or DBT. Matillion integrates with hyperscalers, CDPs, LLMs, and has Git integration for asynchronous collaboration. It can use AI to generate documentation, provides no-code connectors and custom REST API connectors, and allows parameterization with variables. All components work seamlessly within one platform. Matillion supports hybrid SaaS deployment, data lineage, pushdown ELT, vector store connectivity, reverse ETL for AI insights, natural language pipeline building with Copilot, and no-code Generative AI prompting and retrieval-augmented generation.
For more examples of Matillion's AI components in action, check out our library of AI Videos and Demos.
To try Matillion yourself, using your own data, sign up for a free trial.
If you are already a Matillion user or trial customer, you can download the sentiment analysis example shown in the screenshots earlier, and run it on your own platform.
Featured Resources
What Is Massively Parallel Processing (MPP)? How It Powers Modern Cloud Data Platforms
Massively Parallel Processing (often referred to as simply MPP) is the architectural backbone that powers modern cloud data ...
BlogETL and SQL: How They Work Together in Modern Data Integration
Explore how SQL and ETL power modern data workflows, when to use SQL scripts vs ETL tools, and how Matillion blends automation ...
WhitepapersUnlocking Data Productivity: A DataOps Guide for High-performance Data Teams
Download the DataOps White Paper today and start building data pipelines that are scalable, reliable, and built for success.
Share: