Simple CLI for Categorizing and Sentiment of Text
Now that I’ve spent some time with huggingface.co, specifically their NPL Course (natural language processing) I wanted to combine a couple of the learnings into a simple python script.
What I ended up with was a script that could both categorize using a zero-shot-classification
model, as well as get sentiment using a sentiment-analysis
model.
You can interact with this script in one of two ways, first by sending a string as input during execution:
% python main.py "Riding bikes on the beach is glorious"
Loading...
Sentiment: POSITIVE 99.99%
Categories:
Beauty 44.21%
Sports 17.98%
The second is more efficient, and allows for multiple interactions while only needing to load the models once:
% python main.py
Loading...
Text: My favorite tv show lately has been the Mandalorian
Sentiment: POSITIVE 99.60%
Categories:
Television 89.12%
Text: I wish I had more pizza at home :(
Sentiment: NEGATIVE 99.61%
Categories:
Food 60.00%
Home 19.30%
And that is it, that’s the post ✌️
main.py
#!/usr/bin/env python3
import sys
from transformers import pipeline
# List of categories used in zero_shot_classification
categories = [
'Automotive', 'Beauty', 'Books', 'Literature', 'Business', 'Careers', 'Education', 'Family', 'Parenting', 'Food', 'Gaming', 'Health', 'Hobbies',
'Interests', 'Home', 'Garden', 'Law,', 'Government,', 'Politics', 'Life', 'Movies', 'Television', 'Music', 'Radio', 'Finance', 'Pets',
'Science', 'Sports', 'Fashion', 'Technology', 'Computing', 'Travel'
]
print("Loading...")
# load our language models into memory, this can take time.
zero_shot_classification = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
sentiment_analysis = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
while True:
print()
if len(sys.argv) > 1:
text = sys.argv[1]
else:
text = input("Text: ")
print()
try:
# run zero_shot_classification on text using our categories
zsc = zero_shot_classification(
text,
candidate_labels=categories
)
# convert scores to human readable %
zsc['scores'] = [ '%.2f' % round(i * 100, 2) + '%' for i in zsc['scores'] ]
# combine labels and scores into single list
zsc_results = list(zip(zsc['labels'], zsc['scores']))
sa_results = sentiment_analysis(text)
# display sentiment
sentiment = sa_results[0]['label']
sentiment_score = '%.2f' % round(sa_results[0]['score'] * 100, 2) + '%'
print('Sentiment: %s %s' % (sentiment, sentiment_score))
# display top 4 labels
print('Categories:')
for results in zsc_results:
if float(results[1].replace('%', '')) > 10:
print(' %s %s' % (results[0], results[1]))
if len(sys.argv) > 1:
sys.exit()
except KeyboardInterrupt:
sys.exit()