Sentiment Analysis for Songs¶
In this assignment you're going to try assigning sentiments to songs from the songs dataset.
Loading the Song Lyrics Dataset¶
from google.colab import drive
import pandas as pd
drive.mount('/content/gdrive')
df = pd.read_csv('/content/gdrive/My Drive/datasets/songs.csv')
Getting out the lyrics of every song¶
all_terms = df['Lyrics'].str.cat(sep=" ").split()
Getting the terms out of a specific song¶
tylor_df = df[df["Artist"] == "Taylor Swift"]
lover_song_lyrics = taylor_df[df["Title"] == "Lover"].iloc[0]["Lyrics"]
lover_song_terms = lover_song_lyrics.split()
The Sentiment Analysis Example from our Slides¶
import spacy
from textblob import TextBlob
nlp = spacy.load("en_core_web_sm")
sentence = "I am very unhappy with this product."
doc = nlp(sentence)
blob = TextBlob(doc.text)
# Note, positive polarity means positive sentiment, negative means negative sentiment.
print(f"Sentiment Polarity: {blob.sentiment.polarity}")
print(f"Sentiment Subjectivity: {blob.sentiment.subjectivity}")
print(f"Assessments: {blob.sentiment_assessments.assessments}")
Applying a Function to Every Row to Create a new Series¶
You may want to treat sentiment as a new column in your dataset. You can do this with pandas using the .apply
method!
Here's an example of me calculating a simple word count column for the songs dataframe.
def lyrics_count(row):
words = row["Lyrics"].split()
num_words = len(words)
return num_words
df["Lyrics Word Count"] = df.apply(lyrics_count, axis=1)
Preprocessing¶
Copied and pasted from our last assignment
import spacy
from nltk.stem import PorterStemmer
nlp = spacy.load("en_core_web_sm")
stemmer = PorterStemmer()
def preprocess(doc_str, with_stemming=False, with_lemmatization=False):
"""preprocess takes a string, doc_str, and returns the string preprocessed.
By default, preprocessing means lowercasing, removing punctuation, and
removing stop words.
Optionally, you may stem or lemmatize as well by passing with_stemming=True
or with_lemmatization=True.
"""
# Lowercase
doc_str = doc_str.lower()
doc = nlp(doc_str) # Initialize as a spaCy object (list of tokens)
words = []
for token in doc:
# Skip punctuation and stop words
if not token.is_punct and not token.is_stop:
text = token.text
if with_lemmatization:
text = token.lemma_
if with_stemming:
text = stemmer.stem(text)
words.append(text)
# Turn them back into one string
doc_str = " ".join(words)
return doc_str
In [4]:
Copied!
from google.colab import drive
import pandas as pd
drive.mount('/content/gdrive')
df = pd.read_csv('/content/gdrive/My Drive/datasets/songs.csv')
df
from google.colab import drive
import pandas as pd
drive.mount('/content/gdrive')
df = pd.read_csv('/content/gdrive/My Drive/datasets/songs.csv')
df
Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).
Out[4]:
Artist | Title | Lyrics | |
---|---|---|---|
0 | Taylor Swift | cardigan | Vintage tee, brand new phone\nHigh heels on co... |
1 | Taylor Swift | exile | I can see you standing, honey\nWith his arms a... |
2 | Taylor Swift | Lover | We could leave the Christmas lights up 'til Ja... |
3 | Taylor Swift | the 1 | I'm doing good, I'm on some new shit\nBeen say... |
4 | Taylor Swift | Look What You Made Me Do | I don't like your little games\nDon't like you... |
... | ... | ... | ... |
740 | George Michael | The First Time Ever I Saw Your Face | The first time ever I saw your face\nI thought... |
741 | George Michael | Waiting For That Day/You Can’t Always Get What... | Now every day I see you in some other face\nTh... |
742 | George Michael | Shoot the Dog | GTI, Hot Shot\nHe parks it there, just to piss... |
743 | George Michael | Star People | Maybe your mama gave you up, boy\nMaybe your d... |
744 | George Michael | Tonight | Tonight\nDo we have to fight again\nTonight?\n... |
745 rows × 3 columns
In [5]:
Copied!
def get_lyrics(songs_df, artist, title):
"""Given the songs.csv dataframe, pulls out the lyrics for a particular artist and song.
"""
return songs_df[songs_df["Artist"] == artist][df["Title"] == title].iloc[0]["Lyrics"]
get_lyrics(df, "Taylor Swift", "Lover")
def get_lyrics(songs_df, artist, title):
"""Given the songs.csv dataframe, pulls out the lyrics for a particular artist and song.
"""
return songs_df[songs_df["Artist"] == artist][df["Title"] == title].iloc[0]["Lyrics"]
get_lyrics(df, "Taylor Swift", "Lover")
<ipython-input-5-10ffb6885a7e>:4: UserWarning: Boolean Series key will be reindexed to match DataFrame index. return songs_df[songs_df["Artist"] == artist][df["Title"] == title].iloc[0]["Lyrics"]
Out[5]:
"We could leave the Christmas lights up 'til January\nAnd this is our place, we make the rules\nAnd there's a dazzling haze, a mysterious way about you, dear\nHave I known you twenty seconds or twenty years?\n\nCan I go where you go?\nCan we always be this close?\nForever and ever, ah\nTake me out, and take me home\nYou're my, my, my, my lover\n\nWe could let our friends crash in the living room\nThis is our place, we make the call\nAnd I'm highly suspicious that everyone who sees you wants you\nI've loved you three summers now, honey, but I want 'em all\n\nCan I go where you go?\nCan we always be this close?\nForever and ever, ah\nTake me out, and take me home (Forever and ever)\nYou're my, my, my, my lover\nLadies and gentlemen, will you please stand?\nWith every guitar string scar on my hand\nI take this magnetic force of a man to be my lover\nMy heart's been borrowed and yours has been blue\nAll's well that ends well to end up with you\nSwear to be overdramatic and true to my lover\nAnd you'll save all your dirtiest jokes for me\nAnd at every table, I'll save you a seat, lover\n\nCan I go where you go?\nCan we always be this close?\nForever and ever, ah\nTake me out, and take me home (Forever and ever)\nYou're my, my, my, my\nOh, you're my, my, my, my\nDarling, you're my, my, my, my lover108EmbedShare URLCopyEmbedCopy"
In [11]:
Copied!
#!spacy download en_core_web_lg
import spacy
from textblob import TextBlob
nlp = spacy.load("en_core_web_lg")
song_lyrics = get_lyrics(df, "Taylor Swift", "Lover")
doc = nlp(song_lyrics)
blob = TextBlob(doc.text)
# Note, positive polarity means positive sentiment, negative means negative sentiment.
print(f"Sentiment Polarity: {blob.sentiment.polarity}")
print(f"Sentiment Subjectivity: {blob.sentiment.subjectivity}")
print(f"Assessments: {blob.sentiment_assessments.assessments}")
nlp = spacy.load("en_core_web_lg")
def get_polarity(row):
#doc = nlp(row["Lyrics"])
blob = TextBlob(row["Lyrics"])
return blob.sentiment.polarity
taylor_df = df[df["Artist"] == "Taylor Swift"]
taylor_df["Polarity"] = taylor_df.apply(get_polarity, axis=1)
taylor_df.sort_values("Polarity")
# assessments_df = pd.DataFrame(blob.sentiment_assessments.assessments, columns=["words", "polity", "subj", "uknown"])
# assessments_df
#!spacy download en_core_web_lg
import spacy
from textblob import TextBlob
nlp = spacy.load("en_core_web_lg")
song_lyrics = get_lyrics(df, "Taylor Swift", "Lover")
doc = nlp(song_lyrics)
blob = TextBlob(doc.text)
# Note, positive polarity means positive sentiment, negative means negative sentiment.
print(f"Sentiment Polarity: {blob.sentiment.polarity}")
print(f"Sentiment Subjectivity: {blob.sentiment.subjectivity}")
print(f"Assessments: {blob.sentiment_assessments.assessments}")
nlp = spacy.load("en_core_web_lg")
def get_polarity(row):
#doc = nlp(row["Lyrics"])
blob = TextBlob(row["Lyrics"])
return blob.sentiment.polarity
taylor_df = df[df["Artist"] == "Taylor Swift"]
taylor_df["Polarity"] = taylor_df.apply(get_polarity, axis=1)
taylor_df.sort_values("Polarity")
# assessments_df = pd.DataFrame(blob.sentiment_assessments.assessments, columns=["words", "polity", "subj", "uknown"])
# assessments_df
<ipython-input-5-10ffb6885a7e>:4: UserWarning: Boolean Series key will be reindexed to match DataFrame index. return songs_df[songs_df["Artist"] == artist][df["Title"] == title].iloc[0]["Lyrics"]
Sentiment Polarity: 0.3085714285714286 Sentiment Subjectivity: 0.5985714285714286 Assessments: [(['dazzling'], 0.75, 1.0, None), (['mysterious'], 0.0, 1.0, None), (['highly'], 0.16, 0.5399999999999999, None), (['wants'], 0.2, 0.1, None), (['loved'], 0.7, 0.8, None), (['blue'], 0.0, 0.1, None), (['true'], 0.35, 0.65, None)]
<ipython-input-11-845bd8e3efde>:24: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy taylor_df["Polarity"] = taylor_df.apply(get_polarity, axis=1)
Out[11]:
Artist | Title | Lyrics | Polarity | |
---|---|---|---|---|
29 | Taylor Swift | mad woman | What did you think I'd say to that?\nDoes a sc... | -0.299194 |
47 | Taylor Swift | Bad Blood | ’Cause baby, now we've got bad blood\nYou know... | -0.258543 |
23 | Taylor Swift | this is me trying | I've been having a hard time adjusting\nI had ... | -0.210370 |
32 | Taylor Swift | The Man | I would be complex, I would be cool\nThey'd sa... | -0.148677 |
5 | Taylor Swift | betty | Betty, I won't make assumptions\nAbout why you... | -0.140934 |
42 | Taylor Swift | epiphany | Keep your helmet, keep your life, son\nJust a ... | -0.118681 |
16 | Taylor Swift | Cruel Summer | (Yeah, yeah, yeah, yeah)\n\nFever dream high i... | -0.104993 |
48 | Taylor Swift | Cornelia Street | We were in the backseat\nDrunk on something st... | -0.088902 |
7 | Taylor Swift | End Game | I wanna be your end game\nI wanna be your firs... | -0.086001 |
22 | Taylor Swift | illicit affairs | Make sure nobody sees you leave\nHood over you... | -0.083847 |
14 | Taylor Swift | my tears ricochet | We gather here, we line up, weepin' in a sunli... | -0.039423 |
4 | Taylor Swift | Look What You Made Me Do | I don't like your little games\nDon't like you... | -0.028656 |
6 | Taylor Swift | august | Salt air, and the rust on your door\nI never n... | -0.018182 |
21 | Taylor Swift | Style | Midnight\nYou come and pick me up, no headligh... | -0.007077 |
10 | Taylor Swift | Blank Space | Nice to meet you, where you been?\nI could sho... | -0.003736 |
8 | Taylor Swift | You Need To Calm Down | You are somebody that I don't know\nBut you're... | 0.010728 |
45 | Taylor Swift | Miss Americana & The Heartbreak Prince | You know I adore you, I'm crazier for you\nTha... | 0.011639 |
15 | Taylor Swift | invisible string | Green was the color of the grass\nWhere I used... | 0.014601 |
12 | Taylor Swift | champagne problems | You booked the night train for a reason\nSo yo... | 0.015000 |
18 | Taylor Swift | Delicate | This ain't for the best\nMy reputation's never... | 0.022523 |
39 | Taylor Swift | hoax | My only one\nMy smoking gun\nMy eclipsed sun\n... | 0.024566 |
33 | Taylor Swift | Don't Blame Me | Don't blame me, love made me crazy\nIf it does... | 0.042442 |
24 | Taylor Swift | Love Story | We were both young when I first saw you\nI clo... | 0.050000 |
11 | Taylor Swift | Ready for It? | Knew he was a killer first time that I saw him... | 0.057604 |
41 | Taylor Swift | All Too Well | I walked through the door with you, the air wa... | 0.066892 |
28 | Taylor Swift | peace | Our coming-of-age has come and gone\nSuddenly ... | 0.098810 |
35 | Taylor Swift | Dress | Our secret moments in a crowded room\nThey got... | 0.110345 |
0 | Taylor Swift | cardigan | Vintage tee, brand new phone\nHigh heels on co... | 0.113588 |
26 | Taylor Swift | Gorgeous | Gorgeous\n\nYou should take it as a compliment... | 0.122212 |
49 | Taylor Swift | Wildest Dreams | He said, "Let's get out of this town\nDrive ou... | 0.135847 |
40 | Taylor Swift | Getaway Car | No, nothing good starts in a getaway car\n\nIt... | 0.139484 |
1 | Taylor Swift | exile | I can see you standing, honey\nWith his arms a... | 0.145321 |
37 | Taylor Swift | ivy | How's one to know?\nI'd meet you where the spi... | 0.148543 |
19 | Taylor Swift | Call It What You Want | My castle crumbled overnight\nI brought a knif... | 0.166099 |
44 | Taylor Swift | Mr. Perfectly Fine (Taylors Version) [From the... | Mr. "Perfect face"\nMr. "Here to stay"\nMr. "L... | 0.168296 |
13 | Taylor Swift | willow | I'm like the water when your ship rolled in th... | 0.172024 |
38 | Taylor Swift | gold rush | Gleaming, twinkling\nEyes like sinking ships o... | 0.181973 |
43 | Taylor Swift | The Archer | Combat, I'm ready for combat\nI say I don't wa... | 0.202183 |
20 | Taylor Swift | seven | Please picture me in the trees\nI hit my peak ... | 0.209750 |
9 | Taylor Swift | ME! | I promise that you'll never find another like ... | 0.216667 |
31 | Taylor Swift | tolerate it | I sit and watch you reading with your head low... | 0.217000 |
25 | Taylor Swift | evermore | Gray November\nI've been down since July\nMoti... | 0.228651 |
36 | Taylor Swift | no body, no crime | He did it\nHe did it\n\nEste's a friend of min... | 0.254545 |
27 | Taylor Swift | happiness | Honey, when I'm above the trees\nI see this fo... | 0.263108 |
30 | Taylor Swift | mirrorball | I want you to know\nI'm a mirrorball\nI'll sho... | 0.263131 |
17 | Taylor Swift | the last great american dynasty | Rebekah rode up on the afternoon train, it was... | 0.282994 |
3 | Taylor Swift | the 1 | I'm doing good, I'm on some new shit\nBeen say... | 0.291388 |
34 | Taylor Swift | I Did Something Bad | I never trust a narcissist, but they love me\n... | 0.308197 |
2 | Taylor Swift | Lover | We could leave the Christmas lights up 'til Ja... | 0.308571 |
46 | Taylor Swift | London Boy | We can go driving in, on my scooter\nUh, you k... | 0.354444 |
In [16]:
Copied!
# !spacy download en_core_web_lg
import spacy
import numpy as np
nlp = spacy.load("en_core_web_lg") # Load spaCy model
rome = nlp("Rome").vector
italy = nlp("Italy").vector
france = nlp("France").vector
guess_paris = rome - italy + france
actual_paris = nlp("Paris").vector
print(f"Distance is {np.linalg.norm(guess_paris - actual_paris)}")
# !spacy download en_core_web_lg
import spacy
import numpy as np
nlp = spacy.load("en_core_web_lg") # Load spaCy model
rome = nlp("Rome").vector
italy = nlp("Italy").vector
france = nlp("France").vector
guess_paris = rome - italy + france
actual_paris = nlp("Paris").vector
print(f"Distance is {np.linalg.norm(guess_paris - actual_paris)}")
Distance is 43.222904205322266
In [ ]:
Copied!