UVM Theses and Dissertations

Format:

Online

Author:

Reagan, Andrew J.

Title:

Towards a Science of Human Stories : Using Sentiment Analysis and Emotional Arcs to Understand the Building Blocks of Complex Social Systems

Dept./Program:

Mathematics and Statistics

Year:

2017

Degree:

PhD

Abstract:

We can leverage data and complex systems science to better understand society and human nature on a population scale through language -- utilizing tools that include sentiment analysis, machine learning, and data visualization. Data-driven science and the sociotechnical systems that we use every day are enabling a transformation from hypothesis-driven, reductionist methodology to complex systems sciences. Namely, the emergence and global adoption of social media has rendered possible the real-time estimation of population-scale sentiment, with profound implications for our understanding of human behavior. Advances in computing power, natural language processing, and digitization of text now make it possible to study a culture's evolution through its texts using a "big data" lens. Given the growing assortment of sentiment measuring instruments, it is imperative to understand which aspects of sentiment dictionaries contribute to both their classification accuracy and their ability to provide richer understanding of texts. Here, we perform detailed, quantitative tests and qualitative assessments of 6 dictionary-based methods applied to 4 different corpora, and briefly examine a further 20 methods. We show that while inappropriate for sentences, dictionary-based methods are generally robust in their classification accuracy for longer texts. Most importantly they can aid understanding of texts with reliable and meaningful word shift graphs if (1) the dictionary covers a sufficiently large enough portion of a given text's lexicon when weighted by word usage frequency; and (2) words are scored on a continuous scale. Our ability to communicate relies in part upon a shared emotional experience, with stories often following distinct emotional trajectories, forming patterns that are meaningful to us. By classifying the emotional arcs for a filtered subset of 4,803 stories from Project Gutenberg's fiction collection, we find a set of six core trajectories which form the building blocks of complex narratives. We strengthen our findings by separately applying optimization, linear decomposition, supervised learning, and unsupervised learning. For each of these six core emotional arcs, we examine the closest characteristic stories in publication today and find that particular emotional arcs enjoy greater success, as measured by downloads. Within stories lie the core values of social behavior, rich with both strategies and proper protocol, which we can begin to study more broadly and systematically as a true reflection of culture. Of profound scientific interest will be the degree to which we can eventually understand the full landscape of human stories, and data driven approaches will play a crucial role. Finally, we utilize web-scale data from Twitter to study the limits of what social data can tell us about public health, mental illness, discourse around the protest movement of #BlackLivesMatter, discourse around climate change, and hidden networks. We conclude with a review of published works in complex systems that separately analyze charitable donations, the happiness of words in 10 languages, 100 years of daily temperature data across the United States, and Australian Rules Football games.

Request print copy from Annex

Search Website

Search Directory

A to Z

Search Website

Search Directory

Collections

Research

Services

About

Help

Ask a Librarian

Threre are lots of ways to contact a librarian. Choose what works best for you.

10:00 am - 4:00 pm

Reference Desk

(802) 656-2022

Voice

(802) 503-1703

Text

Meet with a librarian or subject specialist for in-depth help.

Submit a question for reply by e-mail.

WANT TO TALK TO SOMEONE RIGHT AWAY?

Library Hours for Thursday, November 21st

All of the hours for today can be found below. We look forward to seeing you in the library.

HOURS TODAY

MAIN LIBRARY

WITHIN HOWE LIBRARY

OTHER DEPARTMENTS

CATQuest

Search the UVM Libraries' collections

UVM Theses and Dissertations