ELGL Co-Founder Kent Wyatt poses a question and a guest columnist answers it. This week, Jason Jones, Assistant Director, Budget Management & Evaluation, Guilford County, NC, writes how we can understand how people feel when they read city communications.
Maybe if I tried really hard I could dazzle you with quotes about how important the words we choose are. As a general baseline, I will assume that we agree on that point and let you search for quotes yourself. I will argue that although we recognize how important words are, we routinely ignore this simple fact when communicating with the public.
We explicitly recognize the tremendous power of words on occasion when people say things that are meant to cause a reaction. At other times, we don’t realize how words are subconsciously affecting us as we read or hear them. If you don’t buy-in to that second concept, this isn’t the article for you. If you do buy-in, I am going to walk through some examples to demonstrate how this plays out in actual local government documents.
I won’t dive too deep into the technical details of this work since there are different methods for exploring this topic. Also, if you aren’t interested, I could lose you really quickly. I do want to do some level setting to let you know what tools I used.
I work in the R programming language and usually stick to just a few packages when processing structured and unstructured text data. For working with PDF documents, I love pdftools from rOpenSci. The rest of my analysis pipeline is usually handled with the tidyverse and tidytext packages. You can actually check out an implementation of some of this for yourself here – The Emotionizer. This is a Shiny App that I built to try to help other people process text data from PDF documents. For the bonus example, I rely heavily on the koRpus R package. This requires some heavy setup to use but it is all laid out in the documentation.
Also, for the sentiment and emotion examples, I am doing my word matching with the NRC lexicon. You can read up on that here – NRC Word-Emotion Association Lexicon
I am going to walk through some examples on a spectrum that I’ve arbitrarily created based on my experience with what seems to be the most basic to most difficult. This is absolutely not comprehensive so if something piques your interest, reach out and we can nerd out on this together.
People usually hear me talk about this type of analysis with strategic plans, budget documents, and press releases but I thought it would be interesting to show you another way this can be used. For my first two examples, I grabbed some law enforcement data from an open data portal and ran some analysis on the arrest narratives. There is a lot more you can do with this, but I just kept it short and sweet for these examples. For example, the geographic distribution of emotion and sentiment. I’m also not revealing what city this came from since these were quick and I don’t want anyone rushing to assumptions based on these examples.
Law Enforcement Narrative Sentiment
What you are seeing in this chart is positive and negative word association in law enforcement arrest narratives disaggregated by reported gender. The bars represent the counts of positively and negatively associated words as they appeared in the arrest narratives. I’m actually pleasantly surprised by this since you can see that there is very little difference in the use of positively and negatively associated words in the narratives written for Males and Females. What about you? Do you find anything particularly interesting here?
Law Enforcement Narrative Emotion
What you are seeing in this chart is word-emotion association in the same law enforcement narratives again disaggregated by reported gender. The bars represent the counts of the different word-emotion associations as they appear in the arrest narratives. I also left the positive and negative word associations from the prior sentiment example in this chart for you. Maybe I will just let you decide if you see anything out of the ordinary here.
If you want to check out another implementation of sentiment and emotion analysis, you can check out my exploration of The City of Raleigh’s budget document. This example actually includes all of my associated R code. I did this back in early 2018 so it was one of my early attempts at this.
One additional thing that I have been working on for a little while now is also assessing, in a standardized way, the reading comprehension level of our communications. Consider for a second, your jurisdiction’s educational attainment statistics. 35.7% of the population in Guilford County, 25 years and over, have a bachelor’s degree or higher. That means that 64.3%, or approximately 227,261 people over the age of 25, do not.
When we publish something, it really should be our responsibility to make that publication as broadly accessible as possible. I generally argue for an 8th to 9th-grade comprehension level which is classified as “plain English – easily understood by 13- to 15-year-old students” by the Flesch Reading Ease test. Check out how Guilford County’s most recent adopted budget document did against that formula. Looks like we are generally writing at a college comprehension level which potentially alienates a lot of people in the county.
I want to give you one last quick example of reading comprehension that compares text where it was not considered to text where it was definitely considered. I first looked at this press release from The City of Los Angeles, promoting its DASH to Class program. I compared it against some patient-level educational material on high blood pressure and hypertension created by UpToDate to assist medical providers. I have to credit my wife for providing this text and recommending it as a great example of something that carefully considers reading comprehension.
The City of Los Angeles press release had a Flesch Reading Ease score of 42.53 which equates to a college-level or higher school level. The patient education material had a Flesch Reading Ease score of 80.22 which equates to a grade 6 school level. I know the consequences associated with reading comprehension in these two examples are dramatically different but we certainly could start evaluating our publications in more comprehensive ways.
I know there is a lot crammed in here and it is difficult to wrap-up in a nice neat bow. If you have any lingering questions, really don’t like something I’ve done here, or want to nerd out on this with me please feel free to just yell at me on Twitter (@packpridejones).