In this special guest feature, Michael Burke, Director of Science & Technology at MSR Communications, takes a look at the state of the PR industry and its growing relationship with data science and machine learning technology. Machine learning may not replace the art of PR anytime soon, but there are countless areas where machine learning can refine and support the intuition and creativity that is critical to PR. Michael has worked with some of the world’s top brands on marketing and PR strategy, including The Myers-Briggs Company and Airbnb, as well as dozens of cutting-edge technology clients. As a director and data scientist at MSR Communications, he’s living his dream of applying data science to PR and marketing communications.
As nearly all industries race to become data-driven, there are a few that notably lag. Among those is Public Relations (PR), which is an inherently difficult discipline to quantify. While it’s easy enough to aggregate outlet circulation/audience numbers for positive media coverage, or social shares for company content, the ability to measure its impact, much less predict outcomes, remains elusive. For one thing, the effects of media coverage can be difficult to determine through standard attribution models because there is usually no direct trail between someone reading an article and someone taking a desired action. Furthermore, the forces that shape media coverage trends remain mysterious–if we understood them, we’d understand a lot more about what makes us tick as a society.
Currently analytics in PR are almost entirely reporting-focused, answering questions like: how many people read articles about our company? How many times were they shared? There is no model that explains why one company is a media darling and another continually receives the scorn of the press, or why some executives seem to get a free pass while others have every word they say nitpicked and used against them.
Machine learning may not replace the art of PR anytime soon. In fact, the algorithm advanced enough to replace the instinct and creativity of a seasoned PR pro may very well be the last thing mankind invents (see the singularity). With that said, there are countless areas where machine learning can refine and support the intuition and creativity that is critical to PR.
Uncovering relationships between standard metrics
To start, machine learning can shed light on the relationships between various metrics that inform most PR reporting. For instance, Unique Monthly Visitors and social shares are common metrics used to retrospectively measure the effectiveness of a campaign. But, how do they affect one another? Linear and logistic regression as well as neural networks can be used to determine the relationship between such metrics and, possibly even more importantly, determine if tactics can be implemented that boost one or the other. For instance, my team analyzed about 400 articles and discovered that search engine ranking is related to the number of social shares an article receives, indicating that if companies invest in sharing their articles through social channels, they may see a 3 to 7 point boost in the article’s MOZ score (one of the better measures of SEO impact).
NLP to determine what kind of content has the most impact
Almost every PR professional has had the cringeworthy experience of having an executive give the mandate to create content that ‘goes viral’–it’s the holy grail of content marketing, yet no one really has a clue as to why something goes viral. In fact, one of the most perplexing questions involves determining why certain media content performs well. Natural Language Processing (NLP) is a potentially powerful tool for gaining understanding in this area. Such algorithms as Naive Bayes and Random Forests can learn by analyzing the text of previous content, thus training models to predict the impact of future content.
Currently platforms like Meltwater and Cision use forms of NLP to analyze the sentiment of articles toward a brand. This is still in early stages, probably because the same model is applied across the board to numerous brands, so the algorithms don’t pick up the nuances in the language around certain brands. I suspect that the next phase in this technology will involve training the models in a way that that incorporates industry and brand nuances for greater accuracy.
Better targeting of journalists
Much of a PR practitioner’s effort is spent determining which journalists to approach about a story. This laborious process can be improved by techniques like market basket analysis with Association Rules. In marketing this technique is applied to determine what products a consumer might be likely to purchase based on past shopping behavior. In PR, this method can help identify which reporters are more likely to cover a topic based on other topics that they’ve covered.
For instance, we found that a reporter who had covered antibiotics and constipation was 3.7 times as likely to cover probiotics. If you’re working for a probiotics company, having this kind of information allows you to leverage tools such as Trendkite to identify reporters who have covered those topics in the past, thus removing some of the guesswork from journalist targeting.
Unsupervised Learning for prioritizing target outlets
One consistent challenge in PR involves determining which media outlets are the most relevant to your audiences. It’s not an easy question, because no single metric provides a satisfactory answer. Ranking them by circulation is easy enough but having a high circulation doesn’t mean that an article in the outlet will have a great impact for your brand. Standard metrics such as circulation, # of backlinks, authority score and number of significant search terms can provide insight, but considering that each publication will have varying degrees of each of these, it’s difficult to come up with a measure that encompasses all these features.
Furthermore, PR practitioners are typically asked to determine optimum media targets without any previous outcome data–very rarely is there data that describes the previous value that an article had for a brand. In cases like this unsupervised learning techniques like clustering can group outlets by multiple attributes, so you can begin to gain an understanding of the kind of value that a publication brings. Algorithms such as K-means can divide your target media outlets into groups according to similarity, allowing you to tier them and prioritize your efforts.
Uncovering hidden patterns in media coverage
Possibly the most exciting frontier for machine learning in PR involves mining for hidden patterns in media coverage. One of the more interesting areas of exploration involves looking at how certain terms in articles are related to other terms. In our analysis of articles on the 2020 election, for example, we used Association Rules to reveal that an article containing the terms “racist”, “sexist” and “Trump” was 57 times more likely to contain the term “homophobic”. If that’s not surprising, consider that an article containing both “Mike Pence” and “Kamala Harris” was 7 times more likely to contain the term ‘racist’, or that an article containing “Biden”, “Castro” and “Klobuchar” was 6.8 times more likely to contain the term “socialist” (something the Biden campaign would do well to consider).
In conclusion, these are of course just a few areas where machine learning could potentially be applied to PR. Other areas that are primed for machine learning include media crisis prediction and news cycle analysis. True, PR is an inherently difficult discipline to quantify because it is so inherently ‘human’, but in my opinion that’s exactly why it offers the greatest potential reward.
Sign up for the free insideAI News newsletter.