What is social media data scraping?
Social media data scraping is a methodology that incorporates third-party technology to automatically scrape data from a website such as Twitter, Instagram, or Facebook.
By using third-party data scraping tools such as Octoparse or Parsehub, you’ll receive your data in a neat excel package that will then allow you to analyze it however you wish.
Popular aspects to analyze with social media data sets include conducting a sentiment analysis, analyzing volume and frequency of certain words or symbols, and looking at patterns of individual words with location data.
Next, let’s dive into how you can use the data you scrape as inspiration for future content.
How to Use Social Media Data as a Source for Your Content
1. Peruse Instagram to come up with fun content ideas.
In our first example, we’ll look at a content campaign called #SexiestLocations on Instagram.
The execution of this project was fairly simple: our research team collected over 4 million posts on Instagram that contained the hashtag #sexy. They then analyzed the posts that included a geolocation tag. From this, they were able to glean the “sexiest” countries in the world, as well as U.S. states.
Safe to say, publishers ate it up.
While it’s actually impossible to truly learn what the sexiest place in the world is (sexy is a subjective term) our team produced a fun campaign for our client that used geo-bait to appeal to light-hearted online sites, like Glamour, E! Online, Women’s Health, and Elite Daily.
2. Explore Twitter to learn more about a pressing topic.
In our second example, we’ll look at a campaign covering a much different topic: college drinking habits.
In this methodology, rather than exploring Instagram, the team analyzed Tweets from Twitter instead. The researchers looked at tweets within a 1.5 mile radius of the center of small, four-year colleges and universities that included the keywords “drunk,” “drinking,” “alcohol,” “booze,” “beer,” or “wine.”
There are official rankings that come out every year that pit universities in America against each other for the top “party school” in the nation. This project speaks to this notion in a new way, by looking to data from Twitter to back up those claims.
This campaign speaks to the ongoing conversation about the problem and prevalence of dangerous levels of college drinking in America. Again, using geo-bait and highly targeted digital PR outreach, this campaign was able to earn coverage at the Huffington Post, Adweek, Elite Daily, and BroBible.
3. Don’t overlook niche social platforms, like Yelp.
You don’t just have to stick to Twitter, Facebook, and Instagram when creating content from social media data. There is a world of niche community platforms that can give you so much unique, interesting information that you can’t find anywhere else.
For instance, this content example uses popular restaurant review platform Yelp to glean insights about Americans’ dining preferences. What are the most popular cuisines in different cities across America? Using Yelp’s Fusion API, this study analyzed more than 120,000 restaurants in the U.S. with their ratings, pricing, and restaurant categories.
Yelp turned out to be a treasure trove of solid data. This visualization shows viewers the most unique restaurants for each city, the number of restaurants, and more. From this, you can see that Boston has more bagel restaurants than other cities, per capita.
This project saw a lot of success very quickly — once the exclusive went live on the Temple University section of ULOOP, it quickly syndicated to other U.S. university sites and earned over 100 pieces of unique media coverage.
4. Analyze tweets for advanced textual insights.
There are times when using a Twitter scrape just isn’t enough, and you need external analysis. In one of the coolest uses of Twitter data I’ve seen, a campaign called “Most Powerful Women” does just that.
IBM Watson Personality Insights is a free online tool by IBM that allows you to analyze text for prevalence of character traits. Typically you might use this tool to analyze speeches that people have given, or articles they’ve written. In the absence of that, you can use their own personal Twitter timeline to get samples of their writing.
The study sought to find out what the similarities and differences are between some of the top 100 “most powerful” women in the world. From Oprah to Queen Elizabeth, the takeaways gleaned from this study are numerous.
The exclusive to this project went to Bravo, a site that often covers powerful women for their audience, anyway.
For those of us who want to be powerful women too, we can learn which traits will take us all the way to the top, based on the most common traits shared by these famous and powerful women.
5. Conduct a survey for more actionable insights.
In my final example, we’ll look at #AdAnalysis, a campaign that combines an Instagram scrape methodology with a survey of 1,000 Americans to derive fascinating insights on the topic of influencer marketing on Instagram.
The campaign researchers sought to answer a few questions: What types of photos are popular for advertisements, and which demographics respond to promoted posts the most positively?
The first question was answered with a data scraping, and the second was answered with a survey.
Combining the two methods of research allowed the campaign to offer more well-rounded and actionable insights to journalists and news publishers.
Best Practices for Scraping Social Media for Content Marketing Campaigns
After producing over a hundred social media scrape campaigns over the past seven years, we’ve learned first-hand what types of social media campaigns excel during digital PR outreach.
Twitter, Instagram, Facebook, LinkedIn, Reddit, and Yelp all have their unique benefits and can offer valuable insight into topics across the spectrum.
In this post, I walked through five campaigns that used social scraping as a methodology. All three campaigns represented diverse subject matter: college education, sex and relationships, food, leadership, and advertising. This methodology can clearly be used across all verticals — for nearly any brand in any niche.
Here’s some tips to keep in mind when producing content with a social media scrape.
- Hashtags are typically subjective, so keep projects lighthearted in nature in order to earn major coverage.
- Stay away from using social scrape methodologies to talk about things that are scientific or close to health topics — people looking for health advice should get information from licensed professionals.
- Make sure that no matter the topic, whatever you produce contributes to an ongoing conversation.
- Exercise caution when combining newsjacking and the scrape methodology, because trending news topics can become old very quickly if you don’t earn coverage immediately.
Originally published Jul 8, 2020 7:00:00 AM, updated July 08 2020