The ultimate guide on unstructured data management. Improve your business by learning the best ways to process and analyze emails, social media, and more.
Over the past decade, strategic management of unstructured data has become essential. By definition, unstructured data doesn’t have any predefined structure. In simple terms, this is text-heavy information that can contain useful numbers, such as statistics, dates, and facts (thanks Wikipedia!).
Can you think of any examples? Social media feeds? Pictures? What about Google Maps? Even email? Yes, yes, yes, and yes. Unstructured data is in every corner of your everyday life, and many businesses aren’t taking advantage of its benefits.
As an introduction, let’s start with a pretty basic example: Google Maps. As most of us know, Google Maps is a web mapping service. It’s great for locating local marketing and sales targets. (It’s also great for finding your way if you get lost a lot, like me.)
So, say you’re a marketer working for a fictional DTC brand that sells mega-muscle, the next big thing in muscle building supplements (bear with me). When you search "gyms in nyc" for potential clients in New York City, you’ll probably end up with this:
For a DTC brand marketer, you'll need to collect some contact information, like the phone number, email, website URL, and address of the gym. To get any of this information, you’d probably click on the business of interest and find this page:
Easy, right? Well, if you want another 10+ business, you'll have to click on 10+ more listings and copy the information from each. This shows us a valuable lesson:
Pages with useful information aren't always structured in a way that makes it easy to collect large amounts of that information. This is where managing unstructured data becomes valuable.
For instance, instead of manually gathering data from Google Maps, with the right apps and code, you could simply type in ‘gyms in nyc’ and automatically generate a spreadsheet of the information you need.
Google Maps is only the tip of the iceberg. Some types of unstructured data are even harder to collect than that on Google Maps and can take even longer to manually input. But, with certain tools, you can create spreadsheets like this in seconds for a wide variety of platforms.
Think about a mention on Twitter, or a statistic within the text of an article that you’re reading. This data exists in every aspect of the internet, and its rise has corresponded with the explosive growth of data creation in our daily lives. To put this into perspective:
Much of the change in data generation has occurred recently. Even a few years ago, little data existed, and most of the data that did exist was structured. Today, large amounts of (mainly unstructured) data exist, and businesses can gain an edge by using it. Here are the things we’ll go over today:
What Kinds of Unstructured Internet Data Are Out There?
Emailis a tool that’s used by (basically) everyone. However, email isn’t optimized to reflect the vast amount of unstructured data created every day through text-heavy communications. Most people know and use Gmail, but this will also cover email marketing sites like Klaviyo and MailChimp.
Gmail and Outlook are the most straightforward versions of email, and typically contain useful information. But, if you’re receiving hundreds of emails a day, it’s hard to filter and analyze which data is the most important. This is where data management strategies come in handy. Certain templates, senders, and times can be optimized, and emails containing important data can be isolated and quantified. For example, a salesperson can target certain outgoing email templates and automate tracking the response rate on these emails. They can also target certain keywords and measure that variable’s success.
On the flip side, incoming emails can also be tracked and sorted. For example, filtering out legitimate product inquiries from co-worker messages via the sender type can be incredibly time-saving. In short, certain structural factors and keywords are easily searchable and can help busy people save time on email sorting, without the need for expensive email optimization products.
Social media networks
Social media networkswere probably the first examples of unstructured data that you thought of. You (hopefully) know what social media networks are, but a few examples include Instagram, Facebook, LinkedIn, and Twitter. While there are currently analytical tools built into social media networks to monitor audience engagement, unstructured data allows for the more granular insights that are critical to generating actionable to-dos.
Take consumer review posts for example. These posts rarely allow marketers to easily interpret the data without extensive manual labor. But, with unstructured data management, you could aggregate Facebook comments that mention the name of your product in combination with certain positive or negative keywords.
If you’re a DTC brand marketer that sells boots, and are promoting the boots as "rugged," you could aggregate mentions of these boots that include keywords like "rugged," "tough," "durable," etc. Thus, you can optimize unstructured social media data to gauge product perception and alter product characteristics without the time-intensive process of reading through customer reviews and comments online.
Even before a launch, marketers can use social media data to gauge customer engagement with different features and create a marketing strategy that is tailored to the product. For example, you could aggregate Tweets that include direct mentions of your brand, and filter for certain keywords to determine which product features consumers want the most.
Further, marketers can use Instagram to filter out posts that either tag or hashtag your brand. The information contained within the post and the accompanying Instagram bio, such as handles, emails, and other listed personal details, can be dropped into a spreadsheet. The information could then be analyzed to determine consumers that already frequently use your product, making them natural brand ambassadors.
While there are currently analytical tools built into social media networks to monitor audience engagement, unstructured data allows for the more granular insights that are critical to generating actionable to-dos.
Beyond marketers and product managers, there are also other important use-cases. For instance, recruiters can use unstructured LinkedIn data to determine which candidates have the best chances of being long-term fits. A recruiter could screen for candidates that have only held a certain amount of positions within a certain period to immediately rule-out candidates with a history of job-hopping. While LinkedIn has similar tools under its recruiter subscription, good unstructured data management doesn’t cost hundreds of dollars per year.
Another use-case is an investment professional that looks at social media trends to source fast-growing investments, and accurately gauge the product-market fit of their portfolio companies. This person could use trends in the brand's direct mentions on Twitter, Facebook, and Instagram to monitor the trajectory of the company. In summary, granular, consumer-level data can be much more useful than surface-level financial metrics.
Business intelligence sites
Business intelligence sites are the next example that we’ll go through. When you think of a business intelligence site, think of Crunchbase, Owler, and Pitchbook. These companies store structured data but also release articles and reports that contain unstructured, yet highly useful, sources of information.Further, their structured data is often only accessible with a costly subscription.
However, unstructured data from these sources can be used in a variety of ways. For example, if you’re a business development manager doing market research, Crunchbase and Pitchbook can give you a top-level view of the industry, along with major industry participants. If you operate in this industry, it can give you intel on competitors that may be more detailed than other sources. If you're an investor, these sites can be sources for new investments and can help find detailed metrics on private companies that would otherwise be unavailable.
Many times, helpful information on these companies are also available in the text of company descriptions. An example of a company description is below:
The data is easily readable, but it’s hard to gather into a spreadsheet and do this across multiple companies. Similar to the original Google Maps example, with the correct platform/tools, needed data can simply be exported into a spreadsheet without a costly subscription to accomplish it.
The example at the beginning of this article touched on Google Maps, but you can use even more unstructured data available here to optimize and monitor a wide variety of processes and critical KPIs. Insights from this data are valuable in tailoring advertising strategies to specific areas that are most likely to use your product. This is really useful for local business owners that are looking to use traditional advertising tools, such as social media and search engine advertising, but would like a more detailed secondhand method of evaluating the success of their strategy.
On top of marketing plan evaluation, Google Maps is also great for targeting leads. For instance, new business can be automatically generated on a spreadsheet depending on specific search queries on Maps. For example, a business owner could search “restaurants in New York City” and receive a list of relevant businesses with addresses, descriptions, ratings, etc. This drastically reduces the time it takes to collect the information manually from Maps.
E-commerce is the final area that we’ll touch upon. Unstructured e-commerce data can be incredibly useful in areas from product development to competitive research. A few common e-commerce sites are Amazon and Shopify, with many more platforms becoming increasingly prevalent.
More specifically, Shopify can be used for product development by filtering product listings by characteristic, then transferring the search results into a usable database. Next, a product manager could look for product categories that have low customer satisfaction ratings, and build a product that fulfills those unmet needs.
On a competitive level, team members can use this data to optimize pricing and product characteristics, without the need to manually research the market. For example, on Amazon, a product manager could easily filter out products by category and cost into a usable database and compile summary statistics on the competitive products listed in the spreadsheet.
Who is This Useful For?
We’ve talked about what sort of unstructured data is out there, and the next topic on our list is who can benefit most from managing this data. Generally, using unstructured internet data is perfect for any organization that wants an efficient and effective way to spend their marketing budget. In today’s world, it isn’t enough to just advertise through Facebook or Google, or generate inbound leads off of time-consuming manual searches. Regardless of the role, anyone can gain a competitive advantage by optimizing the way they gather the data. Here are some key examples of who should consider using unstructured data:
Salespeopleare the first people that we think of. Generating leads is crucial to sales success, and using unstructured data can take lead generation to another level. For example, a salesperson for niche accounting software could isolate social media posts that mention needs for "accounting" and "software" and populate a list of associated usernames. Used in conjunction with Hunter.io, Github, and other platforms through a unified process for managing unstructured data can jumpstart lead generation. Overall, unstructured data can make leads more qualified due to its more detailed nature, which means salespeople have a better chance of conversion. Further, gathering leads can be done faster, leaving salespeople doing more of what they do best: selling.
Marketersare next on the list. The internet stores a vast amount of information that marketers can use to develop better marketing campaigns and monitor existing ones. Facebook, Twitter, and Instagram insights can be great for this specific use. You can isolate mentions of keywords that are especially important to the campaign. For instance, the number of posts that contain the words "easy" and "fast" can reflect the success of a campaign focused on convenience. Marketers can also manage data from these sites to find the next brand ambassador, filtering people who frequently use and/or mention the brand organically.
With unstructured data, product managers can gain a deeper understanding of their company's products at a level inaccessible with traditional data. As mentioned earlier, Amazon and Shopify can be great places for a product manager to scope out competitive products, and optimize current and future products to fit customer needs. Facebook, Twitter, and Instagram can also be great places to generate insights into customer satisfaction and future customer needs.
Business Owners & Managers
Business owners & managers can benefit from unstructured data because they typically cover all of the job functions in the prior three sections. Though a small business owner wouldn’t focus heavily on any of these three aspects, unstructured data management can be especially useful because it optimizes for time, which many owners lack.
A great example is Google Maps. Business owners can use this tool to produce sales, marketing, and product insights across their business. For example, the owner of an up-and-coming bagel shop can use a service that monitors foot-traffic, such as Blix, to quantify the best times to use in-person marketing promotions. This same business can use Google Maps to target nearby businesses that might purchase catering services from the coffee shop. Finally, if the owner is considering expansion, they can use Google Maps to generate competitor location data and quality ratings to determine locations that need a new coffee shop.
In today’s world, it isn’t enough to just advertise through Facebook or Google, or generate inbound leads off of time-consuming manual searches. Regardless of the role, anyone can gain their company a competitive advantage by optimizing the way they gather the data.
Where Are the Tools to Use This?
Now that we’ve explored the types of unstructured data, as well as some of the people who can benefit from a good strategy to deal with it, it’s time to talk about how to build this strategy. The first thing to do is to take a look at a few other introductory sources:
If you’re looking for solutions to internal unstructured data management, IBM is a good resource.
Finally, Igneous released a great report on the state of unstructured data in 2018.
The next step is to either set up your own APIs, or utilize another service to automate the process for you. We have a convenient guide (link) for why APIs are so important and how to set them up, otherwise, you may be able to hire a freelance developer to do it for you. The service you choose can primarily be segmented into two different options: standalone APIs or integrated solutions.
Standalone APIs are essentially what they sound like. Choose any website that you think has valuable data sets, and implement an API to pull specific unstructured data into a spreadsheet. This method is less time-consuming to set up compared to an integrated solution and works well for businesses that really only need a few sources of data, and don’t plan on using other tools, like CRMs or payments and email platforms, alongside it. The only downside to this strategy is that, if you’d like data from multiple different sites or need to integrate the data with other platforms, creating standalone APIs for multiple platforms can become time-consuming, and transferring the data to your other platforms can become even more time-consuming. These can typically be configured yourself, or through contracting a developer to do it for you. Eventually, the data would flow into Google Sheets, which we have a great general cheat sheet for here (link to google sheets article).
Platformsare standalone APIs that are integrated into a broader platform surrounding the unstructured data you’re using. For example, the data from your Google Maps API would flow automatically into your Salesforce or Hubspot CRM. Any contacts you pull from Google Maps could also potentially flow into Gmail, MailChimp, and Klaviyo.
Another example is finding a target company using unstructured data from Crunchbase or Pitchbook, using an integration with Hunter.io to generate email addresses for company employees, and then using unstructured data from social media sites like Facebook, Twitter, and Instagram to fill in personal information, and make the cold email less cold.
These two examples are one of many; the ideal platform will not only integrate with the sites mentioned above, but also have functionality for your payments platforms (Stripe and Plaid), helpdesk software (Zendesk), e-commerce sales channels (Amazon, Shopify), back-office software (Quickbooks), and many more.
If you’re excited about the potential benefit a platform like this could have on your business, then we’re in the same boat! At this point, we’d be remiss not to mention that Lido offers a platform similar to the one above, including dashboards that track the flow of data and information through the platform. Check it out here!