Data scraping for social media analysis and data collection

Data scraping with Data Miner as a browser extension: Analyze social media channels for post reactions over time

Data Miner is a browser extension that can scrape data from website interfaces. For trend analyzes in the social media area, likes, comments and texts can be extracted from posts in order to display the data in a timeline.

Data scraping is a simple but very effective approach to analyzing social media channels. Data from the surface of a website is collected with the scraper and then output in an Excel or CSV file. In the case of social media posts, this would be, for example, likes, comments, post text and the date of the post. This allows us to examine the performance of companies’ and competitors’ social media channels in a simple and relaxed manner and find out which accounts use particularly successful strategies and record the most interactions on the posts.

There is a big difference to the conventional social media analyzes that are available from the various software in this area: they do not work with scraping, but are API-based. In other words, they connect to the programming interface of the respective platform and call up the data for the examined social media account directly from there. This usually has two disadvantages: First, the platforms (Facebook, Instagram, Twitter) limit the query period for posts and accounts relatively severely, so that, depending on the price model, you can see a performance retrospectively 7 days or 28 days. And secondly, depending on the depth of the data, the tools cost more and more because API calls are subject to fees that the social media tool pays to the platforms. If you book a good package for a fee, you can perhaps analyze a competitor’s account for one year retrospectively for performance – an even longer period would cost even more money or not exist at all.

In-depth analysis and full independence via data scraping with data miners

Data scraping approaches for analyzing posts over time are much cheaper and in most cases free because we can operate and control them ourselves. In addition, trends can be measured in unlimited retrospective periods because data scrapers can simply go through each post to the very first post sent. In other words – they are not limited by a paid platform API, which in turn gives the user a high degree of independence.

With the browser extension Data Miner for Google Chrome, this is already possible in the free package. As of 2022, Data Miner is characterized by the fact that the application does not run via an installed program, but as an extension directly via the browser. At the same time, operation is much easier than in the scraping applications of recent years. However, the range of functions in the free package is also lower than, for example, with the web scraping tool Octoparse, which offers the possibility for complex and advanced scraping functions – such as clicking on elements, filling out contact forms, entering data and the automated execution of commands on different days. However, data miner is perfectly adequate for analyzing post reactions on LinkedIn, and it is very practical that beginners in particular are not overwhelmed with enormous complexity.

Data scraping applications always require our configuration from our side. In other words: We have to give the scraper a roadmap for website operation so that it can navigate through the website elements. The scraping works in real time and the application extracts the data post by post.

Typically, scrapers need three key inputs:

  • Rows are the data content of the later Excel table and result from the fields of the website elements that we set for scraping
  • Columns are the concrete data whose collection we set in the scraper and which should appear sorted in the columns of the later Excel spreadsheet
  • Navigation is important when there is a need to perform indefinite scrolling motions or switch pages (pagination) to scrape the full range of social media posts. A loop function may be required so that the scraper runs again and again when it comes into contact with a navigation element until it finds no more navigation elements.

Step 1 - Install and launch Data Miner

Once you have installed Data Miner as a free browser extension in Google Chrome, you can go to a desired social media account and call up the extension. Then click on the blue Scrape This Page button.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Click on New Recipe in the left menu bar and then on the blue button in the main window + New Recipe.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Step 2 - Multiple Rows by Data Miner

In the window that opens, go to the second tab at the top with the title Type. You don’t have to do anything else here, the default selection for “multiple rows” is already the right one for social media accounts.

Ad-Netzwerk im Lead Syncing von HubSpot verknüpfen.

Step 3 - Page & Easy Row Finder by Data Miner

Now go to the third tab at the top titled Page and click on the Easy Row Finder button. This is now one of three crucial steps to give the scraper the timetable for operation.

Neue Liste in HubSpot erstellen, um die Leads aus Leadformularen vorzusortieren für den späteren Mailversand oder Anschluss an Mailchimp.

Now select an entire post with the mouse. Meanwhile, the selection of the surface is shown in red. Then press the number 1 to confirm the selection.

Neue Liste in HubSpot erstellen, um die Leads aus Leadformularen vorzusortieren für den späteren Mailversand oder Anschluss an Mailchimp.

Now select another element on a second post with the mouse and press the number 2 to confirm the selection. This is to ensure that all posts are included in the selection for the rows.

Kontakt-basierte, aktive List erstellen: Diese sorgt dafür, dass in Echtzeit automatisch neue Leads aus dem Lead Syncing eingespielt werden.

Step 4 - Columns & Easy Column Finder by Data Miner

Now we need to set Columns. These save the specific data to be extracted from the posts in the later Excel or CSV table. In this case, this would be: The number of post reactions and the dates of each post to understand performance over time. To do this, click on the 5 Cols tab at the top and create two columns.

Wenn ihr euren Filter finalisiert habt, klickt oben Rechts auf Liste speichern.

Now click the Easy Column Finder button, and click hover over the item in the LinkedIn post that contains the data for each of the columns and press the letter C.

1) Column 1: Post reactions (move the mouse pointer and type C)

2) Column 2: Date (move the mouse pointer and type C)

Erstellt als Ordnungskriterium einen Filter mit Formulareinsendungen. Wählt entweder euer jeweiliges Leadformular aus (muss zuvor snychronisiert sein, und das kann ein paar Stunden dauern, von Erstellung des Formulars bis zur Synchronisierung) oder mehrere Leadformulare oder Alle Formulareinsendungen, wenn sich alle eure Leadformulare sowieso um denselben Anmeldungsgegenstand drehen.

Step 5 - Set Navigation with Easy Nav Finder by Data Miner

Now all that’s left to do is find the navigation control to scrape more posts than just the ones currently visible on the loaded website. The navigation element is defined via the 6 Nav tab at the top of the Data Miner menu bar.

To do this, click on the Easy Nav Finder button. In the case of LinkedIn posts, you can briefly disconnect from the Internet for this purpose, scroll down until the “Show more results” button is visible, and then use the mouse pointer and the N key to set the navigation element as such.

Wählt unter App Event die Anwendung HubSpot aus, nachdem ihr im Hauptdashboard oben links auf Create Zap geklickt habt.

This is what it looks like when the navigation element has been found successfully.

Wählt euren HubSpot Account aus und autorisiert die Verknüpfung mit eurem HubSpot-Account.

Almost done: The scraping schedule can now be named, saved and started in the above tab 8 Save. Scroll down as deep as possible in the LinkedIn posts beforehand so that as many posts as possible are loaded.

Wählt die entsprechende HubSpot-Liste aus, die ihr zuvor erstellt habt.

Now click on Scrape and Append.

Wenn ihr möchtet oder es erforderlich sein sollte, könnt ihr weitere HubSpot-Properties in der Datenübertragung abrufen. Diese könnt ihr dann in den Benachrichtigungsmails als dynamische Felder einstellen, sodass die Mails, die an den Vertrieb gehen, zu jedem Lead diese Daten dynamisch einspeisen und übermitteln.

If everything went well, the data miner should have scraped the posts at least to the data depth that you loaded in the LinkedIn posts website.

Testet den Zapier-Trigger oder überspringt den Test, falls der Trigger – richtigerweise oder irrtümlicherweise – nicht funktioniert.

Step 6 - Arrange table entries

Especially with LinkedIn posts, entries for the same posts are included in the table several times. To clean this up, you should automatically remove the duplicates and automatically remove the blank lines accordingly. If you are using Google Sheets:

Remove Duplicates – go to the top menu under Data > Data Cleanup > Remove Duplicates

Remove empty rows – creates a filter in the header > selects the filter dropdown menu at Column 1 > unchecks (Empty)

Then sort the data along the timeline hierarchy and you’re done: we already have a breakdown of the post reactions of a LinkedIn account.

Wählt das Event Send Mail aus.

For a more stringent display, we can now create the totals for individual months so that the data can be displayed in charts in a clearer and more meaningful way.

Wählt euren Gmail-Account aus und autorisiert die Verknüpfung mit Zapier.

Data & Web
Data Analytics

Create reports with Data Exploration in Google Analytics 4

Many report views that were preconfigured in Google Analytics 3 must be manually created in Google Analytics 4.

Thank you page tracking with Google Analytics 4: Instructions for conversion tracking

The thank you page is the one of the oldest tracking methods and classic in conversion tracking for lead generation.

Track phone number redirects from Google Ads campaigns

With Google Tag Manager you can install phone number forwarding for Google Ads campaigns on your website.

SEO opportunity analysis in international marketing based on Keyword Difficulty

Where the SEO competitor density is very high in Germany, it can be much lower in other countries  – and otherwise.

Data scraping with Data Miner browser extension: Analyze social media channels

Data Miner is a browser extension that can scrape data from website interfaces.

UX & A/B-Tests
Google Optimize

Reduce bounces with website A/B testing via Google Optimize

What improvements need to be made to the website to reduce the bounce rate?

Increase conversion rates with website A/B testing via Google Optimize

If we want to find out which website version gives the best conversion rate, we can do that with an A/B test.

Improve scroll depth with website A/B testing via Google Optimize

An increased scroll depth ensures that more landing page users actually read the content.

Paid Media
LinkedIn Ads

LinkedIn Ads Cost: Why Lead Gen Forms are usually the best choice

LinkedIn Ads Lead Gen Forms are essential in lead generation as their conversion rate is on average five to ten times higher than landing pages.

B2B Marketing: Franchisee Acquisition with LinkedIn Ads

It’s always difficult to find potential franchisees online. The biggest hurdle is above all the production of a stable monthly mileage.

LinkedIn for B2B: How do Lead Gen Forms work?

Marketing with LinkedIn Ads is primarily ideal for generating warm leads for warm acquisition sales – less for direct inquiries.

Paid Media
Google Ads

The artificial budget increase in the target share of possible impressions - Google Ads

There is a way to get ads to the top search position: by artificially increasing the daily budget.

Google Ads Lead Forms with Performance Max

Lead forms are particularly successful in push marketing, not so much with users with search intent. This makes the lead forms very effective for the first time, but also causes a lot of wastage.

Acquisition for franchise companies: Find franchisees with Google Ads

How to find potential franchisees effectively – with warm acquisition leads and direct inquiries via Google Ads.

Paid Media
Facebook Ads

Set up Facebook Ads Lead Gen Forms and use them successfully

Lead forms have a much higher conversion rate than landing pages, so the cost per lead is much lower.

Facebook Ads for Event Promotion: Win commitments with Event Ads

A prospect or a promise can cost as little as €2 – €5 with good targeting. However, lead quality can vary greatly.

Das könnte dich auch interessieren:

Analyze Amazon keywords by country with Helium 10

If we want to examine the awareness and interest of an e-commerce product in international marketing, we can use Helium 10.

Youtube SEO with TubeBuddy and Ahrefs: Optimize descriptions, titles and tags

Keywords with search volume data are not only available for organic Google searches, but also for YouTube keywords.

Set up Lead Syncing for lead forms from Google Ads, LinkedIn Ads and Facebook Ads

This guide is useful for companies that don’t use CRM but still want to receive leads from lead forms via email.