Author: Jack Telford
Screaming Frog’s custom search and extraction features help you find specific elements and what’s inside them in the source code of web pages. Using them in creative ways can quickly generate a lot of information from your site and those of competitors, feeding in to both quick wins and longer term strategy. Here I’ll go through 3 very simple ways I’ve found to make use of them.
At a Glance
|What you do||Use custom search and extraction to find elements on pages|
|Benefits||Inform strategy and understand that of competitors|
|Time||Max 2 hours per task|
What is Screaming Frog Custom Search?
Custom search is extremely simple to understand. Input something that you are looking to find in the HTML code of pages and Screaming Frog will flag all pages with it on, as well as telling you how many instances there are on the page. You can carry out a number of different custom searches in one crawl, with the different things you search for appearing in different columns of your Screaming Frog export.
What is Screaming Frog Custom Extraction?
Custom extraction is a little more complicated to understand, but once you grasp the basics it’s very easy to make use of. Like custom searches, custom extractions look through the HTML of a page for specific elements. However, rather than looking for a specified piece of text or code, a custom extraction finds what is wrapped up in certain code elements (CSSPath, XPath and regex). For example, it could return what’s in H2 tags, or in a paragraph element on-site. You can look either for inner HTML code of an element, or for specific text. There are also additional functionalities, which you won’t need to know for this post but can read about here.
Use Case 1: Find Instances of Specific Text on Your Site With Custom Search
Maybe you ran an offer last year which has expired. Maybe your phone number has changed since you built the site, or maybe you want to change the way you refer to your brand. Custom search is perfect to identify pages containing specific text strings, in order to identify where you need to make changes. Just select “Custom” > “Search” under “Configuration”, then add the text you’re after. Once you crawl the URLs, the instances where it’s found for will show up in the “custom search” tab for each page.
Two other quick uses for this:
- Link Disavowal: Review pages which link to your site by searching for elicit words or those associated with spam in their source code. This way you avoid clicking any dodgy links.
- Competitor Analysis: Find articles created by competitors about a specific topic by crawling their site for specific words (see Coronavirus example here).
Use Case 2: Find all Text on a Site using Custom Extraction for Communication Insights
Wouldn’t it be useful to see the words which are most used on a new client’s site, or those of our competitors, in order to understand at a broad level how they like to talk about things? Custom extractions are perfect for this. Here’s how to do it:
Before running a crawl, set the crawler to pull all of the text in paragraph tags (essentially all on-page body content) by going to configuration > custom > extraction on Screaming Frog, then adding the “//p” as an X Path and setting to “Extract Text”. You can also add others for the likes of H1 tags, H2s or titles, but I’ll keep it simple here and stick to body content.
Crawl the site you want to review. If it’s worked properly, you will see the page text in the “custom extraction” tab of the Screaming Frog dashboard, spread between the “extractor” rows. Export the report to Excel, then copy all of the text pulled in the “extraction” columns. What you’ve done is copy all paragraph text on all pages at once.
Now paste in a word cloud generation tool, which will show the frequency of different words on the site. I ran a quick one for a few Vice pages below to give you an idea of what you can expect to find. We could infer from this that the language is casual, people-centric and direct. You can also see there is a lot of coronavirus talk and that about drugs, relationships and politics.
Sometimes unnecessary words show up through this process, those which all sites use a lot but don’t teach us much. If they do, go back to the extraction and find and replace these words with nothing (words like also, are, and etc.). In the word clouds online tool I use, you can also go to the “word list” tab and look view the number of instances of each word for ideas.
Use Case 3: Find JSON Schema Across Multiple Pages
Sometimes it’s useful to understand which types of structured data your top competitors are using, in order to see what’s likely to be relevant for you to implement. It’s again a matter of pulling a custom extraction here. Like last time, go to configuration > custom > extraction on Screaming Frog, then add “[“‘]@type[“‘]: *[“‘](.*?)[“‘]” as an extractor for Regex.
In the “Custom Extraction” tab you can see your results in “extractor” columns. I did this again for a few pages on the Vice site. If I operated a publisher site, I’d look to implement these schema types too.
I hope you found this an interesting read. Feel free to drop a comment below if you have any questions or feedback. Of course, I’ve only scratched the surface here of what you can do with Screaming Frog custom features, but I hope that there was something to pique your interest. I also recommend taking a look at this full guide to custom extraction if you want to go a little deeper.
Take a look at our main categories here
It’s the foundation that all SEO is built on, and can be a pretty useful extension too. Read our latest & greatest technical SEO articles here.
If you want to diversify the entry points to your site, you’ll need to create and maintain great content. Check out our tips & techniques.
Catch up on strategy advice. However advanced your tactics are, without strategy your SEO won’t get you where you need to go.
Behind the Site
I’m Jack Telford, an Owned Strategy Director at Publicis Media. I’ve been in the SEO industry for the last 6 years and love the collaborative nature of the space. This site is my little contribution to the community.
Got any Good Ideas?
Always looking for new contributors to the site – and for feedback too. Feel free to get in touch if you’re interested in writing or have anything to share.
London, SW4 (or I will be again after lockdown)
About 100 SEO Ideas
The place to explore quick, easily digestible SEO and wider marketing tips and techniques. Sharing knowledge from professionals across the field, we aim to help each other achieve greater success.