Screaming Frog

AI Categorization With Screaming Frog & ChatGPT

In an ideal situation, your website CMS or inventory management system would have a strong taxonomy system to easily categorize thousands of pages accurately and efficiently. That is not always the case however so what can you do about this as an in-house SEO or consultant? Give Screaming Frog’s custom JavaScript feature a try and pair it with an LLM of your choice like ChatGPT. I’ll share a method that I came up with as an example (at least what I’m able to share publicly).

Screaming Frog
Screaming Frog

What is Screaming Frog SEO Spider?

The Screaming Frog SEO Spider is a desktop SEO crawler tool that is used by many SEO’s. You can use the tool to crawl for all of the standard SEO issues like meta title length, response codes, broken links, missing image alt text, etc… You can also use the tool to extract specific lines of code or content from the URLs crawled. One interesting new feature that was released in May 2024 is the ability to execute custom JavaScript during a crawl. This enabled ChatGPT API integration into the website crawling.

Categorizing Pages at Scale With ChatGPT API

An issue I had was being stuck with a legacy system where the categorization of products was either done manually via users or via the in-house team. There was simply way more products being uploaded than could realistically be categorized. Make / Model was also a feature that was not utilized often either as a result due to the level of effort required. So I went about experimenting on my own with this new Screaming Frog functionality and wanted to see if I could get AI to categorize the PDPs at scale including the addition of make / model.

The first thing I did was just go to ChatGPT and experiment with some manual prompting using up to 20 URLs, the site taxonomy and asking to add a model / model where relevant. It took some trial and error to figure out which series of prompts would give me the output I wanted. There’s a limit to how much data ChatGPT will crunch for you manually so I knew that Screaming Frog was likely the route I had to go with in order to scale this. Once I was happy with the results then I went about creating the custom JavaScript snippet in Screaming Frog.

First, I had to set up the custom JavaScript snippet for ChatGPT. You can read how to do this on Screaming Frog’s tutorial on How to Crawl with ChatGPT. Once I registered for and funded my OpenAI API account, I then input this prompt into the JavaScript editor –
Categorize this page within one of the following categories: (insert 1,500+ category names). Then give me the make followed by the model. Here is an example of the format that I want: Category: Riding Mowers | Make: Cub Cadet | Model: XT2 LX46′;.

Next, I chose to limit Screaming Frog to crawling a single XML sitemap that only consisted of products. The reason for this is because OpenAI API calls cost money so I didn’t want to be wasting money on pages that didn’t need categorization. From this point, I let Screaming Frog crawl about 2,000 URLs for this test and input the categories, make & model into a single column within the Custom JavaScript tab. Initially I tried to put categories, make and model all into separate columns but I was exceeding the threshold for API calls on OpenAI so that’s why I put everything into a single prompt. This was also less expensive since technically each URL would have one prompt rather than 3 separate prompts for each URL crawled. From here, I downloaded the CSV file and was able to do some batch Find & Replace to separate out categories from make and model into individual columns. Upon reviewing the results, I was pretty impressed. AI categorization made few errors, and the pages I could not categorize was due to a lack of on-page content so the main topic wasn’t clear.


AI Categorization With Screaming Frog & ChatGPT
AI Categorization With Screaming Frog & ChatGPT

Since my Product IDs were built into the URLs I was able to easily extract those into a separate column. This makes it easy to match the auto-categorization data to the products within the backend system.


After Using Screaming Frog + ChatGPT, I cleaned up the AI categorization output in Excel
After Using Screaming Frog + ChatGPT, I cleaned up the AI categorization output in Excel

What Are The Benefits of AI Categorization?

  1. AI reduces the amount of manual labor necessary to category vast quantities of pages.
  2. Improves efficiency. For my test, I was able to categorize about 2,000 pages in about 20 minutes. To do that amount of categories, make and model manually it would probably take me a day to do that assuming that I knew the products well and didn’t have to spend ample time researching. Now I could do 30k in a day. You can let artificial intelligence do the heavy-lifting then have humans review / edit the results.
  3. Improves user discoverability. Unless a user is using the site search function, the main way someone will typically find products on your site is either through Google or via categories and facets. When products are not categorized or under-categorized there are a lot less opportunities for those pages to be discovered by the user.
  4. Improves SEO performance by making the products discoverable. If your PDP’s are not linked from your core navigation then it makes it much more difficult for search engines to discover your pages since the most common way for bots to discover URLs is through linking.
  5. AI categorization accuracy can be pretty good; certainly better than offshore resources.

I was able to categorize product pages with make & model just from off-the-shelf tools (Screaming Frog SEO Spider, ChatGPT & Microsoft Excel). If you find yourself limited due to engineering resources or product features then it forces us to find creative solutions to solve business problems until we get the necessary resourcing to do our job. AI allows us to do things that were not previously possible without substantial resources.

Let’s Work on Some AI SEO Solutions Together

Please enable JavaScript in your browser to complete this form.
Name