Python script: SEO content analysis of your competitor

Stefan Neefischer

Posted on Feb 3, 2021
at 9:38 am

Analyzing the content of your competitors can provide you with valuable insights about your own operations and goals. This basic Python script can give you information on N-rams in seconds.

On this post

Python script: SEO content analysis of your competitor

How to analyze your competitors’ content with Python script

This Python script is a basic version of a content analysis of your competitor. The main idea is to get a quick overview of what the writing focus looks like. A lean approach is to fetch all URLs in the sitemap, parse the URL slugs and run an n-gram analysis on it.

If you want to know more about n-gram analysis, please also have a look at our Free N-Gram Tool. You can apply it not only for URLs but also keywords, titles, etc.

As a result, you’ll get a list of used n-grams in the URL slugs together with the amount of pages that used this n-gam. This analysis will only take a few seconds even on large sitemaps and will run with less than 50 lines of code.

Additional approaches

If you want to get deeper insights, I can recommend going on with these approaches:

Fetch the content of each URL in the sitemap.
Create n-grams found in headlines.
Create n-grams found in content.
Extract keywords with Textrank or Rake.
Extract known entities for your SEO business.

However, let’s start simple and take a first look into the rabbit hole with this script. Based on your feedback, we may add some more sophisticated approaches. Before running the script, you just need to enter the sitemap URL you want to analyze. After running the script, you’ll find your results in sitemap_ngrams.csv. Open it in Excel or Google Sheets and have fun with analyzing the data.

Bonus Tip:

If you run it twice—once for your competitor and once for your own sitemap—you can easily create a Gap-Analysis within Excel.

Here is the Python code:

# Pemavor.com Sitemap Content Analyzer
# Author: Stefan Neefischer

import advertools as adv
import pandas as pd


def sitemap_ngram_analyzer(site):

    sitemap = adv.sitemap_to_df(site)
    sitemap = sitemap.dropna(subset=["loc"]).reset_index(drop=True)

    # Some sitemaps keeps urls with "/" on the end, some is with no "/"
    # If there is "/" on the end, we take the second last column as slugs
    # Else, the last column is the slug column
    slugs = sitemap['loc'].dropna()[sitemap['loc'].dropna().str.endswith('/')].str.split('/').str[-2].str.replace('-', ' ')
    slugs2 = sitemap['loc'].dropna()[~sitemap['loc'].dropna().str.endswith('/')].str.split('/').str[-1].str.replace('-', ' ')

    # Merge two series
    slugs = list(slugs) + list(slugs2)

    # adv.word_frequency automatically removes the stop words
    word_counts_onegram = adv.word_frequency(slugs)
    word_counts_twogram = adv.word_frequency(slugs, phrase_len=2)

    output_csv = pd.concat([word_counts_onegram, word_counts_twogram], ignore_index=True)\
                    .rename({'abs_freq':'Count','word':'Ngram'}, axis=1)\
                    .sort_values('Count', ascending=False)

    #Save input csv with scores
    output_csv.to_csv('sitemap_ngrams.csv', index=False)
    print("csv file saved")


# Provide the Sitemap that should be analyzed
site = "https://searchengineland.com/sitemap_index.xml"
sitemap_ngram_analyzer(site)
#the results will be saved to sitemap_ngrams.csv file

Do you need a custom Python script?

With our experienced Python developers, you can bring your ideas to life. We understand the power and flexibility of Python. Contact us today. And let’s discuss how we can help you.

Feb 3, 2021
at 9:38 am

Python script: SEO content analysis of your competitor

How to analyze your competitors’ content with Python script

Additional approaches

Here is the Python code:

Do you need a custom Python script?

More Similar Posts

Scrape the Google Scholar Autosuggest with Python

How to use Google, Bing and Amazon Autocomplete APIs

Cluster massive amounts of keywords in PPC

N-gram analysis in PPC

The Rise of Python since 2004

PEMAVOR Search Trends Tool: How is it done?

Subscribe to Newsletter

Recent Posts

FREE SERP Rank Checker: Monitor Your Website’s Visibility and SEO

iOS 18 critical updates: How they impact user data and analytics

Consent Management Platforms: Why do you need and how to choose

Consentless tracking: Boost data collection with privacy compliance

Understanding Chrome’s third-party cookie restrictions and their impact

All Insights