Python Script: Google Autosuggest Trends for Niche Keywords

You like Google Trends but you are disappointed that it does not work well for Long Tail Keywords? Then keep on reading and use this Python Script to build your own “Google Trends” service by using the google autosuggest.


I really like the official google trends service for analyzing new search behaviour. There are two things that stopped me from using it for more than “just playing around”:

  1. If your focus are finding new niche keywords google trends does not show you sufficient data
  2. There is no official API for making requests to google trends, if you use modules like pytrends you have to use proxy servers to avoid getting blocked

In short: This will not solve the job for detecting new niche keywords on scale. Let’s build a solution in Python that is doing the job!

Fetch and Store Autosuggest results over time and look at the changes

Let’s say you have 1000 Seed Keywords you send to the Autosuggest. This will give you probably around 200.000 longtail queries. Then do the same one week later and compare both datasets:

  • Which Queries are new compared to your last run? This is probably the case you are interested in. Google thinks those queries are getting more important – this is how you can build your custom Google Trends solution!
  • Which Queries are not visible anymore

The Python Script is very straight forward and most of the code I already shared here. The new part is about saving the data from past runs and make the comparison between the suggestions over time. I avoided to use file based databases like SQLite for simplicity – so all the data storage is done using CSV Files. Just import the file in Excel and explore niche keyword trends for your business.

Please follow these steps:

  1. Enter your seed keyword set that should be send to the autocomplete: keywords.csv
  2. Adjust the Script settings for your need:
    • LANGUAGE: default “en”
    • COUNTRY: default “us”
  3. Schedule the Script to run once a week. It is also fine to start it manually
  4. Use keyword_suggestions.csv for further analysis:
    • first_seen: this is the date where the query appeared for the first time in the autosuggest
    • last_seen: the date where the query was seen for the last time
    • is_new: if first_seen == last_seen we set this to “True” – Just filter on this value to get the new trending searches in the google autosuggest


Here is the Python Code:

# Pemavor.com Autocomplete Trends
# Author: Stefan Neefischer (stefan.neefischer@gmail.com)
import concurrent.futures
from datetime import date
from datetime import datetime
import pandas as pd
import itertools
import requests
import string
import json
import time

charList = " " + string.ascii_lowercase + string.digits

def makeGoogleRequest(query):
    # If you make requests too quickly, you may be blocked by google 
    time.sleep(WAIT_TIME)
    URL="http://suggestqueries.google.com/complete/search"
    PARAMS = {"client":"opera",
            "hl":LANGUAGE,
            "q":query,
            "gl":COUNTRY}
    response = requests.get(URL, params=PARAMS)
    if response.status_code == 200:
        try:
            suggestedSearches = json.loads(response.content.decode('utf-8'))[1]
        except:
            suggestedSearches = json.loads(response.content.decode('latin-1'))[1]
        return suggestedSearches
    else:
        return "ERR"

def getGoogleSuggests(keyword):
    # err_count1 = 0
    queryList = [keyword + " " + char for char in charList]
    suggestions = []
    for query in queryList:
        suggestion = makeGoogleRequest(query)
        if suggestion != 'ERR':
            suggestions.append(suggestion)

    # Remove empty suggestions
    suggestions = set(itertools.chain(*suggestions))
    if "" in suggestions:
        suggestions.remove("")

    return suggestions


def autocomplete(csv_fileName):
    dateTimeObj = datetime.now().date()
    #read your csv file that contain keywords that you want to send to google autocomplete
    df = pd.read_csv(csv_fileName)
    keywords = df.iloc[:,0].tolist()
    resultList = []

    with concurrent.futures.ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
        futuresGoogle = {executor.submit(getGoogleSuggests, keyword): keyword for keyword in keywords}

        for future in concurrent.futures.as_completed(futuresGoogle):
            key = futuresGoogle[future]
            for suggestion in future.result():
                resultList.append([key, suggestion])

    # Convert the results to a dataframe
    suggestion_new = pd.DataFrame(resultList, columns=['Keyword','Suggestion'])
    del resultList

    #if we have old results read them
    try:
        suggestion_df=pd.read_csv("keyword_suggestions.csv")
        
    except:
        suggestion_df=pd.DataFrame(columns=['first_seen','last_seen','Keyword','Suggestion'])
    
    suggestionCommon_list=[]
    suggestionNew_list=[]
    for keyword in suggestion_new["Keyword"].unique():
        new_df=suggestion_new[suggestion_new["Keyword"]==keyword]
        old_df=suggestion_df[suggestion_df["Keyword"]==keyword]
        newSuggestion=set(new_df["Suggestion"].to_list())
        oldSuggestion=set(old_df["Suggestion"].to_list())
        commonSuggestion=list(newSuggestion & oldSuggestion)
        new_Suggestion=list(newSuggestion - oldSuggestion)
         
        for suggest in commonSuggestion:
            suggestionCommon_list.append([dateTimeObj,keyword,suggest])
        for suggest in new_Suggestion:
            suggestionNew_list.append([dateTimeObj,dateTimeObj,keyword,suggest])
    
    #new keywords
    newSuggestion_df = pd.DataFrame(suggestionNew_list, columns=['first_seen','last_seen','Keyword','Suggestion'])
    #shared keywords with date update
    commonSuggestion_df = pd.DataFrame(suggestionCommon_list, columns=['last_seen','Keyword','Suggestion'])
    merge=pd.merge(suggestion_df, commonSuggestion_df, left_on=["Suggestion"], right_on=["Suggestion"], how='left')
    merge = merge.rename(columns={'last_seen_y': 'last_seen',"Keyword_x":"Keyword"})
    merge["last_seen"].fillna(merge["last_seen_x"], inplace=True)
    del merge["last_seen_x"]
    del merge["Keyword_y"]
    

    #merge old results with new results
    frames = [merge, newSuggestion_df]
    keywords_df =  pd.concat(frames, ignore_index=True, sort=False)
    # Save dataframe as a CSV file
    keywords_df['first_seen'] = pd.to_datetime(keywords_df['first_seen'])
    keywords_df = keywords_df.sort_values(by=['first_seen','Keyword'], ascending=[False,False])   
    keywords_df['first_seen']= pd.to_datetime(keywords_df['first_seen'])
    keywords_df['last_seen']= pd.to_datetime(keywords_df['last_seen'])
    keywords_df['is_new'] = (keywords_df['first_seen']== keywords_df['last_seen'])
    keywords_df=keywords_df[['first_seen','last_seen','Keyword','Suggestion','is_new']]
    keywords_df.to_csv('keyword_suggestions.csv', index=False)





# If you use more than 50 seed keywords you should slow down your requests - otherwise google is blocking the script
# If you have thousands of seed keywords use e.g. WAIT_TIME = 1 and MAX_WORKERS = 5
WAIT_TIME = 0.2
MAX_WORKERS = 20
# set the autocomplete language
LANGUAGE = "en"
# set the autocomplete country code - DE, US, TR, GR, etc..
COUNTRY="US"
# Keyword_seed csv file name. One column csv file.
#csv_fileName="keyword_seeds.csv"
CSV_FILE_NAME="keywords.csv"
autocomplete(CSV_FILE_NAME)
#The result will save in keyword_suggestions.csv csv file

Join the conversation on LinkedIn

More Similar Posts

Menu