How to pull Google Auction Insight Data in an automated way

On this post

    What is the Auction Insights Report?

    Google ads auction insights

    Auction insight data is quite useful to understand your competition in google ads. It is possible to do it on keyword level. This data can explain why your CPCs increased over time. You get also ideas about your competitors budgets – and how they change over time. However this valuable auction insight data is not available in the google ads reporting API.

    Why you should automate the Auction Insights Report Export?

    It is very valuable data you should check regularly. When you have multiple google ads accounts with large keyword sets it is very time consuming to pull the data manually. When you want to analyze auction insights on keyword level the file size is increasing quickly and it can be difficult to analyze in tools like Excel. If you are able to export the auction insights data in an automated way to e.g. Google Big Query things are getting more powerful immediately. Even on large data sets you can easily create google data studio dashboards to get auction insights.

    Required Services: How to automatically save the Auction Insight data to Big Query?

    Google does not allow you to access Auction Insight data with Ads Script or Ads API yet. For that reason we have to create a workaround to get that data in an automated way. Before we go into details about the used components of our solution here is what happens in easy words:

    First: Everything you normally do within your browser to access google auction insight data has to be done programmatically. As soon as the files are downloaded we can make use of official APIs again to store the auction data into cloud storage and big query.

    Google Cloud Functions

    It will be used to run the NodeJS script that will fetch the report and write the reports to Google BigQuery.

    Google Cloud Storage

    Required to store Auction Insight reports in .csv.gz format and trigger script in Google Cloud Function. It makes sense to use Google Cloud Storage as a way to write data to Google BigQuery since Auction insight data is so large.

    Google BigQuery

    This is the database that where we want to store the Auction Insights report data. Big Query is very cheap and powerful. It can be used easily as data source within google data studio.

    It will be used to automate the process and monitor the results from the Google Ads UI.

    How do does the full auction insight extractor solution looks like?

    Necessary Scripts

    Google Cloud Functions

    Auction Insights Report Scraper from Google Ads UI [NodeJS Script]

    By running the headless browser with puppeteer in Google Cloud Function, script obtains the download link of the Auction Insight report, downloads the remote file on Google server to the cloud storage.

    You will need to login to Google Ads before you can pull report data from ads UI. When you want to do this in the Google Cloud Functions, you may encounter some problems. Since it is detected as a foreign device, you may encounter some extra steps for your security. This makes it impossible to log in to the cloud by only using headless chrome. “User data directory” comes to our rescue at this point. Let’s take a look at how you can prepare your data directory to automate login process to Google Ads with headless chrome. You should follow the steps indicated below respectively:

    1. Close Google Chrome browser if it’s running.
    2. You will need to clear your existing cache folder to reduce file size. Warning : This will erase all your history, saved and cached data from your Chrome Browser. Consider backing up directory before you proceed to the steps below. For Windows:
      • Open directory %LOCALAPPDATA%\Google\Chrome\User Data
      • Backup everything inside this directory.
      • Delete everything inside this directory.
      • For more info about other OS visit : https://chromium.googlesource.com/chromium/src/+/master/docs/user_data_dir.md
    3. Open Google Chrome, navigate to https://ads.google.com and login to your account like as usual.
    4. Close Google Chrome after you successfully login.
    5. Go to %LOCALAPPDATA%\Google\Chrome\User Data  again. Copy everything inside this folder and backup them on somewhere you will remember. You will need them later.
    6. Done! These are your necessary data items to login Google Ads with headless chrome.  

    Now that we have user data directory, we can proceed with script.

    1. Create new folder for your script files. For example: auctionScraper
    2. Create a new folder inside the folder you recently created at the step above, name it “tempFolder”.
    3. Paste files into “tempFolder” from your Chrome data directory (%LOCALAPPDATA%\Google\Chrome\User Data)
    4. Create new files as below:
      • index.js
      • package.json

    JS codes for index.js (Do not forget editing necessary fields <YOUR-CLOUD-BUCKET-NAME>, report url (urlVisit), <YOUR-CLOUD-PROJECT-ID>)

    const puppeteer = require('puppeteer-extra');
    const StealthPlugin = require('puppeteer-extra-plugin-stealth');
    puppeteer.use(StealthPlugin());
    const fs = require('fs');
    const { Storage } = require('@google-cloud/storage')
    const request = require('request');
    const bucketName = "<YOUR-CLOUD-BUCKET-NAME>"
    exports.auctionScraper = async(req, res) => {
        //Default Auction Insights report URL. Get it from your Ads UI > Reports > Auction Insights
        var urlVisit = "https://ads.google.com/aw/reporting/reporteditor/view?ocid=1022000&reportId=933615822&euid=563803870&__u=5444835630&uscid=112849000&__c=1508801000&authuser=0&subid=ALL-tr-et-g-aw-c-home-awhp_xin1_signin%21o2-ahpm-0000000082-0000000000"
        if (req.body) {
            if (JSON.parse(req.body).url) {
                //has custom report URL on payload
                urlVisit = JSON.parse(req.body).url
            }
        }
        //Initialize Puppeteer
        const browser = await puppeteer.launch({
                headless: true,
                ignoreHTTPSErrors: true,
                userDataDir: './tempFolder',
                args: [
                    '--no-sandbox',
                    '--disable-setuid-sandbox',
                    '--disable-infobars',
                    '--window-position=0,0',
                    '--disable-features=site-per-process',
                    '--ignore-certifcate-errors',
                    '--ignore-certifcate-errors-spki-list',
                    '--user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.0 Safari/537.36"'
                ]
            })
            //Open the page
        const page = await browser.newPage();
        //Set default timeouts
        page.setDefaultNavigationTimeout(600000)
        page.setDefaultTimeout(600000)
        var x = 1
        await page.goto(urlVisit, { waitUntil: 'domcontentloaded' })
        await page.waitForFunction("document.querySelector('.filter-list-icon') && document.querySelector('.filter-list-icon').clientHeight != 0");
        await page.setRequestInterception(true);
        page.on('request', request => {
            if (request.url().indexOf("awn-report-download") >= 0)
                request.abort();
            else
                request.continue();
        });
        await page.waitForTimeout(300);
        const dateItem = await page.waitForSelector('.date-picker-container .dropdown-and-comparison')
        await dateItem.click()
    await page.waitForTimeout(300);
    //Click “Last 7 Days” > to change frequency, modify below: 
        const last7Days = await page.waitForSelector('.preset-list .group.preset-container .item:nth-child(5)')
        await last7Days.click();
        await page.keyboard.press('Enter');
        await page.waitForTimeout(300);
        try {
            const applyBtn = await page.waitForSelector('.apply-bar .apply', {
                timeout: 3000
            })
            await applyBtn.click()
        } catch (err) {}
        await page.waitForFunction("document.querySelector('.download-icon') && document.querySelector('.download-icon').clientHeight != 0");
        const downloadBtn = await page.waitForSelector('.download-icon')
        await downloadBtn.click()
        await page.waitForTimeout(300);
        await page.waitForFunction("document.querySelector('.download-dropdown .item:nth-child(5)') && document.querySelector('.download-dropdown .item:nth-child(5)').clientHeight != 0");
        const downloadDropdown = await page.waitForSelector('.download-dropdown .item:nth-child(5)')
        await downloadDropdown.click()
        const finalRequest = await page.waitForRequest(
            (request) => {
                if (request.url().indexOf("awn-report-download") >= 0) {
    
                    return request
                }
            }
        );
        var body = {
            url: finalRequest.url(),
            type: JSON.parse(req.body).type
        }
        const gcs = new Storage({
            projectId: '<YOUR-CLOUD-PROJECT-ID>'
        });
        var theURL = body.url
        var theTYPE = body.type
        const bucket = gcs.bucket(bucketName);
        var file = bucket.file("report_" + theTYPE + ".csv.gz");
        var x = await new Promise((resolve, reject) => {
            const requ = request(theURL);
            requ.pause();
            requ.on('response', resu => {
                const exist = file.exists().then(function(data) {
                    return data[0]
                });
                const writeStream = file.createWriteStream({
                    metadata: {
                        contentType: resu.headers['content-type']
                    }
                });
                requ.pipe(writeStream)
                    .on('finish', () => {
                        resolve("ok");
                    })
                    .on('error', err => {
                        writeStream.end();
                        console.error(err);
                        reject();
                    });
                requ.resume();
            });
            requ.on('error', err => console.error(err));
        });
        if (x) {
            res.sendStatus(200)
        }
    }

    package.json

    {
        "name": "auctionscraper",
        "version": "1.0.0",
        "description": "",
        "main": "index.js",
        "engines": {
            "node": ">=16.0.0"
        },
        "scripts": {
            "test": "echo \"Error: no test specified\" && exit 1",
            "start": "functions-framework --target=auctionScraper --trace-warning"
        },
        "author": "",
        "license": "ISC",
        "dependencies": {
            "@google-cloud/functions-framework": "^2.1.0",
            "@google-cloud/storage": "^5.16.1",
            "puppeteer": "^13.3.2",
            "puppeteer-extra": "^3.2.3",
            "puppeteer-extra-plugin-stealth": "^2.9.0",
            "request": "^2.88.2"
        }
    }

    At the end, you should have this file & folders in your script folder: tempFolder, index.js, package.json

    Then you can compress those files to prepare your script to serve on Google Cloud Functions. Your compressed file should be with this structure:

    scriptFiles.zip >
          tempFolder
          index.js
          package.json

    It’s important to keep this structure valid, otherwise you might face some errors while uploading files to Google Cloud Functions. You can now navigate to https://console.cloud.google.com and create new Cloud Function with settings below:

    • Function Name: auctionScraper
    • Trigger Type : HTTP (Allow unauthenticated invocations)
    • Memory : 2GB
    • Timeout : 540
    • Runtime : NodeJS 16
    • Entry point: auctionScraper
    • Source Code : ZIP Upload (upload the zip file you already prepared.)

    Deploy your function. Now, this function will get download link for last 7 days (you can modify on script) auction insight reports from Google Ads UI, and download the report inside a bucket you specified. Next step will be, writing report’s content to the BigQuery.

    Importing Report file to BigQuery [NodeJS Script]

    First step was downloading report from Google Ads to the storage bucket. Now that this process is complete, we can move on to the next step: Importing downloaded compressed csv file (.csv.gz) to the Google BigQuery Table.

    1. Create new folder for your script files. For example: auctionScraper_automation
    2. Create new files inside this folder as below:
      • index.js
      • package.json

    JS codes for index.js (Do not forget editing necessary fields <CLOUD-PROJECT-ID> and <DATA-SET-ID>)

    const projectId = "<CLOUD-PROJECT-ID>"
     const dataSet = "<DATA-SET-ID>"
     const tableSuffix = "_AuctionInsightReport"
     const { BigQuery } = require('@google-cloud/bigquery');
     const { Storage } = require('@google-cloud/storage');
     const storage = new Storage();
     const bigquery = new BigQuery({ projectId: projectId });
     exports.sendToBigQuery = (event, context) => {
         const gcsEvent = event;
         async function loadCSVFromGCS() {
             const metadata = {
                 sourceFormat: 'CSV',
                 skipLeadingRows: 3,
                 compression: "GZIP",
                 autodetect: true,
                 writeDisposition: "WRITE_APPEND"
             };
             const [job] = await bigquery
                 .dataset(dataSet)
                 .table("keywords" + tableSuffix)
                 .load(storage.bucket(gcsEvent.bucket).file(gcsEvent.name), metadata);
             const errors = job.status.errors;
             if (errors && errors.length > 0) {
                 throw errors;
             }
         }
         if (gcsEvent.name.indexOf("report_") >= 0) { loadCSVFromGCS(); }
     };

    The important point here is to interfere with the file as little as possible and send it to BigQuery since report sizes can be huge for large clients. This eventually causes our function to timeout. So it doesn’t seem possible for us to process this .csv file line by line. Therefore, we use “auto detect” function of BigQuery schema. Schema for table will be created automatically by detecting data types of columns.

    package.json file

    {
      "name": "auctionScraper_automation",
      "version": "0.0.1",
      "dependencies": {
        "@google-cloud/storage": "*",
        "@google-cloud/bigquery" : "*"
      }
    }

    At the end, you should have this file in your script folder:  index.js, package.json

    Then, you can compress those files to prepare your script to serve on Google Cloud Functions. Your compressed file should be with this structure:

    scriptFiles.zip >
          index.js
          package.json

    You can now navigate to https://console.cloud.google.com and create new Cloud Function with the settings below:

    • Function Name: auctionScraper
    • Trigger Type : CloudStorage
    • Event Type: Finalize/Create
    • Bucket: Select bucket name you specified at previous script (where your reports downloaded)
    • Memory : 2GB
    • Timeout : 540
    • Runtime : NodeJS 16
    • Entry point: sendToBigQuery
    • Source Code : ZIP Upload (upload the zip file you prepared already.)

    Deploy your function. Now, each time a new file is downloaded to your bucket by previous script, this function will be triggered to send its content to the BigQuery table.

    After our setup is ready to download the report from ads UI and send it to BigQuery, next step is to do it automatically every week. The schedule option of the cloud can also be used, but we preferred Google Ads Script in this solution.

    1. Navigate to https://ads.google.com/aw/bulk/scripts
    2. Create new script with the following code and name it as you wish.

    Ads script code :

    var REPORT_SCRAPER_URL = "<your http trigger url for first google cloud function (auctionScraper)>"
    var body ={
    "url": "url of auction insight report page",
    "type" : "keywords"
    }
    function main() {
          callCloudFunc()
    }
    function callCloudFunc(){
       var options = {
            muteHttpExceptions: true,
            contentType: "text/plain",
            payload: JSON.stringify(body)
        }
      var response = UrlFetchApp.fetch(REPORT_SCRAPER_URL, options);
            if (response.getResponseCode() == 200) {
            }
            else{
             callCloudFunc()
            }
    }
    • Schedule script to run weekly.

    This will fetch Auction Insights report each week from Google Ads UI, download it to the storage and write contents to the BigQuery. So your data will stay up to date on a weekly basis.

    Fetching older auction insight data in bulk

    Ads script is ready now. After it starts working, it will regularly download the Auction Insight report every week and add the data to our BigQuery table. But what if I want to get older data in my table as well? For example, data for the last 2 years?

    Our cloud function auctionScraper will get a timeout while handling those files, because the data for a 2 year report will probably be huge.

    Waiting for this report to be prepared in Google Ads UI is the longest part of the job, sometimes it can take more than 10 minutes. Unfortunately, there is not much we can do about it, because this report is prepared entirely by Google and when it is ready, we can access the link of the report.so it seems unlikely that we will use the auctionScraper function in this action. But we can overcome this problem with our other function “auctionScraper_automation”. As you’ll remember, this function has nothing to do with other script, it simply waits for file upload process on specified bucket, then get triggered to run and write uploaded file content to the BigQuery.

    So what should our solution be for large reports?

    1. Navigate to Google Ads UI, open your Auction Insights Report.
    2. Filter and set date range for report as you wish (last 2 years for example)
    3. Click to download button and download it as .csv.gz format.
    4. This will take some time for Google to prepare your report. Once your report is ready, downloading process will start.
    5. Wait for your download to be completed.
    6. Rename your report as : report_keywords<any_number>.csv.gz  For example :  report_keywords1.csv.gz (The number here is not that important, it is only used to control and not to confuse the processed files. )
    7. Navigate to your Cloud Console.
    8. Open the bucket you created to store reports at previous steps.
    9. Drag&drop or upload recently downloaded and renamed report to this bucket.
    10. It’s done! auctionInsight_automation script will be triggered after upload process complete, and write report’s data to the BigQuery.

    Trouble to set up your own solution?

    You do not have developers in your marketing team and you struggle to set up that solution? We can do the setup and maintenance of the auction insight exporter for you. If you are interested feel free to contact us. We also have a lot of other solutions for automating things in Google Ads.

    More Similar Posts

    Menu