0115 964 8205

A key area of performance for Google Shopping campaigns (PPC), or even “free listings”, is the quality of the data being submitted to Google Merchant Centre. If there are missing attributes which are required to serve products, a product will become disapproved. If Google expects additional attribute values to be submitted, then you may encounter warnings such as “Limited performance due to missing value []”. The general rule of thumb is that any product submitted to Google Merchant Centre should have all of the required attributes submitted to be eligible to serve, and as many additional, respective attribute values to enhance performance.

Google Merchant Centre Product Issues

It is not always easy or possible to adjust feed plugins, extensions or API integrations with your e-commerce platform to easily submit certain attribute values to your Google Merchant Centre account. Fortunately, Google Merchant Centre has a feature that can help with such a scenario. The “supplemental feeds” feature is located within Google Merchant Centre’s left-hand navigation menu under Products > Feeds. As it states on the page, you are able to “use supplemental feeds as a secondary data source to provide additional attributes to your primary feeds”.

Google Merchant Centre Supplemental Feeds with Google Sheets

The supplemental feed feature functions in very much the same way as the main “primary feeds”, allowing Merchants to submit product data via a variety of methods. The feed type which we are focusing on for this post is the ability to create and fetch a Google Sheet document, which we would theoretically be submitting additional attributes for products. We will not be discussing how to create the document within this post, but how to scrape data from product pages to populate attribute values.

This feature is potentially being used by many seasoned PPC consultants, but it can be very time-consuming if you have a large volume of products to collate data into a Google Sheet, if you do not have access to or the ability to export the required data from an e-commerce platform.

We recently encountered such a scenario with a client where a proportion of products were disapproved due to not having 2 of the following 3 attributes: Brand, MPN and GTIN. The products do not have GTIN values as they are custom products. We were awaiting access to their site, and their development agency was slow to respond to our request to see if their feed generator could be configured to submit product SKUs as the MPN attribute.

Google Merchant Centre Supplemental Feed

The fastest solution was to create a supplemental feed within the Google Merchant Centre account, outlining each disapproved product ID and then submitting the SKU as the MPN attribute for each respective product. The time-consuming aspect would be that the product SKUs were on the respective product pages of the site, but it would be a time-consuming process to visit each page, copy the SKU value and paste it in our Google Sheets supplemental feed with its respective product ID.

A supplemental feed was created as a Google Sheet and configured within the account. Next, we exported all of the products which were disapproved due to the missing attribute. Within the exported data, we had all of the attribute values for the products, including URL and ID. This started our thought process of being able to utilise importXML within a Google Sheet to populate the MPN value from each product’s respective URL.

Scrape Product Page data to Google Sheets with ImportXML

Armed with a .csv export of the disapproved products from Google Merchant Centre, we refined the columns to leave us with the product ID and URL columns only in Excel.

Next, we created a new Google Sheet document, which we are going to utilise for scraping the URLs to retrieve the “SKU” attribute from the page, which we will import into an MPN column using an importXML function.

Google Sheets importXML

The function which you should enter into the cell to scrape the value is below.

=importxml((B2), "//dd[@data-testid='productSku']")

This is specific to the made.com product pages shown within our example, so let’s explain the syntax in parts so that you are able to adjust it to your own website’s structure.

The first part of the function: =importxml((B2), is stating that the data is going to be imported from the cell “B2” within our Google Sheet, which is the product pages URL.

The second half of the function is stating which HTML element to scrape the data from: “//dd[@data-testid=’productSku’]”) which in our example on the made.com website is a ‹dd› HTML tag with the data-testid value of productSku.

You can adapt the formula to scrape any value of a particular HTML tag from a page. In the example below we are scraping the colour value from this Deichman product page ,which is contained in a ‹span› HTML tag with the class colorname.

=importxml((B2), "//span[@class='colorname']")
Google Sheets importXML

Once you have scraped the product data required, you can then copy and paste the values from the scraping Google Sheet and paste them to the respective attribute columns within your supplemental feed Google Sheet, which will then update those specific attributes within the products in Google Merchant Centre.

As you would expect, you can apply this function and adapt the formula to scrape any data from a web page as you see fit. Simply identify the HTML element within Chrome Dev tools and then adjust the importXML function as required to scrape its value into the Google Sheet.

Google Chrome Dev Tools

ImportXML Performance Tips

You may find that the importXML function takes a long time when scraping a large volume of values at once. To reduce this, we suggest adding the function to the first cell in a column. Then drag the formula down and essentially batch the process. 10 to 20 at a time, from our experience, works extremely well.

It may also be the case that your server settings or firewall may not allow large volumes of requests at once and could potentially block the Google Sheet importing, so again, batching the process can help alleviate this. If you encounter “unable to fetch” related errors or “N/A” values and you are confident that the formula within the function is correct, it may be the case that your firewall is blocking the function.

Share

Author Biography

Mathew


A 14 year industry veteran that specialises in wide array of online marketing areas such as PPC, SEO, front end web development, WordPress and Magento development.

Accredited Google Partner & Bing Ads qualifications, BA (Hons) in Digital Marketing. One half of the Director duo at Kumo.