0115 964 8205

A key area of performance for Google Shopping campaigns, or even “free listings” is the quality of the data being submitted to Google Merchant Centre. If there are missing attributes which are required to serve products, a product will become disapproved. If Google expects additional attribute values to be submitted then you may encounter warnings such as “Limited performance due to missing value []”. The general rule of thumb is that any product submitted to Google Merchant Centre should have all of the required attributes submitted to be eligible to serve and as many additional, respective attribute values to enhance performance.

Google Merchant Centre Product Issues

It is not always easy or possible to adjust feed plugins, extensions or API integrations with your ecommerce platform to easily submit certain attribute values to your Google Merchant Centre account. Fortunately, Google Merchant Centre has a feature that can help with such a scenario. The “supplemental feeds” feature is located within Google Merchant Centre’s left hand navigation menu under Products > Feeds. As it states on the page you are able to “use supplemental feeds as a secondary data source to provide additional attributes to your primary feeds”.

Google Merchant Centre Supplemental Feeds with Google Sheets

The supplemental feed feature functions in very much the same way as the main “primary feeds”, allowing Merchants to submit product data via a variety of methods. The feed type which we are focussing on for this post is the ability to create and fetch a Google Sheet document which we would theoretically be submitting additional attributes for products. We will not be discussing how to create the document within this post, but how to scrape data from product pages to populate attribute values.

This feature is potentially being used by many seasoned PPC consultants, but it can be very time consuming if you have a large volume of products to collate data in to a Google Sheet if you do not have access or the ability to export required data from an ecommerce platform.

We recently encounter such a scenario with a client where a proportion of products were disapproved due to not having 2 of the following 3 attributes: Brand, MPN, GTIN. The products do not have GTIN values as they are custom products. We were awaiting access to their site and their development agency was slow to respond to our request to see if their feed generator could be configured to submit products SKU’s as the MPN attribute.

Google Merchant Centre Supplemental Feed

The fastest solution was to create a supplemental feed within the Google Merchant Centre account, outlining each disapproved product ID and then submitting the SKU as the MPN attribute for each respective product. The time consuming aspect would be that the product SKU’s were on the respective product pages of the site, but it would be a time consuming process to visit each page, copy the SKU value and past it in the our Google Sheets supplemental feed with its respective product ID.

A supplemental feed was created as a Google Sheet and configured within the account. Next we exported all of the products which were disapproved due to the missing attribute. Within the exported data we had all of the attribute values for the products including URL and ID. This started our thought process of being able to utilise importXML within a Google Sheet to populate the MPN value from each products respective URL.

Scrape Product Page data to Google Sheets with ImportXML

Armed with a .csv export of the disapproved products from Google Merchant Centre, we refined the columns to leave us with the product ID and URL columns only in Excel.

Next, we created a new Google Sheet document which we are going to utilise for scraping the URL’s to retrieve the “SKU” attribute from the page, which we will import in to an MPN column using an importXML function.

Google Sheets importXML

The function which you should enter in to the cell to scrape the value is below.

=importxml((B2), "//dd[@data-testid='productSku']")

This is specific to the made.com product pages shown within our example, so let’s explain the syntax in parts so that you are able to adjust it to your own web sites structure.

The first part of the function: =importxml((B2), is stating that the data is going to be imported from the cell “B2” within our Google Sheet, which is the product pages URL.

The second half of the function is stating which HTML element to scrape the data from: “//dd[@data-testid=’productSku’]”) which in our example on the made.com web site is a ‹dd› HTML tag with the data-testid value of productSku.

You can adapt the formula to scrape any value of a particular HTML tag from a page. In the example below we are scraping the colour value from this Deichman product page which is contained in a ‹span› HTML tag with the class colorname.

=importxml((B2), "//span[@class='colorname']")
Google Sheets importXML

Once you have scraped the product data required, you can then copy and paste the values from the scraping Google Sheet and past the to the respective attribute columns within your supplemental feed Google Sheet which will then update those specific attributes within the products in Google Merchant Centre.

As you would expect, you can apply this function and adapt the formula to scrape any data from a web page as you see fit. Simply identify the HTML element within Chrome dev tools and then adjust the importXML function as required to scrape its value in to the Google Sheet.

Google Chrome Dev Tools

ImportXML Performance Tips

You may find that the importXML function may take a long time when scraping a large volume of values at once. To reduce this, we suggest adding the function to the first cell in a column. Then drag the formula down and essentially batch the process. 10 to 20 at a time from our experience works extremely well.

It may also be the case that your server settings or firewall may not allow large volumes of requests at once and could potentially block the Google Sheet importing, so again batching the process can help alleviate this. If you encounter “unable to fetch” related errors or “N/A” values and you are confident that the formula within the function is correct, it may be the case that your firewall is blocking the function.

Share

Author Biography

Mathew


A 14 year industry veteran that specialises in wide array of online marketing areas such as PPC, SEO, front end web development, WordPress and Magento development.

Accredited Google Partner & Bing Ads qualifications, BA (Hons) in Digital Marketing. One half of the Director duo at Kumo.