Technical SEO for e-commerce: crawlability, indexing and structured data audit

Why technical SEO decides before content

Publishing excellent content in a store that Googlebot cannot crawl is like putting a sign in the window of a closed store. Technical SEO is the set of conditions that allow Google to find, interpret and index each product, category and content page of your e-commerce.

A well-performed technical audit answers three questions: can the bot reach it? Can it read? Is what it reads sufficiently structured to generate rich results? This article maps each layer of this process with tools and checklists applicable to any platform.

⚠️

Warning: Crawlability and indexing errors usually appear weeks after publishing — when organic traffic simply does not grow. Preventive diagnosis avoids months of lost content.

Layer 1: Crawlability — can the bot reach it?

Crawlability is the ability of Googlebot to navigate your site. The most common problems in e-commerce:

robots.txt blocking product or category pages — classic error in migrated stores
Noindex on listing pages — common in themes with incorrect default settings
JavaScript blocking rendering — SPAs and modern frameworks that Googlebot does not execute correctly
Chain redirects — each additional hop consumes crawl budget
Broken internal links (404) — waste authority and create dead ends for the bot

How to audit crawlability

Use Google Search Console → URL Inspection to verify pages individually. For a systemic view, tools like Screaming Frog, Sitebulb or Ahrefs Site Audit map the entire site and identify problematic patterns.

Also check the server crawl log: it shows which URLs Googlebot visited, how often and which returned errors. This information is more precise than any third-party tool.

✅

Crawlability checklist: robots.txt without improper blocks · Updated and submitted XML Sitemap · No noindex on strategic pages · Direct redirects (no chain) · Crawl budget preserved (avoid infinite pagination without rel=canonical)

Layer 2: Indexing — is what was crawled in the index?

Crawling does not mean indexing. Google can visit a page and decide not to include it in the index for various reasons: duplicate content, perceived low quality, canonical pointing to another URL, or simply by algorithmic decision.

Canonicalization and duplicate products

In e-commerce, duplicates are endemic: the same t-shirt appears in /tops/white-t-shirt/, /sale/white-t-shirt/ and /summer-collection/white-t-shirt/. Without a correct rel=canonical, Google chooses which version to index — often not the one you want.

URL parameters for filters, sorting and tracking also create silent duplicates. Configure Google Search Console → URL Parameters to tell Google how to handle each parameter.

Thin content and empty category pages

Category pages with only 4 or 5 products, no description, no editorial context, are candidates for "I will not index this". Add a paragraph of relevant text above the products, specify the value proposition of the category and use heading tags (H1, H2) hierarchically.

Layer 3: Structured Data — data that generates rich results

Structured data are JSON-LD markup that tells Google what each element means, not just how it looks. For e-commerce, the most relevant types are:

🛍️

Product + Offer

Activates price, availability and reviews in search results. Required for shopping listings.

⭐

Review + AggregateRating

Displays rating stars directly in the organic result — increases CTR by up to 35%.

🗺️

BreadcrumbList

Shows the navigation hierarchy in the snippet and improves Google's understanding of site structure.

❓

FAQPage

Expands the result with a questions accordion — takes up more space and reduces competitor clicks.

How to validate structured data

Use Google's Rich Results Test to check if the markup is correct and eligible for rich snippets. The Schema Markup Validator detects syntax errors. Both are free and should be part of the routine after any deployment.

💡

Note Google's requirements: Review rich snippets require reviews to be from real users, not the company. Incorrect markup can result in a manual penalty.

Core Web Vitals as a ranking factor

Since the 2021 Page Experience Update, LCP, INP and CLS are ranking signals. A store with LCP above 4s can have all the correct content, perfect structured data and still lose to faster competitors.

Measure your Core Web Vitals with PageSpeed Insights and, more importantly, with CrUX field data in Search Console (under "Page Experience"). Lab data is useful for debugging, but field data is what Google uses for ranking.

Technical SEO audit checklist — e-commerce

Area	What to check	Tool
Crawlability	robots.txt, noindex, redirects	Search Console, Screaming Frog
Indexing	Coverage report, canonicals, duplicates	Search Console
Speed	LCP, INP, CLS — field data	PageSpeed Insights, CrUX
Structured Data	Product, Offer, Review, Breadcrumb	Rich Results Test
Internal links	404s, orphan pages, click depth	Screaming Frog, Ahrefs
HTTPS	Valid certificate, mixed content	SSL Labs, DevTools

Need a complete technical SEO audit?

Detailed diagnosis of crawlability, indexing and structured data of your store — with report prioritized by impact and technical correction plan.

Request diagnosis → See more articles

Technical SEO for e-commerce: complete crawlability, indexing and structured data audit

Why technical SEO decides before content

Layer 1: Crawlability — can the bot reach it?

How to audit crawlability

Layer 2: Indexing — is what was crawled in the index?

Canonicalization and duplicate products

Thin content and empty category pages

Layer 3: Structured Data — data that generates rich results

Product + Offer

Review + AggregateRating

BreadcrumbList

FAQPage

How to validate structured data

Core Web Vitals as a ranking factor

Technical SEO audit checklist — e-commerce

Need a complete technical SEO audit?

Is your store invisible to Google?

Complete audit

Prioritized report

Action plan