Technical SEO
How to check robots.txt, sitemap, and canonical tags
Robots, sitemap, and canonical tags tell search engines what they can crawl and which URLs matter.
Problem
Small configuration mistakes can block indexing or send contradictory signals.
Symptoms
- Important pages are missing from search.
- Sitemap discovery fails.
- Canonical tags point to unexpected URLs.
How to diagnose
- Fetch robots.txt.
- Find sitemap URLs from robots and common paths.
- Inspect canonical tags on homepage and sampled pages.
How to fix
- Allow important paths in robots.txt.
- Submit a clean sitemap.
- Use one canonical URL per page and avoid self-contradictory metadata.
How Search Lighthouse helps
Search Lighthouse runs these checks together so a report shows whether the issue is robots, sitemap, canonical, or page metadata.
Related guides
How to fix canonical and sitemap host mismatch
Host mismatch happens when your sitemap, canonical tags, and live URLs disagree about the preferred domain.
Why Google crawled but did not index your pages
Crawled but not indexed usually means discovery worked, but page quality, duplication, or signals did not justify indexing.
How to improve indexability for AI-built websites
AI-built websites often ship fast, but search engines still need stable templates, links, and unique page value.