← Clarigital·Clarity in Digital Marketing
On-Page SEO · Session 3, Guide 2

Keyword Clustering · Topic Modelling & Topical Authority

Keyword clustering is the process of grouping related keywords into topic clusters — each cluster becomes a content unit. This guide covers semantic grouping methods, pillar-and-spoke architecture, cannibalisation detection, how Google measures topical authority, and why clustering-first content planning consistently outperforms keyword-by-keyword targeting.

On-Page SEO2,700 wordsUpdated Apr 2026

What You Will Learn

  • The difference between keyword grouping by topic and grouping by intent
  • SERP-based clustering — the most reliable method for grouping keywords
  • How pillar pages and cluster pages relate to each other structurally
  • How Google measures topical authority and why breadth of coverage matters
  • How to identify keyword cannibalisation and resolve it
  • How to determine the right number of pages per cluster

What is Keyword Clustering

Keyword clustering groups semantically related keywords that should be addressed by the same page or by a connected set of pages. Instead of targeting one keyword per page in isolation, clustering identifies which keywords share the same search intent and user need — and therefore belong together in a single, comprehensive piece of content.

The practical benefit: a single well-clustered page can rank for dozens or hundreds of related keywords simultaneously. A guide on "email marketing automation" that thoroughly covers triggers, sequences, segmentation, and tools will naturally rank for all these sub-topics without requiring a separate page for each one.

One cluster ≠ one page always

A cluster is a topic unit, not a page count. Small clusters become a single comprehensive page. Large clusters with distinct intent-differentiated sub-topics become a pillar page + supporting spoke pages. The decision depends on whether sub-topics have sufficient depth and distinct enough intent to warrant separate pages.

Semantic Grouping Methods

Several methods exist for grouping keywords. The most reliable is SERP-based clustering — the others are useful supplements.

SERP-based clustering (most reliable)

Two keywords belong in the same cluster if they return substantially the same top-ranking pages in Google Search. The logic: if Google serves the same content to answer both queries, they represent the same underlying informational need and should be targeted by the same page.

Method: for each pair of candidate keywords, compare the top 10 ranking URLs. If 3+ URLs overlap, the keywords are in the same cluster. This can be automated with crawl tools or keyword clustering software that performs SERP overlap analysis at scale.

Semantic similarity grouping

Keywords that share a primary entity or concept are semantically related even if SERP overlap is low. "What is PageRank" and "PageRank algorithm explained" are about the same concept despite potentially different SERPs — they likely belong on the same page. Semantic similarity grouping is useful for identifying sub-topics that should be covered within a page even if they do not share SERP overlap.

Modifier-based grouping

Head keyword + modifiers that share the same base concept often cluster together: "keyword research tool", "best keyword research tool", "free keyword research tool", "keyword research tool for beginners" — all variations of the same core query. Commercial modifiers ("best", "top", "review") may shift to commercial intent requiring a different page format despite the same base keyword.

Pillar and Spoke Architecture

The pillar-and-spoke model organises related content into a hub page (pillar) covering a broad topic comprehensively, surrounded by cluster pages (spokes) addressing specific sub-topics in depth. Each spoke links back to the pillar; the pillar links to all spokes. This internal linking structure creates a clear topical cluster that communicates expertise and content breadth to Google.

Pillar page characteristics

  • Broad, high-level coverage of a topic with links to deeper sub-topic pages
  • Typically 3,000–6,000 words for competitive topics
  • Targets the highest-volume head keyword in the cluster
  • Acts as an internal link hub — references every spoke page at the relevant section
  • Designed to rank for the head term and provide a navigational overview for users new to the topic

Spoke page characteristics

  • Deep, specific coverage of one sub-topic within the cluster
  • Typically 1,500–3,000 words — focused, not broad
  • Targets a more specific long-tail or mid-tail keyword
  • Links back to the pillar page and to closely related spoke pages
  • Provides the depth that the pillar page cannot offer without becoming unwieldy
Digital Codex itself is built on pillar-spoke architecture

The /seo/ section is the pillar for SEO. /seo/technical/ is a mid-tier cluster pillar for Technical SEO. /seo/technical/lcp-optimisation/ is a spoke providing depth on one sub-topic. This mirrors exactly how Google expects topically authoritative sites to be structured.

Topical Authority

Topical authority refers to a site's demonstrated expertise in a particular subject domain. Google's ranking systems evaluate not just individual page quality but whether a site comprehensively covers a topic area — producing consistent quality signals across all pages in a niche. A site that covers every aspect of a topic tends to rank more easily for new content in that topic than a site covering only parts of it.

How topical authority is built

  • Breadth: Cover all sub-topics within your niche. A site about "email marketing" that covers only campaign types but not deliverability, automation, compliance, and analytics is not demonstrating comprehensive expertise.
  • Depth: Each guide must genuinely answer the question it targets — not a surface-level overview that requires the user to search again for details.
  • Internal linking: Cross-references between related pages signal that your content is organised as a coherent knowledge base, not isolated articles.
  • Update cadence: Keeping guides current with accurate information (especially for fast-changing topics like algorithm updates, ad platform features) signals that a site maintains its content rather than publishing and abandoning it.

Google's documentation on its E-E-A-T framework (Experience, Expertise, Authoritativeness, Trustworthiness) and the quality guidelines for Search Quality Raters explicitly reference topical comprehensiveness as a signal of expertise.

Detecting and Fixing Keyword Cannibalisation

Keyword cannibalisation occurs when two or more pages on your site compete for the same keyword. Google must choose which page to rank — the result is often neither page ranking as well as a single consolidated page would.

How to detect cannibalisation

  • In Google Search Console → Performance, filter by a keyword and check how many URLs are listed in the Pages tab for that query. Multiple URLs appearing for the same query indicates potential cannibalisation.
  • Search Google for: site:yourdomain.com "target keyword" — if multiple pages appear prominently, they may be competing.
  • Analyse URL history in Search Console — if two URLs alternate in position for the same query over time, cannibalisation is likely.

How to resolve cannibalisation

  • Consolidate. Merge the weaker page's content into the stronger page; 301-redirect the weaker URL to the consolidated page. Best when both pages cover substantially the same topic.
  • Differentiate. Rewrite one page to target a genuinely different search intent within the broader topic. "Best project management software" (commercial intent) and "How project management software works" (informational intent) can coexist.
  • Canonicalise. If one page must exist for non-SEO reasons (CMS structure, user navigation) but should not rank, add <link rel="canonical"> pointing to the preferred page.

How to Size Your Clusters

A cluster's size — how many pages it warrants — depends on the depth of the topic and the differentiation of sub-topic intents. There is no universal formula, but these principles guide the decision:

  • If a sub-topic can be answered in 300–500 words within a broader guide, it does not need its own page.
  • If a sub-topic has its own distinct keyword with meaningful search volume and distinct search intent, it warrants a dedicated page.
  • If comprehensively covering all sub-topics on a single page would make the page exceed 6,000–8,000 words, splitting into pillar + spokes improves usability and allows more focused targeting.
  • Avoid creating pages for sub-topics with no search volume — content for humans, not for search engines, but search volume is the signal that humans are looking for that specific information.

Authentic Sources

OfficialGoogle Search Central — Creating Helpful Content

Google's official guidance on what makes content genuinely useful — the basis of topical authority.

OfficialGoogle Search Central — Google Discover

How topical expertise and E-E-A-T affect content surfacing in Discover — related to topical authority.

OfficialGoogle — Search Quality Evaluator Guidelines

Official rater guidelines documenting how human reviewers assess topical expertise and E-E-A-T.

OfficialGoogle Search Console — Performance Report

Detecting cannibalisation using the Queries and Pages dimensions in Search Console.

600 guides. All authentic sources.

Official documentation and academic research only.