Standalone article · part of a sequenced guide

What you'll unlock: Self-service succeeds when retrieval is right. AI Search + knowledge hygiene + action-first portal UX + honest measurement create deflection; GenAI without retrieval just generates confident noise faster.

View full guide New here? Start Chapter 1

Tool guideChapter 3 of 10

AI Search and Knowledge Intelligence

~135 min read

How ServiceNow finds, surfaces, and synthesises information — the intelligence layer beneath every self-service experience

Chapter context

Every self-service program eventually discovers the same truth: people don’t want to browse a knowledge base — they want to solve the problem. AI Search is what makes that possible at scale by matching human language to enterprise content, measuring failures, and turning those failures into continuous improvements.This chapter is also the missing layer for GenAI programs: when retrieval is wrong, users call it hallucination. When retrieval is right, Now Assist becomes reliable because it is grounded in the right sources.

Is this chapter for you?

Are users creating tickets because search “can’t find anything”?

Start with Concepts 1 and 3: index scope, profiles, ranking, and query expansion fix most failures before GenAI is involved.

Is deflection a CFO-level KPI for you?

Read Concept 4 in full. Use conservative attribution (72h no-ticket) and publish assumptions to earn trust and funding.

Do you plan to use Now Assist in portal or Virtual Agent answers?

Read Concepts 1–2 first. Grounding depends on AI Search and knowledge lifecycle. Fix retrieval before tuning prompts.

Do you operate multiple portals (IT, HR, CSM) with different policies?

Concept 3 (profiles, blocks, personalisation) is mandatory. One global profile creates leakage or irrelevance.

If Now Assist is the voice, AI Search is the engine. This chapter explains how ServiceNow retrieves the right information with profiles, semantic similarity, and ranking — then turns search analytics into a knowledge flywheel that improves over time.You will learn the architecture of AI Search, how knowledge creation and lifecycle becomes AI-assisted, how to configure result experiences for different personas, and how to design deflection workflows CFOs actually fund.By the end you can run a credible PDI search POC with a golden query set, ship a portal funnel that preserves trust, and explain why most GenAI failures are retrieval failures in disguise.

Chapter insight

Self-service succeeds when retrieval is right. AI Search + knowledge hygiene + action-first portal UX + honest measurement create deflection; GenAI without retrieval just generates confident noise faster.

Reference diagrams

AI Search stack

Search is a system: source scope → retrieval → ranking → UI blocks → analytics → knowledge flywheel.

SourcesKB, catalog, records, federationScope

RetrieveLexical + semantic candidatesEmbeddings

RankBoosts, recency, personaProfiles

RenderBlocks + actionsPortal

MeasureZero results, clicks, ticketsAnalytics

ImproveFlywheel backlogOps

Deflection funnel

Deflection happens before the ticket exists: retrieve answer, guide action, escalate gracefully when needed.

User asksPortal / mobile / VA entryIntent

SearchAI Search retrieves sourcesRetrieve

SynthesizeOptional GenAI answer grounded in sourcesAssist

ActReset / request / status / escalationFlow

FallbackTicket + context handoffTrust

Implementation paths

Search wins by design, not by hope: scope, quality, ranking, UX, and analytics ownership.

Concept 1

AI Search Architecture

Indexes, NLU, embeddings, ranking, federation, analytics, and PDI configuration

1.1

The search index

How ServiceNow indexes content and what makes AI Search different from legacy search

Key takeaway

AI Search is an index-driven retrieval system tuned for enterprise workflows: it indexes approved sources, respects ACLs, and supports semantic retrieval — not just keyword matching.

Why this matters

Search quality is the foundation beneath self-service, Virtual Agent, and Now Assist grounding. If retrieval is wrong, GenAI will be wrong faster.

Legacy search largely optimised for keyword matching. AI Search adds semantic signals and richer ranking controls so “VPN drops” can still find “Remote access troubleshooting” when the vocabulary differs.

Indexes are source-scoped. The most common failure is indexing too much (noise + leakage risk) or too little (empty results).

Architecture rule: treat AI Search like a production service. Define sources, ranking, analytics, and a lifecycle for improvements — not a one-time configuration.

Workflow — do this next

01List your top 5 search surfaces (portal, Agent Workspace, VA, mobile).
02For each, define: sources allowed, users allowed, and what “good result” means.
03Start with knowledge + catalog; expand to record search only after ACL review.

Real example

Why GenAI “hallucinated” the VPN policy

Now Assist drafted from an outdated article because AI Search indexed old KB spaces. Fixing the index scope and boosting current policy improved both search and GenAI output — the model didn’t change; retrieval did.

1.2

Natural Language Understanding in search

How NLU interprets intent rather than matching keywords

Key takeaway

NLU helps map messy user phrasing to intent (“reset MFA”) and constraints (“mobile device”), improving recall and routing — especially in portal search and Virtual Agent entry.

Why this matters

Users don’t know your internal taxonomy. NLU is the bridge between human language and enterprise content structure.

NLU is useful when you have synonyms and ambiguity. It complements embeddings by normalising common patterns and extracting entities (app, device type, location).

Avoid overfitting: NLU intent models require maintenance. If you don’t maintain it, users will drift and recall will collapse.

Design choice: use NLU for high-volume intents (password, VPN, email, access). Do not attempt to model every long-tail question — let semantic retrieval handle the tail.

Workflow — do this next

01Export top 50 portal queries. Cluster into 10 intents.
02Add synonyms and abbreviations for each intent.
03Validate improvement on a fixed test set weekly for 4 weeks.

Real example

“MFA broken” routed to correct KB

NLU recognised MFA reset intent and added query expansion. AI Search returned the correct article even when users typed “auth app not working” — containment improved without adding new KB.

1.3

Semantic similarity

Vector embeddings and how they find conceptually related content

Key takeaway

Embeddings represent meaning, enabling AI Search to retrieve relevant content even when keywords differ — critical for deflection and GenAI grounding.

Why this matters

Semantic retrieval is what makes search resilient to phrasing variability and multilingual behaviour.

Embeddings map text into a vector space. When a user asks a question, AI Search retrieves content whose embeddings are closest — not just content sharing the same words.

Semantic similarity fails when content is too generic or when your KB is inconsistent. Fix with better article structure and boosting, not by disabling semantics.

Architect advice: keep a “golden set” of 50–100 queries with expected results. Use it as regression tests when you change sources or ranking.

Workflow — do this next

01Create a golden query set with expected KB hits.
02Identify generic articles that appear in too many queries; refine or demote.
03Reindex after major KB changes; compare golden set results.

Real example

Concept match beat keyword match

Users searched “laptop encryption key” and got zero results with legacy search. AI Search semantic retrieval returned “BitLocker recovery key” correctly — vocabulary mismatch solved.

1.4

The ranking pipeline

Relevance scoring, recency, personalisation, and how to influence each

Key takeaway

Ranking blends multiple signals: textual relevance, semantic similarity, freshness, popularity, user context, and explicit boosts — you can tune the ranking without changing content.

Why this matters

Most search “bugs” are ranking problems: the right doc exists, but it’s buried.

Think in layers: retrieve candidates (semantic + lexical), then re-rank. Tuning happens mostly in re-ranking.

Three levers you control: boosting, recency weighting, and profile scoping.

Do not use recency to hide low-quality docs. Fix quality. Recency is a tie-breaker, not a cleanup strategy.

Workflow — do this next

01Pick one portal. Define top 10 queries and desired top 3 results.
02Tune boosts and recency. Re-test on golden set.
03Roll out to 10% of users; monitor zero-results and click-through.

Real example

Recency prevented outdated VPN article

Two VPN policies existed; 2019 doc ranked higher due to backlinks. Adding recency weighting and boosting the 2025 policy corrected ranking — deflection improved immediately.

1.5

Federated search

Connecting external content sources and the configuration that governs source priority

Key takeaway

Federation lets AI Search surface results from external systems (docs, wikis, ticketing, repos) while keeping ServiceNow as the orchestration layer — with explicit source priority and access controls.

Why this matters

Most enterprises don’t store truth in one place. Federation reduces context-switching while preserving governance.

Federation requires two decisions: what to connect and who is allowed.

Priority matters: when internal KB and external wiki conflict, decide which wins by portal/persona — otherwise you train users to distrust search.

Architecture rule: federation is not a replacement for knowledge governance. It is an amplifier — it amplifies both good and bad content.

Workflow — do this next

01Inventory knowledge sources (ServiceNow KB, SharePoint, Confluence, GitHub).
02Define per-portal priority order and allowlist.
03Run a conflict test: 5 topics with duplicate docs; pick winners explicitly.

Real example

Confluence vs KB conflict resolved

Engineers preferred Confluence runbooks; service desk needed KB. Federation enabled both with portal-specific priority so each audience saw the right source first — adoption rose without forcing migration.

1.6

Search analytics

Query logs, zero-result queries, and the data that drives continuous improvement

Key takeaway

Analytics turn search into an operating system: top queries, zero results, click-through, abandonment, and deflection attribution drive weekly improvements.

Why this matters

Without analytics, teams argue with anecdotes. With analytics, you ship a knowledge flywheel.

High-yield metrics: zero-result rate, click-through on top result, and abandonment (search then ticket create).

Use analytics to drive knowledge backlog. Assign owners and SLAs like any product backlog.

Measure improvements against a stable baseline — avoid changing ranking and KB simultaneously without tracking which change helped.

Workflow — do this next

01Weekly: export top 20 zero-result queries and top 20 high-volume queries.
02Create action items: new KB, synonyms, boosts, or federation updates.
03Monthly: review deflection impact and adjust portal funnels.

Real example

Zero-result dashboard drove 14 new articles

Top zero queries were ‘email signature’, ‘VPN on iPhone’, ‘new laptop request’. Writing 14 focused articles and boosting them cut zero-result rate by half in 6 weeks.

1.7

AI Search configuration walkthrough

Enabling, configuring, and testing on PDI

Key takeaway

PDI success pattern: enable AI Search → define sources → configure search profile → tune ranking → validate with golden queries → monitor analytics.

Why this matters

Hands-on PDI steps are required to turn theory into an interview-ready, demoable capability.

Step 1: Enable AI Search features on your PDI (availability depends on instance/build).

Step 2: Configure search sources (start with one KB + catalog).

Step 3: Create a portal profile and test queries with different roles.

Step 4: Tune boosting and recency; rerun golden query set.

Step 5: Capture analytics — zero results, clicks — and iterate.

Workflow — do this next

01Create 10 test KB articles with deliberate synonyms and structure.
02Create golden set: 20 queries with expected top result.
03Run tests as three users: employee, agent, admin; verify ACL scoping.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

AI Search golden query set (starter)

Paste into spreadsheet and use as regression tests.

| Query | Persona | Expected top result | Pass? |
|------|---------|----------------------|-------|
| reset mfa | employee | MFA reset article | |
| vpn drops | employee | remote access troubleshooting | |
| request new laptop | employee | catalog item: laptop request | |
| outlook sync error | agent | KB: Outlook sync fix | |
| bitlocker key | employee | KB: recovery key | |

1.8

AI Search vs legacy search

Migration considerations and the A/B testing approach for rollout

Key takeaway

Migrate by portal and persona. A/B test AI Search vs legacy on the golden set and real traffic. Roll back safely if zero-result or wrong-result rate rises.

Why this matters

Search is a front door. Breaking it causes immediate ticket spikes and loss of trust.

Migration risks: different ranking, different source scope, and user expectation shifts. Manage by A/B testing before full cutover.

Define success criteria: improved top-1 click-through, reduced zero results, and stable or improved deflection. If deflection rises but wrong-result complaints spike, stop.

Operationally: keep legacy available as fallback during pilot. Make it easy for users to report bad results — that feedback is gold.

Workflow — do this next

01Pilot on one portal (employee IT) with 10–20% traffic.
02Track: zero results, click-through, ticket funnel, and feedback volume.
03Expand to HR/CSM portals after stabilising IT portal ranking.

Real example

A/B prevented a bad cutover

Pilot showed AI Search improved click-through but surfaced a restricted article to a broad role due to mis-scoped source. Fixing the profile before 100% rollout avoided a compliance incident.

Concept 2

Knowledge Management with GenAI

How GenAI creates, evaluates, organises, translates, and continuously improves knowledge for self-service

2.1

AI-assisted article creation

How Now Assist drafts knowledge articles from incident resolutions and case notes

Key takeaway

GenAI can draft first-pass knowledge articles from resolved tickets — but humans must validate accuracy, scope, and policy before publishing.

Why this matters

Knowledge authoring cost is the #1 bottleneck in self-service programs. GenAI reduces the cost of drafts; governance preserves trust.

Source material matters: the best drafts come from clean resolution notes and well-structured work notes. Garbage tickets produce garbage knowledge.

Draft pipeline: resolved ticket → draft article → SME review → publish → monitor analytics. Do not publish auto-drafts straight to production portals.

Optimise for reusability: a knowledge article should answer a repeatable problem, not describe a one-off incident timeline.

Workflow — do this next

01Pick one high-volume category (password, VPN, email).
02Generate 5 draft articles from top resolved tickets.
03SME validates with a checklist: correctness, safety, scope, and links.

Real example

VPN outage ticket → evergreen troubleshooting article

Draft originally described a specific outage window. SME edited into general troubleshooting steps with decision tree and escalation. Deflection rose because the article became evergreen.

2.2

Knowledge gap detection

How AI identifies missing knowledge based on failed search queries

Key takeaway

Search analytics (zero results, low click-through, ticket creation after search) reveal knowledge gaps; GenAI helps convert those gaps into article drafts and taxonomy updates.

Why this matters

The highest ROI knowledge work is writing what users actually search for — not what SMEs think they search for.

Gap signals: zero-result queries, high abandonment, and repeated escalations after search.

GenAI helps by proposing article outlines and titles for the top gaps — but the important part is the backlog and ownership model.

Treat knowledge gaps like product bugs with SLAs. If 'MFA reset' has zero results, that's an outage of self-service.

Workflow — do this next

01Weekly: export top 20 zero-result queries.
02Group them into 5–10 gap themes.
03Create drafts for the top 3 themes; publish only after SME review.

Real example

“Email signature” gap fixed in 48 hours

Analytics showed 1,200 searches/month with zero results. GenAI drafted an article; SME verified steps; published in 2 days. Ticket volume for that category dropped within a week.

2.3

Article quality scoring

How the platform evaluates knowledge article completeness and accuracy

Key takeaway

Quality scoring blends structural completeness (title, steps, prerequisites, links), readability, and usage signals (helpful votes, deflection success) — and it can be used to prioritise remediation.

Why this matters

Publishing more articles doesn’t help if users can’t follow them. Quality scoring keeps the knowledge base trustworthy.

Structural signals: clear title, prerequisites, ordered steps, and expected outcomes. Usage signals: helpfulness, time on page, and repeat search after view.

Accuracy is hardest: the platform can flag likely issues (too old, contradictory, low success), but human validation is still required for high-risk procedures.

Use score thresholds: below X → requires review; above Y → eligible for automation and self-service flows.

Workflow — do this next

01Define a knowledge quality rubric and map it to score ranges.
02Create a remediation queue for low-score articles.
03Review 10 low-score articles monthly; fix or retire.

Real example

Top-viewed article was low quality

“Request software” article had high views but low helpfulness. Score flagged it. Rewrite added screenshots and clarified approval flow. Helpful votes doubled; ticket creation after view dropped.

2.4

Automated lifecycle management

AI-driven review scheduling, expiry, and retirement

Key takeaway

Knowledge decays. Automated lifecycle management schedules reviews based on change frequency, policy sensitivity, and usage — retiring stale articles before they poison search and GenAI.

Why this matters

Outdated articles are a deflection killer and a compliance risk. Automation prevents knowledge rot at scale.

High-risk domains (security, HR policy, finance) need shorter review cycles. Low-risk how-to content can be reviewed less often.

AI can recommend retire/refresh based on age + low success + contradiction signals. Humans approve retirements for auditability.

Link lifecycle to product releases and tool changes — knowledge should update when the workflow changes.

Workflow — do this next

01Tag articles by risk tier (IT how-to vs HR policy).
02Set review cadence by tier and usage.
03Auto-create review tasks; track completion SLA.

Real example

Stale security article caused policy breach

Old article recommended unsafe VPN configuration. Lifecycle scoring flagged it due to age and negative feedback; retiring it prevented further misuse and reduced hallucinations in assist drafts.

2.5

Knowledge extraction from tickets

Turning resolved incidents into structured knowledge automatically

Key takeaway

Ticket-to-knowledge pipelines extract: symptom, environment, cause, fix, verification, and escalation criteria — producing structured drafts that are easier to validate and search.

Why this matters

Unstructured drafts are hard to review. Structure makes knowledge scalable and improves AI Search ranking.

A structured template prevents the common failure: articles that say “Reboot fixed it” without stating what rebooted, why, and what to do if it fails.

Use extraction for repetitive categories only — where the same fix applies. For one-off incidents, extraction creates noise.

Pipeline discipline: only extract from tickets with high-quality resolution notes and successful closure (low reopen rate).

Workflow — do this next

01Define extraction template fields (symptom, steps, validation).
02Select eligible tickets: high volume + low reopen + clear resolution.
03Generate drafts; SME approves or rejects.

Real example

Printer incidents → structured knowledge set

Top 3 printer errors generated 15 structured articles. Search recall improved and agents stopped copy-pasting long notes. Deflection rose for printers without changing Virtual Agent topics.

2.6

Knowledge base organisation

AI-assisted categorisation and tagging

Key takeaway

Consistent taxonomy (categories, products, CI classes, audience) improves search ranking and retrieval; AI-assisted tagging reduces manual load but must be audited.

Why this matters

Search systems love structure. Tagging is the cheapest high-leverage improvement you can make.

Tagging improves both lexical and semantic retrieval by adding clean signals that embeddings alone can’t infer reliably (audience, risk, region).

AI can suggest tags and categories from content. Human reviewers approve for high-risk domains to prevent misclassification.

Define a controlled vocabulary for abbreviations and product names — don’t let tags proliferate into chaos.

Workflow — do this next

01Create a canonical tag list (products, services, regions).
02Run AI-assisted tagging suggestions on 100 articles.
03Audit a sample for accuracy; then scale.

Real example

Tag standardisation improved recall

“MFA” vs “2FA” tags were inconsistent. Normalising to one tag and adding synonyms improved top-3 recall across security articles without rewriting content.

2.7

Multilingual knowledge

Machine translation and quality considerations for non-English content

Key takeaway

Translation can accelerate global coverage, but quality and cultural fit require native review for policy-sensitive content. Semantic search helps, but only if translations are accurate.

Why this matters

Global portals fail when English-only knowledge is pushed into non-English regions without validation.

Machine translation works best for procedural IT steps; it is risky for HR and legal policy where wording matters.

Maintain locale-specific variations: the same “leave policy” differs by country. Translation is not localisation.

Use analytics per locale: zero-result and abandonment patterns differ by language. Treat each locale as a product.

Workflow — do this next

01Pick one locale and 50 high-volume articles to translate.
02Native reviewer signs off on top 20 policy-sensitive pieces.
03Compare deflection and feedback before/after per locale.

Real example

Spanish portal parity required content investment

Deflection lagged until 120 core articles were translated and reviewed. Once coverage matched English, AI Search and portal assist reached near parity.

2.8

The knowledge flywheel

System design that makes knowledge better over time without manual curation drag

Key takeaway

Flywheel: Search analytics → gap backlog → GenAI drafts → SME review → publish → deflection measurement → retire stale content → repeat. This is the engine beneath sustainable self-service.

Why this matters

Enterprises that win at AI Search treat knowledge as a product with feedback loops — not a documentation graveyard.

The flywheel converts operational exhaust (queries, tickets, feedback) into better content and better retrieval. Without it, search quality decays and GenAI becomes untrustworthy.

The key is ownership: knowledge manager + portal owner + AI Search admin + SME guild. No owner means no flywheel.

Instrument everything: when a user searches, what did they click, did they create a ticket, did they succeed? Those signals drive iteration.

Workflow — do this next

01Stand up a weekly Knowledge Flywheel meeting (30 min).
02Review: top zero queries, low-quality articles, deflection drop areas.
03Ship: 5 improvements per week (articles, synonyms, boosts, retirement).

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Knowledge flywheel operating rhythm

Weekly agenda you can paste into your ops calendar.

## Weekly (30 min)
- Review top 20 zero-result queries
- Review top 10 abandonment queries (search → ticket)
- Pick 5 actions: new article / retag / synonym / boost / retire

## Monthly (60 min)
- Quality score distribution
- Top deflection categories and failures
- Locale review (non-English)

## Quarterly
- Taxonomy refresh
- High-risk policy review (HR/security/finance)
- Federation source audit

Concept 3

Search Result Configuration

Profiles, boosting, blocks, synonyms, personalisation, evaluation, APIs, and performance tuning

3.1

Search profiles

Configuring what content each user type sees in search results

Key takeaway

Search profiles define which sources, ranking rules, and UI blocks appear per persona (employee portal, agent workspace, HR portal) — preventing “one search to rule them all” mistakes.

Why this matters

Different users require different results and different compliance constraints. Profiles are how you encode that reality.

Profiles segment by surface and persona. Each profile can point at different sources and ranking preferences.

Security implication: HR profile can include HR KB; employee IT profile should not. Profiles plus ACLs form the retrieval boundary.

Practical rule: create the minimum set of profiles that reflect real differences. Too many profiles become unmaintainable.

Workflow — do this next

01Define 3 profiles: employee IT, agent ITSM, HR employee.
02Assign sources and ranking priorities per profile.
03Test the same query across profiles to validate audience separation.

Real example

HR content leakage avoided by profile split

Without profile separation, “leave policy” query surfaced internal HR manager guidance to all employees. Splitting HR manager vs employee profiles fixed it while retaining findability for the right audience.

3.2

Boosting and weighting

How to promote specific content types, sources, or attributes

Key takeaway

Boosting is how you encode business intent into ranking: promote authoritative KB spaces, demote generic docs, and prioritise catalog items for request intents.

Why this matters

If search returns the “wrong right answer,” users lose trust. Boosting makes the right answer win consistently.

Boost dimensions include source type (KB vs record), KB base, article category, recency, and popularity signals.

Use boosting to enforce governance: official policy docs should outrank personal wiki pages on compliance topics.

Anti-pattern: boosting everything. If all sources are boosted, nothing is. Keep boosts scarce and justified.

Workflow — do this next

01Pick 10 high-impact queries; define desired top result.
02Apply minimal boosts to achieve target ordering.
03Re-run on golden set to avoid collateral ranking damage.

Real example

Catalog intent boosted over KB

Users searching “new laptop” wanted the request item, not troubleshooting articles. Boosting catalog results for request intents improved funnel conversion and reduced abandoned searches.

3.3

Search blocks

Configurable result cards and tailoring them for different portals

Key takeaway

Search blocks are UI result components (KB card, catalog card, how-to snippet, recommended actions) — tuned per portal to drive deflection and correct next steps.

Why this matters

Even perfect retrieval fails if the UI doesn’t guide the user to act (read, request, reset, escalate).

Portal blocks should prioritise “do the thing”: reset password, request access, start chat — not just list documents.

Agent workspace blocks should prioritise speed: similar incidents, KB snippets, and recommended fix steps.

Design principle: show fewer, better blocks. Too many blocks create choice paralysis and reduce deflection.

Workflow — do this next

01Choose a portal and define 3 blocks: KB, catalog, escalation.
02A/B test block order for top 10 intents.
03Measure funnel: search → click → success (no ticket).

Real example

Reset flow block increased self-service

Adding a “Reset MFA” action block above KB results increased successful self-service even when the KB article existed — because users preferred action to reading.

3.4

Query expansion

Synonyms, abbreviations, and vocabulary configuration that improves recall

Key takeaway

Query expansion increases recall by mapping user vocabulary to enterprise vocabulary: acronyms, product nicknames, and regional terms.

Why this matters

In large orgs, vocabulary drift is constant. Query expansion prevents search from decaying as language changes.

Maintain a synonym dictionary per domain and locale.

Expansion should be governed: careless synonyms can increase false positives and surface wrong policies.

Use analytics to propose new synonyms: repeated “no results” terms become expansion candidates.

Workflow — do this next

01Extract top 100 queries; identify synonyms and acronyms.
02Add expansions for top 20.
03Validate on golden set; monitor for false positives.

Real example

Acronym expansion fixed finance searches

Users typed “SOX access” and got generic security docs. Adding expansions for “SOX” to approved access control KB improved top-1 results immediately.

3.5

Personalised search

Using user context, history, and role to individualise results

Key takeaway

Personalisation uses role, location, device type, and past interactions to rank results that are more likely correct for this user — but must remain transparent and policy-safe.

Why this matters

Personalisation is how you avoid showing Mac VPN articles to Windows-only users — reducing frustration and tickets.

Signals: role (employee vs agent), device fleet (Windows/Mac), region, assigned apps, and history. Use the minimum signals needed.

Transparency matters: users should see why a result was recommended (e.g., “Windows instructions”). Hidden personalisation can look like inconsistency.

Governance: never personalise into restricted knowledge. ACL rules still dominate.

Workflow — do this next

01Pick one safe signal (device type) and implement personalisation for it.
02Measure reduction in “wrong OS” clicks.
03Expand to role and location after validation.

Real example

OS-aware search cut repeat contacts

Mobile users kept following desktop steps. Personalising by device type surfaced mobile instructions first, reducing repeat contacts and increasing successful self-service.

3.6

Testing search quality

Evaluation methodology for measuring search improvement

Key takeaway

Search quality needs an evaluation stack: golden query set, top-k recall, click-through, deflection attribution, and qualitative audits — not “it feels better.”

Why this matters

Search tuning is iterative. Without tests, you regress silently and lose trust.

Offline eval: golden query set and expected results. Online eval: A/B testing on traffic and funnel metrics.

Use top-k metrics: top-1 accuracy is ideal; top-3 recall is acceptable when UI guides users. Zero-results must trend down.

Add qualitative audits: sample 20 queries weekly and judge result usefulness — analytics can miss “technically clicked but wrong.”

Workflow — do this next

01Maintain a 100-query golden set across IT, HR, and CSM.
02Run monthly evaluation after any ranking/source change.
03Publish a simple scorecard to stakeholders.

Real example

Scorecard prevented silent regression

After adding a new source, top-1 accuracy dropped for 12 common queries. Golden set test caught it before broad rollout; boosts were adjusted and accuracy recovered.

3.7

Search for developers

API surface for building custom search experiences

Key takeaway

Developers can build custom search UX on portals/workspaces using platform APIs — but must preserve the same profile scoping and ACL behaviour as standard search.

Why this matters

Custom portals often need tailored search. Doing it wrong creates leakage and inconsistent results.

Use platform search APIs where available instead of replicating ranking in custom code. Keep profiles and sources centralised.

Security rule: never bypass ACLs. Custom experiences must run as the user, not as admin.

UX rule: keep result types consistent with standard search so users recognise what they’re seeing.

Workflow — do this next

01Define the custom portal requirements that standard search can’t meet.
02Use official APIs with profile scoping; verify ACL behaviour.
03Add analytics instrumentation to custom search events.

Real example

Custom mobile search with safe scoping

Field technicians used a custom mobile portal. Team built a simplified search UI using platform search APIs and technician profile. Results remained ACL-safe and analytics fed the knowledge flywheel.

3.8

Performance tuning

Configuration knobs that affect search latency and throughput

Key takeaway

Latency depends on source scope, retrieval depth (top-k), federation calls, and re-ranking complexity. Tune by reducing scope and caching common queries before scaling infrastructure.

Why this matters

If search is slow, users abandon and create tickets. Performance is deflection.

Start by reducing scope: fewer sources, smaller indexes, and tighter profiles. Avoid federating to slow external systems for high-volume portals.

Tune retrieval: limit top-k and chunk sizes; demote generic content that explodes candidate sets.

Treat performance as an SLO: define p95 latency targets per portal, and monitor continuously.

Workflow — do this next

01Measure p50/p95 latency for portal search.
02Disable one external federated source and compare latency.
03Reduce top-k retrieval and retest quality vs speed.

Real example

Federation caused p95 latency spike

Confluence federation added 800ms at p95. Portal searches abandoned increased. Fix: only federate for agent workspace, not employee portal; latency returned and deflection recovered.

Concept 4

Self-service Deflection Workflows

The portal funnel, proactive answers, measurement, fallbacks, optimisation loops, and an enterprise redesign case study

4.1

The deflection economics

Cost model that justifies investment in AI search and self-service

Key takeaway

Deflection ROI is a simple equation: tickets avoided × cost per ticket + agent time saved − AI/search operating cost. Credible programs use category-level baselines, not global hype numbers.

Why this matters

CFOs fund deflection when you show real math, credible attribution, and a plan to improve over time.

Start with category segmentation: password resets and access requests behave differently than outage incidents. Each category has different deflection ceiling.

Include hidden costs: knowledge maintenance, search tuning, and governance. AI doesn’t remove ops; it changes ops.

Use conservative ranges and publish assumptions. Overpromised deflection destroys credibility and kills future funding.

Workflow — do this next

01Compute cost per ticket by category (fully loaded).
02Pick 3 categories with high volume and clear self-service solutions.
03Model deflection at 20%, 35%, 45% scenarios; choose target.

Real example

Why 70% deflection promises fail

The 70% number often excludes complex categories and assumes perfect knowledge. Programs that model category ceilings (e.g., 60% password, 30% VPN, 5% outages) produce believable business cases and sustained investment.

4.2

Portal AI integration

How AI Search and Now Assist combine in Service Portal

Key takeaway

AI Search retrieves the right sources; Now Assist synthesises and formats the answer; the portal UX guides users to act (reset, request, escalate). All three are required.

Why this matters

Many portals deploy GenAI without retrieval or action CTAs — users chat, then still create tickets.

Pattern: user question → AI Search top candidates → GenAI answer grounded in sources → action cards (catalog, reset flow, contact).

Grounding policy: answers must link to KB or catalog items. If no source exists, the assistant should escalate instead of improvising.

Portal design matters: show the answer and the next action on one screen. Minimise clicks between search and resolution.

Workflow — do this next

01For one portal, map: search box → results → action → success.
02Add action cards for top 10 intents.
03Track funnel metrics (search → click → success).

Real example

Catalog action card beat article reading

Users searching “request software” preferred a one-click catalog card. Adding the card increased successful self-service even though the KB article was accurate.

4.3

The create-a-ticket funnel

Where AI intervenes in ticket creation and with what success rate

Key takeaway

The funnel has intervention points: before ticket creation (answer), during creation (suggest duplicates/KB), and after creation (routing and summarisation). Deflection requires before/during interventions.

Why this matters

Teams often measure AI only after a ticket exists — too late for deflection.

Before create: portal answers and action flows. During create: suggest KB and detect duplicates. After create: PI routing and Now Assist summaries improve agent productivity, not deflection.

Success rates differ: “password reset” has high deflection ceiling; “network outage” does not. Tie funnel interventions to category.

Design for honesty: do not block ticket creation behind endless chat. Make escalation easy when needed.

Workflow — do this next

01Instrument where tickets originate (portal, email, VA).
02Add duplicate suggestion and KB prompts in create form.
03Measure: % searches that end without ticket + % tickets created after search.

Real example

Duplicate detection reduced noise

During outage, portal suggested existing major incident and status page. Ticket creation fell, but user satisfaction rose because they got instant status instead of a queue.

4.4

Proactive resolution

Surfacing answers before the user finishes typing their request

Key takeaway

Proactive suggestions (typeahead + intent prediction) can solve issues earlier in the funnel, increasing containment and reducing frustration — but must be precise to avoid noise.

Why this matters

The best deflection is invisible: users get the answer before they ask fully.

Proactive works when you have strong patterns and clean content: top intents, stable procedures, and clear action paths.

Noise kills it: irrelevant popups train users to ignore the portal. Use strict thresholds and limit to top intents.

Pair with accessibility: proactive suggestions must be keyboard and screen-reader friendly.

Workflow — do this next

01Enable proactive suggestions for top 5 intents only.
02Measure: suggestion click rate and success rate.
03Expand only if success stays high and complaints stay low.

Real example

Typeahead solved VPN queries

As users typed “vpn”, portal suggested “Remote access status” and “Reset VPN profile.” Click-to-resolution improved and ticket creation dropped for that category.

4.5

Measuring deflection

Metrics, attribution challenges, and reporting CFOs accept

Key takeaway

Deflection measurement must tie to record truth: searches that did not create tickets, tickets created after failed search, and reopen/recall rates. CFOs trust conservative attribution with clear assumptions.

Why this matters

Bad measurement creates fake wins and later budget cuts.

Core metrics: search sessions, zero results, click-through, ticket create rate after search, and deflection by category.

Attribution pitfalls: users may search then call; one user may create ticket later; a successful session may still create ticket for compliance reasons.

Reporting pattern: show ranges, confidence, and negative signals (false deflection, wrong answers). Mature programs report risks alongside wins.

Workflow — do this next

01Define deflection as: no ticket created within 72h after search on same intent.
02Track false deflection: ticket created after reading wrong article.
03Publish monthly deflection scorecard with assumptions.

Real example

CFO accepted deflection after assumptions published

Team published a conservative definition and reported both deflection and false deflection. CFO funded further knowledge investment because reporting was honest and stable over time.

4.6

The fallback design

What happens when AI cannot answer and how to make handoff seamless

Key takeaway

A good fallback is fast, respectful, and context-preserving: offer ticket creation, chat, or call — while passing transcript, query, and retrieved sources to the agent.

Why this matters

Fallback quality determines whether users trust self-service or bypass it forever.

Fallback triggers: no sources, low confidence, policy-sensitive topic, or explicit user frustration.

Context handoff: include the user’s query, what they clicked, and the top retrieved articles — so agents don’t restart from zero.

Design rule: never trap users in a loop. Provide “create ticket now” option early.

Workflow — do this next

01Define fallback paths per portal: ticket, chat, phone, scheduled callback.
02Attach context to the created record automatically.
03Train agents to acknowledge prior self-service attempts in first reply.

Real example

Handoff summary reduced repeat questions

Agents received a short summary of the user’s portal attempts and retrieved KB. First message acknowledged steps tried. CSAT improved even when deflection didn’t happen.

4.7

Optimising for deflection

Iterative process of improving containment rate over time

Key takeaway

Deflection improves through weekly iteration: fix top gaps, tune ranking, add synonyms, improve action blocks, and retire bad content — the flywheel in action.

Why this matters

Deflection is a product, not a project. Optimisation is the work.

Run the loop: analytics → identify top failures → ship 5 fixes → re-measure. Consistency beats occasional big rewrites.

Separate content fixes from ranking fixes. Content fixes improve truth; ranking fixes improve findability. You need both.

Build champions: portal owners, knowledge managers, and service desk leads. If operations doesn’t own it, it will die after launch.

Workflow — do this next

01Weekly: ship 5 improvements (KB, synonyms, boosts, blocks).
02Monthly: review top deflection categories and plateau causes.
03Quarterly: portal redesign review and federation audit.

Real example

Containment rose 22% → 41% in 12 weeks

No model changes. Only flywheel work: KB rewrites, synonym expansions, boosting, action cards, and stale article retirement. This is how real programs win.

4.8

Real use case: enterprise portal redesign

Architecture, before-and-after metrics, and lessons

Key takeaway

Portal redesign success requires retrieval quality, action-first UX, and honest measurement. AI Search is the engine; Now Assist is the voice; Flow is the action layer.

Why this matters

This is the case study you use in interviews and steering committees.

Architecture: profile-scoped AI Search + KB flywheel + portal action blocks + fallback handoff summary + analytics dashboards.

Before metrics: high zero results, low click-through, high ticket creation after search. After metrics: improved top-1 click-through, reduced zero results, higher deflection in target categories.

Lessons: don’t skip knowledge hygiene; don’t launch without analytics; don’t claim universal deflection; invest in fallback UX.

Workflow — do this next

01Phase 0: golden query set + analytics baseline.
02Phase 1: AI Search sources + ranking tuning for IT portal.
03Phase 2: action blocks + proactive suggestions for top intents.
04Phase 3: expand to HR/CSM portals with separate profiles.

Real example

100k-employee enterprise — portal relaunch

After 8-week search and knowledge cleanup, top-1 click-through rose 31%. Zero-result rate fell 45%. Deflection for top 5 intents reached 38% with conservative attribution. Ticket volume dropped in those categories while CSAT rose.

Ready-to-use artifacts

Complete templates — paste directly into your AI tool or automation workflow.

Search quality scorecard (starter)

Use weekly/monthly to track improvements across portals.

| Metric | IT portal | HR portal | CSM portal |
|-------|----------|----------|----------|
| Zero-result rate | | | |
| Top-1 click-through | | | |
| Ticket create after search | | | |
| Deflection (72h) | | | |
| False deflection (reopen/complaints) | | | |
| p95 search latency | | | |

Deflection attribution definition

Paste into finance deck so numbers stay honest.

Deflection definition (conservative):
- A user search session is considered deflected if no ticket is created within 72 hours for the same intent category.

Report both:
- Deflection rate (by category)
- False deflection: tickets created after reading an incorrect or outdated article

Assumptions:
- Some tickets are created via phone/email (not attributable to search).
- Some topics require human handling by policy; exclude from deflection target.

AI Search golden query set (starter)

Paste into spreadsheet and use as regression tests.

| Query | Persona | Expected top result | Pass? |
|------|---------|----------------------|-------|
| reset mfa | employee | MFA reset article | |
| vpn drops | employee | remote access troubleshooting | |
| request new laptop | employee | catalog item: laptop request | |
| outlook sync error | agent | KB: Outlook sync fix | |
| bitlocker key | employee | KB: recovery key | |

Knowledge flywheel operating rhythm

Weekly agenda you can paste into your ops calendar.

## Weekly (30 min)
- Review top 20 zero-result queries
- Review top 10 abandonment queries (search → ticket)
- Pick 5 actions: new article / retag / synonym / boost / retire

## Monthly (60 min)
- Quality score distribution
- Top deflection categories and failures
- Locale review (non-English)

## Quarterly
- Taxonomy refresh
- High-risk policy review (HR/security/finance)
- Federation source audit

Enterprise portal redesign — search-first deflection

A 90k-employee organisation had a portal, but employees treated it as a form to create tickets. Search returned irrelevant results, knowledge was stale, and Now Assist responses were ungrounded. Ticket volume rose despite AI licensing.

Before

Legacy keyword search, no profiles, no analytics ownership, generic KB articles, and a portal UX that hid actions behind multiple clicks.

After

AI Search configured with persona profiles, golden query set tests, boosted authoritative KB, query expansion for acronyms, and action-first result blocks. A weekly knowledge flywheel improved coverage. Now Assist was reintroduced only after retrieval quality stabilised.

Zero-result rate −45% in 8 weeks (IT portal)
Top-1 click-through +31% for top intents
Deflection 0% → 38% for top 5 intents (conservative attribution)
Ticket volume down in deflected categories; CSAT up due to better fallback handoff

What goes wrong

Indexing everything without scoping — noise and leakage risk

Start with one portal profile and a small set of sources; expand only after ACL and ranking validation.

Treating search as set-and-forget — knowledge rot poisons retrieval

Run the knowledge flywheel: analytics → backlog → draft → review → publish → retire stale content.

Measuring deflection with vanity metrics (chat ended)

Tie deflection to record truth (no ticket within 72h) and report false deflection explicitly.

Federating to external sources without priority rules

Define per-portal source priority; resolve conflicts explicitly so users learn to trust results.

Vetted by Krishna KumarCurator, FactorBeam

Discussion

Discussion coming soon

Shared comments for this playbook are not live yet. When they are, you'll be able to ask questions, share what worked, and see replies from other readers.