You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: AI Search CSS content selectors for precise content extraction
3
+
description: Control which parts of crawled pages are indexed using CSS selectors.
4
+
products:
5
+
- ai-search
6
+
date: 2026-04-08
7
+
---
8
+
9
+
[AI Search](/ai-search/) now supports [CSS content selectors](/ai-search/configuration/data-source/website/#content-selectors) for website data sources. You can now define which parts of a crawled page are extracted and indexed by specifying CSS selectors paired with URL glob patterns.
10
+
11
+
Content selectors solve the problem of indexing only relevant content while ignoring navigation, sidebars, footers, and other boilerplate. When a page URL matches a glob pattern, only elements matching the corresponding CSS selector are extracted and converted to Markdown for indexing.
12
+
13
+
Configure content selectors via the dashboard or API:
Selectors are evaluated in order, and the first matching pattern wins. You can define up to 10 content selector entries per instance.
39
+
40
+
For configuration details and examples, refer to the [content selectors documentation](/ai-search/configuration/data-source/website/#content-selectors).
|`path`| string | Glob pattern to match against the full page URL. Uses the same glob syntax as [path filtering](/ai-search/configuration/path-filtering/) — `*` matches within a segment, `**` crosses directories. Maximum 200 characters. |
111
-
|`selector`| string | CSS selector to extract content from pages matching the path pattern. Supports standard CSS selectors including element, class, ID, and attribute selectors. Maximum 200 characters. |
111
+
|`selector`| string | CSS selector to extract content from pages matching the path pattern. Supports standard CSS selectors including element, class, ID, and attribute selectors. Maximum 200 characters. |
112
112
113
113
### Examples
114
114
115
115
#### Extract main content from blog pages
116
116
117
117
To index only the article body on blog pages and ignore navigation, sidebars, and footers:
118
118
119
-
| Path | Selector |
120
-
| --------------| -------------------- |
121
-
|`**/blog/**`|`article .post-body`|
119
+
| Path | Selector |
120
+
| ------------ | -------------------- |
121
+
|`**/blog/**`|`article .post-body`|
122
122
123
123
#### Target documentation content
124
124
125
125
To index the main content area of a documentation site:
126
126
127
-
| Path | Selector |
128
-
| --------------| -------------- |
129
-
|`**/docs/**`|`main .content`|
127
+
| Path | Selector|
128
+
| ------------ |--------------- |
129
+
|`**/docs/**`|`main .content`|
130
130
131
131
#### Different selectors for different sections
132
132
133
133
You can define multiple entries to apply different selectors to different parts of your site. The first matching path wins, so place more specific patterns first:
134
134
135
-
| Path | Selector |
136
-
| ----------------------| -------------------- |
137
-
|`**/blog/releases/**`|`.release-notes`|
138
-
|`**/blog/**`|`article .post-body`|
139
-
|`**/docs/**`|`main .content`|
135
+
| Path | Selector |
136
+
| --------------------- | -------------------- |
137
+
|`**/blog/releases/**`|`.release-notes`|
138
+
|`**/blog/**`|`article .post-body`|
139
+
|`**/docs/**`|`main .content`|
140
140
141
141
In this example, a page at `https://example.com/blog/releases/v2` matches the first pattern and uses the `.release-notes` selector. A page at `https://example.com/blog/my-post` skips the first pattern and matches the second.
142
142
143
143
:::caution
144
-
If a CSS selector does not match any elements on a page, the page is indexed with empty content. Verify that your selectors match the expected elements before applying them to a broad set of pages.
144
+
If a CSS selector does not match any elements on a page, the resulting Markdown is empty and AI Search marks the item as errored. Verify that your selectors match the expected elements before applying them to a broad set of pages.
145
145
:::
146
146
147
147
### Interaction with other features
148
148
149
149
-**Path filtering**: [Path filtering](/ai-search/configuration/path-filtering/) takes priority over content selectors. Pages excluded by path filters are never crawled, so content selectors do not apply to them.
150
150
-**Browser Rendering**: Content selectors apply to the HTML that AI Search receives. For sites that render content with JavaScript, turn on [Browser Rendering](#rendering-mode) so that selectors can target the fully rendered DOM.
151
-
-**Future crawls only**: Changes to content selectors apply to pages crawled after the change. To apply new selectors to already-indexed pages, trigger a new [sync job](/ai-search/configuration/indexing/).
151
+
-**Automatic re-indexing**: Updating content selectors triggers a new [sync job](/ai-search/configuration/indexing/) immediately, so changes are applied to all indexed pages.
0 commit comments