Docs Menu
Docs Home
/
Atlas
/ / /

Process Data with Analyzers

You can control how Atlas Search turns a string field's contents into searchable terms using analyzers. Analyzers are policies that combine a tokenizer, which extracts tokens from text, with filters that you define. Atlas Search applies your filters to the tokens to create indexable terms that correct for differences in punctuation, capitalization, filler words, and more.

You can specify analyzers in your index definition for Atlas Search to use when building an index or searching your database. You can also specify alternate (multi) analyzers to use when indexing individual fields, or define your own custom analyzers.

The following tabs show the syntax of the analyzer options you can configure in your index definition:

You can specify an index analyzer for Atlas Search to apply to string fields when building an index using the analyzer option in your Atlas Search index definition.

Atlas Search applies the top-level analyzer to all fields in the index definition unless you specify a different analyzer for a field within the mappings.fields definition for your field.

If you omit the analyzer option, Atlas Search defaults to using the Standard Analyzer.

1{
2 "analyzer": "<analyzer-for-index>",
3 "mappings": {
4 "fields": {
5 "<string-field-name>": {
6 "type": "string",
7 "analyzer": "<analyzer-for-field>"
8 }
9 }
10 }
11}

You can specify a search analyzer for Atlas Search to apply to query text using the searchAnalyzer option in your Atlas Search index definition.

If you omit the searchAnalyzer option, Atlas Search defaults to using the analyzer that you specify for the analyzer option. If you omit both options, Atlas Search defaults to using the Standard Analyzer.

1{
2 "searchAnalyzer": "<analyzer-for-query>",
3 "mappings": {
4 "dynamic": <boolean>,
5 "fields": { <field-definition> }
6 }
7}

You can specify an alternate analyzer for Atlas Search to apply to string fields when building an index using the multi option in your Atlas Search index definition.

To use the alternate analyzer in an Atlas Search query, you must specify the name of the alternate analyzer in the multi field of your query operator's query path.

To learn more, see Multi Analyzer.

1{
2 "mappings": {
3 "fields": {
4 "<string-field-name>": {
5 "type": "string",
6 "analyzer": "<default-analyzer-for-field>",
7 "multi": {
8 "<alternate-analyzer-name>": {
9 "type": "string",
10 "analyzer": "<alternate-analyzer-for-field>"
11 }
12 }
13 }
14 }
15 }
16}

You can define one or more custom analyzers to transform, filter, and group sequences of characters using the analyzers option in your Atlas Search index.

To use a custom analyzer that you define, specify its name value in your index definition's analyzer, searchAnalyzer, or multi.analyzer option.

To learn more, see Custom Analyzers.

1{
2 "mappings": {
3 "dynamic": <boolean>,
4 "fields": { <field-definition> }
5 },
6 "analyzers": [
7 {
8 "name": "<custom-analyzer-name>",
9 "tokenizer": {
10 "type": "<tokenizer-type>"
11 }
12 }
13 ]
14}

See also: Learn by Watching

Watch this video to see how Atlas Search uses analyzers to break documents into searchable units and build an inverted index.

Duration: 8 Minutes

Atlas Search provides the following built-in analyzers:

Analyzer
Description

Uses the default analyzer for all Atlas Search indexes and queries.

Divides text into searchable terms wherever it finds a non-letter character.

Divides text into searchable terms wherever it finds a whitespace character.

Indexes text fields as single terms.

Provides a set of language-specific text analyzers.

If you don't specify an analyzer in your index definition, MongoDB uses the default standard analyzer.

Normalizers produce only a single token at the end of analysis. You can configure normalizers only in the field definition for the Atlas Search token type. Atlas Search provides the following normalizers:

Normalizer
Description

lowercase

Transforms text in string fields to lowercase and creates a single token for the whole string.

none

Doesn't perform any transformation, but still creates a single token.

To learn more about analyzers, see Analyzing Analyzers to Build The Right Search Index For Your App in the MongoDB Developer Center.

Back

Index Reference