Process Data with Analyzers
You can control how Atlas Search turns a string
field's contents into searchable
terms using analyzers. Analyzers are policies that combine a tokenizer, which
extracts tokens from text, with filters that you define. Atlas Search
applies your filters to the tokens to create indexable terms that correct for differences
in punctuation, capitalization, filler words, and more.
You can specify analyzers in your index definition for Atlas Search to use when building an index or searching your database. You can also specify alternate (multi) analyzers to use when indexing individual fields, or define your own custom analyzers.
Syntax
The following tabs show the syntax of the analyzer options you can configure in your index definition:
You can specify an index analyzer for Atlas Search to apply to string fields
when building an index using the analyzer
option in your Atlas Search index definition.
Atlas Search applies the top-level analyzer to all fields in the index definition
unless you specify a different analyzer for a field within the mappings.fields
definition
for your field.
If you omit the analyzer
option, Atlas Search defaults to using the Standard Analyzer.
1 { 2 "analyzer": "<analyzer-for-index>", 3 "mappings": { 4 "fields": { 5 "<string-field-name>": { 6 "type": "string", 7 "analyzer": "<analyzer-for-field>" 8 } 9 } 10 } 11 }
You can specify a search analyzer for Atlas Search to apply to query text using
the searchAnalyzer
option in your Atlas Search index definition.
If you omit the searchAnalyzer
option, Atlas Search defaults to using the analyzer
that you specify for the analyzer
option. If you omit both options, Atlas Search
defaults to using the Standard Analyzer.
1 { 2 "searchAnalyzer": "<analyzer-for-query>", 3 "mappings": { 4 "dynamic": <boolean>, 5 "fields": { <field-definition> } 6 } 7 }
You can specify an alternate analyzer for Atlas Search to apply to string fields when building an
index using the multi
option in your Atlas Search index definition.
To use the alternate analyzer in an Atlas Search query, you must specify the name of the alternate analyzer in the multi
field
of your query operator's query path.
To learn more, see Multi Analyzer.
1 { 2 "mappings": { 3 "fields": { 4 "<string-field-name>": { 5 "type": "string", 6 "analyzer": "<default-analyzer-for-field>", 7 "multi": { 8 "<alternate-analyzer-name>": { 9 "type": "string", 10 "analyzer": "<alternate-analyzer-for-field>" 11 } 12 } 13 } 14 } 15 } 16 }
You can define one or more custom analyzers to transform, filter, and group sequences
of characters using the analyzers
option in your Atlas Search index.
To use a custom analyzer that you define, specify its name
value in your index definition's
analyzer
, searchAnalyzer
, or multi.analyzer
option.
To learn more, see Custom Analyzers.
1 { 2 "mappings": { 3 "dynamic": <boolean>, 4 "fields": { <field-definition> } 5 }, 6 "analyzers": [ 7 { 8 "name": "<custom-analyzer-name>", 9 "tokenizer": { 10 "type": "<tokenizer-type>" 11 } 12 } 13 ] 14 }
See also: Learn by Watching
Watch this video to see how Atlas Search uses analyzers to break documents into searchable units and build an inverted index.
Duration: 8 Minutes
Analyzers
Atlas Search provides the following built-in analyzers:
Analyzer | Description |
---|---|
Uses the default analyzer for all Atlas Search indexes and queries. | |
Divides text into searchable terms wherever it finds a non-letter character. | |
Divides text into searchable terms wherever it finds a whitespace character. | |
Indexes text fields as single terms. | |
Provides a set of language-specific text analyzers. |
If you don't specify an analyzer in your index definition, MongoDB uses the default standard analyzer.
Normalizers
Normalizers produce only a single token at the end of analysis. You can configure normalizers only in the field definition for the Atlas Search token type. Atlas Search provides the following normalizers:
Normalizer | Description |
---|---|
| Transforms text in string fields to lowercase and creates a single token for the whole string. |
| Doesn't perform any transformation, but still creates a single token. |
Learn More
To learn more about analyzers, see Analyzing Analyzers to Build The Right Search Index For Your App in the MongoDB Developer Center.