Analysis

Analysis reference

Analysis and Analyzers in the Guide

An analyzer consists of:

Analysis can be configured when creating an index with the top-level analysis key in the API argument. Example:

analysis :
    analyzer :
        standard :
            type : standard
            stopwords : [stop1, stop2]
        myAnalyzer1 :
            type : standard
            stopwords : [stop1, stop2, stop3]
            max_token_length : 500
        # configure a custom analyzer which is
        # exactly like the default standard analyzer
        myAnalyzer2 :
            tokenizer : standard
            filter : [standard, lowercase, stop]
    tokenizer :
        myTokenizer1 :
            type : standard
            max_token_length : 900
        myTokenizer2 :
            type : keyword
            buffer_size : 512
    filter :
        myTokenFilter1 :
            type : stop
            stopwords : [stop1, stop2, stop3, stop4]
        myTokenFilter2 :
            type : length
            min : 0
            max : 2000

Built-in analyzers

Built-in analyzers in the Guide

Custom analyzers

You can define custom analyzers.

Custom Analyzers in the Guide

Example:

analysis :
    analyzer :
        myAnalyzer2 :
            type : custom
            tokenizer : myTokenizer1
            filter : [myTokenFilter1, myTokenFilter2]
            char_filter : [my_html]
            position_offset_gap: 256
    tokenizer :
        myTokenizer1 :
            type : standard
            max_token_length : 900
    filter :
        myTokenFilter1 :
            type : stop
            stopwords : [stop1, stop2, stop3, stop4]
        myTokenFilter2 :
            type : length
            min : 0
            max : 2000
    char_filter :
          my_html :
            type : html_strip
            escaped_tags : [xxx, yyy]
            read_ahead : 1024