solr配置文件之schema.xml-白红宇

solr配置文件之schema.xml

阅读量：7189 次

发布时间：2019-06-29

本文共 6835 字，大约阅读时间需要 22 分钟。

以下是针对schema.xml 配置文件的剖析：

1. <types></types>这个标签和它的意义一样，是用来表示数据有哪些类型，这些类型当然是solr内部定义的类型和自定义类型。

2.

和他上面解释一样，string类型是不分词的，要建索引，要存储

3.数值类型，有如下几个类型是默认数值类型，如果想用于排序请用 tint/tfloat/tlong/tdouble类型

<!--

Default numeric field types. For faster range queries, consider the tint/tfloat/tlong/tdouble types.

-->

4.时间类型：如果想用于快速排序查询，用tdate（看到这里我的排序没用tdate，得改啊。。）

Note: For faster range queries, consider the tdate type

5.专门用于分词的字段。在里面包含了定义使用什么分词器，可以手工定制。

<!-- in this example, we will only use synonyms at query time

-->

<!-- Case insensitive stop word removal.

add enablePositionIncrements=true in both the index and query

analyzers to leave a 'gap' for more accurate phrase queries.

-->

<filter class="solr.StopFilterFactory"

ignoreCase="true"

words="stopwords.txt"

enablePositionIncrements="true"

</analyzer>

<!-- <analyzer type="query">

<filter class="solr.StopFilterFactory"

ignoreCase="true"

words="stopwords.txt"

enablePositionIncrements="true"

</analyzer>

<filter class="solr.StopFilterFactory"

ignoreCase="true"

words="stopwords.txt"

enablePositionIncrements="true"

</analyzer>

</fieldType>

其他几个类别都是不常用的，也是通过分词器来定义不同的类别。和第五个类似。

6.索引字段名称定义。

<!-- Valid attributes for fields:

name: mandatory - the name for the field

type: mandatory - the name of a previously defined type from the

<types> section

indexed: true if this field should be indexed (searchable or sortable)

stored: true if this field should be retrievable

multiValued: true if this field may contain multiple values per document

omitNorms: (expert) set to true to omit the norms associated with

this field (this disables length normalization and index-time

boosting for the field, and saves some memory). Only full-text

fields or fields that need an index-time boost need norms.

termVectors: [false] set to true to store the term vector for a

given field.

When using MoreLikeThis, fields used for similarity should be

stored for best performance.

termPositions: Store position information with the term vector.

This will increase storage costs.

termOffsets: Store offset information with the term vector. This

will increase storage costs.

default: a value that should be used if no value is specified

when adding a document.

-->

</fields>

id：是索引字段的唯一标识。

termVectors="true"属性主要用于相关搜索。

multiValued="true"属性，一般用于多个字段组成一个字段的情况。

一般用于查询的字段定义为multiValued。

7. <dynamicField name="*_i" type="int" indexed="true" stored="true"/>表示动态字段，暂时没用到。

转载于:https://blog.51cto.com/yjflinchong/1164962

你可能感兴趣的文章

女生到底适不适合做程序员？！

查看>>

Java并发包分析——BlockingQueue

Linux Redhat 一般用户不能执行sudo有关问题的解决方法

查看>>

ceph13跟ceph12配置文件在启动要增加的内容——2019_10

CentOS linux操作系统关闭Sendmail服务命令

查看>>

我的友情链接

查看>>

HPUX11.31U ia64安装配置详细过程文档

查看>>

DB响应时间测试

查看>>

HostEase虚拟主机抢滩中国网站空间市场占据天时地利人和