lucene concept

Document

Document used to describe a document, it can be a html page, a email or a text file. a Document made by a series of File.You can imagine a record of DB as a Document, fields as Fields object

Field

Field used to descibe a property in Document, like a email’s title and content can be descibed by two Fileds

Analyzer

Before a Document be Indexed, Document content should be participle first, Analyzer will done the job. Analyzer class is a abstract class, it have a lot of implementations. In different language, it should choose right Analyzer to do this. After Analysis , the content token to IndexWriter to build Index.

IndexWriter

IndexWriter is the core Lucene used to build Index, it’s job is to take every Document into Index.

Directory

This class represent Lucene’s Index save path. It is a abstract class, it has two implementations. First is FSDirectory, it represent the Index in file system. Second is RAMDirectory, it represent the Index in random memory.

Query

Query is a abstract class, has a lot of implementations, like TermQuery, BooleanQuery, PrefixQuery. The task of this class is to take user’s query string packing into a Query that Lucene could recognize

IndexSearcher

IndexSearcher is used to search in the builded Index. It’s only way to open a Index is read, so it could be a lot of IndexSearcher on a single Index implementations do operations.

Hits

Hits used to save search result.