BurningBright

  • Home

  • Tags

  • Categories

  • Archives

  • Search

lucene concept

Posted on 2017-01-28 | Edited on 2018-12-16 | In search

Document

Document used to describe a document, it can be a html page, a email or a text file. a Document made by a series of File.You can imagine a record of DB as a Document, fields as Fields object

Field

Field used to descibe a property in Document, like a email’s title and content can be descibed by two Fileds

Analyzer

Before a Document be Indexed, Document content should be participle first, Analyzer will done the job. Analyzer class is a abstract class, it have a lot of implementations. In different language, it should choose right Analyzer to do this. After Analysis , the content token to IndexWriter to build Index.

IndexWriter

IndexWriter is the core Lucene used to build Index, it’s job is to take every Document into Index.

Directory

This class represent Lucene’s Index save path. It is a abstract class, it has two implementations. First is FSDirectory, it represent the Index in file system. Second is RAMDirectory, it represent the Index in random memory.

Query

Query is a abstract class, has a lot of implementations, like TermQuery, BooleanQuery, PrefixQuery. The task of this class is to take user’s query string packing into a Query that Lucene could recognize

IndexSearcher

IndexSearcher is used to search in the builded Index. It’s only way to open a Index is read, so it could be a lot of IndexSearcher on a single Index implementations do operations.

Hits

Hits used to save search result.

Apache Lucene Core

Posted on 2017-01-27 | Edited on 2020-09-17 | In search

Apache LuceneTM is a high-performance, full-featured text search engine library
written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
Apache Lucene is an open source project available for free download.
Please use the links on the right to access Lucene.

Features

Lucene offers powerful features through a simple API:\

Scalable, High-Performance Indexing

  • over 150GB/hour on modern hardware
  • small RAM requirements — only 1MB heap
  • incremental indexing as fast as batch indexing
  • index size roughly 20-30% the size of text indexed

Powerful, Accurate and Efficient Search Algorithms

  • ranked searching — best results returned first
  • many powerful query types: phrase queries, wildcard queries, proximity queries, range queries and more
  • fielded searching (e.g. title, author, contents)
  • sorting by any field
  • multiple-index searching with merged results
  • allows simultaneous update and searching
  • flexible faceting, highlighting, joins and result grouping
  • fast, memory-efficient and typo-tolerant suggesters
  • pluggable ranking models, including the Vector Space Model and Okapi BM25
  • configurable storage engine (codecs)

Cross-Platform Solution

  • Available as Open Source software under the Apache License which lets you use Lucene in both commercial and Open Source programs
  • 100%-pure Java
  • Implementations in other programming languages available that are index-compatible

The Apache Software Foundation

The Apache Software Foundation provides support for the Apache community of open-source software projects. The Apache projects are defined by collaborative consensus based processes, an open, pragmatic software license and a desire to create high quality software that leads the way in its field. Apache Lucene, Apache Solr, Apache PyLucene, Apache Open Relevance Project and their respective logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.

collection and array

Posted on 2017-01-26 | Edited on 2020-09-17 | In java

Collection to array

  1. Object[]

    1
    Object[] listArray = list.toArray();
  2. specific array

    1
    String[] listArray = (String[]) list.toArray(new String[0])

ps. it’s can’t be used to translate generic paradigm typed array

Array to collection

1
2
List list = new ArrayList();
list = Arrays.asList(array);

ps. primary type can’t do it like this, it’s parameter must be objects

gzip stream

Posted on 2017-01-26 | Edited on 2018-12-16 | In java

Use GZIPOutputStream ZipOutputStream packing the output
maybe it’s destination a file or a socket, it’s not the point. usually we use generic paradigm the reprecent the data source in stream and the compressed data out stream

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public static void compress(InputStream is, OutputStream os)
throws Exception {

GZIPOutputStream gos = new GZIPOutputStream(os);

int count;
byte data[] = new byte[BUFFER];
while ((count = is.read(data, 0, BUFFER)) != -1) {
gos.write(data, 0, count);
}

gos.finish();
gos.flush();
gos.close();
}

url encoding

Posted on 2017-01-24 | Edited on 2020-09-17 | In tool
  • RFC 3986 section 2.2 reserved january 2005

    1
    ! * '( ) ; : @& = + $ ,/ ? # [ ]
  • RFC 3986 section 2.3 unreserved january 2005

    1
    2
    3
    A B C D E F G H I J K L M N O P Q RS T U V W X Y Z  
    a b c d e f g h i j k l m n o p q rs t u v w x y z
    0 1 2 3 4 5 6 7 8 9 - _ .~
  • RFC 2396 URI Generic Syntax reserved August 1998

    1
    ;  /  ?  :  @  &  =  +  $  ,
  • RFC 2396 URI Generic Syntax unreserved August 1998

    1
    2
    alphanum  or  mark
    mark = - _ . ! ~ * ' ( )

java use the older one
for compatible java use the same collection unreserved from all browser just like RFC2986 no ‘~’ add ‘*’

1
2
3
4
5
6
/*
* Unreserved characters can be escaped without changing the
* semantics of the URI, but this should not be done unless the
* URI is being used in a context that does not allow the
* unescaped character to appear.
*/

regular expression

1
/^((ht|f)tps?):\/\/[\w\-]+(\.[\w\-]+)+([\w\-\.,@?^=%&:\/~\+#]*[\w\-\@?^=%&\/~\+#])?$/
  • start with ‘http/https/ftp/ftps’
  • can’t contain double bytes characters or not unreserved characters

mybatis foreach error

Posted on 2017-01-24 | Edited on 2018-12-16 | In java

Parameter ‘__frch_item_0’ not found. Available parameters are [list]

Mybatis parameter in list

  1. 查看parameterType的类型是不是Java.util.List类型,如果是的话,看foreach 的collection属性是不是list,因为 传递一个 List 实例或者数组作为参数对象传给 MyBatis,MyBatis 会自动将它包装在一个 Map 中,用名称在作为键。List 实例将会以“list” 作为键,而数组实例将会以“array”作为键
    Is parameterType type Java.util.List. If it is, be caution foreach's collection must 'list'. Becase if put a List example or array to Mybatis, it will auto put it to a Map, use it's name as key, example as value. So Mybatis will put a special map to foreach
  2. foreach is any value in list?
  3. foreach is property spell error?
  4. Mybatis set field auto increase but Mysql not.
  5. Item’s property is not right

ps: use Map reduce Bean’s work, but question is if query result is null, then it’s corresponding property will lost(null)

hexo install

Posted on 2017-01-24 | Edited on 2019-01-09 | In blog

Install & theme

use cnpm
npm install -g cnpm —registry=https://registry.npm.taobao.org

1
2
3
4
5
6
7
8
9
10
11
12
13
sudo cnpm install hexo-cli -g
hexo init blog
cd blog

cnpm install
hexo server

# may ship
git clone https://github.com/tufu9441/maupassant-hexo.git themes/maupassant
npm install hexo-renderer-jade --save
npm install hexo-renderer-sass --save

npm install hexo-tag-katex --save

Optimization

  • enter themes\landscape\layout_partial,open head.ejs,delete 31th row fonts.googleapis.com

  • download jquery-2.0.3.min.js put into themes\landscape\source\js , enter themes\landscape\layout_partial, openafter-footer.ejs, replace 17th row to /js/jquery-2.0.3.min.js。

mybatis foreach

Posted on 2017-01-24 | Edited on 2018-12-16 | In java
Option Description
item 循环体中的具体对象。支持属性的点路径访问,如item.age,item.info.details。具体说明:在list和数组中是其中的对象,在map中是value。该参数为必选。
collection 要做foreach的对象,作为入参时,List<?>对象默认用list代替作为键,数组对象有array代替作为键,Map对象用map代替作为键。当然在作为入参时可以使用@Param(“keyName”)来设置键,设置keyName后,list,array,map将会失效。 除了入参这种情况外,还有一种作为参数对象的某个字段的时候。举个例子:如果User有属性List ids。入参是User对象,那么这个collection = “ids”如果User有属性Ids ids;其中Ids是个对象,Ids有个属性List id;入参是User对象,那么collection = “ids.id”上面只是举例,具体collection等于什么,就看你想对那个元素做循环。该参数为必选。
separator 元素之间的分隔符,例如在in()的时候,separator=”,”会自动在元素中间用“,“隔开,避免手动输入逗号导致sql错误,如in(1,2,)这样。该参数可选。
open foreach代码的开始符号,一般是(和close=”)”合用。常用在in(),values()时。该参数可选。
close foreach代码的关闭符号,一般是)和open=”(“合用。常用在in(),values()时。该参数可选。
index 在list和数组中,index是元素的序号,在map中,index是元素的key,该参数可选。

select count(*) from users WHERE id in ( ? , ? )

1
2
3
4
5
6
7
8
9
<select id="countByUserList" resultType="_int" parameterType="list">    
select count(*) from users
<where>
id in
<foreach item="item" collection="list" separator="," open="(" close=")" index="">
#{item.id, jdbcType=NUMERIC}
</foreach>
</where>
</select>

insert into deliver select ?,? from dual union all select ?,? from dual

1
2
3
4
5
6
7
8
9
10
11
12
13
<insert id="addList">  
INSERT INTO DELIVER
(
<include refid="selectAllColumnsSql"/>
)

<foreach collection="deliverList" item="item" separator="UNION ALL">
SELECT
#{item.id, jdbcType=NUMERIC},
#{item.name, jdbcType=VARCHAR}
FROM DUAL
</foreach>
</insert>

insert into string_string (key, value) values (?, ?) , (?, ?)

1
2
3
4
5
<insert id="ins_string_string">    
insert into string_string (key, value) values
<foreach item="item" index="key" collection="map"
open="" separator="," close="">(#{key}, #{item})</foreach>
</insert>

select count(*) from key_cols where col_a = ? AND col_b = ?

1
2
3
4
5
<select id="sel_key_cols" resultType="int">    
select count(*) from key_cols where
<foreach item="item" index="key" collection="map"
open="" separator="AND" close="">${key} = #{item}</foreach>
</select>

ps: 一定要注意到$和#的区别,$的参数直接输出,#的参数会被替换为?,然后传入参数值执行。

skip_render

Posted on 2017-01-24 | Edited on 2018-12-16 | In blog

No translate source dirctory

set _config.yml skip_render to ignore translate like:

1
skip_render: Demo/*.html

or more complicated, base on regular expression(not strict)

  1. under single dirctory all files:

    1
    skip_render: demo/*
  2. under single dirctory a type file:

    1
    skip_render: demo/*.html
  3. under single dirctory all files and children dirctory:

    1
    skip_render: demo/**
  4. complicated condition:

    1
    2
    3
    skip_render:
    - 'demo/*.html'
    - 'demo/**'

Hello Hexo

Posted on 2017-01-22 | Edited on 2018-12-16 | In blog

hello hexo

Command

1
2
3
$ hexo init [folder]
$ hexo new [layout] <title>
$ hexo generate
Option Description
-d, —deploy Deploy after generation finishes
-w, —watch Watch file changes

config

  Website in subdirectory

If your website is in a subdirectory (such as http://example.org/blog) set url to http://example.org/blog and set root to /blog/
Ctrl+Alt+N

  • hexo install
  • hexo server
  • hexo deploy
  1. hexo install
  2. hexo server
  3. hexo deploy
1
public staic void main(String[] args)

GitHub

octocat

1…2728

Leon

280 posts
20 categories
57 tags
GitHub
Links
  • clock
  • typing-cn
  • mathjax
  • katex
  • cron
  • dos
  • keyboard
  • regex
  • sql
  • toy
© 2017 – 2022 Leon
Powered by Hexo v3.9.0
|
Theme – NexT.Muse v7.1.2