24h購物| | PChome| 登入
2008-07-15 17:25:10| 人氣1,141| 回應1 | 上一篇 | 下一篇

mysql tritonn study

推薦 0 收藏 0 轉貼0 訂閱站台

Tritonn is a patched version of MySQL that supports better fulltext search function with Senna

MySQL version after 3.23.23 supports FULLTEXT index. With it, MySQL can execute the full-text search for the field of VARCHAR and TEXT type. But, MySQL’s fulltext search implementation has the following problems:

* Insufficient Japanese/Chinese/Korean support
* Slow phrase search
* Slow update

With Tritonn, you get M17N fulltext search function, faster phrase search, and faster update WITHOUT modifying your application.




Features

* Supports MySQL version 5.0 and 5.1
* MATCH AGAINST query support in BOOLEAN MODE and defaulut mode(NLQ MODE)
* In BOOLEAN MODE, you can use all operators like +, -, <, >, (, ), ~, *, ”.
* Supports Japanese encoding EUC, SJIS
* Supports Unicode with UTF8
* Supports normalization. In UTF8, NFKC normalization supported
* Supports similar document search
* Supports near words search
* Supports MyISAM storage engines.
* Supports snippet(KWIC) function with MySQL user defined functions.
* Supports Japanese word’s index(with MeCab), N-gram(bi-gram) index and space delimited index.
* 2ind patch enables MySQL to use FULLTEXT index and normal b-tree index bothly at one time.

Example

The following is an example of MySQL’s fulltext search with English text, that works as expected.

[test] > SET NAMES utf8;
Query OK, 0 rows affected (0.00 sec)

[test] > CREATE TABLE t1 (c1 TEXT, FULLTEXT INDEX idx (c1)) ENGINE = MyISAM DEFAULT CHARSET utf8;
Query OK, 0 rows affected (0.00 sec)

[test] > INSERT INTO t1 VALUES (”I have a pen.”), (”May I Help You?”), (”Have a nice day.”);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0

[test] > SELECT * FROM t1 WHERE MATCH(c1) AGAINST(”nice”);
+------------------+
| c1 |
+------------------+
| Have a nice day. |
+------------------+
1 row in set (0.00 sec)

And the following is an example of MySQL’s fulltext search with Japanese text, whose result is empty.

[test] > drop table t1;
Query OK, 0 rows affected (0.00 sec)

[test] > CREATE TABLE t1 (c1 TEXT, FULLTEXT INDEX idx (c1)) ENGINE = MyISAM DEFAULT CHARSET utf8;
Query OK, 0 rows affected (0.00 sec)

[test] > INSERT INTO t1 VALUES(”私はペンを持っています。”), (”いらっしゃいませ~”), (”良い一日を。”);
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0

[test] > SELECT * FROM t1 WHERE MATCH(c1) AGAINST(”良い”);
Empty set (0.00 sec)

This is because MySQL’s fulltext implementation splits text into ’keywords’ by spaces. But in Japanese text, words are not separated by spaces.

And the following is an example of Tritonn’s fulltext search with Japanese text, that works as expected.

[test] > SELECT * FROM t1 WHERE MATCH(c1) AGAINST(”良い”);
+------------------+
| c1 |
+------------------+
| 良い一日を。 |
+------------------+
1 row in set (0.00 sec)
Empty set (0.00 sec)




reference

http://sourceforge.net/projects/tritonn


http://labs.cybozu.co.jp/blog/kazuho/archives/2008/02/triton-embed-primary-key.php

http://qwik.jp/tritonn/perftest.html

台長: 大D QQ
人氣(1,141) | 回應(1)| 推薦 (0)| 收藏 (0)| 轉寄
全站分類: 不分類 | 個人分類: Mysql相關 |
此分類上一篇:mysql connect in c

是 (若未登入"個人新聞台帳號"則看不到回覆唷!)
* 請輸入識別碼:
請輸入圖片中算式的結果(可能為0) 
(有*為必填)
TOP
詳全文