Saturday, October 13, 2007

using sphinx freetext search engine on windows xp

This entry deals with using (testing) the open source full text search engine sphinx on a win32 box. Official sphinx document as well as other links and tutorials for sphinx are Linux oriented. So a step by step newbie sort of guide would be helpful , I believe. First go to the sphinx website and download the zip file containing binaries. If you feel you are up to compiling sphinx on win32 boxes then you should not read newbie guides anyway! Assumption is that you already have a working mysql installation on your windows box.

http://www.sphinxsearch.com/downloads.html



Download the zip file and extract in some folder of your choice. First step in our process is to create a sphinx.conf file. This sphinx.conf file will be used by all the programs. For simplicity's sake, lets say we have a table that has only 3 columns, id, title and description. we want to index text in title and description fields. Our test table is named "test" and it sits in a database called testdb.

Below is how our config file looks like,


source test {
type = mysql

sql_host = localhost
sql_user = gloo

sql_pass = gloo

sql_db = testdb

sql_port = 3306

# indexer query

sql_query = \ SELECT \ id, title,description FROM test;

# document info query
sql_query_info = SELECT * FROM test WHERE id=$id
}

index test
{
source = test
path = ./test-index

morphology = stem_en
min_word_len = 3

min_prefix_len = 0

min_infix_len = 3
}


searchd {
port = 3312

}



You have to know a bit about sphinx config file. see this link on the sphinx web site for all the options. Now we are ready to index our test table. To do so just go to the folder where you extracted the sphinx binaries. You should see three programs, indexer,searchd and search as shown in the picture below.



Just go to command line and type command "indexer". Make sure that your mysql service is running before you try to use the indexer. Indexer will list out all possible options. You can run it with the path to your configuration file. Again , read the sphinx documentation for full information. If you type in the indexer command, the indexer will run and create index of your data. you can verify this by looking inside your folder ( or wherever you set the path to index files) . some new files should have been created.



So now, we have used the indexer and created the indexes out of data. But how do we use these indexes? What we do here is start the sphinx search daemon that other programs and API can connect to. It is also possible to build sphinx as mysql plug-in. But lets go with the sphinx searchd daemon for now. start the searchd daemon.


sphinx searchd daemon is now listening for connections. To test, now use the search client program. Later we will see how to use the PHP API to do searches.
You will find the following links useful




© Life of a third world developer
Maira Gall