Welcome! Log In Create A New Profile

Advanced

How to avoid subword?

Posted by devkbsc 
How to avoid subword?
April 05, 2016 10:41AM
Hello,
In the search results display also the results of subwords.

Example :
my search keyword is IST. It displays also depISTage, ISTogram.

How to avoid displaying this items in my search result?
Tec
Re: How to avoid subword?
April 05, 2016 03:34PM
Well, as of my knowledge your above posting is not quite correct. If the keyword
ist
is stored in db,
you will not get search results for
depistage
Because, if you search for 'ist', only keywords starting with 'ist. . . ' will be presented in result listing.
Consequently words like
istogram
will be found.

In order to find out whether 'depistage' is also part of your stored keywords, please use a tool like phpMyAdmin or something similar and have a look in table
your_table_prefix_keywords
of your database.

As additionl info: Sphider-plus offers the option to search 'strictly'. This will not offer any other word like your search terrm
ist
in result listing.

Tec
rap
Re: How to avoid subword?
April 05, 2016 11:31PM
My experience with Sphider shows that only EXACT matches are returned. Provided ist is a stored word, a search for ist would not return depestage. It IS possible that a word like depistage could appear in the result description as depistage, but somewhere on the returned page ist (surrounded by white space) would need to be present.

Example: You have a page which lists all blacksmiths in a province. One of the individuals listed is John Black. You do a search on the word black. On the results page, you may see a page description thus - "Here is an alphabetical list of all blacksmiths in the province:". The search is not actually matching the word blacksmiths. Smith actually DOES occur elsewhere in the page even though it may not occur in the description.

This is true both for the original Sphider and my mod. The logic of Sphider remains unchanged. The changes concern language support, security, and the removal of deprecated code. The only "new" logic in the mod isn't really new (or original). It is the incorporation of the wildcard mod written by Tec and presented in this forum (with code changes reflecting the rest of the updated database access and security). (As an aside, Tec is given credit for the wildcard function inside the code.) After much testing, the wildcard function IS reliable and partial matches occur ONLY if a wildcard is used.

I have to wonder, did you also change the default keyword length prior to indexing? The DEFAULT keyword length is 5. Unless this has been changed, a search for ist would return ZERO results due to insufficient word length.
Re: How to avoid subword?
April 06, 2016 10:31AM
Thank you very much for your replay. I will try to be more clear from the newt post.

I have found a solution for this problem. It's in the "spider setting". i need to uncheck the box

"Use word stemming (e.g. find sites containing "runs" and "running" when searching for "run"winking smiley. Should be enabled before indexing"
Sorry, only registered users may post in this forum.

Click here to login