Welcome! Log In Create A New Profile

Advanced

Foreign Languages Not Indexing

Posted by Toomster 
Foreign Languages Not Indexing
July 16, 2015 07:45PM
We have been having trouble with foreign languages not indexing.
-Character recognition is a problem and languages like Japanese and Chinese simply don't index.

Do you have any suggestions?
Tec
Re: Foreign Languages Not Indexing
July 16, 2015 10:20PM
The original Sphider is based on charset ISO-8859-1, which is quite okay for English language. But if you want to index (and search for) international languages, you need UTF-8 support. This is provided by Sphider-plus. Besides UTF-8 support, several language specific features were implemented. Some examples:
For Chinese language, this search engine is able to segment phrases into the base words, so that all will become searchable. Dictionaries with 106.800 radicals are supplied.
For Japanese language segmentation of 5.724 kanji (new, old, and half width), hiragana, katakana and the jinmeiyo Japanese character writing systems is implemented.
Transliteration of Latin characters into their Greek equivalents is enabled, so that queries for both notations will be answered. Even if only one notation was indexed. Also this search engine accepts queries containing Greek vowels without accents.

Tec
Re: Foreign Languages Not Indexing
July 16, 2015 10:29PM
Thanks.
Super, I assume supports German, Spanish, French as well?
Same page result customization options as Sphider?
Protects against injection? and the bad guys?
Tec
Re: Foreign Languages Not Indexing
July 17, 2015 01:41PM
<<< I assume supports German, Spanish, French as well? >>>
Yes it does.

<<< Same page result customization options as Sphider?>>>
No, a little bit more complicated, but alike. Responsive design is supported for search form, result listing and addurl form.

<<< Protects against injection? and the bad guys? >>>
Yes, nearly paranoid. For some details, please notice
http://www.sphider-plus.eu/index.php?f=30#14
and also
http://www.sphider-plus.eu/index.php?f=14#14_19

Additionally, as of the change log:

New feature:
Admin backend protected against XSRF attacks (Cross-Site-Forgery-Request).
Independent from IDS.

New feature:
Admin backend protected against SFA attacks (Session-Fixation-Attacks).
Independent from IDS.

New feature:
Limit the duration of a session (time-out) in admin backend.
To be defined in 'Settings' menu as seconds of inactivity.

New feature:
Prevent search form from being flooded by too many queries per unit of time.
To be activated in admin settings


Because of the bad boys, not all info is presented on the Internet.

Tec
Sorry, only registered users may post in this forum.

Click here to login