Welcome! Log In Create A New Profile

Advanced

index PDF files

Posted by risby 
index PDF files
April 08, 2012 07:37PM
I am indexing PDF files on my homepage using SPIDER, and it works fantastic. But i have one small problem. In a PDF document a word might be split on two lines, that is done by a special code in the PDF format. If you search using Adobe reader you can search the word without problems. But using pdftotext the code is replaced with a space character. The result is that the search does not work. Is it possible to make a change to the pdftotext file that solves the problem?
regards risby
Sorry, only registered users may post in this forum.

Click here to login