Welcome! Log In Create A New Profile

Advanced

Indexing problem - 'Not text or html'

Posted by quest 
Indexing problem - 'Not text or html'
March 17, 2014 09:13PM
Sphider is installed and the database created OK. When I attempt to index the site, however, I get the error message:
Not text or html
Links found: 0. New links: 0
My 'home' page is named 'default.htm' as opposed to 'index.html'. Might this be an issue? Any other suggestions would be gratefully appreciated.
Tec
Re: Indexing problem - 'Not text or html'
March 17, 2014 10:20PM
No, a filename like 'default.html' will not prevent Spider to index this page.
<<< Any other suggestions would be gratefully appreciated. >>>
Would require the URL to follow your issue.

Tec
Re: Indexing problem - 'Not text or html'
March 17, 2014 10:23PM
The site is
http://www.printabledirect.com
Thanks
Tec
Re: Indexing problem - 'Not text or html'
March 17, 2014 11:06PM
Upps!?!
Had no problem to index that site. Your Sphider installation might be corrupted, or there are some limitations (granted permissions?) on the server, which is holding your Sphider installation. Are you running Sphider on a 'Shared Hosting' server?

As an extract of my log file:

Spidering http://www.printabledirect.com/
1. Retrieving: http://www.printabledirect.com/ at 23:35:04.
Size of page: 24.98kb. Starting indexing at 23:35:15.
Indexed
Links found: 66. New links: 66
2. Retrieving: http://www.printabledirect.com/2013-year-planner.php at 23:35:38.
Size of page: 5.59kb. Starting indexing at 23:35:40.
Indexed
Links found: 1. New links: 1
3. Retrieving: http://www.printabledirect.com/2014-year-planner.php at 23:35:43.
Size of page: 5.61kb. Starting indexing at 23:35:45.
Indexed
Links found: 1. New links: 0
4. Retrieving: http://www.printabledirect.com/2015-year-planner.php at 23:35:45.
Size of page: 5.61kb. Starting indexing at 23:35:47.
Indexed
Links found: 1. New links: 0
5. Retrieving: http://www.printabledirect.com/about.html at 23:35:48.
Size of page: 1.95kb. Starting indexing at 23:35:49.
Indexed
. . . . .
. . . .
. . .
Links found: 1. New links: 0
278. Retrieving: http://www.printabledirect.com/printable-2012-calendar.asp at 23:59:47.
Size of page: 11.03kb. Starting indexing at 23:59:50.
Indexed
Links found: 1. New links: 0
279. Retrieving: http://www.printabledirect.com/printable-to-do-list-print.php at 23:59:54.
Size of page: 2.04kb. Starting indexing at 23:59:57.
Indexed
Links found: 0. New links: 0
280. Retrieving: http://www.printabledirect.com/printable-us-time-zone-maps.htm at 23:59:58.
Size of page: 0.30kb. Starting indexing at 00:00:00.
Indexed
Links found: 0. New links: 0

Completed at 00:00:01.

Currently in database: 1 sites, 275 links, 0 categories and 6463 keywords.

Tec
Re: Indexing problem - 'Not text or html'
March 17, 2014 11:17PM
Yes, the installation is on a shared server although I haven't had any error messages re permissions. I might download and start again?
Thanks
Tec
Re: Indexing problem - 'Not text or html'
March 18, 2014 10:52AM
Yes, you may download and reinstall. But I am afraid that the settings of your 'Shared Hosting' server are too restrictive.

Try the following:
Create a file named php.ini with the following content:
allow_url_fopen = on;
safe_mode = Off;

and place this file into the …/admin/ folder. Eventually it might be useful to place this file additionally into the root folder of your Sphider installation.

Tec
Re: Indexing problem - 'Not text or html'
April 10, 2014 10:02PM
Can you give me some pointers as to where I should look next please?

I have the same error message = "Not text or html"

The Sphider installation provably works as I can readily index http://www.byercycles.co.uk (my favourite bicycle shop)

BUT

when I try to index either of my two sites, which are on the same host as the Sphider install I get the 'Not text or html' error.

http://www.royalobservatorygreenwich.org (redirected by .htaccess entry)
http://www.thegreenwichmeridian.org (index.php has header redirect to "real" home page)


Sphider installed at http://www.royalobservatorygreenwich.org/sphider/search.php

Thanks

J

ps I checked with phpinfo() and

allow_url_fopen = on;
safe_mode = Off;
Tec
Re: Indexing problem - 'Not text or html'
April 11, 2014 03:40PM
Does not work for you, because the original Sphider does not follow a redirection. So the indexer is not been forwarded to the content of the 'real' homepage.

Tec
Sorry, only registered users may post in this forum.

Click here to login