Welcome! Log In Create A New Profile

Advanced

Sitemap Generator

Posted by arty56 
Sitemap Generator
April 03, 2008 11:55AM
Has anyone here thought of using the data collected by sphider in creating a sitemap.xml, sitemap.txt and of course a sitemap.php for humans to read.

If not I am writing a script to do it and shd be finished soon.

OK here is the finished product and script. Enjoy.

I got this idea from seeing a software program that generated a sitemap.xml file for google, a sitemap.txt file for others and of course a sitemap generated for humans to read.

The only downfall was it only parsed from a set url, and then you had to physically upload the three files.

Don't get me wrong it is a brilliant piece of software... BUT I wanted something I didn't have to upload.
I wanted the files to be automatically generated and stored on my server where I need these files to belong.

When I saw this I knew straight away that the Sphider search engine would be the ideal parsing engine to utilise in generating my sitemap requirements.

Sure Sphider can parse from a sitemap.xml but I got excited knowing that I could also use Sphider to create my sitemap.xml file.

Then I got to thinking about how to integrate the creation of the sitemap.php, [for humans to read. :_) ] as well as the sitemap.txt [ for other search engines ].

So the next thing I did was to look at the data being stored by Sphider and see what data I could use to create the .xml, .txt and .php files.

As I code in php, the natural thing for me to create was a php script that in turn generated the other files I wanted.

So a star was born so to speak. lol

I first wrote the php script to generate the sitemap.php and then wrote another script, create_sitemap_xml.php, to create the sitemap.xml file and then added some extra code so the same script created the sitemap.txt as well.

I then integrated the create_sitemap_xml.php into the sitemap.php so I was only left with one script.

The brilliant part of this is when ever you re-index your site you just need to click on sitemap.php and all three files get updated automatically. [sitemap.php,sitemap.txt and sitemap.xml].

The data pulled from the Sphider database and used in generating the three sitemap files is the links table.

http://www.snazzypromotions.com/sitemap.php Site Map - php file
http://www.snazzypromotions.com/sitemap.txt Site Map - txt file
http://www.snazzypromotions.com/sitemap.xml Site Map - xml file

http://www.snazzypromotions.com/blog/sitemap.zip Download the files and readme here.

Hope you enjoy.

Arty

PS

Tec feel free to steal the code and add to sphider and plus if you think it's worth it.

Instant Free Advertising
Where Advertising Your Site Begins
http://www.instantfreeadvertising.com/search/index.php?m=1



Edited 1 time(s). Last edit at 04/03/2008 12:51PM by arty56.
Tec
Re: Sitemap Generator
April 03, 2008 12:40PM
Creating a sitemap is already part of Sphider-plus for index, re-index and 'Erase & Re-index'. So I don't have to steal some code.

Tec



Edited 1 time(s). Last edit at 04/03/2008 02:50PM by Tec.
Re: Sitemap Generator
April 03, 2008 12:52PM
oh kewl, lol.

I haven't had a look at sphider plus yet.

Good to know.

Hope it helps someone perhaps for the standard sphider.
Re: Sitemap Generator
April 13, 2008 10:16AM
ThereĀ“s no create_sitemap_xml.php in the download.
Tec
Re: Sitemap Generator
April 18, 2008 09:50AM
Hello Willy,

A 'create_sitemap_xml.php' file is not required. Creation of the sitemap is part of the script .../admin/spider.php

Tec
Re: Sitemap Generator
April 27, 2008 02:56PM
I'm having trouble getting the script to write the sitemap.xml when it is run from a cron job.

The code used in the spider.php file is:
(note that I tried using $filename=$_SERVER['DOCUMENT_ROOT'] . "/sitemap.xml"; but that gives the unable to open file error when run from cron. Using the format below stops the unable to open file error, but gives unable to write error instead)

$filename = "http://www.[domain]/sitemap.xml";
if (!$handle = fopen($filename, "w"winking smiley) {
print "$filename failed to open...";
die ('Unable to open Sitemap file');
}
if (!fwrite($handle, "$version\n$urlset\n$copyright\n$update\n$all_links</urlset>\n"winking smiley) {
print "Unable to write to $filename";
die ('');
}
fclose($handle);


This works fine when you run the spider.php direct from the admin.php and the sitemap is created/updated fine. However, when it is run from the cron job, the spider runs OK but I get the "Unable to write to ..[sitemap].."

The sitemap file has permissions set to 777. It is in the root directory (obviously) so unsure about setting permissions on root??

Any thoughts???
Re: Sitemap Generator
May 01, 2008 11:43AM
@arty56 thanks for sharing this, got it working, but whats the use of the sitemap.txt? And for what is the create_xml_sitemap.php mentioned in the readme.txt. I don' t have that file, but the script works.
Re: Sitemap Generator
June 05, 2008 02:03PM
Ok, I have this in place, but what do i need to put in my /admin/spider.php file to get this to generate the files?

Thanks.
Re: Sitemap Generator
June 12, 2008 02:50PM
Can anyone help here?
Tec
Re: Sitemap Generator
June 12, 2008 07:11PM
Don't understand your question. If you use Sphider-plus
https://sourceforge.net/project/showfiles.php?group_id=214642
new sitemap files are automatically created in folder .../admin/sitemaps/ during each index / re-index.
You don't need to put anything into your .../admin/spider.php file.

Tec
Re: Sitemap Generator
June 23, 2008 10:42AM
I don't have Sphider-plus. I thought this was a mod for the original Sphider, not Sphider plus. I am having the same issue as marty1. If I run this as a cron job, it fails, but if I go to the spider.php file directly through my browser.
Sorry, only registered users may post in this forum.

Click here to login