Welcome! Log In Create A New Profile

Advanced

[MOD] Ultimate cronjob for sphider.

Posted by coolgal 
[MOD] Ultimate cronjob for sphider.
October 29, 2007 02:15AM
Hi;

Well eventhough i have asked for exchange mod here [www.sphider.eu] but seems like no one could help me out.

But anyway this is the cron i have made for sphider.

Functions:
- You can set up how many sites it will cron at once.
- It will auto calculate the day since the last index, so you can set the day it will recrawl that sites again
- Log the index for cron job.
- And some others small things.

Instructions:
- Create a file name named spider_cron.php inside the admin folder with this following content:

<?php
set_time_limit (0);
$include_dir = "../include";
$log_format = "html";
$log_dir = "log";

require_once ("$include_dir/commonfuncs.php"winking smiley;
$all = 0;
extract (getHttpVars());
$settings_dir = "../settings";
ignore_user_abort(1);
require_once ("$settings_dir/conf.php"winking smiley;
include "$settings_dir/database.php";

include "messages.php";
include "spiderfuncs.php";
error_reporting (E_ALL ^ E_NOTICE ^ E_WARNING);

$reindex = 1;
$keep_log = 1;
$delay_time = 0;
$all = 1;
$command_line = 0;
$delay = 1; // Second
$number = 0;
$date_to_search = "";

if (isset($_GET['n']) && $_GET['n']) {
$vars = explode (",", $_GET['n']);

$number = $vars[0];
$date_to_search = $vars[1];
}


if ($keep_log) {

if ($log_format=="html"winking smiley {
$log_file = $log_dir."/".Date("ymdHi"winking smiley.".html";

} else {
$log_file = $log_dir."/".Date("ymdHi"winking smiley.".log";
}

if (!$log_handle = fopen($log_file, 'w')) {
die ("Logging option is set, but cannot open file for logging."winking smiley;
}
}

index_all($date_to_search, $number);

$tmp_urls = Array();


function microtime_float(){
list($usec, $sec) = explode(" ", microtime());
return ((float)$usec + (float)$sec);
}


function index_url($url, $level, $site_id, $md5sum, $domain, $indexdate, $sessid, $can_leave_domain, $reindex) {
global $entities, $min_delay;
global $command_line;
global $min_words_per_page;
global $supdomain;
global $mysql_table_prefix, $user_agent, $tmp_urls, $delay_time, $domain_arr;
$needsReindex = 1;
$deletable = 0;

$url_status = url_status($url);
$thislevel = $level - 1;

if (strstr($url_status['state'], "Relocation"winking smiley) {
$url = eregi_replace(" ", "", url_purify($url_status['path'], $url, $can_leave_domain));

if ($url <> '') {
$result = mysql_query("select link from ".$mysql_table_prefix."temp where link='$url' && id = '$sessid'"winking smiley;
echo mysql_error();
$rows = mysql_numrows($result);
if ($rows == 0) {
mysql_query ("insert into ".$mysql_table_prefix."temp (link, level, id) values ('$url', '$level', '$sessid')"winking smiley;
echo mysql_error();
}
}

$url_status['state'] == "redirected";
}

/*
if ($indexdate <> '' && $url_status['date'] <> '') {
if ($indexdate > $url_status['date']) {
$url_status['state'] = "Date checked. Page contents not changed";
$needsReindex = 0;
}
}*/
ini_set("user_agent", $user_agent);
if ($url_status['state'] == 'ok') {
$OKtoIndex = 1;
$file_read_error = 0;

if (time() - $delay_time < $min_delay) {
sleep ($min_delay- (time() - $delay_time));
}
$delay_time = time();
if (!fst_lt_snd(phpversion(), "4.3.0"winking smiley) {
$file = file_get_contents($url);
if ($file === FALSE) {
$file_read_error = 1;
}
} else {
$fl = @fopen($url, "r"winking smiley;
if ($fl) {
while ($buffer = @fgets($fl, 4096)) {
$file .= $buffer;
}
} else {
$file_read_error = 1;
}

fclose ($fl);
}
if ($file_read_error) {
$contents = getFileContents($url);
$file = $contents['file'];
}


$pageSize = number_format(strlen($file)/1024, 2, ".", ""winking smiley;
printPageSizeReport($pageSize);

if ($url_status['content'] != 'text') {
$file = extract_text($file, $url_status['content']);
}

printStandardReport('starting', $command_line);


$newmd5sum = md5($file);


if ($md5sum == $newmd5sum) {
printStandardReport('md5notChanged',$command_line);
$OKtoIndex = 0;
} else if (isDuplicateMD5($newmd5sum)) {
$OKtoIndex = 0;
printStandardReport('duplicate',$command_line);
}

if (($md5sum != $newmd5sum || $reindex ==1) && $OKtoIndex == 1) {
$urlparts = parse_url($url);
$newdomain = $urlparts['host'];
$type = 0;

/* if ($newdomain <> $domain)
$domainChanged = 1;

if ($domaincb==1) {
$start = strlen($newdomain) - strlen($supdomain);
if (substr($newdomain, $start) == $supdomain) {
$domainChanged = 0;
}
}*/

// remove link to css file
//get all links from file
$data = clean_file($file, $url, $url_status['content']);

if ($data['noindex'] == 1) {
$OKtoIndex = 0;
$deletable = 1;
printStandardReport('metaNoindex',$command_line);
}


$wordarray = unique_array(explode(" ", $data['content']));

if ($data['nofollow'] != 1) {
$links = get_links($file, $url, $can_leave_domain, $data['base']);
$links = distinct_array($links);
$all_links = count($links);
$numoflinks = 0;
//if there are any, add to the temp table, but only if there isnt such url already
if (is_array($links)) {
reset ($links);

while ($thislink = each($links)) {
if ($tmp_urls[$thislink[1]] != 1) {
$tmp_urls[$thislink[1]] = 1;
$numoflinks++;
mysql_query ("insert into ".$mysql_table_prefix."temp (link, level, id) values ('$thislink[1]', '$level', '$sessid')"winking smiley;
echo mysql_error();
}
}
}
} else {
printStandardReport('noFollow',$command_line);
}

if ($OKtoIndex == 1) {

$title = $data['title'];
$host = $data['host'];
$path = $data['path'];
$fulltxt = $data['fulltext'];
$desc = substr($data['description'], 0,254);
$url_parts = parse_url($url);
$domain_for_db = $url_parts['host'];

if (isset($domain_arr[$domain_for_db])) {
$dom_id = $domain_arr[$domain_for_db];
} else {
mysql_query("insert into ".$mysql_table_prefix."domains (domain) values ('$domain_for_db')"winking smiley;
$dom_id = mysql_insert_id();
$domain_arr[$domain_for_db] = $dom_id;
}

$wordarray = calc_weights ($wordarray, $title, $host, $path, $data['keywords']);

//if there are words to index, add the link to the database, get its id, and add the word + their relation
if (is_array($wordarray) && count($wordarray) > $min_words_per_page) {
if ($md5sum == '') {
mysql_query ("insert into ".$mysql_table_prefix."links (site_id, url, title, description, fulltxt, indexdate, size, md5sum, level) values ('$site_id', '$url', '$title', '$desc', '$fulltxt', curdate(), '$pageSize', '$newmd5sum', $thislevel)"winking smiley;
echo mysql_error();
$result = mysql_query("select link_id from ".$mysql_table_prefix."links where url='$url'"winking smiley;
echo mysql_error();
$row = mysql_fetch_row($result);
$link_id = $row[0];

save_keywords($wordarray, $link_id, $dom_id);

printStandardReport('indexed', $command_line);
}else if (($md5sum <> '') && ($md5sum <> $newmd5sum)) { //if page has changed, start updating

$result = mysql_query("select link_id from ".$mysql_table_prefix."links where url='$url'"winking smiley;
echo mysql_error();
$row = mysql_fetch_row($result);
$link_id = $row[0];
for ($i=0;$i<=15; $i++) {
$char = dechex($i);
mysql_query ("delete from ".$mysql_table_prefix."link_keyword$char where link_id=$link_id"winking smiley;
echo mysql_error();
}
save_keywords($wordarray, $link_id, $dom_id);
$query = "update ".$mysql_table_prefix."links set title='$title', description ='$desc', fulltxt = '$fulltxt', indexdate=now(), size = '$pageSize', md5sum='$newmd5sum', level=$thislevel where link_id=$link_id";
mysql_query($query);
echo mysql_error();
printStandardReport('re-indexed', $command_line);
}
}else {
printStandardReport('minWords', $command_line);

}
}
}
} else {
$deletable = 1;
printUrlStatus($url_status['state'], $command_line);

}
if ($reindex ==1 && $deletable == 1) {
check_for_removal($url);
} else if ($reindex == 1) {

}
if (!isset($all_links)) {
$all_links = 0;
}
if (!isset($numoflinks)) {
$numoflinks = 0;
}
printLinksReport($numoflinks, $all_links, $command_line);
}


function index_site($url, $reindex, $maxlevel, $soption, $url_inc, $url_not_inc, $can_leave_domain) {
global $mysql_table_prefix, $command_line, $mainurl, $tmp_urls, $domain_arr, $all_keywords;
$result = mysql_query("select keyword_ID, keyword from ".$mysql_table_prefix."keywords"winking smiley;
echo mysql_error();
while($row=mysql_fetch_array($result)) {
$all_keywords[addslashes($row[1])] = $row[0];
}

$compurl = parse_url($url);
if ($compurl['path'] == '')
$url = $url . "/";

$t = microtime();
$a = getenv("REMOTE_ADDR"winking smiley;
$sessid = md5 ($t.$a);


$urlparts = parse_url($url);

$domain = $urlparts['host'];
if (isset($urlparts['port'])) {
$port = (int)$urlparts['port'];
}else {
$port = 80;
}


$errno = 0;
$errmsg = "";
$fp = fsockopen($domain, $port, $errno, $errmsg);
if (!$fp) {
printConnectErrorReport($errmsg);

}

fclose ($fp);


$result = mysql_query("select site_id from ".$mysql_table_prefix."sites where url='$url'"winking smiley;
echo mysql_error();
$row = mysql_fetch_row($result);
$site_id = $row[0];

if ($site_id != "" && $reindex == 1) {
mysql_query ("insert into ".$mysql_table_prefix."temp (link, level, id) values ('$url', 0, '$sessid')"winking smiley;
echo mysql_error();
$result = mysql_query("select url, level from ".$mysql_table_prefix."links where site_id = $site_id"winking smiley;
while ($row = mysql_fetch_array($result)) {
$site_link = $row['url'];
$link_level = $row['level'];
if ($site_link != $url) {
mysql_query ("insert into ".$mysql_table_prefix."temp (link, level, id) values ('$site_link', $link_level, '$sessid')"winking smiley;
}
}

$qry = "update ".$mysql_table_prefix."sites set indexdate=now(), spider_depth = $maxlevel, required = '$url_inc'," .
"disallowed = '$url_not_inc', can_leave_domain=$can_leave_domain where site_id=$site_id";
mysql_query ($qry);
echo mysql_error();
} else if ($site_id == '') {
mysql_query ("insert into ".$mysql_table_prefix."sites (url, indexdate, spider_depth, required, disallowed, can_leave_domain) " .
"values ('$url', now(), $maxlevel, '$url_inc', '$url_not_inc', $can_leave_domain)"winking smiley;
echo mysql_error();
$result = mysql_query("select site_ID from ".$mysql_table_prefix."sites where url='$url'"winking smiley;
$row = mysql_fetch_row($result);
$site_id = $row[0];
} else {
mysql_query ("update ".$mysql_table_prefix."sites set indexdate=now(), spider_depth = $maxlevel, required = '$url_inc'," .
"disallowed = '$url_not_inc', can_leave_domain=$can_leave_domain where site_id=$site_id"winking smiley;
echo mysql_error();
}


$result = mysql_query("select site_id, temp_id, level, count, num from ".$mysql_table_prefix."pending where site_id='$site_id'"winking smiley;
echo mysql_error();
$row = mysql_fetch_row($result);
$pending = $row[0];
$level = 0;
$domain_arr = get_domains();
if ($pending == '') {
mysql_query ("insert into ".$mysql_table_prefix."temp (link, level, id) values ('$url', 0, '$sessid')"winking smiley;
echo mysql_error();
} else if ($pending != '') {
printStandardReport('continueSuspended',$command_line);
mysql_query("select temp_id, level, count from ".$mysql_table_prefix."pending where site_id='$site_id'"winking smiley;
echo mysql_error();
$sessid = $row[1];
$level = $row[2];
$pend_count = $row[3] + 1;
$num = $row[4];
$pending = 1;
$tmp_urls = get_temp_urls($sessid);
}

if ($reindex != 1) {
mysql_query ("insert into ".$mysql_table_prefix."pending (site_id, temp_id, level, count) values ('$site_id', '$sessid', '0', '0')"winking smiley;
echo mysql_error();
}


$time = time();


$omit = check_robot_txt($url);

printHeader ($omit, $url, $command_line);


$mainurl = $url;
$num = 0;

while (($level <= $maxlevel && $soption == 'level') || ($soption == 'full')) {
if ($pending == 1) {
$count = $pend_count;
$pending = 0;
} else
$count = 0;

$links = array();

$result = mysql_query("select distinct link from ".$mysql_table_prefix."temp where level=$level && id='$sessid' order by link"winking smiley;
echo mysql_error();
$rows = mysql_num_rows($result);

if ($rows == 0) {
break;
}

$i = 0;

while ($row = mysql_fetch_array($result)) {
$links[] = $row['link'];
}

reset ($links);


while ($count < count($links)) {
$num++;
$thislink = $links[$count];
$urlparts = parse_url($thislink);
reset ($omit);
$forbidden = 0;
foreach ($omit as $omiturl) {
$omiturl = trim($omiturl);

$omiturl_parts = parse_url($omiturl);
if ($omiturl_parts['scheme'] == '') {
$check_omit = $urlparts['host'] . $omiturl;
} else {
$check_omit = $omiturl;
}

if (strpos($thislink, $check_omit)) {
printRobotsReport($num, $thislink, $command_line);
check_for_removal($thislink);
$forbidden = 1;
break;
}
}

if (!check_include($thislink, $url_inc, $url_not_inc )) {
printUrlStringReport($num, $thislink, $command_line);
check_for_removal($thislink);
$forbidden = 1;
}

if ($forbidden == 0) {
printRetrieving($num, $thislink, $command_line);
$query = "select md5sum, indexdate from ".$mysql_table_prefix."links where url='$thislink'";
$result = mysql_query($query);
echo mysql_error();
$rows = mysql_num_rows($result);
if ($rows == 0) {
index_url($thislink, $level+1, $site_id, '', $domain, '', $sessid, $can_leave_domain, $reindex);

mysql_query("update ".$mysql_table_prefix."pending set level = $level, count=$count, num=$num where site_id=$site_id"winking smiley;
echo mysql_error();
}else if ($rows <> 0 && $reindex == 1) {
$row = mysql_fetch_array($result);
$md5sum = $row['md5sum'];
$indexdate = $row['indexdate'];
index_url($thislink, $level+1, $site_id, $md5sum, $domain, $indexdate, $sessid, $can_leave_domain, $reindex);
mysql_query("update ".$mysql_table_prefix."pending set level = $level, count=$count, num=$num where site_id=$site_id"winking smiley;
echo mysql_error();
}else {
printStandardReport('inDatabase',$command_line);
}

}
$count++;
}
$level++;
}

mysql_query ("delete from ".$mysql_table_prefix."temp where id = '$sessid'"winking smiley;
echo mysql_error();
mysql_query ("delete from ".$mysql_table_prefix."pending where site_id = '$site_id'"winking smiley;
echo mysql_error();
printStandardReport('completed',$command_line);


}

function index_all($date_to_search, $number) {

global $mysql_table_prefix, $delay;

$sql = "SELECT * from ".$mysql_table_prefix."sites WHERE indexdate < '{$date_to_search}' ORDER BY indexdate ASC ";
$res = mysql_query($sql);

echo mysql_error();

$i = 0;



while ($number > $i && ($row = mysql_fetch_assoc($res)) ) { //

$i++;




$url = $row['url'];
$depth = $row['spider_depth'];
$include = $row['required'];
$not_include = $row['disallowed'];
$can_leave_domain = $row['can_leave_domain'];

if ($can_leave_domain == '') {
$can_leave_domain = 0;
}

if ($depth == -1) {
$soption = 'full';
} else {
$soption = 'level';
}

index_site($url, 1, $depth, $soption, $include, $not_include, $can_leave_domain);
sleep($delay);
}
}

function get_temp_urls ($sessid) {
global $mysql_table_prefix;
$result = mysql_query("select link from ".$mysql_table_prefix."temp where id='$sessid'"winking smiley;
echo mysql_error();
$tmp_urls = Array();
while ($row=mysql_fetch_row($result)) {
$tmp_urls[$row[0]] = 1;
}
return $tmp_urls;

}

function get_domains () {
global $mysql_table_prefix;
$result = mysql_query("select domain_id, domain from ".$mysql_table_prefix."domains"winking smiley;
echo mysql_error();
$domains = Array();
while ($row=mysql_fetch_row($result)) {
$domains[$row[1]] = $row[0];
}
return $domains;

}

function commandline_help() {
print "Usage: php spider.php <options>\n\n";
print "Options:\n";
print " -all\t\t Reindex everything in the database\n";
print " -u <url>\t Set url to index\n";
print " -f\t\t Set indexing depth to full (unlimited depth)\n";
print " -d <num>\t Set indexing depth to <num>\n";
print " -l\t\t Allow spider to leave the initial domain\n";
print " -r\t\t Set spider to reindex a site\n";
print " -m <string>\t Set the string(s) that an url must include (use \\n as a delimiter between multiple strings)\n";
print " -n <string>\t Set the string(s) that an url must not include (use \\n as a delimiter between multiple strings)\n";
}

//printStandardReport('quit',$command_line);
if ($email_log) {
$indexed = ($all==1) ? 'ALL' : $url;
$log_report = "";
if ($log_handle) {
$log_report = "Log saved into $log_file";
}
mail($admin_email, "Sphider indexing report", "Sphider has finished indexing $indexed at ".date("y-m-d H:i:s"winking smiley.". ".$log_report);
}
if ( $log_handle) {
fclose($log_handle);
}

?>

- Create the file name named cron.php with this following content

<?

set_time_limit (0);
$include_dir = "../include";
require_once ("$include_dir/commonfuncs.php"winking smiley;
ignore_user_abort(1);
extract (getHttpVars());
$settings_dir = "../settings";
require_once ("$settings_dir/conf.php"winking smiley;
include "$settings_dir/database.php";

$date_to_old = 3;
$separator = 1;

$number = 10;

$curr_date = date("Y-m-d"winking smiley;
$date_to_search = dateadd("d", $date_to_old, $curr_date,"Y-m-d", "-"winking smiley;

if ($number == 0){
$sql = "SELECT * from ".$mysql_table_prefix."sites WHERE indexdate < '{$date_to_search}'";
$res= mysql_query($sql);

$number = mysql_num_rows($res);

$number = intval($number / $separator);

}
if ($number == 0) $number = 1 ;

exec("wget http:// yourdomain. com/spider_cron.php?n=$number,$date_to_search "winking smiley;


function dateadd($type, $num, $date, $format = "Y-m-d", $zn = "+"winking smiley
{
$add_d = 0;
$add_m = 0;
$add_y = 0;
$add_h = 0;
$add_i = 0;

switch ($type)
{
case "d":
$add_d = $num;
break;
case "m":
$add_m = $num;
break;
case "y":
$add_y = $num;
break;
case "h":
$add_h = $num;
break;
case "i":
$add_i = $num;
break;
}

$tmp_date = strtotime($date);

$y = date("Y", $tmp_date);
$m = date("m", $tmp_date);
$d = date("d", $tmp_date);
$h = date("H", $tmp_date);
$i = date("i", $tmp_date);

if ($zn == "+"winking smiley{
$new_date = date($format, mktime($h + $add_h, $i + $add_i, 0, $m + $add_m, $d + $add_d, $y + $add_y));
} else {
$new_date = date($format, mktime($h - $add_h, $i - $add_i, 0, $m - $add_m, $d - $add_d, $y - $add_y));
}

return $new_date;
}
?>

***REMEMBER to change [yourdomain.com] to your domain. You can change the settings if you like to.

- Run cron.php to start the cron job. If you want to test it, simply go to your websie and point to cron.php, if it creates the log means congratulation.

By any chance, if you guys can help me out at [www.sphider.eu]

Thank you



Edited 1 time(s). Last edit at 10/29/2007 02:16AM by coolgal.
Re: [MOD] Ultimate cronjob for sphider.
October 30, 2007 12:39AM
they want work for me can you help me please

if i paste the cron.php file in the inddex directory she give me empty page

and whene i make it in admin directory she give me this error

Warning: require_once(./include/commonfuncs.php) [function.require-once]: failed to open stream: No such file or directory in C:\xampp\htdocs\search\admin\cron.php on line 5

Fatal error: require_once() [function.require]: Failed opening required './include/commonfuncs.php' (include_path='.;C:\xampp\php\pear\') in C:\xampp\htdocs\search\admin\cron.php on line 5

thanks
Re: [MOD] Ultimate cronjob for sphider.
October 30, 2007 04:02PM
I dont have that errors before, please take a look at your path. Both of this files need to be put in the admin directory
Re: [MOD] Ultimate cronjob for sphider.
October 31, 2007 11:54PM
she give me whaite page there is nothing i point to cron.php notrhing

i did the same thing you told

can you help me please

thanks you
Re: [MOD] Ultimate cronjob for sphider.
November 01, 2007 02:58AM
well of course it is the blank page when yourun cron.php on browser. You can check if it creates the log file or not, if has, the it wokrs fine. You just need go to your host control panel and set up the cron point to cron.php
Re: [MOD] Ultimate cronjob for sphider.
November 12, 2007 12:51PM
How can I check if this works? I created both the files and place them in admin directory!
Re: [MOD] Ultimate cronjob for sphider.
November 13, 2007 01:51AM
You can create the folder named "log" in the admin control panel, chmod it to 777, then when you excute the file, you should see the log files there.
Re: [MOD] Ultimate cronjob for sphider.
November 14, 2007 08:02AM
Wait so can someone explain to me what actually happens when the cron job is iniceated? Does it do a full indexing or to a certain depth? Does it put a lot of stress on the server if you have many links in your database to re-index? Does it slow down my site if it's going and someone goes on my site? Please just can som1 explain the basics of what happens and what I should beconcerned about and what I should expect when it's running?

Thanks, Tyler
Re: [MOD] Ultimate cronjob for sphider.
November 14, 2007 05:47PM
I opened cron.php from my browser! it gave me a blank file. now i looked into the log directory and it was empty! i have logging enabled in ADmin CP. also the directory is chmoded to 777!
Re: [MOD] Ultimate cronjob for sphider.
November 19, 2007 07:21PM
I've been thinking of this since I downloaded sphider (recently though). Promise to test it!
Thanks
Re: [MOD] Ultimate cronjob for sphider.
November 30, 2007 12:41AM
Thanks this seems great!
however I get this error from my host
I think they wont allow me to run this
is there away around this?

Warning: exec() has been disabled for security reasons in /home/texasbri/public_html/search/admin/cron.php on line 31
Re: [MOD] Ultimate cronjob for sphider.
December 27, 2007 03:18PM
Have is it pobil to change the ammount of side to index and the date from last index ?

Brian Jorgensen
Kejoinet.dk
Denmark
Re: [MOD] Ultimate cronjob for sphider.
December 29, 2007 05:54AM
Unfortunetly, you will recieve large ammounts of server load from this function on larger databases.... I would suggest this to anyone with smaller databases.
Re: [MOD] Ultimate cronjob for sphider.
January 16, 2008 07:26AM
Hi, the cronjob which you have created it consume lots of memory and make the server very busy. becasue you have have used the wget.exe to call the spider_cron.php which starts the net process. is their any option which can be set with php comand line program like php-win.exe this can handel more efficently becasue this application dont effect the server proformance and run in background. and your website always remain smoothly becasue your server support php-cgi.exe as a php interpreater to render the php script. so please try to use the php-win.exe on windows based systems.

I tryed this cronjob yesturday but this application becomes un responsive and make the server busy.

So nay other alternative to run the cron jobs without web server url. you have used exec("wget http:// yourdomain. com/spider_cron.php?n=$number,$date_to_search "winking smiley; in this wget call the page from the example.com doamin via its webserver whcih casue the server busy because cron takes lots of time to index pages anf till the time the server becomes un responsive.

So please try to use some other alternative method via direct php interpreator in command line mode ain background trak like Windows Task Service. i have set many cronjobs with this windows based task service and its running fine and its not effecting the web server progormance.

thanks in advance.

Birender Singh Budhwar
Administrator
BSBDM.REDIRECTME.NET
Re: [MOD] Ultimate cronjob for sphider.
March 31, 2009 06:10PM
This wud have been a major break thru to get the automatic site indexing, now i get a blank page and my log folder is just empty. chmod to 777 but still doesnt work
Re: [MOD] Ultimate cronjob for sphider.
March 31, 2009 08:16PM
YEAAA


THANK YOUUUUUUU MANNN


ITS realy work

Current CPU Usage from my server

31281 "username" 0 1.0 1.1 /usr/bin/php /home/username/public_html/admin/spider_cron.php


THANK YOUUUUUUUU

PS: you must give you host cron job privilegs than you can start the cronjob over the cron.php
Re: [MOD] Ultimate cronjob for sphider.
May 20, 2009 04:08AM
thanks winking smiley



Edited 2 time(s). Last edit at 05/20/2009 04:17AM by LittleEngineer.
Re: [MOD] Ultimate cronjob for sphider.
May 27, 2009 01:36AM
would it be possable to get a copy of this as a .txt somewhere so that we dont have to fix the smilies?

thank you
Re: [MOD] Ultimate cronjob for sphider.
July 25, 2009 08:24PM
What is the purpose of the winkie smile right in the middle of the code, what is suppossed to be there
Re: [MOD] Ultimate cronjob for sphider.
July 25, 2009 08:38PM
A ; followed by a )

winking smiley
Re: [MOD] Ultimate cronjob for sphider.
July 30, 2009 05:25PM
What is wrong with just running the spider.php file, like the following (set to run every 2 minutes):

*/2 * * * * /usr/path/to/cronlog "/path/to/sphider/admin/spider.php -all > /dev/null"

The reason I ask is that I am trying to get this to work but it won't and I am getting no help in the main forum.

Cheers,
Dan
Re: [MOD] Ultimate cronjob for sphider.
August 02, 2009 09:22PM
well im actually getting this problem

Parse error: syntax error, unexpected ';' in /home/user/public_html/folder/search/admin/cron.php on line 5

any help will be appreciated
Re: [MOD] Ultimate cronjob for sphider.
August 13, 2009 04:16PM
yeah I'm getting the same, and when the 'offending' syntax is removed, the same error appears for line 9. Relisys, did you find a solution?

cheers,

Dave
Re: [MOD] Ultimate cronjob for sphider.
August 28, 2009 05:31PM
Hello. Is there a solution for my problem?


error in sphider_cron.php

Warning: set_time_limit() has been disabled for security reasons in /usr/home/*.com/web/admin/spider_cron.php on line 2



error in cron.php

Warning: set_time_limit() has been disabled for security reasons in /usr/home/*.com/web/admin/cron.php on line 3
Warning: exec() has been disabled for security reasons in /usr/home/*.com/web/admin/cron.php on line 31



Edited 3 time(s). Last edit at 08/28/2009 05:42PM by sergiodik.
Re: [MOD] Ultimate cronjob for sphider.
September 30, 2009 11:21PM
Don't bother with this script!
Re: [MOD] Ultimate cronjob for sphider.
October 07, 2009 09:47PM
Thanks for great mod.
I did have some issue with vista wget does not exists with lamp
I also found a small error, if you add a new site it get value NULL as defualt, you cronjob will not autoindex them.
Here is the sql for cronjob.php

if ($number == 0){
//$sql = "SELECT * from ".$mysql_table_prefix."sites WHERE indexdate < '{$date_to_search}'";
$sql = "SELECT * from ".$mysql_table_prefix."sites WHERE (indexdate is null) or (indexdate < '{$date_to_search}')";
$res= mysql_query($sql);

$number = mysql_num_rows($res);

$number = intval($number / $separator);

}

Here is the sql for spider_cronjob.php
function index_all($date_to_search, $number) {

global $mysql_table_prefix, $delay;
$sql = "SELECT * from ".$mysql_table_prefix."sites WHERE indexdate < '{$date_to_search}' ORDER BY indexdate ASC ";
$res = mysql_query($sql);


Here is the last change small code change in cronjob.php to figout how many site that need be updated
$date_to_old = 3;
$separator = 1;

//$number = 10;
$number = 0;

After this change the script working very nice for me.
Re: [MOD] Ultimate cronjob for sphider.
October 12, 2009 10:21PM
I see that you received plenty of help... NOT!
winking smiley winking smiley winking smiley winking smiley
Re: [MOD] Ultimate cronjob for sphider.
October 12, 2009 10:54PM
I see Willy doesn;t use Sphider anymore! Wise choice!
Sorry, only registered users may post in this forum.

Click here to login