Indexing rss feed web sites.
November 14, 2015 04:46AM
Hi all.

Wondering if someone can give me some guidance? I am trying to set up the search engine to crawl through different sites and seek and index only the rss feed pages of the different sites.

Been trying different ways such as using “ /rss” as “.xml” as key words along with just a regular crawl. No matter what I do I cannot seem to get the search engine to pick up and index these rss feed pages.

Wondering if anyone may have a suggestion I can try to get these RSS feed pages to index. Preferable only the RSS pages but yet it will continue to crawl through the regular links looking for more rss pages.

Thanks for any help you can offer.
Re: Indexing rss feed web sites.
November 14, 2015 09:32AM
Sphider-plus does it

Re: Indexing rss feed web sites.
November 14, 2015 08:27PM
Thanks for that piece of feedback. I will check into that option also.
Re: Indexing rss feed web sites.
November 15, 2015 09:37AM
Okay, if you like it in more details.

Sphider-plus is able to index:
- RDF, RSD, RSS and Atom feeds.
To be activated in admin settings, additionally the following options are available for feed indexation:
- Follow CDATA directives (select to obey CDATA tags in feeds).
- Index 'Dublin Core' and other individually marked tags in RDF feeds (select to index individual tags in RDF feeds,
like <dc:title> and <sy:pubDate> ).
- Follow the 'preferred (true/false)' directive in RSD feeds (select to obey the 'preferred' directive in RSD feeds).

In case you like to reed more about feeds, please visit the following links

RDF feeds
RDF Vocabulary Description Language 1.0: RDF Schema (10 February 2004).
RDF/XML Syntax Specification (Revised: 10 February 2004).

RSD feeds
RSD: Really Simple Discoverability 1.0.

RSS feeds
RSS 0.91, 0.92 and 2.0 Specification.

RSS feed Specification
Website of the RSS working group.

Atom feeds
The Atom Syndication Format.

And yes, beside indexing feeds, Sphider-plus will follow all links and detect further pages containing text, feeds and media content.

Sorry, only registered users may post in this forum.

Click here to login