Help | Site Map
Connecting Tech Pros Worldwide
 
 
LinkBack Thread Tools
  #1  
Old December 6th, 2006, 02:55 PM
gaikokujinkyofusho@gmail.com
Guest
 
Posts: n/a
Default Removing duplicate entries/stories from a RSS feed?

Hi, I have been enjoying being able to subscribe to RSS
(http://kinja.com/user/thedigestibleaggie) for awhile and have come up
with a fairly nice list of feeds but I have run into an annoying
(though not critical) problem, duplicate stories. Apparently there is
overlap with some of the sites I subscribe to so I get duplicate
stories. Does anyone know of some sort of filter (software or online
service) that can remove duplicate stories? Any help or suggestions
would really be appreciated!

Cheers

-Gaiko

  #2  
Old December 6th, 2006, 06:30 PM
Paul Lutus
Guest
 
Posts: n/a
Default Re: Removing duplicate entries/stories from a RSS feed?

gaikokujinkyofusho@gmail.com wrote:
Quote:
Hi, I have been enjoying being able to subscribe to RSS
(http://kinja.com/user/thedigestibleaggie) for awhile and have come up
with a fairly nice list of feeds but I have run into an annoying
(though not critical) problem, duplicate stories. Apparently there is
overlap with some of the sites I subscribe to so I get duplicate
stories. Does anyone know of some sort of filter (software or online
service) that can remove duplicate stories? Any help or suggestions
would really be appreciated!
Write a script in a language that supports associative arrays (as do Java,
Perl, Ruby, Python, and even JavaScript). Key the associative array to a
unique key created out of elements in the various RSS feed items. Fill the
associative array using the generated key.

Unfortunately, it is rare for two RSS feed items to be truly identical.
Often, they tell the same story with small differences in wording (to avoid
accusations of plagiarism) and of course the URL is normally different.

Without some complex coding to detect items that are almost the same, the
above method will remove only genuinely identical items from different RSS
feeds.

--
Paul Lutus
http://www.arachnoid.com
 

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over network members.
Post your question now . . .
It's fast and it's free

Popular Articles