Skip to main content

Update to Date-based Entry Ignoring

TL;DR FeedMail will now ignore new items 7 days older than a previously seen item. This is expected to affect almost no "true" new posts.

In theory checking to find new entries for a feed is a simple process.

  1. Download the feed.
  2. Check the ID of each entry to see if you have seen it before.

However the real world is much messier. It is recommended for feed IDs to be URLs (to ensure global uniqueness) however this results in many feeds just using the URL that the article is available at. However these URLs sometimes change, and poorly designed feed generators update the ID of existing entries to the new URL.

From a protocol point of view these are completely new entries, however to a user these are duplicates. In order to reduce the effect of this common issue on our our users FeedMail has some simple mitigations for posts that have recorded published dates.

  1. If the entry is older than a year always ignore it.
  2. If the entry is older than the 10th newest post in the feed ignore it.
  3. If the entry is more than 7 days older than the newest post in the feed ignore it.

Rule number 3 is a new rule to better filter out old entries for infrequent feeds. In many cases this will ensure that you only receive one or two duplicate entries if the feed accidentally changes their post IDs.

One might think that you can simply ignore any article that isn't newer than the newest article, however in practice many feeds post items with the published timestamp slightly out of order. Based on our analysis a 7 day window will almost never ignore new articles while filtering out a lot of duplicates with new IDs.

Comments

Popular posts from this blog

Announcing FeedMail

I'm pleased to be sharing a project that I have been working on for a while and have been thinking about doing for even longer. FeedMail is a simple service that aims to get updates from your favourite websites to your email with no fuss and no nonsense. If you are already sold and want to follow some feeds simply go to feedmail.org to get started. How FeedMail Works FeedMail works using a set of technologies informally called RSS. FeedMail actually supports a variety of feed formats including Atom, RSS2 and RSS1. These feeds are created by websites and updated whenever new content is posted. FeedMail subscribes to these feeds on your behalf and forwards new entries to the email address of your choice. Many websites support these feeds. Just post the URL to an article or website that you want to subscribe to and FeedMail will show you the available feeds. For example the following websites support RSS: YouTube Channels Medium GitHub Releases Tumblr Many news sites Many more... RSS

Digests are Coming

Up to this point FeedMail has only supported real-time notifications. Meaning that every feed update immediately produces a single email. However this is about to change! When we asked for feedback on the features you would like to see in FeedMail we had a number of users reach out saying that they wanted a way to batch notifications together. We saw two main reasons for this: To reduce noise in their inbox. For some high-volume feeds users wanted to be able to quickly skim, then delete the entire batch in one go. While deleting one-by-one offers more flexibility, the bulk option is easier for high-volume feeds. To reduce costs. While we believe that our prices are incredibly reasonable, they can add up if you are getting lots of updates. For example if you follow a feed that updates every 15min that will be about $35 a year (or half price if you buy your credits in bulk). Not super expensive but maybe more than you want to spend for a single feed! Digests provide and option for cost

Providing Email Subscriptions to your Readers with FeedMail

If you offer a blog with an RSS feed you can reach more users by offering email subscription as well. FeedMail provides an easy-to-integrate newsletter that has no cost to you. Just select one of the implementation strategies below. Implementation Options Link The easiest option is to provide a link to the subscribe page. First got to the FeedMail New Subscription page. Enter your website's URL and click "Go".  Then copy the URL from your browser's address bar. This the the URL to subscribe to your website! It should look something like https://feedmail.org/subscriptions/new?url= https%3A%2F%2Fyour-site.example . You can then link to that URL from your site. For example if you are using raw HTML in your design it would look something like: <a href="https://feedmail.org/subscriptions/new?url= https%3A%2F%2Fyour-site.example ">Click here to subscribe by email.</a>  Subscribe Button To use a button simply add the following code to your website. &l