Atom feed

Damian Cugley

Let’s add an Atom resource to allow feed readers to read a feed of blog posts.

A what

Feed readers (also called RSS readers) like NetNewsWire are apps for keeping up with blogs by listing new posts on all your subscribed blogs. (They use the term subscription, but it is not a paid subscription.) To make this work, the blog website publishes a file in a format like RSS, Atom, or JSON Feed. When you subscribe to a blog, your reader app downloads this periodically to see if new posts have been added.

The feed document corresponds to the blog index page, and it has entries corresponding to blog posts.

Mismiy version

An Atom feed is an XML document. Rather than generate XML with Mustache, we added a simple XML writer that generates tidy XML code. This slightly contradicts our aim of writing the minimum code, but using an XML writer avoids wasting time working out why a mustache template is not generating valid XML.

On top of the metadata we already have, Atom entries for posts need authors, unique identifiers, and updated datetimes. Feeds need unique identifiers, and a link to their own URL.

Author objects can be added to the schema for post metadata. Mismiy supports two formats: the first is just a string, the name of the writer:

title: Some blog entry
author: Alice de Winter

The other includes extra fields that Atom allows:

title: Some blog entry
author:
    name: Alice de Winter
    uri: https://dewinter.example/alice/
    email: alice@dewinter.example

Another mandatory field is the updated timestamp of the entries. This can be optionally specified in the metadata at the top of a post. It is only needed when new information has been added to the post since it was published: Otherwise Mismiy can just use the published datetime.

Digression on identifiers

Atom requires identifiers for entries and the feed as a whole. We want Mimsiy to be able to derive them from information authors already supply, such as the file names of the posts, so that authors are not required to do extra work to satisfy Atom’s requirement.

The unique identifiers for entries are strings in the form of URIs. They have to be unique and durable. The three main types of URI that seem applicable are:

These are respectively

How do we determine the ID for a given post?

How do we determine the ID for a given feed?

The second option means the template for creating a blog need not include an identifier, reducing the risk that people will accidentally create clashing feeds.

Other feed formats are available

Existing RSS readers must support Atom, so it follows there is no added value in supporting other formats. The reason for choosing Atom is that it is the one with the most thoroughly worked out and clearest specification.

Disappointingly the Feed Validation Service still warns against using an namespace prefix in the XML document, linking to a survey from 2007 showing many feed readers fail to read XML with namespaces properly. We are going to ignore that and assume that anyone maintaining a feed reader in the last decade will have made use of an adequate XML parser.

On Linux and macOS we can check the feed file is valid using a RelaxNG schema and the xmllint command or the Python package lxml.

Posts on similar topics

  • Atom (2)
  • XML (2)
  • RSS (1)