Feed me too!

Saturday · 10 Nov 07 · 08:04 PM IST | Posted by Karthik | Category: Tech

Hey, Happy Diwali people!

It was a long weekend at work (yesterday holiday), and I was quite happy to have one!  Contrary to what you'd expect, though, I spent time yesterday and today doing a bit of coding on my site, and have constructed an RSS feed for it, which I will talk about in this extremely lengthy post.  Don't worry though, our regularly scheduled programming (movies, drawings, etc.) will continue soon...

I've wanted to add an RSS Feed to my site for a very long time.  But it's only been the last couple of months that I'd been seriously planning for it and thinking of ways to implement it, and I finally completed it today.  Look to the left of the screen, below the menu there is an icon with "RSS" next to it, and if you are viewing this site on a browser that is capable of recognising RSS feeds, you should see the feed icon on the address bar itself.

But hold on a moment, let me take some time to clearly explain what this is about, and how I implemented it.  You'll have to forgive me if some of this is very basic stuff though (and some of you must forgive me if this is needlessly complex).  Also, the way I've coded it is probably not the best way to do it.  But it was something interesting to do, and I put in quite some time into it, so I thought why not write an article on it.

What is an RSS Feed and why should I care?

If you're like me and you regularly follow a number of websites/blogs, you probably keep visiting them frequently to see if the authors have updated the content there.  But then, if the list of these websites is large, you'd probably like it better if you could get a "summarised" view of all the updates on this entire set of websites, like below —

Google Reader Screenshot

That's a screenshot from Google Reader, which is an aggregator.  An aggregator (or "feed reader") is a website or a software program that provides such summarised information pulled from websites.  How does it know what information to pull from a website?  That's where an RSS feed comes in.  You see, a few years ago, a bunch of guys came up with a method for a site to provide a summary of updated content, in the form of a simple XML file.  This specification is called RSS (Really Simple Syndication).  The way it works is, a website generates a summary of updates as an XML file conforming to this specification, and then aggregators simply peek at this file and generate the summary you see above.  Sites that provide a summary like this can have their content syndicated by other websites this way, so it becomes another channel to get the content out there.

Building the RSS Feed for Karthik82.com

So, now we've established that we need to have an XML file formatted in a particular way and made available to anyone (an aggregator/feed reader).  What is the format required actually?  Look no further than the RSS 2.0 Specification page.  As you can make out, the file contains the following items —

  • An opening header, identifying it as an XML document,
  • a channel tag, which contains information about the website that is providing the feed, and
  • a set of item elements, that contain the items that are part of the feed.  Each item contains a title, a description (the actual content) and a link (which is a link to the actual content).

So, now, all we need to do is make sure that whenever an aggregator wants it, there is an XML file fomatted in the above way, with the relevant data, available.  There are a few ways in which this can be done —

  • Generate the XML file manually (that is, type it out in any text editor), each time an update is made to the site.  Add the required data to the XML file and then upload it.  This would work, but it's not really a good way of doing things.  Plus, not all the content from my site can by syndicated this way.  I wanted comments to also show up in the feed.
  • Have a PHP script on the server that triggers the generation of the XML file, whenever it is called.  This one is a step better than the previous method.  Since it's a PHP script generating the XML file, it can have direct access to the database and pull out whatever content needs to be present in the feed.  The problem is, the XML will be updated only when the script is executed, so it'll not necessarily be in sync with the content on the site.
  • The third method is the one I finally went with: don't have an XML file on the server at all!  Dynamically "feed" the XML content to the aggregator (through PHP, again) whenever it asks for it.  The Wordpress blogging platform handles RSS this way.

Now that I'd decided how to generate the feed, the next step was to actually decide how to pull the relevant data from my database tables and make it available in the feed.  I wanted to have a chronological view of activity on the site, which means that not only should blog posts figure in the feed, even drawings, movie reviews, and most importantly, comments, needed to be there as well.  I thought of a few ways to accomplish this, as well —

  • Pick up the top 15 (or whatever) rows from the blog posts' database table, step through it one row at a time, and generate the item information.  This doesn't achieve the objective of representing all activity on the site (as it only concerns itself with the front-page blog posts), but it's the simplest thing to do.
  • The ideal thing to do would be this — create a doubly linked list, where each node contains an item.  Step through all the rows on all the relevant database tables on my site (comments, news posts, drawings, movie reviews, level reviews), and then put them all in the list.  Then, sort the whole list chronologically.  One of the fields in the item is anyway a date, which can be stored as a Unix timestamp, and this can be used as the key for sorting.  Then, step through this list and pick up the top 15 or 20 (or whatever number) items from there, and dash it off as the XML.  This would present (as I call it) a true chronological view of all activity on the site.  But it is a bit complicated to do, and here's the most important thing, aggregators are going to be polling the feed regularly, so I was worried that this method would cause problems if it used up more memory.  Alternately, I could do this on the database side, by creating a view that would contain the sorted data from the relevant tables.  But I don't know whether that would make matters any better, and besides I'm far worse at coding SQL than I am at PHP, so I don't think I'd be able to pull that off.  Bottom line: this is a bit too much.
  • What I finally ended up doing was a compromise.  I query the tables one by one.  I pick up the top 10 rows from the news table, 5 rows from the drawings table, 5 from the comments, 5 from the level reviews, and 5 from the movie reviews.  So that's thirty rows in all.  From these, I pick up the relevant fields and put them all into an array (I actually started coding a doubly linked list, but Varun advised me to use an array instead.  I'm glad I followed his advice, since, I realised that PHP's arrays are very powerful and not at all like the arrays I was used to in C).  This is actually an array of objects (I defined a class called feedItem which would contain the relevant fields I needed in the RSS feed).  I then perform a Bubble Sort on this array.  Now out of all the sorting methods I was exposed to in Engineering, I always found Bubble Sort to be quite an intuitive one, that's why I went with it.  Perhaps it's not the best method (then again, I don't think the performance will be too bad, because many of the elements in this array would already be sorted), but to be frank I didn't think too much about optimisation.  My code is an implementation of the first algorithm given on the Wikipedia page (though I remember following the second one during Engineering).  Once sorted, I simply output the XML (I output a header saying Content-Type: text/xml with PHP and then follow that with the XML content), and voila, the feed is ready!

Whew!  So now, you can have something like this —

Google Reader Screenshot — with the RSS Feed from Karthik82.com!

Yay!  That's Google Reader with a subscription to my site's RSS feed!  The URL to the feed on this site, by the way is http://karthik82.com/rss so you can add that into your aggregator.  That's not all.  There is one more cool thing I did with this.  I added this RSS feed to my orkut profile as well.  When you visit my profile, you'll see a link called Karthik82.com on the left side menu, and on clicking it, you'll see something similar to this (click for a larger image) —

Karthik's orkut profile — with the RSS Feed from Karthik82.com!

That's a summary of the updates on my site, accessible within orkut itself.  You can add the feed link to any site or service that supports RSS syndication.  A couple of people to credit: Krish Ashok, for the discussion we had the other day, he kind of confirmed that the direction I was heading in was the right one; Kartik Agaram for being a strong advocate of building an RSS feed; and Tobias Münch, for going ahead and building an RSS feed for his site, thereby inspiring me to do one myself (the headline, by the way, is a reference to this post on Toby's site, where he announced that he had created an RSS feed for it).  I also picked up some info from these articles: this, this and this (the last one, by the way, is about the CDATA element in XML, which I used for the description field of every item, as the blog post content which is going to go into the XML file, has markup within it).  Now I don't know how many people will actually use the RSS feed from my site, but hey, I learnt something while implementing this, and I can lay claim to building a completely customised RSS feed for my site.  Feel free to give your comments below.

19 comment(s) for this post.

Comments for this News Post

#1
10 Nov 07 · 11:12 PM
Comment by user Varun
Good good... gonna read this when I add RSS to varunabhiram.com ;)
#2
10 Nov 07 · 11:32 PM
Comment by user Karthik
I'll be glad if someone finds this useful :)
#3
11 Nov 07 · 12:53 AM
Comment by user Toby
Just checked the post very quickly, but I'll have an in-depth look at it later of course, since a dynamic feed is exactly what I need to add RSS to Kotogoto Daily. I bet your post will be very useful to me, Karthik! Oh and btw, thanks for the reference in the post title. ;-)
#4
11 Nov 07 · 03:23 AM
Nice feature you added to your site! :) Very useful! I am going to use it! ;)
#5
11 Nov 07 · 09:26 PM
Comment by user Karthik
Thanks Toby and Ismaele.

Btw, I typed up that entire post, tags and all, and added it to the database, and once it appeared on this page, I took a chance and passed it thru the XHTML validation service... not one error! Was very happy about that :)

Hey I might have to change my profile pic here too, what with both Toby and Varun changing theirs...
#6
12 Nov 07 · 04:37 AM
Nice!!

Surely php has its own sorting routines? You shouldn't have to write up your own.

You don't even need external sorting. Just maintain a timestamp field in the database for each item. Then generating the most recent items is a simple sql query with the ORDER BY and LIMIT clauses.
#7
12 Nov 07 · 03:50 PM
Comment by user Karthik
Kartik - Yes, PHP does have functions that can sort arrays. I did it myself though, just so I could show off a bit :)

What you said about ORDER BY and LIMIT, I am using already. The problem here is, that there are five different tables (and a couple of those have foreign keys in other tables too) from where I am getting the data.

When I pick up data from each table, it is sorted by itself, but not necessarily in chronological order when considered with ALL the data. That's why I am doing a bit of sorting myself after querying the tables.

Perhaps I will change the sorting method to a more optimal one later.
#8
13 Nov 07 · 12:27 AM
Good job dude. I glanced through this post, Will read in detail later. I'm sure it will have a lot of details knowing you since college days :).

Keep it up!
#9
13 Nov 07 · 01:55 AM
Ah, that makes sense. Lots of sites have separate feeds for articles and comments, but putting them together is nicer and makes for a unique kind of experience.

You don't even have to optimize it if it's not a problem ;)
scrapbook.akkartik.name/post/16834044
Hey! I can't post links?!
#10
13 Nov 07 · 01:15 PM
Comment by user Karthik
Kartik - no, you can't post links for now :(

For some odd reason we (Varun & me) were getting these 403 errors whenever the POSTDATA passed from this form contained URLs. So Varun wrote a script to alert people not to use URLs here.

Need to figure out a workaround though... But at the moment, I think it's simpler not to have URLs at all :) That way, it's simpler to just run the text through some PHP functions (htmlentities, nl2br, etc.) before putting it into the database.

Got reminded of this comic at xkcd: xkcd.com/327 :)

I would be able to use links here though, by editing the comment in the database directly ;)

Srikanth - Thanks man :)
#11
7 Nov 14 · 04:37 PM
Everything is very open with a clear explanation of
the challenges. It was really informative. Your website is extremely helpful.
Thanks for sharing!
#12
31 Jan 15 · 12:32 AM
Hmm it looks like your site ate my first comment (it was extremely long) so I guess I'll just sum it up
what I had written and say, I'm thoroughly enjoying your blog.
I as well am an aspiring blog writer but I'm still new to the whole
thing. Do you have any points for inexperienced blog writers?

I'd genuinely appreciate it.
#13
1 Feb 15 · 11:20 AM
Hi there! Do you know if they make any plugins to assist
with Search Engine Optimization? I'm trying to get my blog to rank for some
targeted keywords but I'm not seeing very good success. If you know of any please share.
Thanks!
#14
4 Feb 15 · 06:21 PM
Attractive portion of content. I simply stumbled
upon your weblog and in accession capital to claim that I
get in fact loved account your weblog posts. Anyway I'll be subscribing to your augment and even I fulfillment you access persistently rapidly.
#15
6 Feb 15 · 09:54 AM
Hi there, after reading this awesome paragraph i am as well glad to share my knowledge here with
colleagues.
#16
8 Feb 15 · 09:52 PM
Wonderful article! We are linking to this particularly
great post on our site. Keep up the great writing.
#17
9 Feb 15 · 07:50 AM
Someone essentially lend a hand to make seriously posts
I'd state. This is the very first time I frequented your web
page and up to now? I amazed with the analysis you made to create this actual post extraordinary.
Excellent task!
#18
15 Feb 15 · 07:43 AM
The abuser could have frequent runny noses or nosebleeds.
#19
15 Feb 15 · 08:30 PM
Hello! This is kind of off topic but I need some help from an established blog.
Is it very difficult to set up your oown blog?
I'm not very techincal but I can figure things out pretty quick.
I'm thinking about making my own but I'm not
sure where to start. Do you have any iideas or suggestions?
Thank you
Recent Comments
Recent Posts
Latest Doom Map
Dark Fate 2 — a Doom II map by Karthik Abhiram

Dark Fate 2 is a singleplayer level for Doom II, replacing MAP01.  It's a small-sized hellish level — and there's a walkthrough video as well.

Popular Videos
Other Destinations
Favourite Posts
Some Statistics
  • news posts two · 173
  • news posts one · 301
  • comments · 1168
  • drawings · 77
The Author
Karthik

Karthik Abhiram

27-year old Taurean (birthday 15-May-82), Assistant Manager - HR at Tata Consultancy Services Ltd in Hyderabad, India.  Previously, did Post Graduate Diploma in Management from T A Pai Management Institute (2003-05) and before that, Computer Science Engineering from Sree Nidhi Institute of Science and Technology (1999-2003).

Email: karthik82 -AT- gmail -DOT- com
orkut profile
Facebook profile
YouTube channel
deviantART page
Google Reader Shared Items

Disclaimer: The views expressed on this site are purely my own.

Warning: This site occasionally contains profanity.

Creative Commons License