Plumbing Project: Yahoo! Pipes for Aggregating Comments across a Syndicated Course Blog

So last week I mentioned that after figuring out how to display comment counts on syndicated posts, the next hurdle was to figure out how to display the most recent comments across a whole collection of syndicated blogs/posts.

The dilemma lies in the fact that while there is an RSS feed for each post’s comments, that feed needs to be essentially “exploded” and then all of the exploded feeds need to be aggregated together in order to get a view of comments across all of the syndicated blogs. I wasn’t sure how we were going to do that within WordPress/FeedWordPress. As you can imagine, as the content for ds106 grows, there are more and more posts being aggregated. EACH of those posts has an associated comment feed, and each of those feeds could have any number of comments. Potentially, we could be talking about thousands of comments, in the end. That’s a big feed to both construct and parse.

On Tuesday it occurred to me that maybe I could tackle this in Yahoo! Pipes, so I took a stab at it and I’ve been sort of successful.

Here’s the pipe’s process:

Step One: Fetch the feed out of We’re already off to a tough start. Right now Jim is syndicating the last 600 posts on the blog. That’s so that as students joined the class this spring and subscribed to the feed, they could get a backlog of everything that had come before. Keep in mind that there are almost 900 syndicated posts on the site, and, um, we’re in the third week of class. So, I’m already not starting with everything on my plate. Theoretically, if you go leave a comment on the very first post that was syndicated into the site, I can’t capture any information about it.

But, we’ll forge ahead.

Step Two: I’m going to pull out one element of the feed: the wfw:commentRSS property which I’ve already determined is pretty universal (esp. on WordPress blogs) and contains the URL of the post’s comment feed.

Step Three: Now I’m going to run through a loop that basically fetches the data contained in those comment feeds that I just pulled out of the site feed. For some reason, I had to use this function on Yahoo! Pipes as opposed to the Fetch Feed function again. I don’t know why. I’m not that smart. Whatever it finds in those comment feeds is then emitted into a new feed.

Steps Four & Five: Now I have to do some kind of klugey stuff. I know I need to filter out items that are empty — I have empty items because not every comment feed has contents (because no one has commented on those posts yet). But for some reason, the Filter function in Yahoo! Pipes doesn’t work when I put in a regular expression to try and match, say, the item title to a null value. However, I can use the Regex function to replace a null item title with a string — in this case “XXX.” And then, I can filter out all of the titles that are equal to “XXX.” It’s stupid, but it works.

Step Six: Now I truncate the feed to the last 10 items, because that’s all I want to display on the course site in the sidebar (and I’m trying to limit the amount of parsing WordPress needs to do).

Step Seven: I output the pipe.

If I’m lucky, all the pieces fit together and I get a widget on the ds106 sidebar of the last 10 or so comments.

But sometimes, I just get a feed error. Probably because the whole thing is pretty intensive. So, now I ned to figure out if there is a better way. I’m thinking I need some way to cache this data as it’s compiled. I mean, it’s not going to be changing a lot and there’s no reason to repeat the parsing of every feed every time. But, I admit, I’m not sure what steps to try next.

Ultimately, I’d like to get this to a Yahoo! Pipe that we could clone anytime we’re running a course blog so that we can include recent comments all the time.

As always, ideas, thoughts, and reactions are welcome!!

20 thoughts on “Plumbing Project: Yahoo! Pipes for Aggregating Comments across a Syndicated Course Blog”

  1. I was hoping you’d write this post – fantastic work. I was afraid your aggregated comment feed might mean adding each RSS comment feed individually, but that says more about my own pathetic concepts than anything.

    I can’t think of higher praise to say that the hit off of your post is reminiscent of Tony Hirst. All hail Martha Burtis!

  2. Brian–Yeah, I thought this was going to involve significant manual adding of feeds to some tool when I first started. It really seemed insurmountable. It’s pretty amazing, though, the stuff that is hidden in RSS feeds that we don’t realize is there. I actually discovered while I was working on this that a lot of feeds include a property for the number of comments on a post. It’s another RSS extension that originated with slashdot but was picked up by a lot of other systems.

  3. A oblanceolate projection for a initiate to start with is to pose a new jack to cease that nettlesome and faithful sound heard every second one walks into the kitchen. This is a not very complicated way to begin in the domain of internal status mensuration, before tossing on to solon serious projects.

  4. I cerebration this was feat to postulate significant exercise adding of feeds to any puppet when I introductory started. It rattling seemed insuperable. It’s pretty awing, though, the nonsensicality that is unseeable in RSS feeds that we don’t realise is there. I actually discovered time I was employed on this that a lot of feeds let a dance for the limit of comments on a flyer. It’s other RSS teaching that originated with slashdot but was picked up by a lot of new systems.

  5. Hey Martha, great blog post, thanks for sharing! I am really interested in looking at better syndication for distributed courses. For my PhD, I am thinking of trying to build (either from scratch, or duct-taping together various existing tools, which would probably be preferable), some “souped up RSS reader”, which can import learner’s feeds (including comments!) from blogs, forums, wikis etc, and know which elements were authored by which person, and then display the conversation in the course back to the learners in various ways (including perhaps exporting to various visual tools)… I’ve been looking for neat ways to do things like grabbing all the comments, and your post was really useful to me! I’ll let you know if I ever manage to build something functional.


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.