Sunday, November 9, 2008

I built a blog aggregator - waywework.it

I've been spending some time recently putting together a blog aggregator site for some of the folks I work with. Its now up and running at http://waywework.it. I hope this will be an interesting place to share our public community and as one of my colleagues said "this keeps my Google Reader much neater".

Today I'd like to talk about the code running this site which is posted and available on github at http://github.com/alexrothenberg/waywework.

I started thinking I would use an existing aggregator site and just apply my skin but when I did a quick search on github I most of the hard work existed in atom and rss gems and plugins and I wanted to take advantage of the just released Rails 2.2 so I decided to build my own. This turned out to be not too much work. Today I'd like to talk about how I put this together.

First I created my project with some scaffolding for feeds which would have_many posts

class Feed < ActiveRecord::Base
has_many :posts, :dependent => :delete_all
end

class Post < ActiveRecord::Base
belongs_to :feed
end


I soon found the atom gem and rss parser built into ruby. Using them was a piece of cake as all I had to do was create a method to call each one in my Feed model

class Feed < ActiveRecord::Base
def get_posts_from_atom atom_xml
feed = Atom::Feed.new(atom_xml)
feed.entries.each { |entry|
link = entry.links.detect {|l| l.rel == 'alternate'}
create_post(:contents=>entry.content.value, :url=>link.href, :title=>entry.title, :published=>entry.published.to_s(:db), :updated=>entry.updated.to_s(:db))
}
return !feed.entries.blank?
end

def get_posts_from_rss rss_xml
rss = RSS::Parser.parse(rss_xml, false)
rss.items.each { |entry|
create_post(:contents=>entry.description, :url=>entry.link, :title=>entry.title, :published=>entry.date.to_formatted_s(:db), :updated=>entry.date.to_formatted_s(:db))
}
return !rss.items.blank?
end
end


Of course I had to create the glue wrapping it all together. A rake task to be call on a schedule

namespace :feeds do
desc "Load the feeds"
task :populate => :environment do
feeds = Feed.all
feeds.each do |feed|
feed.get_latest
end
end
end


and the logic to load the feed, parse it and update the posts.

class Feed < ActiveRecord::Base
def get_latest
puts "getting feed for #{name}"
xml = get_feed
got_atom_posts = get_posts_from_atom xml
get_posts_from_rss xml unless got_atom_posts
end

def get_feed
uri = URI.parse(feed_url)
uri.read
end

def create_post params
params.merge!(:feed_id=>id)
existing_post = Post.find_by_url(params[:url])
if existing_post
existing_post.update_attributes(params)
else
Post.create(params)
end
end
end


The next step was to publish an atom feed of my site. Again there was a plugin atom_feed_helperwaiting to help me. I installed the plugin and created a view builder

atom_feed(:url => atom_feed_url) do |feed|
feed.title("WayWeWork")
feed.updated(@posts.first.published)

for post in @posts
feed.entry(post, :url=>post.url, :published=>post.published, :updated=>post.updated) do |entry|
entry.title("#{post.feed.author}: #{post.title}")
entry.content(post.contents, :type => 'html')
end
end
end

This was all so easy I had hardly done anything other than glue these plugins together. Now I finished up with a few bells and whistles.

I added a who's talking and archive by date section to my homepage that I called from my controller like this

class PostsController < ApplicationController
@active_feeds = Feed.by_author
@activity_by_date = Post.activity_by_date
end


I added security to restrict who can administer feeds

class FeedsController < ApplicationController
before_filter :authenticate

protected
def authenticate
authenticate_or_request_with_http_basic do | user_name, password|
username = YAML::load_file(File.join(RAILS_ROOT, %w[config password.yml]))['username']
pwd = YAML::load_file(File.join(RAILS_ROOT, %w[config password.yml]))['password']
user_name == username && password == pwd
end
end
end


For the UI I am somewhat graphically challenged so got some help. For this github was very cool as I could add lessallan as a collaborator and he could check in his changes so they just appeared!

Finally a little work with capistrano (mostly just creating a Capfile) and I could deploy!

Overall I spent a few days and now have a site that does exactly what I want. Where most of the code I wrote is specific to my site and the general purpose plumbing was downloaded. I'm very pleased with the availability of plugins and gems and how easy it was to collaborate using github!

Now I just hope others find the site interesting to use!

1 comments:

prasoon said...

Thanks Alex for building a simple and powerful app. Its very useful and goes a long way in aggregating our bloggers.