Cache Yahoo! Web Service Calls using Ruby
One way of dramatically speeding up applications that are built using the Yahoo! Web Services APIs is to make heavy use of caching. With caching, calls that have been previously made in a specified time frame can be answered using cached data rather than making an API call over the network.
Caching Intro
The level of caching you use should take in to account the kind of data you are retrieving. You might be building an application that redisplays your Flickr photo sets on your own site. These sets change rarely, and you don't mind if you there is a delay of up to 12 hours before a new set appears on your site. Contrast this with redisplaying your most recent links from del.icio.us, where you might want them to show up on your own site straight away or within 5 or 10 minutes.
This HOWTO describes two methods of caching at the HTTP retrieval layer.
Caching in memory
Caching in memory is memoization with the addition of checking for freshness.
require 'net/http'
class MemFetcher
def initialize
# we initialize an empty hash
@cache = {}
end
def fetch(url, max_age=0)
# if the API URL exists as a key in cache, we just return it
# we also make sure the data is fresh
if @cache.has_key? url
return @cache[url][1] if Time.now-@cache[url][0]<max_age
end
# if the URL does not exist in cache or the data is not fresh,
# we fetch again and store in cache
@cache[url] = [Time.now, Net::HTTP.get_response(URI.parse(url)).body]
end
end
Create an instance of the CacheFetcher class:
irb(main):001:0> require 'cache'
=> true
irb(main):002:0> fetcher = MemFetcher.new
=> #<Fetcher:0x4d6ec8 @cache={}>
Now retrieve a URL, specifying that it should not be retrieved it if it has been cached in the last 60 seconds:
irb(main):003:0> fetcher.fetch('http://search.yahooapis.com/WebSearchService/V1/webSearch?appid=YahooDemo&query=madonna&results=10', 60)
If you try this in an interactive prompt there should be a short delay before the data is returned. Run the command again and the data will be returned instantly; it is already in the cache.
Caching to disk
MemFetcher is only useful for long-running Ruby programs, as the cache itself is stored in memory. Here is an alternative implementation that saves cached data to disk; this can be used by multiple Ruby processes.
require 'net/http'
require 'md5'
class DiskFetcher
def initialize(cache_dir='/tmp')
# this is the dir where we store our cache
@cache_dir = cache_dir
end
def fetch(url, max_age=0)
file = MD5.hexdigest(url)
file_path = File.join("", @cache_dir, file)
# we check if the file -- a MD5 hexdigest of the URL -- exists
# in the dir. If it does and the data is fresh, we just read
# data from the file and return
if File.exists? file_path
return File.new(file_path).read if Time.now-File.mtime(file_path)<max_age
end
# if the file does not exist (or if the data is not fresh), we
# make an HTTP request and save it to a file
File.open(file_path, "w") do |data|
data << Net::HTTP.get_response(URI.parse(url)).body
end
end
end
Usage is similar to the in-memory cache:
irb(main):001:0> require 'cache'
=> true
irb(main):002:0> fetcher = DiskFetcher.new
=> #<DiskFetcher:0x4d0424>
irb(main):003:0> fetcher.fetch('http://search.yahooapis.com/WebSearchService/V1/webSearch?appid=YahooDemo&query=madonna&results=10', 60)
If no argument is provided to the DiskCacheFetcher constructor, the default temp directory will be used to store the cache files. On Unix-based systems, this is /tmp.
These functions can now be used in place of direct calls to Net::HTTP methods. This provides a simple but robust mechanism for caching API calls, speeding up your application and reducing the number of overall calls you have to make.
Ready to get started?
By applying for an Application ID for this service, you hereby agree to the Terms of Use
Yahoo! Groups Discussions
view all
Tue, 02 Dec 2008
Tue, 18 Nov 2008
Rboss RubyGem for interacting with Yahoo Boss Search
Fri, 22 Aug 2008
Using the Yahoo Address Book APIs with ROR - a step by step tutorial
Tue, 08 Jul 2008

