Wednesday, September 21, 2011

 

Image sizing and scraping with JQuery and Rails


I know, I know, long time no type. Not my fault, I swear! First I went travelling, then I had a buncha work to do, then I went to San Francisco TechCrunch Disrupt, all of which occupied my time. Well, I suppose, on reflection, all those things were my fault, but, um ... look, I'm back now, OK?

Back with a li'l discussion of image sizing and scraping with Rails and JQuery, for a project I can't really talk about yet. Suffice to say that sometimes I want to calculate the size of various remote images - quite a lot of them, in fact, making performance important - and sometimes I want to scrape all the images from a set of web pages.

I thought the first task was going to be tricky. And maybe it is. But fortunately, someone has solved it for me, via the totally awesome FastImage Ruby gem, for which praise should be heaped upon one sdsykes. It works exactly like advertised, which is to say, like this:


  def self.picture_info_for(url)
    return nil if url.blank?
    begin
      size = FastImage.size(url)
      return size  #[width, height]
    rescue
      logger.info "Error getting info for picture at "+url.to_s
      return Array[0,0]  #this makes sense for my app, but maybe not yours
    end
  end


So, hurrah! This meant the scraping bit was actually tricker. Sure, I could have done it all in Rails, but it's user-facing, and I didn't want the user to have to wait for a bunch of potentially sequential http requests to complete without seeing any results. I could have done it all on the client side, but parsing HTML with Javascript, even with JQuery, sounded painful and fraught with difficulties, compared to using the dead-easy Hpricot gem. So I came up with a compromise I quite like:

1. On the client side: (written using HAML, which I mostly adore)
- @image_urls.each_with_index do |url,idx|
  = link_to url, url
  %div{:id => 'page_'+idx.to_s}     

%script
  $(function() {
  - @image_urls.each_with_index do |url,idx|
    $.ajax({
    url: '/stories/scrape_images?url=#{CGI::escape(url)}',
    success: function(msg){ $('#page_#{idx}').html(msg); },
    error: function(msg){ $('#page_#idx}').html(msg); }
    });
  });

2. On the server, to first create and then respond to that client page:
  def popup_scraped_images
    start_time = Time.now
    @image_urls = []
    @seed = Seed.find(params[:seed_id])
    @seed.active_signals.each do |signal|
      next if signal.main_url.blank?
      @image_urls << signal.main_url
    end
  end

  def scrape_images
    url = params[:url]
    slash = url =~ /[^\/]\/[^\/]/
    host = slash.nil? ? "" : url[0,slash+1]
    html = ''

    page = HTTParty.get(url, :timeout => 5)
    Hpricot(page).search("//img").each do |element|
      img_src = element.attributes["src"]
      img_src = host+img_src if img_src.match(/^\//)
      html += '<img src="'+img_src+'" />' if img_src.match(/^http/)
    end
  end


Hopefully how they all interact is self-explanatory. Et voila - semi-asynchronous Rails/JQuery image scraping, handled on the server side for easy caching if need be later on.

Labels: , , , , , , ,


Comments:
Nice Information Your first-class knowledge of this great job can become a suitable foundation for these people. I did some research on the subject and found that almost everyone will agree with your blog.
Cyber Security Course in Bangalore

 
Writing in style and getting good compliments on the article is hard enough, to be honest, but you did it so calmly and with such a great feeling and got the job done. This item is owned with style and I give it a nice compliment. Better!
Cyber Security Training in Bangalore
 
Top quality blog with unique content and found valuable looking forward for next updated thank you
Ethical Hacking Course in Bangalore
 
Excellent blog with valuable information and just added your blog to my bookmarking sites thank for sharing.
Data Science Course in Bangalore
 
Nice Blog. Thanks for sharing with us. Such amazing information.

Only Blog

Guest Blogger

Guest Blogging Site

Guest Blogging Website

Guest Posting Site
 
Find the steps to download, install and activate the Norton product.

NORTON.COM/SETUP
NORTON.COM/SETUP
 
McAfee software has several features like protection against viruses and malware, permanently delete the damaged data, protection for many devices, etc.
Mcafee.com/activate
Mcafee.com/activate

 
Reach webroot.com/safe and download Webroot antivirus for securing your data. Install the setup and visit webroot.com/geeksquad to activate your subscription.
Mcafee.com/activate
Mcafee.com/activate

 
Online Assignment Help affiliations award students to score boundless etchings in their undertaking comfort and help them with refreshing their game-plan. Submitting persuading schoolwork each time is astounding contemplating some unavoidable conditions.
Assignment Writing Service
London Assignment help
do my assignment
write my assignment
 
Priya Chaudhary Label is online designer clothing store to buy Kurti/Kurta, tops, shirts, dresses, pants, palazzos, dupattas, pajama, kaftans & mask for Women .Designer Kurtis
 
Allied Paralegal Services is a licensed paralegal with the Law Society of Ontario, offering a wide range of legal assistance to our clients. Backed by years of experience and exhaustive knowledge, we provide the favorable results you come looking for.
lawyers near me
paralegal advisors
court near me
court near me
Expert Paralegal Service
lawyer licensing process
Human Rights Violation
find legal advisors
Laws consultants near me

lawyers near me
Paralegal Services Ontario
Paralegal Services Toronto
Paralegal Services Near me
paralegal services near me
justice of the peace Ontario
lawyers near me

 
I am social worker & Ex-Councillor Candidate in Khanpur Ward 81-S & true believer that one can change this nation for better by inspiring others .

Omprakash Chaudhary Delhi
 

Post a Comment

Subscribe to Post Comments [Atom]





<< Home

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]