i've noticed..

2010, Aug 12

Multithreading in Ruby Like a Champ

Multithreading is a great tool when you know how to use it. For a recent project, I had to download hundreds of web pages. To speed the process up, I set the page download process up to start a new thread. The only problem with this is that opening hundreds of connections at the same time would cause server errors and I would get blank pages. My solution, which you’ll find below, was to only allow 10 threads to be created at once. This worked great, so I’m sharing it with you.

Here ya go

# Contains download_link(link,save_dir,save_name) to download and 
# save webpages locally (irrelevant)

require 'download_page'

# keep the threads in here
threads
= []

(1967..2010).each do |year|

 
# excluded irrelevant variable definitions :  
# link,save_dir,save_name ...


 
# Only open 10 threads at a time
 
if(Thread.list.count % 10 != 0)
    download_thread
= Thread.new do
      download_link
(link,save_dir,save_name)
   
end
    threads
<< download_thread
 
else
   
# Wait for open threads to finish executing 
# before starting new one

    threads
.each do |thread|
      thread
.join
   
end
   
# Start downloading again
    download_thread
= Thread.new do
      download_link
(link,save_dir,save_name)
   
end
    threads
<< download_thread
 
end
end

# Wait for threads to finish executing before exiting the program
threads
.each do |thread|
  thread
.join
end