Olly Legg home

Progress Bars for Ruby's CSV

23rd Oct 2012

I've worked on a fair number of projects where I've had to import data from a CSV file. With large datasets, especially when inserting into a database with standard ActiveRecord, this can take a while. If you're not running it often its usually not worth optimising, but it does get annoying not knowing how long its going to take when you do.

With this small module you can easily add a progress bar to the built-in CSV library. It uses the progress_bar gem to do all the hard work (not to be confused with the other Ruby/ProgressBar library).

require 'csv'
require 'progress_bar'

class CSV
  module ProgressBar
    def progress_bar
      ::ProgressBar.new(@io.size, :bar, :percentage, :elapsed, :eta)
    end

    def each
      progress_bar = self.progress_bar

      super do |row|
        yield row
        progress_bar.count = self.pos
        progress_bar.increment!(0)
      end
    end
  end

  class WithProgressBar < CSV
    include ProgressBar
  end

  def self.with_progress_bar
    WithProgressBar
  end
end

It can be used in a few different ways. The least intrusive if you're integrating it into an existing code base is to extend the CSV::ProgressBar module onto the CSV instance.

data = File.read('data.csv')
csv = CSV.new(data)
csv.extend(CSV::ProgressBar)
csv.each do |row|
  # expensive operation
end

You can also use the subclass, CSV::WithProgressBar, which includes the ProgressBar module for you. This syntax doesn't feel as idiomatic, but does make it possible to use the class-level convenience methods.

CSV::WithProgressBar.foreach('data.csv') do |row|
  # expensive operation
end

I also added the CSV.with_progress_bar class method. Its just a little bit of syntactic sugar, but I find it reads nicer than using the subclass.

CSV.with_progress_bar.foreach('data.csv') do |row|
  # expensive operation
end

I've added the code to a Gist. Use it, fork it, modify it.