ruby picture

RCR 338: Replace csv.rb in 1.9+ with FasterCSV code

Submitted by JEG2 (Fri Jun 02 13:38:41 UTC 2006)

Abstract

FasterCSV code is over eight times faster than CSV in parsing some real word CSV data and Ruby Talk has seen multiple complaints on this very issue. On top of that, it has many features that make working with CSV files much easier, like header support. It's time to trade up.

Problem

FasterCSV was designed and tested in a thread on Ruby Core complaining about the speed of the standard CSV library. These threads have been seen before on multiple mailing lists. The process resulted in significantly faster code that has become quite popular.

In addition to raw speed, FasterCSV has many oft requested features, especially header parsing. This makes working with CSV data as easy as it should be and allows code to be future-proofed against things like shifting column order.

Even with the added speed and many new features, FasterCSV is similar in size to the currrent CSV code base. That means it is significantly easier to maintain.

The FasterCSV code is fully documented and comes with thousands of unit tests.

Proposal

I propose to replace the CSV code base with the FasterCSV code base.

This will break most code using the old CSV library, because of differences in how the two libraries treat parameters:

  # CSV argument style
  CSV.open("my_file.csv", "w", "\t", "\r\n") do |csv|
    # ...
  end
  # FasterCSV argument style
  FasterCSV.open( "my_file.csv", "w", :col_sep => "\t",
                                      :row_sep => "\r\n" ) do |csv|
    # ...
  end

The difference is needed since FasterCSV supports many more options.

Analysis

This should significantly reduce the speed complaints if it doesn't end them altogether. As an added bonus, the library gains many new features. We as developers get an easier to maintain code base with excellent test coverage. Good changes all around.

Implementation

I propose we replace csv.rb code with the code in fastercsv.rb. We can then rename all occurances of FasterCSV to CSV, completely replacing the old library.
ruby picture
Comments Current voting
I have been using FasterCSV in Ruport as well as handling some massive data(10+mb csv fles) for projects in my work.

This library works great and has nice additional features. I fully support it.


Strongly opposed 0
Opposed 0
Neutral 0
In favor 14
Strongly advocate 4
ruby picture
If you have registered at RCRchive, you may now sign in below. If you have not registered, you may sign up for a username and password. Registering enables you to submit new RCRs, and vote and leave comments on existing RCRs.
Your username:
Your password:

ruby picture

Powered by .