RCR 338: Replace csv.rb in 1.9+ with FasterCSV code
Submitted by JEG2 (Fri Jun 02 13:38:41 UTC 2006)
Abstract
FasterCSV code is over eight times faster than CSV in parsing some real word CSV data and Ruby Talk has seen multiple complaints on this very issue. On top of that, it has many features that make working with CSV files much easier, like header support. It's time to trade up.
Problem
FasterCSV was designed and tested in a thread on Ruby Core complaining about the speed of the standard CSV library. These threads have been seen before on multiple mailing lists. The process resulted in significantly faster code that has become quite popular.
In addition to raw speed, FasterCSV has many oft requested features, especially header parsing. This makes working with CSV data as easy as it should be and allows code to be future-proofed against things like shifting column order.
Even with the added speed and many new features, FasterCSV is similar in size to the currrent CSV code base. That means it is significantly easier to maintain.
The FasterCSV code is fully documented and comes with thousands of unit tests.
Proposal
I propose to replace the CSV code base with the FasterCSV code base.
This will break most code using the old CSV library, because of differences in how the two libraries treat parameters:
# CSV argument style
CSV.open("my_file.csv", "w", "\t", "\r\n") do |csv|
# ...
end
# FasterCSV argument style
FasterCSV.open( "my_file.csv", "w", :col_sep => "\t",
:row_sep => "\r\n" ) do |csv|
# ...
end
The difference is needed since FasterCSV supports many more options.
Analysis
This should significantly reduce the speed complaints if it doesn't end them altogether. As an added bonus, the library gains many new features. We as developers get an easier to maintain code base with excellent test coverage. Good changes all around.
Implementation
I propose we replace csv.rb code with the code in fastercsv.rb. We can then rename all occurances of FasterCSV to CSV, completely replacing the old library.
I have been using FasterCSV in Ruport as well as handling some massive data(10+mb csv fles) for projects in my work.
This library works great and has nice additional features. I fully support it.
|
Strongly opposed |
0 |
Opposed |
0 |
Neutral |
0 |
In favor |
14 |
Strongly advocate |
4 |
|
RCRchive copyright © David Alan Black, 2003-2005.
Powered by .
This library works great and has nice additional features. I fully support it.