-
Notifications
You must be signed in to change notification settings - Fork 122
Fix: Allow \r in unquoted fields when row separator doesn't contain \r #346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 11 commits
b3f7932
440c545
196efe4
5b8f693
c237450
dd88061
750531a
f323873
cb1084d
f2a2f8f
313f849
b455a09
9be946f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||
---|---|---|---|---|---|---|---|---|
|
@@ -138,28 +138,71 @@ def test_non_regex_edge_cases | |||||||
end | ||||||||
end | ||||||||
|
||||||||
def test_malformed_csv_cr_first_line | ||||||||
error = assert_raise(CSV::MalformedCSVError) do | ||||||||
CSV.parse_line("1,2\r,3", row_sep: "\n") | ||||||||
def test_unquoted_cr_with_lf_row_separator | ||||||||
data = "field1,field\rwith\rcr,field3\nrow2,data,here\n" | ||||||||
expected = [ | ||||||||
["field1", "field\rwith\rcr", "field3"], | ||||||||
["row2", "data", "here"] | ||||||||
] | ||||||||
assert_equal(expected, CSV.parse(data, row_sep: "\n")) | ||||||||
end | ||||||||
|
||||||||
def test_unquoted_cr_with_custom_row_separator | ||||||||
data = "field1,field\rwith\rcr,field3|row2,data,here|" | ||||||||
expected = [ | ||||||||
["field1", "field\rwith\rcr", "field3"], | ||||||||
["row2", "data", "here"] | ||||||||
] | ||||||||
assert_equal(expected, CSV.parse(data, row_sep: "|")) | ||||||||
end | ||||||||
|
||||||||
def test_unquoted_cr_with_crlf_row_separator | ||||||||
data = "field1\r,field2,field3\r\nrow2,data,here\r\n" | ||||||||
assert_raise(CSV::MalformedCSVError) do | ||||||||
CSV.parse(data, row_sep: "\r\n") | ||||||||
end | ||||||||
assert_equal("Unquoted fields do not allow new line <\"\\r\"> in line 1.", | ||||||||
error.message) | ||||||||
end | ||||||||
|
||||||||
def test_malformed_csv_cr_middle_line | ||||||||
csv = <<-CSV | ||||||||
line,1,abc | ||||||||
line,2,"def\nghi" | ||||||||
def test_unquoted_cr_rejected_when_included_in_row_separator | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can use the same naming rule for parse error case.
Suggested change
|
||||||||
data = "field1,field\r2,field3\r\nrow2,data,here\r\n" | ||||||||
assert_raise(CSV::MalformedCSVError) do | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you also check the error message like other tests?
Suggested change
|
||||||||
CSV.parse(data, row_sep: "\r\n") | ||||||||
end | ||||||||
end | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are they the same concept tests? |
||||||||
|
||||||||
line,4,some\rjunk | ||||||||
line,5,jkl | ||||||||
CSV | ||||||||
def test_liberal_parsing_with_unquoted_cr_and_custom_row_separator | ||||||||
data = "field1,field\rwith\rcr,field3|row2,data,here|" | ||||||||
expected = [ | ||||||||
["field1", "field\rwith\rcr", "field3"], | ||||||||
["row2", "data", "here"] | ||||||||
] | ||||||||
assert_equal(expected, CSV.parse(data, row_sep: "|", liberal_parsing: true)) | ||||||||
end | ||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we move this to |
||||||||
|
||||||||
error = assert_raise(CSV::MalformedCSVError) do | ||||||||
CSV.parse(csv) | ||||||||
end | ||||||||
assert_equal("Unquoted fields do not allow new line <\"\\r\"> in line 4.", | ||||||||
error.message) | ||||||||
def test_quoted_cr_with_custom_row_separator | ||||||||
data = "field1,\"field\rwith\rcr\",field3|row2,data,here|" | ||||||||
expected = [ | ||||||||
["field1", "field\rwith\rcr", "field3"], | ||||||||
["row2", "data", "here"] | ||||||||
] | ||||||||
assert_equal(expected, CSV.parse(data, row_sep: "|")) | ||||||||
end | ||||||||
|
||||||||
def test_unquoted_cr_in_middle_line | ||||||||
csv = "line,1,abc\nline,2,\"def\nghi\"\nline,4,some\rjunk\nline,5,jkl\n" | ||||||||
result = CSV.parse(csv) | ||||||||
expected = [ | ||||||||
["line", "1", "abc"], | ||||||||
["line", "2", "def\nghi"], | ||||||||
["line", "4", "some\rjunk"], | ||||||||
["line", "5", "jkl"] | ||||||||
] | ||||||||
assert_equal(expected, result) | ||||||||
end | ||||||||
|
||||||||
def test_empty_rows_with_cr | ||||||||
result = CSV.parse("\n" + "\r") | ||||||||
assert_equal([[], ["\r"]], result) | ||||||||
end | ||||||||
|
||||||||
def test_malformed_csv_unclosed_quote | ||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we can remove this comment.
I feel that it's useful for commit message (the PR description in this repository) because it describes why we do this change but it may not be useful for readers of new code. (Nobody will not try using
"\r\n"
here.)