Problems sorting a CSV file with sort in Linux [CLOSED]

0 votes

The problem is that some of the fields contain commas, but they are inside double quotes.


sort -t, -k1,1 -k3,3 -k2,2 SomeFile.csv > OutputFile.csv

A line could look something like this:
This is the first field,"This is, well, the second field",The third field could look like this

That line has three fields:
1: This is the first field
2: "This is, well, the second field"
3: The third field could look like this

But sort consider it to have five fields:
1: This is the first field
2: "This is
3: well
4: the second field
5: The third field could look like this

How would you solve this?

closed with the note: None
posted Aug 8, 2013 by Jagan Mishra

1 Answer

+1 vote
Best answer

Try csvtool. You could use it to replace the , field separators with TABs, say, pipe through sort specifying TAB as the field delimiter and then pipe through csvtool again to convert the fileds separators back to commas. E.g.:

csvtool -t ',' -u TAB cat input.csv | sort -f 2 -t ' ' | csvtool -t TAB -u ',' cat - > output.csv
answer Aug 8, 2013 by Naveena Garg
