There have been times when I’ve had to extract a particular column from a
tab-separated or comma-separated file. The best way to do this is to use the
shell command cut
. Let’s say I have a file named input.txt
that looks like this:
{% include_code cut/input.txt lang:apacheconf %}
If I want to extract just the User Id
column, I could type in the following:
cut -d ',' -f 3 input.txt
Here the -d
option specifies the delimeter and the -f
option specifies the field(s) to be extracted.
The command above would generate the following output:
User Id
aijaz
js
guptas
If I want to include line numbers, I can use the nl
shell filter:
$ cut -d ',' -f 3 input.txt | nl -ba
1 User Id
2 aijaz
3 js
4 guptas
$
If I want the User Id
and Used
columns, I could do:
$ cut -d ',' -f 3,5 input.txt
User Id,Used
aijaz,2200
js,3300
guptas,1500
$
As one would expect, I can change the order in which fields appear by using -f 3,5,1
, for instance.
Look at the man page for cut
for more options, including how to extract specific bytes from each line.