XSV Cookbook
XSV is an extremely fast CLI utility to process csv/tsv/psv files. Extremely helpful to automate data extraction, format conversions, filters etc.
It comes with an extensive man page and website also has great documentation. Still, noting down some of the common ways I use it below.
First, download from the tool’s site at or brew install xsv
Inspect file
#field type, statistics
#count records
#flatten file and show record by record
#nicely readable flat output for statistics
|
#or show as nicely aligned table
|
#create an index file for fast processing. Very useful on large files. Not needed for smaller files
#and when you update the file, you need recreate the index file too.
File format conversion
Best part is that you don’t have to worry about header rows, escaping characters etc.
#csv to psv (pipe separated)
#tab separated to csv
#tsv to psv
Note that "^I"
indicates a TAB character - you can type that in terminal by CTRL-V-I.
Split & Merge files
# split file into multiple files by a column - aka partition
# adds one file per value of office field in directory t/
#split the file every 1000 records into directory t/
#merge rows from multiple files without duplicating header record
Select Data
#random select 4 records
# select two columns by name
# select first and third column
# select first to second column
# select first and third column in another order
# and sort output in reverse order of department
|
# extract 2nd row to 4th row
# extract 3 rows starting from second row
# filter by regex "galore" ignoring case
# filter by regex NOT "galore" ignoring case
Joins
# inner join on cities.city = employees.office
# left and right joins