Overview¶
docs |
|
|---|---|
tests |
|
package |
Python support for linear TSV files
Free software: MIT license
What is Linear TSV¶
In contrast to Excel’s TSV dialect, linear TSV is line-based.
“But hey”, I hear you say, “isn’t TSV always line-based?”. Well, the issue arises when a cell contains a tab or newline character. In excel’s TSV format, that cell is surrounded by quotes and the entry is continued on the next line. Now you have:
entries spanning several lines
quotes that need to be ignored (“)
quotes that are escaped by doubling them (“”)
Since entries can span several lines, many naïve file manipulations aren’t possible:
Taking the first 50 entries of a dataset: head -n 50 customers.tsv
Filtering entries: grep “Zürich” customers.tsv
Sorting the entries alphabetically: sort customers.tsv
All of this can be prevented if you simply:
escape tabs: \t
escape newlines: \n
escape carriage returns: \r
escape backslashes: \\
Lastly, linear TSV can also encode None as \N.
That’s [linear TSV](http://dataprotocols.org/linear-tsv/) in a nutshell.
Installation¶
pip install tsv2dict
You can also install the in-development version with:
pip install https://github.com/nkurmann/tsv2dict/archive/master.zip
Documentation¶
Development¶
To run all the tests run:
tox
Note, to combine the coverage data from all the tox environments run:
Windows |
set PYTEST_ADDOPTS=--cov-append
tox
|
|---|---|
Other |
PYTEST_ADDOPTS=--cov-append tox
|