Module csv | Tarantool

Module csv

The csv module handles records formatted according to Comma-Separated-Values (CSV) rules.

The default formatting rules are:

  • Lua escape sequences such as \n or \10 are legal within strings but not within files,
  • Commas designate end-of-field,
  • Line feeds, or line feeds plus carriage returns, designate end-of-record,
  • Leading or trailing spaces are ignored,
  • Quote marks may enclose fields or parts of fields,
  • When enclosed by quote marks, commas and line feeds and spaces are treated as ordinary characters, and a pair of quote marks “” is treated as a single quote mark.

The possible options which can be passed to csv functions are:

  • delimiter = string (default: comma) – single-byte character to designate end-of-field
  • quote_char = string (default: quote mark) – single-byte character to designate encloser of string
  • chunk_size = number (default: 4096) – number of characters to read at once (usually for file-IO efficiency)
  • skip_head_lines = number (default: 0) – number of lines to skip at the start (usually for a header)

Below is a list of all csv functions.

Name Use
csv.load() Load a CSV file
csv.dump() Transform input into a CSV-formatted string
csv.iterate() Iterate over CSV records
csv.load(readable[, {options}])

Get CSV-formatted input from readable and return a table as output. Usually readable is either a string or a file opened for reading. Usually options is not specified.

Parameters:
  • readable (object) – a string, or any object which has a read() method, formatted according to the CSV rules
  • options (table) – see above
Return:

loaded_value

Rtype:

table

Example:

Readable string has 3 fields, field#2 has comma and space so use quote marks:

tarantool> csv = require('csv')
---
...
tarantool> csv.load('a,"b,c ",d')
---
- - - a
    - 'b,c '
    - d
...

Readable string contains 2-byte character = Cyrillic Letter Palochka: (This displays a palochka if and only if character set = UTF-8.)

tarantool> csv.load('a\\211\\128b')
---
- - - a\211\128b
...

Semicolon instead of comma for the delimiter:

tarantool> csv.load('a,b;c,d', {delimiter = ';'})
---
- - - a,b
    - c,d
...

Readable file ./file.csv contains two CSV records. Explanation of fio is in section fio. Source CSV file and example respectively:

tarantool> -- input in file.csv is:
tarantool> -- a,"b,c ",d
tarantool> -- a\\211\\128b
tarantool> fio = require('fio')
---
...
tarantool> f = fio.open('./file.csv', {'O_RDONLY'})
---
...
tarantool> csv.load(f, {chunk_size = 4096})
---
- - - a
    - 'b,c '
    - d
  - - a\\211\\128b
...
tarantool> f:close()
---
- true
...
csv.dump(csv-table[, options, writable])

Get table input from csv-table and return a CSV-formatted string as output. Or, get table input from csv-table and put the output in writable. Usually options is not specified. Usually writable, if specified, is a file opened for writing. csv.dump() is the reverse of csv.load().

Parameters:
  • csv-table (table) – a table which can be formatted according to the CSV rules.
  • options (table) – optional. see above
  • writable (object) – any object which has a write() method
Return:

dumped_value

Rtype:

string, which is written to writable if specified

Example:

CSV-table has 3 fields, field#2 has “,” so result has quote marks

tarantool> csv = require('csv')
---
...
tarantool> csv.dump({'a','b,c ','d'})
---
- 'a,"b,c ",d

'
...

Round Trip: from string to table and back to string

tarantool> csv_table = csv.load('a,b,c')
---
...
tarantool> csv.dump(csv_table)
---
- 'a,b,c

'
...
csv.iterate(input, {options})

Form a Lua iterator function for going through CSV records one field at a time. Use of an iterator is strongly recommended if the amount of data is large (ten or more megabytes).

Parameters:
  • csv-table (table) – a table which can be formatted according to the CSV rules.
  • options (table) – see above
Return:

Lua iterator function

Rtype:

iterator function

Example:

csv.iterate() is the low level of csv.load() and csv.dump(). To illustrate that, here is a function which is the same as the csv.load() function, as seen in the Tarantool source code.

tarantool> load = function(readable, opts)
         >   opts = opts or {}
         >   local result = {}
         >   for i, tup in csv.iterate(readable, opts) do
         >     result[i] = tup
         >   end
         >   return result
         > end
---
...
tarantool> load('a,b,c')
---
- - - a
    - b
    - c
...
Found what you were looking for?
Feedback