CSVY yaml frontmatter for csv file format


This project is maintained by the csvy's team

Welcome to CSVY.

This page describe the specs of yaml frontmatter for csv file format. The main goals of the format are extreme simplicity and readability

Because for data human’s curators from no-data, CSV, CSV+metadata to Semi-structured data, the technological gap is too large. A simple file format to add metadata to the existing datasets is needed, json is very cryptic for humans, but yaml can do the job.

For backward compatibility you can always add to your data.csv a data.yml metadata file, the next step when there is proper implementation make a single file data.csvy will not be a problem at all.

There are many initiatives which it plans to use json + csv, but most are not meant to be published and read by humans.

YAML Header

A YAML metadata block is a valid YAML object, delimited by a line of three hyphens --- at the top and a line of three hyphens --- or three dots ... at the bottom.

Defining table columns

Use the JSON Table Schema:

name: my-dataset
  - name: var1
    title: variable 1
    type: string
    description: explaining var1
      - required: true
  - name: var2
    title: variable 2
    type: integer
  - name: var3
    title: variable 3
    type: number

Libraries supporting CSVY

Backwards Compatibility

Parser support for skipping multiple lines in the header (which would contain the YAML), and for comment lines (lines starting with #). Based on CSV Parser Notes by @hubgit.

Language Parser Skip lines Comment lines Comments
Excel Mac   yes no  
Python pandas.read_csv yes yes  
R read.table yes yes  
Ruby csv.read no yes skip lines via regex

Authors and Contributors

Support or Contact

Use Github Issues.