CSVY: yaml frontmatter for csv file format

This project is maintained by the csvy team

Welcome to CSVY.

This page describe the specs of yaml frontmatter for csv file format. The main goals of the format are extreme simplicity and readability.

Because for data human’s curators from no-data, CSV, metadata+CSV to Semi-structured data, the technological gap is too large. A simple file format to add metadata to the existing datasets is needed. JSON is very cryptic for humans, but YAML can do the job as it can be easily read both by humans and softwares.

Based on Tabular Data Resource

There are important initiatives, like Tabular Data Resource which it plans to use (json + csv), but most are meant to be published and read by machines.

CSVY is a simple container of a Tabular Data Resource, where the (Metadata+Schema) are translated from JSON to YAML and put in the YAML frontmatter part of the file, after the YAML frontmatter part is the Data part stored using the CSV Dialect Description Format.

YAML Header delimiter

A YAML metadata block is a valid YAML object, delimited by a line of three hyphens --- at the top and a line of three hyphens --- or three dots ... at the bottom.

Defining the Table Schema

Use the Table Schema, it’s important to know that the CSVY format is designed to store only one dataset per file.

profile: tabular-data-resource
name: my-dataset
path: https://raw.githubusercontent.com/csvy/csvy.github.io/master/examples/example.csvy
title: Example file of csvy 
description: Show a csvy sample file.
format: csvy
mediatype: text/vnd.yaml
encoding: utf-8
  - name: var1
    type: string
  - name: var2
    type: integer
  - name: var3
    type: number
  csvddfVersion: 1.0
  delimiter: ","
  doubleQuote: false
  lineTerminator: "\r\n"
  quoteChar: "\""
  skipInitialSpace: true
  header: true
- title: The csvy specifications
  path: http://csvy.org/
  email: ''
- name: CC-BY-4.0
  title: Creative Commons Attribution 4.0
  path: https://creativecommons.org/licenses/by/4.0/

Libraries supporting CSVY

Backwards Compatibility

For backward compatibility you can always add to your data.csv a data.yml metadata file, the next step when there is proper implementation make a single file container, data.csvy will not be a problem at all.

Parser support for skipping multiple lines in the header (which would contain the YAML), and for comment lines (lines starting with #). Based on CSV Parser Notes by @hubgit.

Language Parser Skip lines Comment lines Comments
Excel Mac   yes no  
Python pandas.read_csv yes yes  
R read.table yes yes  
Ruby csv.read no yes skip lines via regex

Authors and Contributors

Support or Contact

Use Github Issues.