This page describe the specs of yaml frontmatter for csv file format. The main goals of the format are extreme simplicity and readability.
Because for data human’s curators from no-data, CSV, metadata+CSV to Semi-structured data, the technological gap is too large. A simple file format to add metadata to the existing datasets is needed, json is very cryptic for humans, but yaml can do the job.
There are important initiatives, like Tabular Data Packages which it plans to use (json + csv), but most are meant to be published and read by machines.
CSVY is a simple container of a Tabular Data Package, where the (Metadata+Schema) are translated from JSON to YAML and put in the YAML frontmatter part of the file, after the YAML frontmatter part is the Data part stored using the CSV Dialect Description Format. It’s possible put multiple Data resources separates by the YAML Header delimiter.
A YAML metadata block is a valid YAML object, delimited by a line of three hyphens
--- at the top and a line of three hyphens
--- or three dots
... at the bottom.
--- name: my-dataset resources: - order: 1 schema: fields: - name: var1 type: string - name: var2 type: integer - name: var3 type: number dialect: csvddfVersion: 1.0 delimiter: "," doubleQuote: false lineTerminator: "\r\n" quoteChar: "\"" skipInitialSpace: true header: true --- var1,var2,var3 A,1,2.0 B,3,4.3
For backward compatibility you can always add to your data.csv a data.yml metadata file, the next step when there is proper implementation make a single file container, data.csvy will not be a problem at all.
|Language||Parser||Skip lines||Comment lines||Comments|
|Ruby||csv.read||no||yes||skip lines via regex|
Use Github Issues.