-
Class Summary
Class |
Description |
CSVParser |
Parses CSV files according to the specified format.
|
CSVRecord |
A CSV record parsed from a CSV file.
|
Package org.apache.commons.csv Description
Apache Commons CSV Format Support.
CSV are widely used as interfaces to legacy systems or manual data-imports.
CSV stands for "Comma Separated Values" (or sometimes "Character Separated
Values"). The CSV data format is defined in
RFC 4180
but many dialects exist.
Common to all file dialects is its basic structure: The CSV data-format
is record oriented, whereas each record starts on a new textual line. A
record is build of a list of values. Keep in mind that not all records
must have an equal number of values:
csv := records*
record := values*
The following list contains the CSV aspects the Commons CSV parser supports:
- Separators (for lines)
- The record separators are hardcoded and cannot be changed. The must be '\r', '\n' or '\r\n'.
- Delimiter (for values)
- The delimiter for values is freely configurable (default ',').
- Comments
- Some CSV-dialects support a simple comment syntax. A comment is a record
which must start with a designated character (the commentStarter). A record
of this kind is treated as comment and gets removed from the input (default none)
- Encapsulator
- Two encapsulator characters (default '"') are used to enclose -> complex values.
- Simple values
- A simple value consist of all characters (except the delimiter) until
(but not including) the next delimiter or a record-terminator. Optionally
all surrounding whitespaces of a simple value can be ignored (default: true).
- Complex values
- Complex values are encapsulated within a pair of the defined encapsulator characters.
The encapsulator itself must be escaped or doubled when used inside complex values.
Complex values preserve all kind of formatting (including newlines -> multiline-values)
- Empty line skipping
- Optionally empty lines in CSV files can be skipped.
Otherwise, empty lines will return a record with a single empty value.
In addition to individually defined dialects, two predefined dialects (strict-csv, and excel-csv)
can be set directly.
Example usage:
Reader in = new StringReader("a,b,c");
for (CSVRecord record : CSVFormat.DEFAULT.parse(in)) {
for (String field : record) {
System.out.print("\"" + field + "\", ");
}
System.out.println();
}