public final class CSVParser extends java.lang.Object implements java.lang.Iterable<CSVRecord>, java.io.Closeable
CSVFormat
.
The parser works record wise. It is not possible to go back, once a record has been parsed from the input stream.
There are several static factory methods that can be used to create instances for various types of resources:
parse(java.io.File, Charset, CSVFormat)
parse(String, CSVFormat)
parse(java.net.URL, java.nio.charset.Charset, CSVFormat)
Alternatively parsers can also be created by passing a Reader
directly to the sole constructor.
For those who like fluent APIs, parsers can be created using CSVFormat#parse(java.io.Reader)
as a shortcut:
for(CSVRecord record : CSVFormat.EXCEL.parse(in)) { ... }
To parse a CSV input from a file, you write:
File csvData = new File("/path/to/csv"); CSVParser parser = CSVParser.parse(csvData, CSVFormat.RFC4180); for (CSVRecord csvRecord : parser) { ... }
This will read the parse the contents of the file using the RFC 4180 format.
To parse CSV input in a format like Excel, you write:
CSVParser parser = CSVParser.parse(csvData, CSVFormat.EXCEL); for (CSVRecord csvRecord : parser) { ... }
If the predefined formats don't match the format at hands, custom formats can be defined. More information about
customising CSVFormats is available in CSVFormat JavaDoc
.
If parsing record wise is not desired, the contents of the input can be read completely into memory.
Reader in = new StringReader("a;b\nc;d"); CSVParser parser = new CSVParser(in, CSVFormat.EXCEL); List<CSVRecord> list = parser.getRecords();
There are two constraints that have to be kept in mind:
Internal parser state is completely covered by the format and the reader-state.
Constructor and Description |
---|
CSVParser(java.io.Reader reader,
CSVFormat format)
Customized CSV parser using the given
CSVFormat |
CSVParser(java.io.Reader reader,
CSVFormat format,
long characterOffset,
long recordNumber)
Customized CSV parser using the given
CSVFormat |
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes resources.
|
long |
getCurrentLineNumber()
Returns the current line number in the input stream.
|
java.util.Map<java.lang.String,java.lang.Integer> |
getHeaderMap()
Returns a copy of the header map that iterates in column order.
|
long |
getRecordNumber()
Returns the current record number in the input stream.
|
java.util.List<CSVRecord> |
getRecords()
Parses the CSV input according to the given format and returns the content as a list of
CSVRecords . |
boolean |
isClosed()
Gets whether this parser is closed.
|
java.util.Iterator<CSVRecord> |
iterator()
Returns an iterator on the records.
|
static CSVParser |
parse(java.io.File file,
java.nio.charset.Charset charset,
CSVFormat format)
Creates a parser for the given
File . |
static CSVParser |
parse(java.lang.String string,
CSVFormat format)
Creates a parser for the given
String . |
static CSVParser |
parse(java.net.URL url,
java.nio.charset.Charset charset,
CSVFormat format)
Creates a parser for the given URL.
|
public CSVParser(java.io.Reader reader, CSVFormat format) throws java.io.IOException
CSVFormat
If you do not read all records from the given reader
, you should call close()
on the parser,
unless you close the reader
.
reader
- a Reader containing CSV-formatted input. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.java.io.IOException
- If there is a problem reading the header or skipping the first recordpublic CSVParser(java.io.Reader reader, CSVFormat format, long characterOffset, long recordNumber) throws java.io.IOException
CSVFormat
If you do not read all records from the given reader
, you should call close()
on the parser,
unless you close the reader
.
reader
- a Reader containing CSV-formatted input. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.characterOffset
- Lexer offset when the parser does not start parsing at the beginning of the source.recordNumber
- The next record number to assignjava.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.java.io.IOException
- If there is a problem reading the header or skipping the first recordpublic static CSVParser parse(java.io.File file, java.nio.charset.Charset charset, CSVFormat format) throws java.io.IOException
File
.
Note: This method internally creates a FileReader using
FileReader.FileReader(java.io.File)
which in turn relies on the default encoding of the JVM that
is executing the code. If this is insufficient create a URL to the file and use
parse(URL, Charset, CSVFormat)
file
- a CSV file. Must not be null.charset
- A charsetformat
- the CSVFormat used for CSV parsing. Must not be null.java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either file or format are null.java.io.IOException
- If an I/O error occurspublic static CSVParser parse(java.lang.String string, CSVFormat format) throws java.io.IOException
String
.string
- a CSV string. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either string or format are null.java.io.IOException
- If an I/O error occurspublic static CSVParser parse(java.net.URL url, java.nio.charset.Charset charset, CSVFormat format) throws java.io.IOException
If you do not read all records from the given url
, you should call close()
on the parser, unless
you close the url
.
url
- a URL. Must not be null.charset
- the charset for the resource. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.java.lang.IllegalArgumentException
- If the parameters of the format are inconsistent or if either url, charset or format are null.java.io.IOException
- If an I/O error occurspublic void close() throws java.io.IOException
close
in interface java.io.Closeable
close
in interface java.lang.AutoCloseable
java.io.IOException
- If an I/O error occurspublic long getCurrentLineNumber()
ATTENTION: If your CSV input has multi-line values, the returned number does not correspond to the record number.
public java.util.Map<java.lang.String,java.lang.Integer> getHeaderMap()
The map keys are column names. The map values are 0-based indices.
public long getRecordNumber()
ATTENTION: If your CSV input has multi-line values, the returned number does not correspond to the line number.
public java.util.List<CSVRecord> getRecords() throws java.io.IOException
CSVRecords
.
The returned content starts at the current parse-position in the stream.
CSVRecords
, may be emptyjava.io.IOException
- on parse error or input read-failurepublic boolean isClosed()
public java.util.Iterator<CSVRecord> iterator()
IOExceptions occurring during the iteration are wrapped in a
RuntimeException.
If the parser is closed a call to next()
will throw a
NoSuchElementException.
iterator
in interface java.lang.Iterable<CSVRecord>