Readers and Writers in Super CSV are configured using the CsvPreference class. This class is immutable and is assembled using the Builder pattern.
The preferences available are:
There are four 'ready to use' configurations for typical scenarios.
Constant | Quote char | Delimiter char | End of line symbols |
---|---|---|---|
STANDARD_PREFERENCE | " | , | \r\n |
EXCEL_PREFERENCE | " | , | \n |
EXCEL_NORTH_EUROPE_PREFERENCE | " | ; | \n |
TAB_PREFERENCE | " | \t | \n |
All of these configurations use the default values of:
Preference | Default value |
---|---|
surroundingSpacesNeedQuotes | false |
ignoreEmptyLines | true |
maxLinesPerRow | 0 (disabled) |
encoder | DefaultCsvEncoder |
quoteMode | NormalQuoteMode |
skipComments | false (no CommentMatcher used) |
If none of the predefined preferences suit your purposes, you can easily create your own (you're not just limited to CSV files!). For example, the following code snippet creates preferences suitable for reading/writing pipe-delimited files.
private static final CsvPreference PIPE_DELIMITED = new CsvPreference.Builder('"', '|', "\n").build();
In accordance with RFC 4180, the default behaviour of Super CSV is to treat all spaces as important, including spaces surrounding the text in a cell.
This means for reading, a cell with contents surrounded by spaces is read with surrounding spaces preserved. And for writing, the same String is written with surrounding spaces and no surrounding quotes (they're not required, as spaces are considered important).
There are some scenarios where this restriction must be relaxed, in particular when the CSV file you're working with assumes that surrounding spaces must be surrounded by quotes, otherwise will be ignored. For this reason, Super CSV allows you to enable the surroundingSpacesNeedQuotes preference.
With surroundingSpacesNeedQuotes enabled, it means that for reading, a cell with contents surrounded by spaces would be read as surrounded by spaces (surrounding spaces are trimmed), unless the String has surrounding quotes, e.g. " surrounded by spaces ", in which case the spaces are preserved. And for writing, any String containing surrounding spaces will automatically be given surrounding quotes when written in order to preserve the spaces.
You can enable this behaviour by calling surroundingSpacesNeedQuotes(true) on the Builder. You can do this with your own custom preference, or customize an existing preference as shown below.
private static final CsvPreference STANDARD_SURROUNDING_SPACES_NEED_QUOTES = new CsvPreference.Builder(CsvPreference.STANDARD_PREFERENCE).surroundingSpacesNeedQuotes(true).build();
Prior to Super CSV 2.0.0, this behaviour wasn't configurable and surrounding spaces were always trimmed.
By default, all empty lines (which aren't quoted) are ignored when reading CSV. This is useful if your CSV file has leading or trailing empty lines. If you with to disable this behaviour and allow empty lines to be read you can use the ignoreEmptyLines preference:
private static final CsvPreference ALLOW_EMPTY_LINES = new CsvPreference.Builder(CsvPreference.STANDARD_PREFERENCE).ignoreEmptyLines(false).build();
If your CSV file isn't quoted property (is missing a trailing quote, for example), then it's possible that Super CSV will keep reading lines in an attempt to locate the end of the row - for large files this can cause an OutOfMemoryException. By setting the maxLinesPerRow preference Super CSV will fail fast, giving you a chance to locate the row in error and fix it:
private static final CsvPreference DISABLE_MULTILINE_ROWS = new CsvPreference.Builder(CsvPreference.STANDARD_PREFERENCE).maxLinesPerRow(10).build();
By default Super CSV only adds surrounding quotes when writing CSV when it contains a delimiter, quote or newline (or if you've enabled surroundingSpacesNeedQuotes and the value has surrounding spaces).
Super CSV provides two alternative quoting modes:
You can also write your own QuoteMode, but please note that this is a means to enable quotes when they're not normally required (you won't be able to disable quotes because then your CSV will not be readable if it contains embedded special characters). Just pass your desired mode to the useQuoteMode() method when building your preferences:
private static final CsvPreference ALWAYS_QUOTE = new CsvPreference.Builder(CsvPreference.STANDARD_PREFERENCE).useQuoteMode(new AlwaysQuoteMode().build();
Super CSV provides a powerful CsvEncoder, but if you'd like complete control over how your CSV is encoded, then you can supply your own to the useEncoder() method when building your preferences:
private static final CsvPreference CUSTOM_ENCODER = new CsvPreference.Builder(CsvPreference.STANDARD_PREFERENCE).useEncoder(new MyAwesomeEncoder().build();
If you'd like to encode particular columns, but leave other columns unchanged then you can use Super CSV's SelectiveCsvEncoder. This might be useful if you're really concerned with performance and you know that certain columns will never contain special characters. Just be aware that if a column does contain special characters and you don't encode it, you could end up with invalid CSV.
Although comments aren't part of RFC4180, some CSV files use them so it's useful to be able to skip these lines (or even skip lines because they contain invalid data). You can use one of the predefined comment matchers:
Or if you like you can write your own by implementing the CommentMatcher interface.
Just pass your desired comment matcher to the skipComments() method when building your preferences:
private static final CsvPreference STANDARD_SKIP_COMMENTS = new CsvPreference.Builder(CsvPreference.STANDARD_PREFERENCE).skipComments(new CommentStartsWith("#").build();