Text handling

There are two options for text handling in Noda Time. For some elements of formatting, you can follow the "normal" approach from the .NET Base Class Library (BCL) - in particular, most of the core Noda Time types implements IFormattable. However, no parsing support is provided in this way. (It used to be, but the whole approach is so convoluted that documenting it accurately proved too great an overhead.)

The preferred approach is to use the "pattern" classes such as LocalDatePattern and so forth. This leads to clearer, more robust code, and performs better. The formatting support present in the BCL style is mostly present to work well with compound format strings, where you may wish to mix several values of different types in a single formatting call.

All the types responsible for text in Noda Time are in the NodaTime.Text namespace.

The pattern-based API

A pattern is an object capable of parsing from text to a specific type, and formatting a value to text. Parsing and formatting don't take any other options: the pattern knows everything about how to map between the value and text. In particular, internationalization is handled by having the pattern hold a CultureInfo.

Whereas using the BCL approach the format information has to be specified on every call, using the pattern approach the format information is fixed for any particular pattern. Convenience methods are provided to create new pattern instances based on existing ones but with different internationalization information or other options.

Each core Noda type has its own pattern type such as OffsetPattern. All these patterns implement the IPattern<T> interface, which has simple Format and Parse methods taking just the value and text respectively. The result of Parse is a ParseResult<T> which encapsulates both success and failure results.

The BCL-based API

Most of the core Noda Time types (LocalDateTime, Instant etc) provide methods with the following signatures:

  • ToString(): Formats the value using the default pattern for the current thread's format provider.
  • ToString(string, IFormatProvider): Formats the value with the given pattern and format provider. The pattern text for this call is exactly the same as when creating a pattern object with the preferred API.

Pattern text

Each type has its own separate pattern text documentation. The available patterns are as consistent as possible within reason, but documenting each separately avoids confusion with some field specifiers being available for some types but not others.

Standard and custom patterns

Standard patterns are those denoted with a single character to represent a common pattern within the culture being used. For example, the standard pattern d for a LocalDate is in month/day/year format in a US culture, but day/month/year format in a UK culture. They're usually a shorthand for a possibly-culture-specific custom format, but not always. (Some standard patterns in Noda Time can't be represented directly in custom patterns.)

Custom patterns give more direct control over how a value is formatted or parsed. It may still be culture-sensitive like standard patterns, but in a lower level way - the / format specifier within a LocalDate pattern is used to indicate the culture-sensitive date separator, which is / in a US culture but . in German culture, for example.

When a single character is specified for a pattern, it is always treated as a standard pattern. If no standard pattern for that character exists, an exception is thrown. To create a custom pattern which would normally only contain a single character, use % to effectively "escape" the character. So a LocalTime custom pattern which formats the 24-hour hour-of-day without padding would be represented as %H.

Custom patterns

All custom patterns support the following characters:

Character Meaning Example
% Escape to force a single-character custom pattern to be treated as such. %H => 5
' Open and close a text literal, which can include double quotes. HH'H'mm'M' => 07H30M
" Open and close a text literal, which can include single quotes. HH"'"mm => 07'30
\ Escapes the following character. HH\'mm => 07'30

Additionally:

  • Where valid, : always refers to the culture-specific time separator (a colon in the invariant culture)
  • Where valid, / always refers to the culture-specific date separator (a forward slash in the invariant culture)
  • Where valid, < and > are used for embedding one pattern within another. For consistency, these characters must always be quoted when they are intended to be used as text literals.

Any ASCII letters (a-z, A-Z) which are intended to be used as text literals (when parsing, they must be matched exactly; when formatting they are reproduced exactly) must be quoted or escaped. Even if they do not have a specific meaning for the given pattern type, their presence within the pattern would be a potential cause for confusion and error. Additionally, by effectively reserving all ASCII letters, Noda Time has more room for future expansion without compatibility concerns. The one exception to this rule is 'T', which is explicitly allowed within date/time-based patterns (LocalDateTime etc) as a common separator between the two parts. It is not permitted (without quoting or escaping) in other patterns such as for LocalDate or LocalTime.

Any non-letter characters within a custom format which don't have a specific meaning are treated as text literals. You may wish to escape or quote such characters anyway, for the sake of consistency.

Related fields

In general, a field may only occur once in a pattern in any form. For example, a pattern of "dd MM '('MMM')' yyyy" is invalid as it specifies the month twice, even though it specifies it in different forms. This restriction may be relaxed in the future, but it would always be invalid to have a value with inconsistencies.

In some cases, fields may be related without being the same. The most obvious example here is day-of-week and the other date fields. When parsing, the day-of-week field is only used for validation: in itself, it doesn't provide enough information to specify a date. (The week-year/week-of-week-year/day-of-week scheme is not currently supported in text handling.) If the day-of-week is present but does not concur with the other values, parsing will fail.

In other cases, there can be multiple fields specifying the same information - such as "year-of-era" and "absolute-year". In these cases either field is actually enough to determine the information, but when parsing the field values are validated for consistency.

Template values

Many patterns allow a template value to be specified - for date/time values this is typically midnight on January 1st 2000. This value is used to provide values for fields which aren't specified elsewhere. For example, if you create a LocalDateTimePattern with a custom pattern of "dd HH:mm:ss" then that doesn't specify the year or month - those will be picked from the template value. Template values can be specified for both standard and custom patterns, although standard patterns will rarely use them.

The century in the template value is also used when a pattern specifies a two-digit year ("yy"), although such patterns are generally discouraged anyway.

Advice on choosing text patterns

Often you don't have much choice about how to parse or format text: if you're interoperating with another system which provides or expects the data in a particular format, you just have to go with their decision. However, often you do have a choice. A few points of guidance:

  • You need to decide whether this text is going to be parsed by humans or computers primarily. For humans, you probably want to use their culture - for computers, you should almost always use the invariant culture.
  • Custom patterns are rarely appropriate for arbitrary cultures. They are generally useful for either the invariant culture or for specific cultures that you have knowledge of. (If you're writing an app which is only used in one country, for example, you have a lot more freedom than if you'll be dealing with cultures you don't have experience of, where the standard patterns are generally a better bet.)
  • If you're logging timestamps, think very carefully before you decide to log them in any time zone other than UTC. It's the one time zone that everyone else can work with, and you never need to worry about daylight saving time.
  • When designing a custom pattern:
    • Consider sortability. A pattern such as uuuu-MM-dd is naturally sortable in the text form (assuming you never need years outside the range 0-9999), whereas neither dd-MM-uuuu nor MM-dd-uuuu is sortable.
    • Avoid two-digit years. Aside from anything else, the meaning of "2009-10-11" is a lot more obvious than "09-10-11".
    • Think about what precision you need to go down to.
    • Think about whether a fixed-width pattern would be useful or whether you want to save space by removing sub-second insignificant zeroes.
    • Try to use a pattern which is ISO-friendly where possible; it'll make it easier to interoperate with other systems in the future.
    • Quote all non-field values other than spaces.

There are some patterns that are unambiguous for humans, but which Noda Time won't parse in all cases. For example, a LocalTime pattern of Hmm is always unambiguous for valid values, as it will always be three characters (for a one-digit hour) or four characters (for a two-digit hour). However, the Noda Time parsing strategy doesn't look ahead enough to handle this, so a value of "120" is parsed as an "hour" value of 12, followed by not enough digits for the minutes. While it would be possible to support this, it would add significant complexity for the sake of formats which are quite hard to read as a human anyway.

To avoid this, either use a format with a fixed length for each field (e.g. HHmm) or a format with a separator between fields (e.g. H:mm). Noda Time will then handle all values appropriately.