In this post I’ll explain how I built on my runtime date-formatting functionality into the date-parsing realm. The result is a date-parsing library that literally creates itself at runtime.
I have a demo of the date-parsing library online for your enjoyment.
How it works
The technique is similar to my date-formatting library:
- accept some input such as
- accept a format specifier such as
- use the format specifier to create a function capable of interpreting date strings in the given format
This allows parsing dates very efficiently and flexibly. In fact, the function that gets built will parse dates with as much detail as possible, down to the second, defaulting to a less precise date when there’s less information.
The date-parsing code is a bit more complex than the formatting code. The parsing code has to build a regular expression which will successfully match a well-formed input, as specified by the format string. It inserts groups into the regular expression wherever it sees some data it can use to deduce the value of the date, and keeps track of the groups so it can use the captured values as parameters to the
Date constructor. For example, if it sees the character
Y in the format string, it knows that value can be captured in the regular expression and used as the
year parameter to the
Date constructor. It matches, but doesn’t use, other data to ensure it is validly formatted. For example, the day of the week isn’t helpful when parsing a date. The demo will make this clear.
In many cases, depending on the format string, it should be possible to use the date-parsing code together with the date-formatting code for round-trip processing. Take a date, format it with some format string, then read it back in with the same format string, and you should get the same date. Of course, you need to preserve whatever level of detail you want to get back—you won’t get everything back if you throw it away during the formatting step. You’ll see that in the demo too.
What it’s not
strtotime can understand input like “two weeks ago next Sunday”). It’s also not internationalized. It only works for my little slice of the universe: the English language—though international date-formatting standards (ISO 8601, highly recommended) make that a moot point anyway.