I'm looking to define the Rebol date format in EBNF notation. I'd like as best as possible to only define valid dates—at least those that are valid in Rebol at the moment:
Date ::= DateDate ('/' Time DateZone?)?
DateDate ::=
DateDay31 ('-' DateMonth31 '-' | '/' DateMonth31 '/') DateYear
| DateDay30 ('-' DateMonth30 '-' | '/' DateMonth30 '/') DateYear
| DateDay28 ('-' DateMonthFebruary '-' | '/' DateMonthFebruary '/') DateYear
| "29" ('-' DateMonthFebruary '-' | '/' DateMonthFebruary '/') DateYearLeap
| DateYear ('-' DateMonth31 '-' | '/' DateMonth31 '/') DateDay31
| DateYear ('-' DateMonth30 '-' | '/' DateMonth30 '/') DateDay30
| DateYearLeap ('-' DateMonthFebruary '-' | '/' DateMonthFebruary '/') DateDay29
| DateYear ('-' DateMonthFebruary '-' | '/' DateMonthFebruary '/') DateDay28
/*
Currently years cannot be negative and have a maximum value of 16383
So the following two values are shortcuts.
*/
DateYear ::= Digit (Digit (Digit (Digit Digit?)?)?)?
DateYearLeap ::=
'1' Digit Digit DateYearLeapEnd
| Digit Digit DateYearLeapEnd
| Digit DateYearLeapEnd
| DateYearLeapEnd
| [048]
DateYearLeapEnd ::= [02468] [048] | [13579] [26]
DateMonth ::= DateMonth31 | DateMonth30 | DateMonthFebruary
DateDay31 ::= '3' [01] | DateDay29
DateDay30 ::= '30' | DateDay29
DateDay29 ::= [12] Digit | '0'? [1-9]
DateDay28 ::= '2' [0-8] | '1' Digit | '0'? [1-9]
DateMonth31 ::=
DateMonthJanuary |
DateMonthMarch |
DateMonthMay |
DateMonthJuly |
DateMonthAugust |
DateMonthOctober |
DateMonthDecember
DateMonth30 ::=
DateMonthApril |
DateMonthJune |
DateMonthSeptember |
DateMonthNovember
/* Currently only English month names are valid */
DateMonthJanuary ::= 'Jan' ('u' ('a' ('r' 'y'?)?)?)? | '0'? '1'
DateMonthFebruary ::= 'Feb' ('r' ('u' ('a' ('r' 'y'?)?)?)?)? | '0'? '2'
DateMonthMarch ::= 'Mar' ('c' 'h'?)? | '0'? '3'
DateMonthApril ::= 'Apr' ('i' 'l'?)? | '0'? '4'
DateMonthMay ::= 'May' | '0'? '5'
DateMonthJune ::= 'Jun' 'e'? | '0'? '6'
DateMonthJuly ::= 'Jul' 'y'? | '0'? '7'
DateMonthAugust ::= 'Aug' ('u' ('s' 't'?)?)? | '0'? '8'
DateMonthSeptember ::= 'Sep' ('t' ('e' ('m' ('b' ('e' 'r'?)?)?)?)?)? | '0'? '9'
DateMonthOctober ::= 'Oct' ('o' ('b' ('e' 'r'?)?)?)? | '10'
DateMonthNovember ::= 'Nov' ('e' ('m' ('b' ('e' 'r'?)?)?)?)? | '11'
DateMonthDecember ::= 'Dec' ('e' ('m' ('b' ('e' 'r'?)?)?)?)? | '12'
/* Zone Hours are currently -15 - 15, the following is a shortcut: */
DateZone ::= Sign Digit Digit? ':' ([03] '0' | [14] '5')
Time ::= TimeHour ':' TimeMinute (':' TimeSecond)?
/* Need to constrain to valid hours */
TimeHour ::= Sign Digit* | Sign? Digit+
/* Need to constrain to 0-59 */
TimeMinute ::= Sign? Digit Digit?
/* Need to constrain to 0-59.999999 */
TimeSecond ::= Sign? ((Digit Digit?)? '.' Digit+ | Digit Digit? '.'?)
Digit ::= [0-9]
Sign ::= [+-]
Notes:
Partial matches are bad: this rule should match a whole string—a partial match is failure, e.g. the following would only partially match
22
and would thus fail.
'2' | '22'
I've taken shortcuts on some values (see source comments), I would expect to flesh these out.
Although Rebol will interpret
12-010-2014
as a date (12-Oct-2014), I don't see any reason to support this.
To test this code in Rebol, save as a file and load as a parse rule to use against samples:
Rebol []
do http://reb4.me/x/ebnf.r
date-grammar: get in context load-ebnf %date.ebnf 'date
foreach test [
"28-Feb-2016"
"29-Feb-2000"
"29-Feb-2011"
"29-Feb-2010"
"29-Feb-2016"
"1-April-2015/12:00"
"1-4-2015/12:00+5:00"
"01-Apr-2015"
"31-Apr-2015"
"29-Feb-1900"
"00-Feb-20"
"15-Apr-16"
"2015-04-01/12:15:10"
"2015-04-01/12:15:10."
"2015-04-01/12:15:."
"2015-04-01/12:15:10.1234"
"2015-04-01/12:15:10.1234-05:00"
"2015-04-01/12:15:10.1234-0:00"
][
print [either parse test date-grammar ["*"][" "] mold test date? try [load test]]
]
1 Answer 1
Is Janua
really a valid month? If so, then you're fine, but I have a feeling that this would be more appropriate.
DateMonthJanuary ::= 'January' | 'Jan' | '0'? '1'
Which brings me to my next problem. It's case sensitive, right? So, 22-JAN-2015
wouldn't match. I believe it should, but I'm not familiar with Robol. Please correct me if I'm wrong.
I've seen this handled by defining case insensitive tokens for each letter.
A ::= 'a' | 'A'
And then define your rules from those tokens like this.
DateMonthJanuary ::= J A N U A R Y | J A N | '0'? '1'
My syntax may be a little off, I'm accustomed to ANTLR's flavor of EBNF, but that should illustrate the idea.
DateYearLeap ::=
I'm really not sure that I would handle that in your grammar. The logic is convoluted (not yours, just leap years in general) and I'm pretty sure this will match a few leap years that aren't. Much better for this one to be validated by your parser if that's possible.
/* Need to constrain to valid hours */ TimeHour ::= Sign Digit* | Sign? Digit+ /* Need to constrain to 0-59 */ TimeMinute ::= Sign? Digit Digit? /* Need to constrain to 0-59.999999 */ TimeSecond ::= Sign? ((Digit Digit?)? '.' Digit+ | Digit Digit? '.'?)
I think you should be able to define tokens for these. It may take a bit of doing to get the precedence right, but start with something like this.
TimeMinute ::= Sign? [0-6] Digit
/* Optional Sign, Optional 10s Digit 0-6, 0-9*/
-
\$\begingroup\$ Yep, Rebol accepts 'Janua' as a month! One goal in laying out the format is to identify potential tweaks, efficiencies and rationalizations. Being able to propose
'Jan' 'uary'?
doesn't cost anything to the language while it tightens up the [currently non-existent] spec. A biggie is suggesting the subtleDateDate ([/T] Time DateZone?)?
change to the main date rule and add| 'Z'
to the zone rule allowing Rebol to load dates conforming to RFC3339. \$\endgroup\$rgchris– rgchris2015年04月24日 15:22:21 +00:00Commented Apr 24, 2015 at 15:22 -
\$\begingroup\$ Really?! Oh wow! I never would have thought that was right. I stand corrected. \$\endgroup\$RubberDuck– RubberDuck2015年04月24日 15:24:55 +00:00Commented Apr 24, 2015 at 15:24
-
\$\begingroup\$ Rebol is indeed case-insensitive when it comes to months, so I fear you're right in having to spell out the names with both cases:
J A N (U A R Y)?
it'll have to be. \$\endgroup\$rgchris– rgchris2015年04月24日 15:25:06 +00:00Commented Apr 24, 2015 at 15:25 -
\$\begingroup\$ On leap years—I still would like to see how much is possible to express lexically, even if it does get even messier... \$\endgroup\$rgchris– rgchris2015年04月24日 15:30:13 +00:00Commented Apr 24, 2015 at 15:30
-
\$\begingroup\$ I think you can get close on the leap years, but ultimately, I think you'll need to do some extra validation post-lex. \$\endgroup\$RubberDuck– RubberDuck2015年04月24日 15:31:06 +00:00Commented Apr 24, 2015 at 15:31
ebnf.r
makes it real code, not pseudocode. \$\endgroup\$