4
\$\begingroup\$

I'm looking to define the Rebol date format in EBNF notation. I'd like as best as possible to only define valid dates—at least those that are valid in Rebol at the moment:

Date ::= DateDate ('/' Time DateZone?)?
DateDate ::=
 DateDay31 ('-' DateMonth31 '-' | '/' DateMonth31 '/') DateYear
 | DateDay30 ('-' DateMonth30 '-' | '/' DateMonth30 '/') DateYear
 | DateDay28 ('-' DateMonthFebruary '-' | '/' DateMonthFebruary '/') DateYear
 | "29" ('-' DateMonthFebruary '-' | '/' DateMonthFebruary '/') DateYearLeap
 | DateYear ('-' DateMonth31 '-' | '/' DateMonth31 '/') DateDay31
 | DateYear ('-' DateMonth30 '-' | '/' DateMonth30 '/') DateDay30
 | DateYearLeap ('-' DateMonthFebruary '-' | '/' DateMonthFebruary '/') DateDay29
 | DateYear ('-' DateMonthFebruary '-' | '/' DateMonthFebruary '/') DateDay28
/* 
 Currently years cannot be negative and have a maximum value of 16383
 So the following two values are shortcuts.
*/
DateYear ::= Digit (Digit (Digit (Digit Digit?)?)?)?
DateYearLeap ::= 
 '1' Digit Digit DateYearLeapEnd
 | Digit Digit DateYearLeapEnd
 | Digit DateYearLeapEnd
 | DateYearLeapEnd
 | [048]
DateYearLeapEnd ::= [02468] [048] | [13579] [26]
DateMonth ::= DateMonth31 | DateMonth30 | DateMonthFebruary
DateDay31 ::= '3' [01] | DateDay29
DateDay30 ::= '30' | DateDay29
DateDay29 ::= [12] Digit | '0'? [1-9]
DateDay28 ::= '2' [0-8] | '1' Digit | '0'? [1-9]
DateMonth31 ::=
 DateMonthJanuary |
 DateMonthMarch |
 DateMonthMay |
 DateMonthJuly |
 DateMonthAugust |
 DateMonthOctober |
 DateMonthDecember
DateMonth30 ::=
 DateMonthApril |
 DateMonthJune |
 DateMonthSeptember |
 DateMonthNovember
/* Currently only English month names are valid */
DateMonthJanuary ::= 'Jan' ('u' ('a' ('r' 'y'?)?)?)? | '0'? '1'
DateMonthFebruary ::= 'Feb' ('r' ('u' ('a' ('r' 'y'?)?)?)?)? | '0'? '2'
DateMonthMarch ::= 'Mar' ('c' 'h'?)? | '0'? '3'
DateMonthApril ::= 'Apr' ('i' 'l'?)? | '0'? '4'
DateMonthMay ::= 'May' | '0'? '5'
DateMonthJune ::= 'Jun' 'e'? | '0'? '6' 
DateMonthJuly ::= 'Jul' 'y'? | '0'? '7'
DateMonthAugust ::= 'Aug' ('u' ('s' 't'?)?)? | '0'? '8'
DateMonthSeptember ::= 'Sep' ('t' ('e' ('m' ('b' ('e' 'r'?)?)?)?)?)? | '0'? '9'
DateMonthOctober ::= 'Oct' ('o' ('b' ('e' 'r'?)?)?)? | '10'
DateMonthNovember ::= 'Nov' ('e' ('m' ('b' ('e' 'r'?)?)?)?)? | '11'
DateMonthDecember ::= 'Dec' ('e' ('m' ('b' ('e' 'r'?)?)?)?)? | '12'
/* Zone Hours are currently -15 - 15, the following is a shortcut: */
DateZone ::= Sign Digit Digit? ':' ([03] '0' | [14] '5')
Time ::= TimeHour ':' TimeMinute (':' TimeSecond)?
/* Need to constrain to valid hours */
TimeHour ::= Sign Digit* | Sign? Digit+
/* Need to constrain to 0-59 */
TimeMinute ::= Sign? Digit Digit?
/* Need to constrain to 0-59.999999 */
TimeSecond ::= Sign? ((Digit Digit?)? '.' Digit+ | Digit Digit? '.'?)
Digit ::= [0-9]
Sign ::= [+-]

Notes:

  • Partial matches are bad: this rule should match a whole string—a partial match is failure, e.g. the following would only partially match 22 and would thus fail.
    '2' | '22'

  • I've taken shortcuts on some values (see source comments), I would expect to flesh these out.

  • Although Rebol will interpret 12-010-2014 as a date (12-Oct-2014), I don't see any reason to support this.

To test this code in Rebol, save as a file and load as a parse rule to use against samples:

Rebol []
do http://reb4.me/x/ebnf.r
date-grammar: get in context load-ebnf %date.ebnf 'date
foreach test [
 "28-Feb-2016"
 "29-Feb-2000"
 "29-Feb-2011"
 "29-Feb-2010"
 "29-Feb-2016"
 "1-April-2015/12:00"
 "1-4-2015/12:00+5:00"
 "01-Apr-2015"
 "31-Apr-2015"
 "29-Feb-1900"
 "00-Feb-20"
 "15-Apr-16"
 "2015-04-01/12:15:10"
 "2015-04-01/12:15:10."
 "2015-04-01/12:15:."
 "2015-04-01/12:15:10.1234"
 "2015-04-01/12:15:10.1234-05:00"
 "2015-04-01/12:15:10.1234-0:00"
][
 print [either parse test date-grammar ["*"][" "] mold test date? try [load test]]
]
RubberDuck
31.1k6 gold badges73 silver badges176 bronze badges
asked Apr 22, 2015 at 23:11
\$\endgroup\$
5
  • \$\begingroup\$ "ebnf" is not a real programming language. \$\endgroup\$ Commented Apr 22, 2015 at 23:22
  • 2
    \$\begingroup\$ @200_success don't some tools generate parsers/lexers based on EBNF grammars? \$\endgroup\$ Commented Apr 23, 2015 at 9:27
  • 1
    \$\begingroup\$ @RubberDuck A specific language, such as yacc, would be on-topic. Otherwise, it's pseudocode. \$\endgroup\$ Commented Apr 23, 2015 at 9:35
  • \$\begingroup\$ Would be much obliged if it would be possible to release this question and reinstate the ebnf tag. As demonstrated, this is working code and I'd prefer it reviewed not specifically as Rebol code (as per related question) as it is also targeted toward EBNF interpreters not based in the Rebol language. \$\endgroup\$ Commented Apr 23, 2015 at 17:48
  • 1
    \$\begingroup\$ Reopened, as your ebnf.r makes it real code, not pseudocode. \$\endgroup\$ Commented Apr 23, 2015 at 21:23

1 Answer 1

4
\$\begingroup\$

Is Janua really a valid month? If so, then you're fine, but I have a feeling that this would be more appropriate.

DateMonthJanuary ::= 'January' | 'Jan' | '0'? '1'

Which brings me to my next problem. It's case sensitive, right? So, 22-JAN-2015 wouldn't match. I believe it should, but I'm not familiar with Robol. Please correct me if I'm wrong.

I've seen this handled by defining case insensitive tokens for each letter.

A ::= 'a' | 'A'

And then define your rules from those tokens like this.

DateMonthJanuary ::= J A N U A R Y | J A N | '0'? '1'

My syntax may be a little off, I'm accustomed to ANTLR's flavor of EBNF, but that should illustrate the idea.

DateYearLeap ::= 

I'm really not sure that I would handle that in your grammar. The logic is convoluted (not yours, just leap years in general) and I'm pretty sure this will match a few leap years that aren't. Much better for this one to be validated by your parser if that's possible.

/* Need to constrain to valid hours */
TimeHour ::= Sign Digit* | Sign? Digit+
/* Need to constrain to 0-59 */
TimeMinute ::= Sign? Digit Digit?
/* Need to constrain to 0-59.999999 */
TimeSecond ::= Sign? ((Digit Digit?)? '.' Digit+ | Digit Digit? '.'?)

I think you should be able to define tokens for these. It may take a bit of doing to get the precedence right, but start with something like this.

TimeMinute ::= Sign? [0-6] Digit
/* Optional Sign, Optional 10s Digit 0-6, 0-9*/
answered Apr 23, 2015 at 23:11
\$\endgroup\$
6
  • \$\begingroup\$ Yep, Rebol accepts 'Janua' as a month! One goal in laying out the format is to identify potential tweaks, efficiencies and rationalizations. Being able to propose 'Jan' 'uary'? doesn't cost anything to the language while it tightens up the [currently non-existent] spec. A biggie is suggesting the subtle DateDate ([/T] Time DateZone?)? change to the main date rule and add | 'Z' to the zone rule allowing Rebol to load dates conforming to RFC3339. \$\endgroup\$ Commented Apr 24, 2015 at 15:22
  • \$\begingroup\$ Really?! Oh wow! I never would have thought that was right. I stand corrected. \$\endgroup\$ Commented Apr 24, 2015 at 15:24
  • \$\begingroup\$ Rebol is indeed case-insensitive when it comes to months, so I fear you're right in having to spell out the names with both cases: J A N (U A R Y)? it'll have to be. \$\endgroup\$ Commented Apr 24, 2015 at 15:25
  • \$\begingroup\$ On leap years—I still would like to see how much is possible to express lexically, even if it does get even messier... \$\endgroup\$ Commented Apr 24, 2015 at 15:30
  • \$\begingroup\$ I think you can get close on the leap years, but ultimately, I think you'll need to do some extra validation post-lex. \$\endgroup\$ Commented Apr 24, 2015 at 15:31

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.