-
Notifications
You must be signed in to change notification settings - Fork 113
Description
YAML scanner has problems with unusual mapping keys
Trans reported:
Found an issue with the YAML syntax highlighting in CodeRay. Try it on the following example.
When it hits mri <1.9 it starts reverse highlighting almost every other line.
- package: system_timer engine : mri <1.9
The problem is that the scanner doesn't allow white space before the colon. That's an easy fix; however, the set of characters allowed in keys is limited and non spec-conform.
See http://yaml.org/spec/1.2/spec.html#id2788859 for further reading.
A patch for the 1.0 trunk:
Index: /Users/murphy/ruby/coderay/lib/coderay/scanners/yaml.rb =================================================================== --- /Users/murphy/ruby/coderay/lib/coderay/scanners/yaml.rb (revision 580) +++ /Users/murphy/ruby/coderay/lib/coderay/scanners/yaml.rb (working copy) @@ -75,20 +75,18 @@ when match = scan(/[,{}\[\]]/) encoder.text_token match, :operator next - when state == :initial && match = scan(/[\w.() ]*\S(?=: |:$)/) + when state == :initial && match = scan(/[\w.() ]*\S(?= *:(?: |$))/) encoder.text_token match, :key key_indent = column(pos - match.size - 1) - # encoder.text_token key_indent.inspect, :debug state = :colon next - when match = scan(/(?:"[^"\n]*"|'[^'\n]*')(?=: |:$)/) + when match = scan(/(?:"[^"\n]*"|'[^'\n]*')(?= *:(?: |$))/) encoder.begin_group :key encoder.text_token match[0,1], :delimiter encoder.text_token match[1..-2], :content encoder.text_token match[-1,1], :delimiter encoder.end_group :key key_indent = column(pos - match.size - 1) - # encoder.text_token key_indent.inspect, :debug state = :colon next when match = scan(/(![\w\/]+)(:([\w:]+))?/)
From Redmine: http://odd-eyed-code.org/issues/231
YAML scanner doesn't recognize false and true
It should also recognize -.Inf and such. Some readable part of the spec is about the Core Schema which Ruby seems to use.
Needs more investigation.
Index: lib/coderay/scanners/yaml.rb =================================================================== --- lib/coderay/scanners/yaml.rb (revision 742) +++ lib/coderay/scanners/yaml.rb (working copy) @@ -11,6 +11,11 @@ KINDS_NOT_LOC = :all + CONSTANTS = %w[ true True TRUE false False FALSE null Null NULL ] # :nodoc: + + IDENT_KIND = WordList.new(nil). + add(CONSTANTS, :pre_constant) + protected def scan_tokens encoder, options @@ -59,7 +64,11 @@ encoder.text_token matched, :content if scan(/(?:\n+ {#{string_indent + 1}}.*)+/) encoder.end_group :string next - when match = scan(/(?![!"*&]).+?(?=$|\s+#)/) + when match = scan(/(?![!*&\[\],{}]|- ).+?(?=$|\s+#)/) + if kind = IDENT_KIND[match] + encoder.text_token match, kind + next + end encoder.begin_group :string encoder.text_token match, :content string_indent = key_indent || column(pos - match.size - 1) @@ -116,6 +125,9 @@ when match = scan(/:\w+/) encoder.text_token match, :symbol next + when match = scan(/~|\.(?:inf|Inf|INF|nan|NaN|NAN)/) + encoder.text_token match, :pre_constant + next when match = scan(/[^:\s]+(:(?! |$)[^:\s]*)* .*/) encoder.text_token match, :error next
From Redmine: http://odd-eyed-code.org/issues/234
YAML scanner doesn't recognize - -
required_ruby_version: !ruby/object:Gem::Requirement requirements: - - ">=" - !ruby/object:Gem::Version version: 1.8.2 version:
The third line is currently interpreted as
operator(- )string(- ">="), which is wrong. It should beoperator(- )operator(- )string(">=").
From Redmine: http://odd-eyed-code.org/issues/237
The yaml.multiline example is broken
The YAML scanner doesn't seem to tokenize multiline strings correctly.
From Redmine: http://odd-eyed-code.org/issues/238
YAML scanner doesn't recognize arrays
...like [1, 2, 3].
From Redmine: http://odd-eyed-code.org/issues/239