Jump to content
Wikipedia The Free Encyclopedia

Module talk:DecodeEncode

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Module:DecodeEncode is permanently protected from editing because it is a heavily used or highly visible module. Substantial changes should first be proposed and discussed here on this page. If the proposal is uncontroversial or has been discussed and is supported by consensus, editors may use {{edit template-protected}} to notify an administrator or template editor to make the requested edit.

Bug report: bad decoding of U+03B5 ε (epsilon)

[edit ]

About U+03B5 ε GREEK SMALL LETTER EPSILON (ε ε)

  • Issue: after resolving HTML entity ε by mw.text.decode() , the plain character is not found by mw.ustring.gsub() . No issue with alternative HTML entity ε. ε good, ε bad.
Report limitations: Original report and bug reproduction is at enwiki Module talk:DecodeEncode, from where en:module:DecodeEncode and en:module:String are used live. At phabricator pseudocode may be used and some "results" may be hardcoded. In-text the escape & is used, not in-function. Lua patterns not used ("no %").
  • To reproduce:
1. Create research string:
Xε1Xε2X (shows live and unedited as: Xε1Xε2X)
2. Render the string by decode() (as inner function)
3. then on rendered result use gsub() to replace plain character εE: (as outer function)
mw.ustring.gsub( s=(mw.text.decode( s=Xε1Xε2X, decodeNamedEntities=true ) ), pattern=ε, repl=E ) [is pseudo-code, see note. 21:10, 7 February 2023 (UTC)]
4. Result3 (s&r pattern use ε from Xε1X):
XE1XE2X
5. Result4 (s&r pattern use ε from Xε2X):
XE1XE2X
  • Expected: XE1XE2X (only one character ε exists)
{{#invoke:String|replace|source={{#invoke:DecodeEncode|decode|s=Xε1Xε2X}}|pattern=ε|replace=E|plain=true}}
→ XE1XE2X
-DePiep (talk) 21:10, 7 February 2023 (UTC) [reply ]

Workaround A, ad hoc

[edit ]

Workaround A, ad hoc: add innermost function to first replace in the research string εε:

A1: {{#invoke:String|replace|source={{#invoke:DecodeEncode|decode|s={{#invoke:String|replace|source=Xε1Xε2X|pattern=ε|replace=ε|plain=true}}}}|pattern=ε|replace=E|plain=true}}
XE1XE2X

Workaround B, in module (THIN SPACE example)

[edit ]

Workaround B: early in :en:module:DecodeEncode, replace εε

About THIN SPACE: it looks like character U+2009 THIN SPACE (   ) has a samilar issue.   good,   bad.

Currently in code:

functionp._decode(s,subset_only)
localret=nil;
s=mw.ustring.gsub(s,' ',' ')-- Workaround for bug:   gets properly decoded in decode, but   doesn't.
ret=mw.text.decode(s,notsubset_only)
returnret
end

In en:module:DecodeEncode/sandbox, I have coded a similar handling of EPSILON:

module:DecodeEncode, module:DecodeEncode/sandbox diff
functionp._decode(s,subset_only)
localret=nil;
-- U+2009 THIN SPACE: workaround for bug: HTML entity   is decoded incorrect. Entity   gets decoded properly
s=mw.ustring.gsub(s,' ',' ')
-- U+03B5 ε GREEK SMALL LETTER EPSILON: workaround for bug (phab:T328840): HTML entity ε is decoded incorrect for gsub(). Entity ε gets decoded properly
s=mw.ustring.gsub(s,'ε','ε')
ret=mw.text.decode(s,notsubset_only)
returnret
end
  • /sandbox tests:
B. {{#invoke:String|replace|source={{#invoke:DecodeEncode/sandbox|decode|s=Xε1Xε2X}}|pattern=ε|replace=E|plain=true}}
B1. ResultB1 (s&r pattern use ε from Xε1X): XE1XE2X
B2. ResultB2 (s&r pattern use ε from Xε2X): XE1XE2X

I propose to edit the module along this way.

Workaround C (mw, Lua)

[edit ]

Changes in mw, Lua: I have not idea.

testcases EPSILON

[edit ]
  • Original failure, now solved=not showing any more:
(hardcoded explanation here): in cell marked Red XN, the result showed as "XE1Xε2X". That is: wikitext input "ε" was not recognised & replaced. -DePiep (talk) 07:49, 19 February 2023 (UTC) [reply ]
EPSILON ε ε error & fix proposal (16 Feb 2023)
1 2 3 4 5 6
id entity code plain mod:.. decode(&entity;) replace(decode(..)) with E
pattern=hardcoded ⟨ε⟩ from plain
(s=&entity;)
(s=checkstring)
mod:..decode/sandbox
checkstring X&epsi;1X&epsilon;2X >Xε1Xε2X< >Xε1Xε2X<
EPSI &epsi; >ε< >ε< E
XE1XE2X
E
XE1XE2X
EPSILON &epsilon; >ε< >ε< E
XE1XE2X
Red XN
E
XE1XE2X
Similar fix as U+2009 THIN SPACE (&thinsp;, &ThinSpace;) has (though original cause bug may be different for THIN SPACE).
  • Phabricator T328840 did not gain traction. Would be mw-level, not this module.
-DePiep (talk) 06:22, 16 February 2023 (UTC) [reply ]

Template-protected edit request on 16 February 2023

[edit ]
This edit request has been answered. Set the |answered= parameter to no to reactivate your request.
Issue: bad decoding of HTML entity &epsi; Red XN
re U+03B5 ε GREEK SMALL LETTER EPSILON (&epsi;, &epsilon;)
Change: fix by replacing with entity &epsilon; Green tickY before applying decode(). See § Workaround B for code diff & backgrounds; minor comment change
Discussion: (1) reported at T328840, no responses (mw-level); (2) bug report here not challenged
Testcases: See § testcases EPSILON.
DePiep (talk) 06:49, 16 February 2023 (UTC) [reply ]
Done * Pppery * it has begun... 03:11, 19 February 2023 (UTC) [reply ]

NBSP behaviour

[edit ]

Leaving this note here.

About NBSP, U+00A0 NO-BREAK SPACE (&nbsp;, &NonBreakingSpace;). With input &nbsp; I am experiencing problems reminding of § epsilon (T328840, now resolved).

When nested like: (replace|s=(decode|s=AB&nbsp;YZ)|replace=AB_YZ) returns breaking code (breaking when used in/with HTML/css code like span, sup, class).

No time to build the reproduction/test, so have to leave it for now. Not reported on phab. DePiep (talk) 07:27, 20 February 2023 (UTC) [reply ]

Template-protected edit request on 21 March 2023

[edit ]
This edit request has been answered. Set the |answered= parameter to no to reactivate your request.

Please replace all code Module:DecodeEncode with module:DecodeEncode/sandbox. (compare )

Change: apply require('strict'), and declade function local explicit. DePiep (talk) 14:34, 21 March 2023 (UTC) [reply ]

Invitation is out. -DePiep (talk) 14:49, 21 March 2023 (UTC) [reply ]
Upd: Gonnym has made large improvements, so the sandboxdiff is large. I do not see strict-related changes. DePiep (talk) 21:31, 21 March 2023 (UTC) [reply ]
The changes are good and no globals remain. The two mw.ustring could be string. Johnuniq (talk) 06:40, 22 March 2023 (UTC) [reply ]
thx. As said, please someone with trust perform ER because me editing/commenting in between does not help. DePiep (talk) 08:18, 22 March 2023 (UTC) [reply ]
Done — Martin (MSGJ · talk) 18:35, 22 March 2023 (UTC) [reply ]

AltStyle によって変換されたページ (->オリジナル) /