Module:DecodeEncode/doc: Difference between revisions
Appearance
From Wikipedia, the free encyclopedia
+Cat
(29 intermediate revisions by 7 users not shown)
Line 1:
Line 1:
{{Module rating |general}}
{{Module rating |pre-alpha<!-- Values: pre-alpha • alpha • beta • release • protected -- If a rating not needed/relevant, delete this template call -->}}
<!-- Please place categories where indicated at the bottom of this page and interwikis at Wikidata (see [[Wikipedia:Wikidata]]) -->
<!-- Please place categories where indicated at the bottom of this page and interwikis at Wikidata (see [[Wikipedia:Wikidata]]) -->
{{High-use}}
Implements Lua functions [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.decode|mw.text.decode]], [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.encode|mw.text.encode]] in a module.
Implements Lua functions [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.decode|mw.text.decode]], [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.encode|mw.text.encode]] in a module.
:<code><nowiki>{{#invoke:decode|s=Source&nbsp;text}}</nowiki></code> → <code><nowiki>Source text</nowiki></code>
:<code><nowiki>{{#invoke:(追記) decodeEncode| (追記ここまで)decode|s=Source&nbsp;text(追記) &copy; (追記ここまで)}}</nowiki></code> → <code><nowiki>Source text(追記) © (追記ここまで)</nowiki></code>
See [[List of XML and HTML character entity references]].
See [[List of XML and HTML character entity references]].
== Decode (&copy; → ©) ==
== Decode ((追記) {{mono|1= (追記ここまで)&copy;(追記) }} (追記ここまで) → ©) (追記) <span class="anchor" id="Decode"></span> (追記ここまで)==
{{hatnote|See {{slink||Known issues}} for possible THIN SPACE, epsilon issues}}
:Decodes [[List of XML and HTML character entity references|Named Entities]] ''from'' entity name ''into'' a regular (unicode) character:
:Decodes [[List of XML and HTML character entity references|Named Entities]] ''from'' entity name ''into'' a regular (unicode) character:
:<code>&copy;</code> → <code>©</code>
:<code>&copy;</code> → <code>©</code>
:<code>&gt;</code> → <code>></code>
:<code>&gt;</code> → <code>></code>
All (削除) welldefined (削除ここまで) named entities are decoded ([https://html.spec.whatwg.org/multipage/named-characters.html#named-character-references HTML Named character references], formally: as defined in the [https://www.php.net/get_html_translation_table PHP table]).
All (追記) well-defined (追記ここまで) named entities are decoded ([https://html.spec.whatwg.org/multipage/named-characters.html#named-character-references HTML Named character references], formally: as defined in the [https://www.php.net/get_html_translation_table PHP table]).
:A regular, rendered sentence:
:A regular, rendered sentence:
::"At 100(削除) ° (削除ここまで)F, &(削除) amp; (削除ここまで) with a (削除) " (削除ここまで)burning(削除) " (削除ここまで) sun above, we (削除) walked (削除ここまで)"
::"At 100(追記) ° (追記ここまで)F, & with a (追記) " (追記ここまで)burning(追記) " (追記ここまで) sun above, we (追記) , we ⁄walked⁄. (追記ここまで)"
:In code:
:In code:
::"<code>At 100&nbsp;&deg;F, &(削除) amp; (削除ここまで)amp; with a &quot;burning&quot; sun above, we walked</code>"-- (削除) in code (削除ここまで)
::"<code>At 100&nbsp;&deg;F, & with a &quot;burning&quot; sun above, we (追記) &frasl; (追記ここまで)walked(追記) &frasl;. (追記ここまで)</code>"(追記) (追記ここまで)-- (追記) wikitext (追記ここまで)
:Processing:
:Processing:
:<code><nowiki>{{#invoke:decodeEncode|decode|s=At 100 °F, & with a "burning" sun above, we walked}}</nowiki></code> →
:<code><nowiki>{{#invoke:decodeEncode|decode|s=At 100 °F, & with a "burning" sun above, we (追記) ⁄ (追記ここまで)walked(追記) ⁄. (追記ここまで)}}</nowiki></code> →
::<code>{{#invoke:decodeEncode|decode|s=At 100 °F, & with a "burning" sun above, we walked}}</code> -- In code: no named entities
::<code>{{#invoke:decodeEncode|decode|s=At 100 °F, & with a "burning" sun above, we (追記) ⁄ (追記ここまで)walked(追記) ⁄. (追記ここまで)}}</code> -- In code:(追記) straight characters, (追記ここまで) no named entities(追記) . (追記ここまで)
:Renders, again:
⚫
::"At 100 °F, & with a (追記) " (追記ここまで)burning(追記) " (追記ここまで) sun above, we (追記) ⁄walked⁄. (追記ここまで)"
===Decode a reduced set only===
===Decode a reduced set only===
Line 30:
Line 36:
:Also, this module ignores the "omitted" logic: {{para|subset_only}} should be set explicitly to 'true' to be effective.
:Also, this module ignores the "omitted" logic: {{para|subset_only}} should be set explicitly to 'true' to be effective.
== Encode (© → &copy;) ==
== Encode (© → (追記) {{mono|1= (追記ここまで)&copy;(追記) }} (追記ここまで)) (追記) <span class="anchor" id="Encode"></span> (追記ここまで)==
:Function <code>encode</code> encodes some entity-named characters into that name (for example: <code>&</code> → <code>&amp;</code>).
:Function <code>encode</code> encodes some entity-named characters into that name (for example: <code>&</code> → <code>&amp;</code>).
Line 40:
Line 46:
Encode:
Encode:
:<code><nowiki>{{#invoke:decodeEncode|encode|s=At >100 °F, & with a "burning" sun above, we walked. ©(削除) |charset=&<>{{!}}°"'&©}}" (削除ここまで)|charset=&<>{{!}}°"'&©}}</nowiki></code>(削除) → (削除ここまで)
:<code><nowiki>{{#invoke:decodeEncode|encode|s=At >100 °F, & with a "burning" sun above, we walked. ©|charset=&<>{{!}}°"'&©}}</nowiki></code>
:<code>{{#invoke:decodeEncode|encode|s=At >100 °F, & with a "burning" sun above, we walked. ©|charset=&<>{{!}}°"'&©}}"|charset=&<>{{!}}°"'&©}}</code>
:→
⚫
::(削除) → (削除ここまで)"(削除) <code> (削除ここまで)At 100 °F, &(削除) amp;amp; (削除ここまで) with a (削除) &quot; (削除ここまで)burning(削除) &quot; (削除ここまで) sun above, we (削除) walked</code> (削除ここまで)"(削除) -- in code (削除ここまで)
:<code><nowiki>At &gt;100 &#176;F, &amp; with a &quot;burning&quot; sun above, we walked. &#169;</nowiki></code><!-- used Special:ExpandTemplate -->
:Renders as:
:"At >100 °F, & with a "burning" sun above, we walked. ©"
===character set to encode===
Per Lua documentation, only a small set of characters is processed. The (追記) characterset (追記ここまで) can be set ((追記) expanded) by using (追記ここまで){{para|charset}}.
⚫
:(削除) Use escape character is '<code>\</code>' (backslash; not "%" then). (削除ここまで)Example: {{para|charset|<nowiki><>" \'&</nowiki>}} (the default), {{para|charset|<nowiki><>(削除) \|\ (削除ここまで)°(削除) \ (削除ここまで)"(削除) \ (削除ここまで)'(削除) \ (削除ここまで)&©</nowiki>}}; characters not in the default will be replaced by their decimal entity: <code>©</code> → <code>&#169;</code> <small>(hexadecimal number, not decimal nor named &copy;)</small>
⚫
:Example: {{para|charset|<nowiki><>" \'&</nowiki>}} (the default), {{para|charset|<nowiki><>°"'&©(追記) {{!}} (追記ここまで)</nowiki>}}; characters not in the default will be replaced by their decimal entity: <code>©</code> → <code>&#169;</code> <small>(hexadecimal number, not decimal nor named &copy;)</small>
==Template==
{{As of|Dec 2020}}, there are no tempates implementing this module.
==Known issues <span class="anchor" id="Template"></span>==
* 13 Sep 2021: NOTE: The encode function with user-supplied charset is now used productively in {{tl|R/superscript}} and {{tl|R/ref}}. Before implementing breaking changes here, these templates need to be adjusted accordingly!
* 26 Sep 2021: {{unichar|2009|THIN SPACE|html=}}
:Note: Possible bug: Decoding <code>&ThinSpace;</code> works, but <code>&thinsp;</code> doesn't.
:Resolved in code.
* 4 Feb 2023: {{unichar|03B5|GREEK SMALL LETTER EPSILON|html=}}
{{tracked|T328840}}
:See {{slink|Module_talk:DecodeEncode|Bug_report:_bad_decoding_of_U+03B5_ε_(epsilon)}}
:Resolved in code.
==See also==
==See also==
* [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.decode|mw.text.decode]]
* [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.decode|mw.text.decode]]
* [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.encode|mw.text.encode]]
* [[:mw:Extension:Scribunto/Lua_reference_manual#mw.text.encode|mw.text.encode]]
* [[:Module:Urldecode]]
* [[:Module:Urldecode]]
{{Navbox wikitext-handling templates}}
<includeonly>{{sandbox other||
<includeonly>{{sandbox other||
<!-- Categories below this line, please; interwikis at Wikidata -->
<!-- Categories below this line, please; interwikis at Wikidata -->
[[Category:Wikitext processing templates]]
[[Category:Wikitext processing templates]]
[[Category:(削除) String (削除ここまで) (削除) manipulation (削除ここまで) (削除) modules (削除ここまで) (削除) (no template) (削除ここまで)]]
[[Category:(追記) Modules (追記ここまで) (追記) that (追記ここまで) (追記) manipulate (追記ここまで) (追記) strings (追記ここまで)]]
[[Category:Template metamodules]]
}}</includeonly>
}}</includeonly(追記) ><noinclude (追記ここまで)>
[[Category:Module documentation pages]]
</noinclude>
Latest revision as of 08:11, 22 April 2025
This module is rated as ready for general use. It has reached a mature state, is considered relatively stable and bug-free, and may be used wherever appropriate. It can be mentioned on help pages and other Wikipedia resources as an option for new users. To minimise server load and avoid disruptive output, improvements should be developed through sandbox testing rather than repeated trial-and-error editing.
Warning This Lua module is used on approximately 140,000 pages .
To avoid major disruption and server load, any changes should be tested in the module's /sandbox or /testcases subpages, or in your own module sandbox. The tested changes can be added to this page in a single edit. Consider discussing changes on the talk page before implementing them.
To avoid major disruption and server load, any changes should be tested in the module's /sandbox or /testcases subpages, or in your own module sandbox. The tested changes can be added to this page in a single edit. Consider discussing changes on the talk page before implementing them.
Implements Lua functions mw.text.decode, mw.text.encode in a module.
{{#invoke:decodeEncode|decode|s=Source text©}}
→Source text©
See List of XML and HTML character entity references.
Decode (© → ©)
[edit ]See § Known issues for possible THIN SPACE, epsilon issues
- Decodes Named Entities from entity name into a regular (unicode) character:
©
→©
>
→>
All well-defined named entities are decoded (HTML Named character references, formally: as defined in the PHP table).
- A regular, rendered sentence:
- "At 100 °F, & with a "burning" sun above, we , we ⁄walked⁄."
- In code:
- "
At 100 °F, & with a "burning" sun above, we ⁄walked⁄.
" -- wikitext
- "
- Processing:
{{#invoke:decodeEncode|decode|s=At 100 °F, & with a "burning" sun above, we ⁄walked⁄.}}
→At 100 °F, & with a "burning" sun above, we ⁄walked⁄.
-- In code: straight characters, no named entities.
- Renders, again:
- "At 100 °F, & with a "burning" sun above, we ⁄walked⁄."
Decode a reduced set only
[edit ]By setting |subset_only=true
, only these five entity names are decoded: '<', '>', '&', '"', ' ' (that is, into '<', '>', '&', '"', ' ').
- Note: There is a difference with the relevant Lua parameter. (This only concerns your task if you also work directly with the Lua mw.text.decode function). Lua documentation defines parameter
|decodeNamedEntities=
, having this effect: when omitted or false, only the reduced set of entities is recognized and decoded. This use of 'false' is inverted in using|subset_only=
:|decodeNamedEntities=false
=|subset_only=true
.
- Also, this module ignores the "omitted" logic:
|subset_only=
should be set explicitly to 'true' to be effective.
Encode (© → ©)
[edit ]- Function
encode
encodes some entity-named characters into that name (for example:&
→&
).
Regular sentence:
- "At >100 °F, & with a "burning" sun above, we walked. ©"
In code:
- "
At >100 °F, & with a "burning" sun above, we walked. ©
"
Encode:
{{#invoke:decodeEncode|encode|s=At >100 °F, & with a "burning" sun above, we walked. ©|charset=&<>{{!}}°"'&©}}
- →
At >100 °F, & with a "burning" sun above, we walked. ©
- Renders as:
- "At >100 °F, & with a "burning" sun above, we walked. ©"
character set to encode
[edit ]Per Lua documentation, only a small set of characters is processed. The characterset can be set (expanded) by using |charset=
.
- Example:
|charset=<>" \'&
(the default),|charset=<>°"'&©{{!}}
; characters not in the default will be replaced by their decimal entity:©
→©
(hexadecimal number, not decimal nor named ©)
Known issues
[edit ]- 13 Sep 2021: NOTE: The encode function with user-supplied charset is now used productively in {{R/superscript }} and {{R/ref }}. Before implementing breaking changes here, these templates need to be adjusted accordingly!
- 26 Sep 2021: U+2009 THIN SPACE ( ,  )
- Note: Possible bug: Decoding
 
works, but 
doesn't. - Resolved in code.
- 4 Feb 2023: U+03B5 ε GREEK SMALL LETTER EPSILON (ε, ε)
- See Module talk:DecodeEncode § Bug report: bad decoding of U+03B5 ε (epsilon)
- Resolved in code.