Bugs php.net | support | documentation | report a bug | advanced search | search howto | statistics | random bug | login

go to bug id or search bugs for

Bug #25927 get_html_translation_table calls the ' ' instead of '
Submitted: 2003年10月20日 17:53 UTC Modified: 2010年10月12日 04:52 UTC
Votes:3
Avg. Score:4.7 ± 0.5
Reproduced:2 of 2 (100.0%)
Same Version:2 (100.0%)
Same OS:1 (50.0%)
From: acm at tweakers dot net Assigned: cataphract (profile)
Status: Closed Package: Unknown/Other Function
PHP Version: 4.3.3 OS: Linux
Private report: No CVE-ID: None
[2003年10月20日 17:53 UTC] acm at tweakers dot net
Description:
------------
When you call get_html_translation_table, with the ENT_QUOTES parameter, it'll return ' for '
The code for ' should, of course, be '
This was not broken in 4.3.1, so is newly introduced in either 4.3.2 or 4.3.3
One wonders how this could occur, since both htmlspecialchars/htmlentities and html_entity_decode work correctly.
Reproduce code:
---------------
<? print_r(get_html_translation_table(HTML_SPECIALCHARS,ENT_QUOTES));
?>
Expected result:
----------------
Array
(
 [&] => &amp;
 ["] => &quot;
 ['] => &#039;
 [<] => &lt;
 [>] => &gt;
)
Actual result:
--------------
Array
(
 [&] => &amp;
 ["] => &quot;
 ['] => &#39;
 [<] => &lt;
 [>] => &gt;
)

Patches

Pull Requests

History

AllCommentsChangesGit/SVN commitsRelated reports
[2003年10月20日 18:35 UTC] kees at tweakers dot net
We've fixed it be commenting line 421 of ext/standard/html.c:
420- { '\'', "&#039;", 6, ENT_HTML_QUOTE_SINGLE },
421:/* { '\'', "&#39;", 5, ENT_HTML_QUOTE_SINGLE }, */
[2003年10月20日 18:55 UTC] iliaa@php.net
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.php.net/manual/ and the instructions on how to report
a bug at http://bugs.php.net/how-to-report.php
Both &#39; and &#039; are valid.
[2003年10月20日 19:04 UTC] acm at tweakers dot net
Well, it's cute that both are valid, but that's not the point...
get_html_translation_table is supposed to return "how php's functions translate it", not "any way which is valid".
And in that way, it _fails_ to do so.
Since the function html_entity_decode is only available as of php-4.3.0, anyone who has a similar function (based on the php-example on the documentpage!), finds it broken because of this.
In that sense it is, imho, a bug.
Quoting you're own documentation:
"get_html_translation_table -- Returns the translation table used by htmlspecialchars() and htmlentities()"
[2003年10月20日 21:51 UTC] moriyoshi@php.net
Not quite. When you have to write your own html_entity_decode(), you should cope with any forms of the numeric entity including hexadecimal style. It's not as simple as the snippet in the manual page.
[2003年10月21日 05:14 UTC] acm at tweakers dot net
Well, maybe so.
But I was refering to a function that tries to undo the changes of htmlspecialchars/htmlentities.
If htmlspecialchars changes ' to &#039; and you want to depend on get_html_translation_table to undo all changes, you expect it to return ' = &#039; instead of ' = &#39;, since that's the change htmlspecialchars/htmlentities did aswell.
It didn't change it to &#39;
If you really wanted to create a perfect entity-decoder, you'd indeed have to cope with all those &*; entities, including all the &#[0-9]{2,3};-like entities.
But for the simple "undo the htmlspecialchars"-like function that is not necessary.
And again, get_html_translation_table returns "how the htmlspecialchars/entities functions do it", not "all possible translations" or "just a valid version, maybe not what our own functions do", doesn't it? :)
To explain what I mean:
if you do 
echo html_entity_decode(htmlspecialchars("'", ENT_QUOTES));
you get ' back.
If you do:
function my_entity_decoder($string)
{
$trans = array_flip(get_html_translation_table(ENT_HTML_SPECIALCHARS, ENT_QUOTES));
$original = strtr($encoded, $trans);
}
echo my_entity_decoder(htmlspecialchars("'", ENT_QUOTES));
Where you trust the get_html_translation_table-function to return enough information to output ' again...
But if it all doesn't matter to you guys, why do the two change at all?
Why does the htmlspecialchars change it to &#039; why the get_html_translation_table claims it changes it to &#39; ??
[2003年11月20日 14:00 UTC] mike-php at emerge2 dot com
Does the same in Windows PHP 4.3.4.
[2010年10月11日 07:15 UTC] cataphract@php.net
-Status: Bogus +Status: Re-Opened -Assigned To: +Assigned To: cataphract
[2010年10月12日 04:51 UTC] cataphract@php.net
Automatic comment from SVN on behalf of cataphract
Revision: http://svn.php.net/viewvc/?view=revision&amp;revision=304340
Log: - Added a 3rd parameter to get_html_translation_table. It now takes a charset
 hint, like htmlentities et al.
- Fixed bug #49407 (get_html_translation_table doesn't handle UTF-8).
- Fixed bug #25927 (get_html_translation_table calls the ' &amp;#39; instead of
 &amp;#039;).
- Fixed tests for get_html_translation_table and unified the Windows and
 non-Windows versions of the tests.
[2010年10月12日 04:52 UTC] cataphract@php.net
-Status: Re-Opened +Status: Closed
[2010年10月12日 04:52 UTC] cataphract@php.net
Fixed for PHP 5.3 and trunk.
PHP Copyright © 2001-2025 The PHP Group
All rights reserved. Last updated: Mon Dec 15 02:00:01 2025 UTC

AltStyle によって変換されたページ (->オリジナル) /