By: squeegee in PHP Tutorials on 2011年07月31日 [フレーム]
I think this is a reasonable port of Perl's Encoding::FixLatin by Grant McLean, which converts a string with mixed encodings (ASCII, ISO-8859-1, CP1252, and UTF-8) to UTF-8.
<?php
function init_byte_map(){
global $byte_map;
for($x=128;$x<256;++$x){
$byte_map[chr($x)]=utf8_encode(chr($x));
}
$cp1252_map=array(
"x80"=>"xE2x82xAC", // EURO SIGN
"x82" => "xE2x80x9A", // SINGLE LOW-9 QUOTATION MARK
"x83" => "xC6x92", // LATIN SMALL LETTER F WITH HOOK
"x84" => "xE2x80x9E", // DOUBLE LOW-9 QUOTATION MARK
"x85" => "xE2x80xA6", // HORIZONTAL ELLIPSIS
"x86" => "xE2x80xA0", // DAGGER
"x87" => "xE2x80xA1", // DOUBLE DAGGER
"x88" => "xCBx86", // MODIFIER LETTER CIRCUMFLEX ACCENT
"x89" => "xE2x80xB0", // PER MILLE SIGN
"x8A" => "xC5xA0", // LATIN CAPITAL LETTER S WITH CARON
"x8B" => "xE2x80xB9", // SINGLE LEFT-POINTING ANGLE QUOTATION MARK
"x8C" => "xC5x92", // LATIN CAPITAL LIGATURE OE
"x8E" => "xC5xBD", // LATIN CAPITAL LETTER Z WITH CARON
"x91" => "xE2x80x98", // LEFT SINGLE QUOTATION MARK
"x92" => "xE2x80x99", // RIGHT SINGLE QUOTATION MARK
"x93" => "xE2x80x9C", // LEFT DOUBLE QUOTATION MARK
"x94" => "xE2x80x9D", // RIGHT DOUBLE QUOTATION MARK
"x95" => "xE2x80xA2", // BULLET
"x96" => "xE2x80x93", // EN DASH
"x97" => "xE2x80x94", // EM DASH
"x98" => "xCBx9C", // SMALL TILDE
"x99" => "xE2x84xA2", // TRADE MARK SIGN
"x9A" => "xC5xA1", // LATIN SMALL LETTER S WITH CARON
"x9B" => "xE2x80xBA", // SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
"x9C" => "xC5x93", // LATIN SMALL LIGATURE OE
"x9E" => "xC5xBE", // LATIN SMALL LETTER Z WITH CARON
"x9F" => "xC5xB8" // LATIN CAPITAL LETTER Y WITH DIAERESIS
);
foreach($cp1252_map as $k=>$v){
$byte_map[$k]=$v;
}
}
function fix_latin($instr){
if(mb_check_encoding($instr,'UTF-8'))return $instr; // no need for the rest if it's all valid UTF-8 already
global $nibble_good_chars,$byte_map;
$outstr='';
$char='';
$rest='';
while((strlen($instr))>0){
if(1==preg_match($nibble_good_chars,$input,$match)){
$char=$match[1];
$rest=$match[2];
$outstr.=$char;
}elseif(1==preg_match('@^(.)(.*)$@s',$input,$match)){
$char=$match[1];
$rest=$match[2];
$outstr.=$byte_map[$char];
}
$instr=$rest;
}
return $outstr;
}
$byte_map=array();
init_byte_map();
$ascii_char='[x00-x7F]';
$cont_byte='[x80-xBF]';
$utf8_2='[xC0-xDF]'.$cont_byte;
$utf8_3='[xE0-xEF]'.$cont_byte.'{2}';
$utf8_4='[xF0-xF7]'.$cont_byte.'{3}';
$utf8_5='[xF8-xFB]'.$cont_byte.'{4}';
$nibble_good_chars = "@^($ascii_char+|$utf8_2|$utf8_3|$utf8_4|$utf8_5)(.*)$@s";
?>
Then just call fix_latin wherever you need it.
This policy contains information about your privacy. By posting, you are declaring that you understand this policy:
This policy is subject to change at any time and without notice.
These terms and conditions contain rules about posting comments. By submitting a comment, you are declaring that you agree with these rules:
Failure to comply with these rules may result in being banned from submitting further comments.
These terms and conditions are subject to change at any time and without notice.
Most Viewed Articles (in PHP )
PHP code to write to a CSV file from MySQL query
Send push notifications using Expo tokens in PHP
Different versions of PHP - History and evolution of PHP
PHP code to import from CSV file to MySQL
A Basic Example using PHP in AWS (Amazon Web Services)
Password must include both numeric and alphabetic characters - Magento
PHP code to write to a CSV file for Microsoft Applications
PHP convert string to lower case
Resume or Pause File Uploads in PHP
Parent: child process exited with status 3221225477 -- Restarting
Convert a hex string into a 32-bit IEEE 754 float number in PHP
Latest Articles (in PHP)
Send push notifications using Expo tokens in PHP
PHP convert string to lower case
A Basic Example using PHP in AWS (Amazon Web Services)
Different versions of PHP - History and evolution of PHP
PHP code to import from CSV file to MySQL
PHP code to write to a CSV file for Microsoft Applications
PHP code to write to a CSV file from MySQL query
Password must include both numeric and alphabetic characters - Magento
PHP file upload prompts authentication for anonymous users
PHP file upload with IIS on windows XP/2000 etc
Resume or Pause File Uploads in PHP
Send push notifications using Expo tokens in PHP
PHP convert string to lower case
A Basic Example using PHP in AWS (Amazon Web Services)
Different versions of PHP - History and evolution of PHP
PHP code to write to a CSV file for Microsoft Applications
PHP code to write to a CSV file from MySQL query
PHP code to import from CSV file to MySQL
Password must include both numeric and alphabetic characters - Magento
Resume or Pause File Uploads in PHP
PHP file upload prompts authentication for anonymous users
PHP file upload with IIS on windows XP/2000 etc
© 2023 Java-samples.com
Tutorial Archive: Data Science React Native Android AJAX ASP.net C C++ C# Cocoa Cloud Computing EJB Errors Java Certification Interview iPhone Javascript JSF JSP Java Beans J2ME JDBC Linux Mac OS X MySQL Perl PHP Python Ruby SAP VB.net EJB Struts Trends WebServices XML Office 365 Hibernate
Latest Tutorials on: Data Science React Native Android AJAX ASP.net C Cocoa C++ C# EJB Errors Java Certification Interview iPhone Javascript JSF JSP Java Beans J2ME JDBC Linux Mac OS X MySQL Perl PHP Python Ruby SAP VB.net EJB Struts Cloud Computing WebServices XML Office 365 Hibernate