PHP iconv() function character code conversion

Time:2020-6-29

In PHP, iconv function library can complete the conversion of various character sets, which is an indispensable basic function library in PHP programming; however, sometimes iconv will transcode some data for no reason. For example, there will be an error when converting the character ‘-‘ to GB2312.

Let’s take a look at the usage of this function.

The simplest application is to replace GB2312 with UTF-8:


$text=iconv("GB2312","UTF-8",$text);

In use$text=iconv("UTF-8","GB2312",$text)In the process, if some special characters are encountered, such as “-“, the “.” in English name and so on, the conversion will be broken. The text after these characters can’t be converted any more.

To solve this problem, the following code can be used:


$text=iconv("UTF-8","GBK",$text);

You did not read wrong, it is so simple, do not use GB2312, and write GBK, it is OK.

There’s another way, the second parameter, plus//IGNORE, ignore the error as follows:


iconv("UTF-8","GB2312//IGNORE",$data);

There is no specific comparison between the two methods, and the first method (GBK instead of GB2312) is better.

Description of iconv() in PHP manual:

iconv

(PHP 4 >= 4.0.5, PHP 5)
iconv – Convert string to requested character encoding
Description
string iconv ( string in_charset, string out_charset, string str )
Performs a character set conversion on the string str from in_charset to out_charset. Returns the converted string or FALSE on failure.
If you append the string //TRANSLIT to out_charset transliteration is activated. This means that when a character can’t be represented in the target charset, it can be approximated through one or several similarly looking characters. If you append the string //IGNORE, characters that cannot be represented in the target charset are silently discarded. Otherwise, str is cut from the first illegal character.

When using this function for string encoding conversion, it should be noted that if UTF-8 is converted to GB2312, the string may be truncated. In this case, the following methods can be used to solve the problem:


$str=iconv('utf-8',"gb2312//TRANSLIT",file_get_contents($filepath));

In the second parameter, add the red character part to indicate that if the character matching the source code cannot be found in the target code, similar characters will be selected for conversion. Here you can also use the parameter: / / ignore to ignore characters that cannot be converted.

Ignore means to ignore the conversion error. Without the ignore parameter, all strings after the character cannot be saved.

Iconv is not the default function of PHP, but also the module installed by default. It needs to be installed for use.

Windows 2000 + PHP, if you can modify it php.ini File, set extension = PHP_ iconv.dll At the same time, you need to copy your original PHP installation file iconv.dll Go to your WinNT / system32 (if your DLL points to this directory). In the Linux environment, by using the static installation method, you can add one more item — with iconv when configuring. Phpinfo can see the iconv item. (Linux7.3+Apache4.06+php4.3.2)。

Mb_ convert_ Introduction of encoding and iconv functions

mb_convert_encodingThis function is used for transcoding. I didn’t understand the concept of program coding all the time, but now I seem to be a little bit enlightened. However, there is no coding problem in English. Only Chinese data can have this problem. For example, when you write a program with Zend studio or EDITPLUS, you use GBK encoding. If the data needs to be entered into the database, and the database code is utf8, then you need to code and convert the data, otherwise, entering the database will become garbled code.

Make a GBK to UTF-8:

<?php 
header("content-Type: text/html; charset=Utf-8"); 
echo mb_ convert_ Encoding ("you are my friend", "UTF-8", "GBK"); 
?>

Another GB2312 to Big5:

<?php 
header("content-Type: text/html; charset=big5"); 
echo mb_ convert_ Encoding ("you are my friend", "Big5", "GB2312"); 
?>

However, to use the above functions, you need to install it, but you need to enable the mbstring extension library first.

string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] )You need to enable mbstring extension library first php.ini Will; extension=php_ mbstring.dll In front of; without MB_ convert_ Encoding can specify a variety of input codes, it will automatically identify according to the content, but the execution efficiency is much worse than iconv;

string iconv ( string in_charset, string out_charset, string str )Note: for the second parameter, in addition to specifying the encoding to be converted to, you can also add two suffixes: / / translate and / / ignore, where / / translate will automatically change the characters that cannot be directly converted into one or more similar characters, / / ignore will ignore the characters that cannot be converted, and the default effect is to truncate the first illegal character.

In general, iconv is used only when the original coding cannot be determined or iconv cannot be displayed normally after conversionmb_convert_encoding Function.


$content = iconv("GBK", "UTF-8″, $content);
$content = mb_convert_encoding($content, "UTF-8″, "

summary

The above is the whole content of this article. I hope that the content of this article has some reference learning value for your study or work. Thank you for your support for developepaer. If you want to know more about it, please check the related links below