Learn PHP international digital format processing


I don’t know if you have understood that for digital format, western countries will take three digits as a carry, separated by commas. For example, 12345678, if expressed in a standard format, is 12345678. However, our Chinese actually does not have such a separator. In addition, some areas are separated by spaces. We can see this immediately through the code. In fact, in the previous articles, we have been exposed to this knowledge,Learn the internationalization function in PHP to view currency and date information, let’s study it in detail today. Why format numbers and currencies? We will explain them one by one in the article.

Digital standard format

First, let’s look at the standard number format we introduced at the beginning.

$localeArr = ['en_US', 'zh_CN', 'ja_JP', 'de_DE', 'fr_FR', 'ar-IQ', 'ru_RU'];

foreach ($localeArr as $locale) {
    $fmt = new NumberFormatter($locale, NumberFormatter::DECIMAL);
    echo $locale . ':', $fmt->format(1234567.891234567890000), PHP_EOL;
// en_US:1,234,567.891
// zh_CN:1,234,567.891
// ja_JP:1,234,567.891
// de_DE:1.234.567,891
// fr_FR:1 234 567,891
// ar-IQ:١٬٢٣٤٬٥٦٧٫٨٩١
// ru_RU:1 234 567,891

We first specify a number of country codes, then loop them and instantiate them using the numberformatter object. The second parameter is the format type to be instantiated. Here, we specify the numeric type. Then you can use the format () method to format the specified number and output it. It can be seen that Germany uses. To separate carry and comma as decimal point. France and Russia use spaces to indicate carry and commas to indicate decimal points. Other countries follow the standard British expression.

For many financial and banking projects, the standard number format is very useful. We are often exposed to ordinary numbers and Chinese capital letters to be filled in during remittance, and some corporate finance for enterprises and foreign-related companies also need numbers in this standard format to record stubs. Now that we’re talking about finance, let’s look at the display of currency format.

Currency format

foreach ($localeArr as $locale) {
    $fmt = new NumberFormatter($locale, NumberFormatter::CURRENCY);
    echo $locale . ':', $fmt->format(1234567.891234567890000), PHP_EOL;
    echo $locale . ':', $fmt->formatCurrency(1234567.891234567890000, 'RUR'), PHP_EOL;
// en_US:$1,234,567.89
// en_US:RUR 1,234,567.89
// zh_CN:¥1,234,567.89
// zh_CN:RUR 1,234,567.89
// ja_JP:¥1,234,568
// ja_JP:RUR 1,234,567.89
// de_DE:1.234.567,89 €
// de_DE:1.234.567,89 RUR
// fr_FR:1 234 567,89 €
// fr_FR:1 234 567,89 RUR
// ar-IQ:١٬٢٣٤٬٥٦٨ د.ع.‏
// ar-IQ:١٬٢٣٤٬٥٦٧٫٨٩ RUR
// ru_RU:1 234 567,89 ₽
// ru_RU:1 234 567,89 р.

In this code, we use two modes of output. The first is to specify the numberformatter, and the second parameter is currency, that is, to specify the format as currency. In fact, the token symbol of the corresponding region is added before and after the number in the standard format. For example, the ¥ commonly used in China and Japan is generally placed in front of the amount, while the euro logo is used in Europe to put behind the amount.

Another form is formatcurrency (). This method can specify a currency type. If it is not the locale of this type, it will directly output the currency character. In the test code, we give the old ruble of Russia, and RUR will be directly output in other regions. When the region is set to Russia, the output is the standard old ruble symbol (now the new ruble is used, the symbol is ₽, and the old ruble is р.)。

Detailed locale formatting style

Does it feel very tall? No, no, no, the above two formats are just appetizers. The really fun ones will be served to you right now.

$fmt = new NumberFormatter('zh_CN', NumberFormatter::PERCENT);
echo $fmt->format(1234567.891234567890000), PHP_EOL; // 123 456 789 %

$fmt = new NumberFormatter('zh_CN', NumberFormatter::SCIENTIFIC);
echo $fmt->format(1234567.891234567890000), PHP_EOL; // 1,2345678912345679E6

$fmt = new NumberFormatter('zh_CN', NumberFormatter::SPELLOUT);
echo $fmt->format(1234567.891234567890000), PHP_ EOL; //  one million two hundred and thirty-four thousand five hundred and sixty-seven point eight nine one two three four five six seven nine

$fmt = new NumberFormatter('zh_CN', NumberFormatter::SPELLOUT);
echo $fmt->format(1234502.891234567890000), PHP_ EOL; //  One million two hundred thirty four thousand five hundred two point eight nine one two three four five six seven nine

$fmt = new NumberFormatter('zh_CN', NumberFormatter::ORDINAL);
echo $fmt->format(1234567.891234567890000), PHP_ EOL; //  Page 1234568

$fmt = new NumberFormatter('zh_CN', NumberFormatter::DURATION);
echo $fmt->format(1234567.891234567890000), PHP_EOL; // 1,234,568

Not much about percent. Percentage is an increase of a percent sign. It is not output in standard format. It will be separated by spaces. Scientific is our common scientific counting method. The result in the test code is the power of 10 of 1.xx.

Spellout is more powerful, according to the spelling rules of the current regional language. Yes, it is directly converted into our Chinese expression. If you need to convert it into Chinese uppercase, you can replace it directly. This is definitely a major discovery of this article. In an interview with a company, someone asked how to convert numbers into Chinese, because many financial systems need this function. Whether accounting or invoice processing, Chinese uppercase or lowercase are automatically output by the system. At that time, we also wrote the algorithm for half a day. If you write the algorithm yourself, in addition to paying attention to the units, the representation of zero is also a very important point. Interested friends can try it by themselves. However, if someone asks this question during the interview next time, I will directly throw out the artifact of numberformatter:: spelout.

Ordinal is the expression of sorting. In Chinese, in fact, a second word is added in front of it. Duration is a format based on duration rules. Both will discard the decimal point.

Formatting rule settings

Although there are so many rule formats for us to use, our business is always strange. Can we define our own format rules? Since it’s written like that, of course it’s OK.

var_dump($fmt->getPattern()); // string(8) "#,##0.##"
$fmt->setPattern("#0.# kg");
var_dump($fmt->getPattern()); // string(6) "0.# kg"
echo $fmt->format(1234567.891234567890000), PHP_EOL; // 1234567.9 kg

See? We use the setpattern () method to define a format rule with kg, which shows that we need a format representing weight. Then only one decimal point is reserved, and no separator is required. In this way, when the format () method is used again, it will be formatted according to the format we specified.

Attribute operation

Of course, in addition to directly setting the rule format, we can also specify some attribute values to change the current format effect.

$fmt = new NumberFormatter( 'zh_CN', NumberFormatter::DECIMAL );
echo "Digits: ".$fmt->getAttribute(NumberFormatter::MAX_FRACTION_DIGITS), PHP_EOL; // Digits: 3
echo $fmt->format(1234567.891234567890000), PHP_EOL; // 1,234,567.891

$fmt->setAttribute(NumberFormatter::MAX_FRACTION_DIGITS, 2);
echo "Digits: ".$fmt->getAttribute(NumberFormatter::MAX_FRACTION_DIGITS), PHP_EOL; // Digits: 2
echo $fmt->format(1234567.891234567890000), PHP_EOL; // 1,234,567.89

In this code, we set Max by setattribute()_ FRACTION_ The value of digits, which is used to change the maximum number of decimal places reserved. Of course, not only this attribute, but also many other attributes that can be modified. You can refer to the official manual by yourself.

Separator settings

Similarly, we can directly modify the symbols used in formatting, such as separator and decimal point. You can use the setsymbol () method directly.

var_dump($fmt->getSymbol(NumberFormatter::GROUPING_SEPARATOR_SYMBOL)); // string(1) ","
$fmt->setSymbol(NumberFormatter::GROUPING_SEPARATOR_SYMBOL, "*");
var_dump($fmt->getSymbol(NumberFormatter::GROUPING_SEPARATOR_SYMBOL)); // string(1) "*"
echo $fmt->format(1234567.891234567890000), PHP_EOL; // 1*234*567.891

Text property settings associated with locale formatting

We can also directly set some text information related to region formatting. For example, settextattribute () is used to modify the negative sign in the following code. We can also use this method to modify the interval character, currency code and other contents. You can test and learn by comparing with the official documents.

var_dump($fmt->getTextAttribute(NumberFormatter::NEGATIVE_PREFIX)); // string(1) "-"
echo $fmt->format(-1234567.891234567890000), PHP_EOL;
$FMT - > settextattribute (numberformatter:: negative_prefix, "minus sign");
var_ dump($fmt->getTextAttribute(NumberFormatter::NEGATIVE_PREFIX)); //  String (7) "minus sign"
echo $fmt->format(-1234567.891234567890000), PHP_ EOL; //  Minus sign 1234567.891

Get regional information

These two methods are to simply obtain the current regional information. We have also talked about it in other articles before, valid_ Local indicates a valid area, which is actual_ Local represents the actual area.

var_dump($fmt->getLocale(Locale::VALID_LOCALE)); // string(10) "zh_Hans_CN"
var_dump($fmt->getLocale(Locale::ACTUAL_LOCALE)); // string(10) "zh_Hans_CN"

Character conversion to numeric, currency format

We can output numbers in format, and the output contents will be converted into strings because of the addition of separators. Then, can we convert the formatted standard digital characters back to the number type?

$fmt = new NumberFormatter( 'zh_CN', NumberFormatter::DECIMAL );
$num = "1,234,567.891";
echo $fmt->parse($num)."\n"; // 1234567.891
echo $fmt->parse($num, NumberFormatter::TYPE_INT32)."\n"; // 1234567

$fmt = new NumberFormatter( 'zh_CN', NumberFormatter::CURRENCY );
echo $fmt->parseCurrency('¥1,234,567.89', $currency), PHP_EOL; // 1234567.89
var_dump($currency); // string(3) "CNY"

There are two methods. The first is the parse () method, which converts the number string in standard format back to the number of the specified type, which can be specified as type_ INT32 、TYPE_ INT64 、TYPE_ DOUBLE 、TYPE_ Currency and other types. Another method is the parsecurrency () method. As can be seen from the name, it converts the currency format back to numbers, and, importantly, its second reference parameter can also return the general code of currency symbols. For example, the CNY returned in the test code represents the RMB we use.

error message

Finally, let’s take a look at how to get the error information in numberformatter.

echo $fmt->parseCurrency('1,234,567.89', $currency), PHP_EOL;
var_dump($fmt->getErrorCode()); // int(9)
var_dump(intl_is_failure($fmt->getErrorCode())); // bool(true)
var_dump($fmt->getErrorMessage()); // string(36) "Number parsing failed: U_PARSE_ERROR"

Here, we use non-standard currency strings to convert using parsecurrency(). Parsecurrency() must receive content with currency symbols, so an error occurs here. We can use geterrorcode () to get the error code and geterrormessage () to get the error information. The other is an Intl_ is_ The failure() function is used to judge whether an error of regional language problem has occurred according to the error code.


It is another eye opening learning journey. The conversion of Chinese lowercase format is really unknown before, and the mutual conversion of currencies can also be applied to some collection programs, such as the collection and analysis of e-commerce page prices. In short, I still feel full of harvest. In addition, this set of numberformatter objects also provides process oriented functional methods, such as numfmt_ Create(), remember numfmt_ The function at the beginning, oh, don’t and number_ The functions related to format () are confused.

Test code:

https://github.com/zhangyue0503/dev-blog/blob/master/php/202011/source/4. Learn PHP international digital format processing.php

Reference documents:


Official account: hard core project manager

Add wechat / QQ friends: [xiaoyuezigonggong / 149844827] get free PHP and project management learning materials

Tiktok, official account, voice, headline search, hard core project manager.

Station B ID: 482780532