PHP rar decompression read expansion pack learning

Time:2021-11-29

As the extended learning of compression and decompression, the two ace compression formats RAR and zip have always been the compression terminators in the computer field. Rar compressed packages are nearly dominant in the windows system. The PHP extension we are learning today is aimed at rar compressed package operation. However, the RAR extension of PHP can only read and decompress rar compressed packages, not compress them.

The installation package of PHP rar extension in PECL is outdated and cannot be used in php7. We need to use its source code on GitHub to compile and install it successfully in php7 environment.

https://github.com/cataphract/php-rar

After git clone is directly installed, it can be installed as a normal PHP extension.

Get the compressed package handle rararchive

$arch = RarArchive::open("test.rar");

$archNo = rar_open("test.rar");

echo $arch, PHP_EOL; // RAR Archive "/data/www/blog/test.rar"
echo $archNo, PHP_EOL; // RAR Archive "/data/www/blog/test.rar"

$arch->close();
rar_close($archNo);

echo $arch, PHP_EOL; // RAR Archive "/data/www/blog/test.rar" (closed)
echo $archNo, PHP_EOL; // RAR Archive "/data/www/blog/test.rar" (closed)

PHP rar extension can be written in two forms. One is object-oriented, that is, using rararchive class to operate compressed packages. Another way is to use a function rar directly_ Open is used to get a handle to a rar file. They were rewritten__ ToString method, so we can directly print the contents of the handle and see the specific file operated by the current handle.

When we close the handle, the handle object can still output, but a closed will be displayed later. At this time, the handle object can no longer perform other operations.

$arch = RarArchive::open("test.rar");
$archNo = rar_open("test.rar");

echo $arch->getComment(), PHP_EOL;
echo $arch->isBroken(), PHP_EOL;
echo $arch->isSolid(), PHP_EOL;

echo rar_comment_get($archNo), PHP_EOL;
echo rar_broken_is($archNo), PHP_EOL;
echo rar_solid_is($archNo), PHP_EOL;

echo $arch->setAllowBroken(true), PHP_EOL;
echo rar_allow_broken_set($archNo, true), PHP_EOL;

Some methods of rararchive object can help us get the information of the current compressed package. For example, getcomment () gets the description of the compressed package, isbroken () gets whether the current compressed package is damaged, and issolid () checks whether the current compressed package is available. The setallowbroken () method allows us to operate on damaged compressed packets. Here we give the writing methods of object-oriented and process-oriented.

Each entity file or directory in the compressed package operates rarentry

After obtaining the handle of the compressed package, we need to further obtain the contents inside the compressed package. In the handle object, the rarentry object of each file and directory inside the compressed package has been saved.

$gameEntry = $arch->getEntry('ldxlcs/ldxlcs/game.htm');
echo $gameEntry->getName(), PHP_EOL; // ldxlcs/ldxlcs/game.htm
echo $gameEntry->getUnpackedSize(), PHP_EOL; // 56063

$gameEntryNo = rar_entry_get($arch, "ldxlcs/ldxlcs/game.htm");
echo $gameEntry->getName(), PHP_EOL; // ldxlcs/ldxlcs/game.htm
echo $gameEntry->getUnpackedSize(), PHP_EOL; // 56063

$fp = $gameEntryNo->getStream();
while (!feof($fp)) {
    $buff = fread($fp, 8192);
    if ($buff !== false) {
        echo $buff;
    } else {
        break;
    }
    //fread error
}
//Output the entire contents of the file
echo PHP_EOL;

echo 'Entry extract: ', $gameEntry->extract("./"), PHP_EOL;

The getentry () method of the handle object is used to get the contents of the specified file or directory. It obtains a single file or directory, so the file content to be obtained must be clearly specified. Through this method, we can get a rarentry object. Next, there are some operations of this object.

The getname () method of the rarentry object is used to obtain the file name with a path, which is the absolute path in the compressed package. The getunpackedsize () method is used to obtain the size of the file, and the getstream () method is used to obtain the file stream. Through the getstream () method, we can directly print and output the contents of the file.

Of course, the most important thing is that we can extract a file directly to the specified directory through the extract () method. The PHP rar extension does not provide a method to completely decompress the entire compressed package, so if we need to decompress the entire compressed package, we need to decompress these files one by one by looping through all the contents of the compressed package.

Finally, let’s take a look at how to traverse all the contents of the compressed package.

$entries = $arch->getEntries();

foreach ($entries as $en) {
    echo $en, PHP_EOL;
    echo $en->getName(), PHP_EOL;
    echo $en->getUnpackedSize(), PHP_EOL;
    echo $en->getAttr(), PHP_EOL;
    echo $en->getCrc(), PHP_EOL;
    echo $en->getFileTime(), PHP_EOL;
    echo $en->getHostOs(), PHP_EOL;
    echo $en->getMethod(), PHP_EOL;
    echo $en->getPackedSize(), PHP_EOL;
    echo $en->getVersion(), PHP_EOL;
    echo $en->isDirectory(), PHP_EOL;
    echo $en->isEncrypted(), PHP_EOL;

}

//Contents of all files in the compressed package
// RarEntry for file "ldxlcs/ldxlcs/game.htm" (3c19abf6)
// ldxlcs/ldxlcs/game.htm
// 56063
// 32
// 3c19abf6
// 2017-09-10 13:25:04
// 2
// 51
// 7049
// 200
// ……

$entriesNo = rar_list($archNo);
foreach ($entriesNo as $en) {
    echo $en->getName(), PHP_EOL;
}

The getentries () method of rararchive object is directly used. Through this method, we can obtain an array of rarentry objects, which contains all the contents of the RAR compressed package. In this code, we also print some other attribute methods of the rarentry object. We can roughly understand these methods according to the name. These methods obtain various information about the file, and you can test them yourself.

exception handling

Finally, if you open the wrong file or get a file that is not in the compressed package, the PHP rar extension will report an error in the form of a PHP error. However, since it provides a complete object-oriented writing method, it must also provide a set of object-oriented exception handling mechanism.

//If usingexceptions is not opened, all errors will go through the PHP error mechanism. After opening, go through the PHP exception mechanism
RarException::setUsingExceptions(true);
var_dump(RarException::isUsingExceptions()); // bool(true)
try {
    $arch = RarArchive::open("test1.rar");
    $arch->getEntry('ttt.txt');
} catch (RarException $e) {
    var_dump($e);
    // object(RarException)#35 (7) {
    //     ["message":protected]=>
    //     string(91) "unRAR internal error: Failed to open /data/www/blog/test1.rar: ERAR_EOPEN (file open error)"
    //     ["string":"Exception":private]=>
    //     string(0) ""
    //     ["code":protected]=>
    //     int(15)
    //     ["file":protected]=>
    //     string(22) "/data/www/blog/rar.php"
    //     ["line":protected]=>
    //     int(93)
    //     ["trace":"Exception":private]=>
    //     array(1) {
    //       [0]=>
    //       array(6) {
    //         ["file"]=>
    //         string(22) "/data/www/blog/rar.php"
    //         ["line"]=>
    //         int(93)
    //         ["function"]=>
    //         string(4) "open"
    //         ["class"]=>
    //         string(10) "RarArchive"
    //         ["type"]=>
    //         string(2) "::"
    //         ["args"]=>
    //         array(1) {
    //           [0]=>
    //           string(9) "test1.rar"
    //         }
    //       }
    //     }
    //     ["previous":"Exception":private]=>
    //     NULL
    //   }
}

As long as rarexception:: setusingexceptions() is set to true, the exception handling mechanism of PHP rar extension can be turned on. At this time, we open an error file or get an error file path in the compressed package, and the error information will be thrown in the form of an exception.

summary

Does this set of expansion feel very humanized? It not only provides an object-oriented way, but also provides a process-oriented way based on function operation. However, this does not have many benefits, because it takes into account both old code and new ideas, and the internal implementation of its own extension will be much more complex. When we write our own code, we try not to write it like this. When refactoring, we can migrate to the latest form step by step.

There is not much useful information about rar compression operation. Of course, if we want to generate compressed packages in the production environment, we will directly generate ZIP format for users in most cases. After all, most client software can support the decompression of RAR and ZIP format files at the same time. If it is necessary to specify the generation of RAR, we can also discuss with the product manager or customers. Sometimes, technical difficulties can be solved through business flexibility. The most important thing is communication.

Test code:

https://github.com/zhangyue0503/dev-blog/blob/master/php/202007/source/PHP%E7%9A%84rar%E8%A7%A3%E5%8E%8B%E8%AF%BB%E5%8F%96%E6%89%A9%E5%B1%95%E5%8C%85%E5%AD%A6%E4%B9%A0.php

Reference documents:
https://www.php.net/manual/zh/book.rar.php

Official account: hard core project manager

Add wechat / QQ friends: [xiaoyuezigonggong / 149844827] get free PHP and project management learning materials

Tiktok, official account, voice, headline search, hard core project manager.

Station B ID: 482780532

Recommended Today

On the mutation mechanism of Clickhouse (with source code analysis)

Recently studied a bit of CH code.I found an interesting word, mutation.The word Google has the meaning of mutation, but more relevant articles translate this as “revision”. The previous article analyzed background_ pool_ Size parameter.This parameter is related to the background asynchronous worker pool merge.The asynchronous merge and mutation work in Clickhouse kernel is completed […]