How to encrypt your Python code — a share of pycon China 2018

Time:2020-2-26

This article was published on the prodesire blog.

Preface

In November last year, we shared Python source code encryption at pycon China 2018 Hangzhou station, telling how to encrypt and decrypt Python code by modifying Python interpreter. However, due to the author’s procrastination attack, it has not been collated into a text version in time, and now it has finally defeated it, so this article has come into being.

This article will first introduce the ideas, methods, advantages and disadvantages of the existing source code encryption scheme, and then introduce how to customize Python interpreter to better encrypt and decrypt the source code.

Existing encryption scheme

Because of the dynamic and open source characteristics of python, it is difficult for Python code to achieve good encryption. Some voices in the community believe that such restrictions are true and should be protected commercially by legal means rather than by encrypting the source code, while others hope to have a means of encrypting anyway. As a result, people come up with a variety of encryption or obfuscation schemes to protect the source code.

Common source protection methods are as follows:

  • issue.pycfile
  • Code confusion
  • Usepy2exe
  • UseCython

Let’s briefly talk about these programs.

Publish. PyC file

thinking

As you all know, the Python interpreter will generate the.pycFile, then explain the execution.pycContents of the file. Of course, the Python interpreter can also execute directly.pycPapers. and.pycThe file is a binary file, so you can’t directly see the source content. If the code is released to the customer environment.pycRather than.pyFile, is that not to achieve the purpose of protecting Python code?

Method

hold.pyFile compiled as.pycFile is a very easy thing, but you don’t need to run all the code once, and then get the generated.pycPapers.

In fact, the python standard library provides a library called compileall, which can be compiled easily.

Execute the following command to traverse<src>All under directory.pyFile, compiling it as.pycDocument:

python -m compileall <src>
Then delete<src>All under directory.pyThe file can be packaged and published:

$ find <src> -name '*.py' -type f -print -exec rm {} \;

Advantage

  • Simple and convenient, raising the threshold of source code cracking
  • Good platform compatibility,.pyWhere to run,.pycWhere to run

Insufficient

  • Poor interpreter compatibility,.pycCan only run on a specific version of the interpreter
  • There are ready-made decompiler tools, low cost of cracking

Python-uncompyle6 is such a decompilation tool with outstanding results.

Execute the following command to.pycDecompile file to.pyDocument:

$ uncompyle6 *compiled-python-file-pyc-or-pyo*

Code confusion

If the code is confused to a certain extent, and even the author looks at it with great difficulty, can it also achieve the purpose of protecting the source code?

thinking

Since our purpose is to confuse, that is, to make the code gradually less understandable through a series of transformations, we can do this:

  • Remove comments and documents. Without these instructions, it would not be so easy to understand some key logics.
  • Change indent. The perfect indent is comfortable to watch. If the indent changes from long to short, you will be upset to watch it.
  • Add a space between tokens. This is similar to changing indent.
  • Rename function, class, variable. Naming directly affects readability. Messy names are a big obstacle to reading comprehension.
  • Insert invalid code in a blank line. That’s the trick, using irrelevant code to disrupt the pace of reading.

Method

Method 1: obfuscate with oxyry

Http://pyob.oxyry.com/ is a website for online obfuscation of Python code, which can be easily used for obfuscation.

Suppose we have such a Python code, involving classes, functions, parameters, etc

# coding: utf-8

class A(object):
    """
    Description
    """

    def __init__(self, x, y, default=None):
        self.z = x + y
        self.default = default
    
    def name(self):
        return 'No Name'


def always():
    return True


num = 1
a = A(num, 999, 100)
a.name()
always()

afterOxyryTo get the following code:

class A (object ):#line:4
    ""#line:7
    def __init__ (O0O0O0OO00OO000O0 ,OO0O0OOOO0000O0OO ,OO0OO00O00OO00OOO ,OO000OOO0O000OOO0 =None ):#line:9
        O0O0O0OO00OO000O0 .z =OO0O0OOOO0000O0OO +OO0OO00O00OO00OOO #line:10
        O0O0O0OO00OO000O0 .default =OO000OOO0O000OOO0 #line:11
    def name (O000O0O0O00O0O0OO ):#line:13
        return 'No Name'#line:14
def always ():#line:17
    return True #line:18
num =1 #line:21
a =A (num ,999 ,100 )#line:22
a .name ()#line:23
always ()

After confusion, the code mainly makes some adjustments in comments, parameter names and spaces, which slightly brings some obstacles in reading.

Method 2: use pyobfuscate library for obfuscation

Pyobfuscate is a very old Python code obfuscate library, but it’s “growing old”.

For the same Python code above, thepyobfuscateThe effect of confusion is as follows:

# coding: utf-8
if 64 - 64: i11iIiiIii
if 65 - 65: O0 / iIii1I11I1II1 % OoooooooOO - i1IIi
class o0OO00 ( object ) :
 if 78 - 78: i11i . oOooOoO0Oo0O
 if 10 - 10: IIiI1I11i11
 if 54 - 54: i11iIi1 - oOo0O0Ooo
 if 2 - 2: o0 * i1 * ii1IiI1i % OOooOOo / I11i / Ii1I
 def __init__ ( self , x , y , default = None ) :
  self . z = x + y
  self . default = default
  if 48 - 48: iII111i % IiII + I1Ii111 / ooOoO0o * Ii1I
 def name ( self ) :
  return 'No Name'
  if 46 - 46: ooOoO0o * I11i - OoooooooOO
  if 30 - 30: o0 - O0 % o0 - OoooooooOO * O0 * OoooooooOO
def Oo0o ( ) :
 return True
 if 60 - 60: i1 + I1Ii111 - I11i / i1IIi
 if 40 - 40: oOooOoO0Oo0O / O0 % ooOoO0o + O0 * i1IIi
I1Ii11I1Ii1i = 1
Ooo = o0OO00 ( I1Ii11I1Ii1i , 999 , 100 )
Ooo . name ( )
Oo0o ( ) # dd678faae9ac167bc83abf78e5cb2f3f0688d3a3

The effect of method 2 looks better than method 1. In addition to renaming classes and functions and adding some spaces, the most obvious thing is to insert some unrelated code, which becomes more difficult to read.

Advantage

  • Simple and convenient, raising the threshold of source code cracking
  • Compatibility is good, as long as the source logic can be compatible, the obfuscation code can also be

Insufficient

  • Only single file can be confused, and multiple source files linked with each other cannot be confused
  • The code structure has not changed, and byte code can be obtained, so it is not difficult to crack

Using py2exe

thinking

Py2exe is a tool for converting Python scripts to executable files on Windows platform. The principle is to compile the source code as.pycFiles, together with necessary dependent files, are packaged into an executable file.

If final issue bypy2exeIs the binary file packaged to protect the source code?

Method

Usepy2exeThe steps of packing are relatively simple.

  1. Write the entry file. In this example, the name ishello.py
print 'Hello World'
  1. To writesetup.py
from distutils.core import setup
import py2exe

setup(console=['hello.py'])
  1. Generate executable
python setup.py py2exe

The generated executable is located atdist\hello.exe

Advantage

  • It can be directly packaged into exe for easy distribution and execution
  • The crack threshold is higher than. PyC

Insufficient

  • Poor compatibility, can only run on Windows system
  • The layout of the generated executable is clear and open, and the source code can be found.pycFile, and then decompile the source code

Using Python

thinking

althoughCythonIts main purpose is to improve performance, but based on its principle: to.py/.pyxCompile to.cFile, then.cFile compiled as.so(Unix) or.pyd(Windows), another benefit of which is that it’s hard to crack.

Method

UseCythonThe steps of development are not complicated.

  1. Writing fileshello.pyxorhello.py
def hello():
    print('hello')
  1. To writesetup.py
from distutils.core import setup
from Cython.Build import cythonize

setup(name='Hello World app',
     ext_modules=cythonize('hello.pyx'))
  1. Compile to.c, and then compile to.soor.pyd
python setup.py build_ext --inplace

implementpython -c "from hello import hello;hello()"You can directly reference thehello()Function.

Advantage

  • The generated binary. So or. PYD file is hard to crack
  • At the same time, it brings performance improvement

Insufficient

  • The compatibility is slightly poor, and it may need to be recompiled for different versions of the operating system
  • Although most Python codes are supported, once some codes are found not to be supported, the cost of perfection is high

Custom Python interpreter

Consider the previous several programs, are from the source of processing, more or less some deficiencies. If we start with the interpreter transformation, will we be able to better protect the code?

Since a Python interpreter is usually included when a commercial Python program is released to the customer environment, if the interpreter can be modified to solve the problem of source protection, it is also an optional way.

Suppose we have an algorithm that can encrypt the original Python code. The encrypted code can be seen by anyone along with the distribution program, but it is hard to crack. On the other hand, there is a custom Python interpreter that can decrypt the encrypted code and then interpret and execute it. Because the Python interpreter itself is a binary file, people can not get the decrypted key data from the interpreter. So as to achieve the purpose of protecting the source code.

To realize the above ideas, we need to master the basic encryption and decryption algorithm first, then explore the way Python executes code to understand where to encrypt and decrypt, and finally disable bytecode to prevent passing through.pycDecompile.

Encryption and decryption algorithm

Symmetric key encryption algorithm

Symmetric key encryption, also known as symmetric encryption, private key encryption and shared key encryption, is a kind of encryption algorithm in cryptography. This kind of algorithm uses the same key when encrypting and decrypting, or uses two keys that can be simply calculated from each other.

Symmetric encryption algorithm is characterized by open algorithm, small computation, fast encryption speed and high encryption efficiency.

Common symmetric encryption algorithms include DES, 3DES, AES, blowfish, idea, RC5, RC6, etc.

The encryption and decryption process of symmetric key is as follows:

How to encrypt your Python code -- a share of pycon China 2018

Plaintext is encrypted into ciphertext by key, and ciphertext can also be decrypted into plaintext by the same key.

Through the OpenSSL tool, we can easily choose symmetric encryption algorithm for encryption and decryption. Let’s take AES algorithm as an example to introduce its usage.

AES encryption

#Specify password for symmetric encryption
$ openssl enc -aes-128-cbc -in test.py -out entest.py -pass pass:123456

#Specify file for symmetric encryption
$ openssl enc -aes-128-cbc -in test.py -out entest.py -pass file:passwd.txt

#Specify environment variable for symmetric encryption
$ openssl enc -aes-128-cbc -in test.py -out entest.py -pass env:passwd

AES decryption

#Specify password for symmetric decryption
$ openssl enc -aes-128-cbc -d -in entest.py -out test.py -pass pass:123456

#Specify file for symmetric decryption
$ openssl enc -aes-128-cbc -d -in entest.py -out test.py -pass file:passwd.txt

#Specifying environment variables for symmetric decryption
$ openssl enc -aes-128-cbc -d -in entest.py -out test.py -pass env:passwd

Asymmetric key encryption algorithm

Public key cryptography, also known as asymmetric cryptography, is a type of cryptography algorithm. In this cryptography method, a pair of keys are needed, one is the private key, the other is the public key. These two keys are mathematically related. The information obtained by encrypting a user’s public key can only be decrypted by the user’s private key.

The characteristic of asymmetric encryption algorithm is that its strength is complex, and its security depends on Algorithm and key, but because of its complexity, the speed of encryption and decryption is not as fast as symmetric encryption and decryption.

Common symmetric encryption algorithms include RSA, ElGamal, knapsack algorithm, Rabin, D-H, ECC, etc.

The process of asymmetric key encryption and decryption is as follows:

How to encrypt your Python code -- a share of pycon China 2018

Plaintext is encrypted into ciphertext by public key, and ciphertext is decrypted into plaintext by private key corresponding to public key.

Through the OpenSSL tool, we can easily choose asymmetric encryption algorithm for encryption and decryption. Let’s take RSA algorithm as an example to introduce its usage.

Generate private key, public key

#With AES-128 algorithm, a 2048 bit length private key is generated
$ openssl genrsa -aes128 -out private.pem 2048

#Generate public key based on private key
$ openssl rsa -in private.pem -outform PEM -pubout -out public.pem

RSA encryption

#Use public key for encryption
openssl rsautl -encrypt -in passwd.txt -inkey public.pem -pubin -out enpasswd.txt

RSA decryption

#Decrypt with private key
openssl rsautl -decrypt -in enpasswd.txt -inkey private.pem -out passwd.txt

Source protection based on encryption algorithm

Symmetric encryption is suitable for encrypting source files, while asymmetric encryption is suitable for encrypting keys. If you combine the two, you can achieve the purpose of encrypting and decrypting the source code.

Encryption in build environment

When we release the installation package, the source code should be encrypted, so we need to encrypt the source code in the construction phase. The encryption process is as follows:

How to encrypt your Python code -- a share of pycon China 2018

  1. Randomly generate a key. This key is actually a password for symmetric encryption.
  2. Use this key to encrypt the source code symmetrically, and generate the encrypted code.
  3. Use the public key (see the asymmetric key encryption algorithm for the generation method) to encrypt the key asymmetrically and generate the encrypted key.

Whether the encrypted code or the encrypted key will be placed in the installation package. They can be seen by users, but not deciphered. How does the Python interpreter execute the encrypted code?

Decryption by Python interpreter

Assuming that the private key corresponding to the public key is built in our Python interpreter, it is possible to decrypt it. Because the Python interpreter itself is a binary, you don’t need to worry about the built-in private key being seen. The decryption process is as follows:

How to encrypt your Python code -- a share of pycon China 2018

  1. When the Python interpreter executes the encryption code, it needs to pass in the parameter indicating the encryption key. Through this parameter, the interpreter obtains the encryption key
  2. The Python interpreter uses the built-in private key to decrypt the encryption key asymmetrically to get the original key
  3. The Python interpreter uses the original key to decrypt the encrypted code symmetrically to get the original code
  4. The Python interpreter executes the original code

It can be seen that the purpose of protecting source code can be realized by transforming the construction link and customizing the execution process of Python interpreter. It’s easy to transform the build phase, but how to customize the Python interpreter? We need to have an in-depth understanding of how the interpreter executes scripts and modules in order to control them at a specific entry.

Execution and decryption of scripts and modules

Several ways to execute Python code

In order to find all the entrances for Python interpreter to execute Python code, we need to first execute how Python interpreter can execute code.

Run script directly

python test.py

Run statement directly

python -c "print 'hello'"

Direct operation module

python -m test

Import, reload module

python
>>>Import test import module
>>>Reload (test) - reload module

Run statement directlyWe don’t need to do any extra processing on this way.
Direct operation moduleandImport, reload moduleThe two methods are different in process, so we will look at them together next.
Therefore, we will be divided into two situations: running scripts and loading modules to further explore their own processes and decryption methods.

Decrypt when running script

The process of running a script
The code call logic of Python interpreter when running script is as follows:

       main            WinMain
[Modules/python.c] [PC/WinMain.c]
             \         /
              \       /
               \     /
                \   /
                 \ /
               Py_Main
           [Moduls/main.c]

The entry function of Python interpreter running script varies with the operating system. On Linux / unix system, the main entry function isModules/python.cMediummainFunction, on a Windows system, isPC/WinMain.cMediumWinMainFunction. But both functions eventually callModuls/main.cMediumPy_MainFunction.

Let’s have a lookPy_MainRelated logic in function:

[Modules/Main.c]
--------------------------------------

int
Py_Main(int argc, char **argv)
{
    if (command) {
        //Process Python - C < command >
    } else if (module) {
        //Process python-m < module >
    }
    else {
        //Process Python < File >
        ...
        fp = fopen(filename, "r");
        ...
    }
}

Handle<command>and<module>In the logic of processing files (by running the script directly), we can see that the explanation opens the file and obtains the file pointer. So if we put thefopenChange to customdecrypt_openThis function is used to open an encrypted file, decrypt it, and return a file pointer, which points to the decrypted file. So, can’t decrypt the script?

Custom decrypt? Open
We might as well add a new oneModules/crypt.cFile, used to store some custom encryption and decryption functions.

decrypt_openThe function is implemented as follows:

[Modules/crypt.c]
--------------------------------------

/*Open file in decryption mode*/
FILE *
decrypt_open(const char *filename, const char *mode)
{
    int plainlen = -1;
    char *plaintext = NULL;
    FILE *fp = NULL;

    if (aes_passwd == NULL)
        fp = fopen(filename, "r");
    else {
        plainlen = aes_decrypt(filename, aes_passwd, &plaintext);
        //If unable to decrypt, return source file descriptor
        if (plainlen < 0)
            fp = fopen(filename, "r");
        //Otherwise, convert to memory file descriptor
        else
            fp = fmemopen(plaintext, plainlen, "r");
    }
    return fp;
}

Thereaes_passwdIs a global variable that represents the key in the symmetric encryption algorithm. We assume that we have obtained the key for the time being, and we will explain how to obtain it later. andaes_decryptIt is a self-defined function that uses AES algorithm for symmetric decryption. Due to space limitation, the implementation of this function will not be pasted out.

decrypt_openThe logic is as follows:

  • Determine whether the symmetric key is obtained. If not, directly open the file and return the file pointer
  • If so, try to decrypt using the symmetric algorithm

    • If the decryption fails, it may be a non encrypted script. Open the file directly and return the file pointer
    • If the decryption is successful, we create a memory file object through the decrypted content and return the file pointer

After these functions are implemented, we can decrypt and execute the encrypted code when we run the script directly.

Decrypt on module load

Process of loading modules
The logic of loading module is mainly realized inPython/import.cIn the document, the process is as follows:

                                             Py_Main
                                         [Moduls/main.c]
                                                |
    builtin___import__                      RunModule
            |                                   |
PyImport_ImportModuleLevel <----┐     PyImport_ImportModule
            |                   |               |
    import_module_level         └------- PyImport_Import
            |
         load_next                         builtin_reload
            |                                   |
      import_submodule                PyImport_ReloadModule
            |                                   |
        find_module <---------------------------┘
  • adoptpython -m <module>When a module is loaded by thePy_Mainfunction
  • adoptimport <module>When a module is loaded by thebuiltin___import__function
  • adoptreload(<module>)When a module is loaded by thebuiltin_reloadfunction

But either way, it will eventually callfind_moduleFunction, let’s see if there’s something hidden in this function?

[Python/import.c]
--------------------------------------

static struct filedescr *
find_module(char *fullname, char *subname, PyObject *path, char *buf,
            size_t buflen, FILE **p_fp, PyObject **p_loader)
{
    ...
    fp = fopen(buf, filemode);
    ...
}

We arefind_moduleThe logic to open the file is found in the function. If it is directly changed to the logic implemented in the previous sectiondecrypt_openIs not it possible to decrypt a module when it is loaded?

The general idea is like this, but there is a detail to pay attention to,bufNot necessarily.pyDocuments, maybe.pycDocuments, we only.pyIf the file is changed, it can be written as follows:

[Python/import.c]
--------------------------------------

static struct filedescr *
find_module(char *fullname, char *subname, PyObject *path, char *buf,
            size_t buflen, FILE **p_fp, PyObject **p_loader)
{
    ...
    if (fdp->type == PY_SOURCE) {
        fp = decrypt_open(buf, filemode);
    }
    else {
        fp = fopen(buf, filemode);
    }
    ...
}

After the above changes, the purpose of decryption is realized when the module is loaded.

Support specified key file

There is also a problem to be solved in the previous article: at first, we assume that the interpreter has obtained the key content and stored it in the global variableaes_passwdHow to obtain the key content?

We need a Python interpreter to support a new parameter option, through which we can specify the encrypted key file, and then decrypt it through an asymmetric algorithm to getaes_passed

Suppose this parameter option is-k <filename>, you can usepython -k enpasswd.txtTo tell the interpreter the file path of the encryption key. It is as follows:

[Modules/main.c]
--------------------------------------

/*Command line options, note that K: is new*/
#define BASE_OPTS "3bBc:dEhiJk:m:OQ:RsStuUvVW:xX?"
...
/* Long usage message, split into parts < 512 bytes */
static char *usage_1 = "\
...
-k key : decrypt source file by using key file\n\
...
";
...
int
Py_Main(int argc, char **argv)
{
    ...
    char *keyfilename = NULL;
    ...
    while ((c = _PyOS_GetOpt(argc, argv, PROGRAM_OPTS)) != EOF) {
        ...
        case 'k':
            keyfilename = (char *)malloc(strlen(_PyOS_optarg) + 1);
            if (keyfilename == NULL)
                Py_FatalError(
                   "not enough memory to copy -k argument");
            strcpy(keyfilename, _PyOS_optarg);
            keyfilename[strlen(_PyOS_optarg)] = '
[Modules/main.c]
--------------------------------------
/*Command line options, note that K: is new*/
#define BASE_OPTS "3bBc:dEhiJk:m:OQ:RsStuUvVW:xX?"
...
/* Long usage message, split into parts < 512 bytes */
static char *usage_1 = "\
...
-k key : decrypt source file by using key file\n\
...
";
...
int
Py_Main(int argc, char **argv)
{
...
char *keyfilename = NULL;
...
while ((c = _PyOS_GetOpt(argc, argv, PROGRAM_OPTS)) != EOF) {
...
case 'k':
keyfilename = (char *)malloc(strlen(_PyOS_optarg) + 1);
if (keyfilename == NULL)
Py_FatalError(
"not enough memory to copy -k argument");
strcpy(keyfilename, _PyOS_optarg);
keyfilename[strlen(_PyOS_optarg)] = '\0';
break;
...
}
...
if (keyfilename != NULL) {
int passwdlen;
char *passwd = NULL;
passwdlen = rsa_decrypt(keyfilename, &passwd);
set_aes_passwd(passwd);
if (passwdlen < 0) {
fprintf(stderr, "%s: parsing key file '%s' error\n", argv[0], keyfilename);
free(keyfilename);
return 2;
} else {
free(keyfilename);
}
}
...
}
'; break; ... } ... if (keyfilename != NULL) { int passwdlen; char *passwd = NULL; passwdlen = rsa_decrypt(keyfilename, &passwd); set_aes_passwd(passwd); if (passwdlen < 0) { fprintf(stderr, "%s: parsing key file '%s' error\n", argv[0], keyfilename); free(keyfilename); return 2; } else { free(keyfilename); } } ... }

Its logic is as follows:

  • k:MediumkExpress support-kOptions;:Represents the option followed by a parameter, the path of the encrypted key file here
  • The interpreter is processing the-kWhen the parameter is set, the path of the following file is obtained and recorded in thekeyfilenamein
  • Use customrsa_decryptFunction (limited to space, not listed how to implement the logic) to decrypt the encrypted key file asymmetrically to obtain the original content of the key
  • Write the key content toaes_passwdin

Thus, by specifying the encrypted key file explicitly, the interpreter obtains the original key, decrypts the encrypted code through the key, and then executes the original code. But there’s a hidden onerisk: generated during code execution.pycFile, decompiled from it.pyThe file is unencrypted. In other words, malicious users can bypass restrictions by this means. So, we needDisable bytecode

Disable bytecode

Do not generate. PyC file

The first thing to do is not generate.pycFile, so that malicious users can not directly.pycFile to source.

We know, through-BOption to tell the Python interpreter not to generate.pycPapers. Since the custom Python interpreter does not generate.pycLet’s just disable this option:

[Modules/main.c]
--------------------------------------

/*Command line option, note that B is removed*/
#define BASE_OPTS "3bc:dEhiJm:OQ:RsStuUvVW:xX?"
...
/* Long usage message, split into parts < 512 bytes */
static char *usage_1 = "\
...
//-B     : don't write .py[co] files on import; also PYTHONDONTWRITEBYTECODE=x\n\
...
";
...
int
Py_Main(int argc, char **argv)
{
    ...
    //Do not generate py [CO]
    Py_DontWriteBytecodeFlag++;
    ...
}

In addition, the Python interpreter gets whether or not to generate the.pycFile, so it also needs to be processed:

[Python/pythonrun.c]
--------------------------------------

void
Py_InitializeEx(int install_sigs)
{
    ...
    f ((p = Py_GETENV("PYTHONDEBUG")) && *p != '
[Python/pythonrun.c]
--------------------------------------
void
Py_InitializeEx(int install_sigs)
{
...
f ((p = Py_GETENV("PYTHONDEBUG")) && *p != '\0')
Py_DebugFlag = add_flag(Py_DebugFlag, p);
if ((p = Py_GETENV("PYTHONVERBOSE")) && *p != '\0')
Py_VerboseFlag = add_flag(Py_VerboseFlag, p);
if ((p = Py_GETENV("PYTHONOPTIMIZE")) && *p != '\0')
Py_OptimizeFlag = add_flag(Py_OptimizeFlag, p);
//Remove processing for pythondontwritebycode
if ((p = Py_GETENV("PYTHONDONTWRITEBYTECODE")) && *p != '\0')
Py_DontWriteBytecodeFlag = add_flag(Py_DontWriteBytecodeFlag, p);
...
}
') Py_DebugFlag = add_flag(Py_DebugFlag, p); if ((p = Py_GETENV("PYTHONVERBOSE")) && *p != '
[Python/pythonrun.c]
--------------------------------------
void
Py_InitializeEx(int install_sigs)
{
...
f ((p = Py_GETENV("PYTHONDEBUG")) && *p != '\0')
Py_DebugFlag = add_flag(Py_DebugFlag, p);
if ((p = Py_GETENV("PYTHONVERBOSE")) && *p != '\0')
Py_VerboseFlag = add_flag(Py_VerboseFlag, p);
if ((p = Py_GETENV("PYTHONOPTIMIZE")) && *p != '\0')
Py_OptimizeFlag = add_flag(Py_OptimizeFlag, p);
//Remove processing for pythondontwritebycode
if ((p = Py_GETENV("PYTHONDONTWRITEBYTECODE")) && *p != '\0')
Py_DontWriteBytecodeFlag = add_flag(Py_DontWriteBytecodeFlag, p);
...
}
') Py_VerboseFlag = add_flag(Py_VerboseFlag, p); if ((p = Py_GETENV("PYTHONOPTIMIZE")) && *p != '
[Python/pythonrun.c]
--------------------------------------
void
Py_InitializeEx(int install_sigs)
{
...
f ((p = Py_GETENV("PYTHONDEBUG")) && *p != '\0')
Py_DebugFlag = add_flag(Py_DebugFlag, p);
if ((p = Py_GETENV("PYTHONVERBOSE")) && *p != '\0')
Py_VerboseFlag = add_flag(Py_VerboseFlag, p);
if ((p = Py_GETENV("PYTHONOPTIMIZE")) && *p != '\0')
Py_OptimizeFlag = add_flag(Py_OptimizeFlag, p);
//Remove processing for pythondontwritebycode
if ((p = Py_GETENV("PYTHONDONTWRITEBYTECODE")) && *p != '\0')
Py_DontWriteBytecodeFlag = add_flag(Py_DontWriteBytecodeFlag, p);
...
}
') Py_OptimizeFlag = add_flag(Py_OptimizeFlag, p); //Remove processing for pythondontwritebycode if ((p = Py_GETENV("PYTHONDONTWRITEBYTECODE")) && *p != '
[Python/pythonrun.c]
--------------------------------------
void
Py_InitializeEx(int install_sigs)
{
...
f ((p = Py_GETENV("PYTHONDEBUG")) && *p != '\0')
Py_DebugFlag = add_flag(Py_DebugFlag, p);
if ((p = Py_GETENV("PYTHONVERBOSE")) && *p != '\0')
Py_VerboseFlag = add_flag(Py_VerboseFlag, p);
if ((p = Py_GETENV("PYTHONOPTIMIZE")) && *p != '\0')
Py_OptimizeFlag = add_flag(Py_OptimizeFlag, p);
//Remove processing for pythondontwritebycode
if ((p = Py_GETENV("PYTHONDONTWRITEBYTECODE")) && *p != '\0')
Py_DontWriteBytecodeFlag = add_flag(Py_DontWriteBytecodeFlag, p);
...
}
') Py_DontWriteBytecodeFlag = add_flag(Py_DontWriteBytecodeFlag, p); ... }

Disable access to bytecode object co code

Just don’t generate.pycFiles are not enough. Malicious users can access the object’s co code attribute to get bytecode, and then obtain the source code by decompiling. Therefore, we also need to prevent users from accessing bytecode objects:

[Objects/codeobject.c]
--------------------------------------

static PyMemberDef code_memberlist[] = {
    {"co_argcount",     T_INT,          OFF(co_argcount),       READONLY},
    {"co_nlocals",      T_INT,          OFF(co_nlocals),        READONLY},
    {"co_stacksize",T_INT,              OFF(co_stacksize),      READONLY},
    {"co_flags",        T_INT,          OFF(co_flags),          READONLY},
    // {"co_code",         T_OBJECT,       OFF(co_code),           READONLY},
    {"co_consts",       T_OBJECT,       OFF(co_consts),         READONLY},
    {"co_names",        T_OBJECT,       OFF(co_names),          READONLY},
    {"co_varnames",     T_OBJECT,       OFF(co_varnames),       READONLY},
    {"co_freevars",     T_OBJECT,       OFF(co_freevars),       READONLY},
    {"co_cellvars",     T_OBJECT,       OFF(co_cellvars),       READONLY},
    {"co_filename",     T_OBJECT,       OFF(co_filename),       READONLY},
    {"co_name",         T_OBJECT,       OFF(co_name),           READONLY},
    {"co_firstlineno", T_INT,           OFF(co_firstlineno),    READONLY},
    {"co_lnotab",       T_OBJECT,       OFF(co_lnotab),         READONLY},
    {NULL}      /* Sentinel */
};

At this point, a custom Python interpreter is complete.

Demonstration

Run script

adopt-kOption, Python interpreter can run encrypted and unencrypted Python files.

How to encrypt your Python code -- a share of pycon China 2018

Loading module

Can pass-m <module>To load encrypted and unencrypted modules, you can also use theimport <module>To load encrypted and unencrypted modules.

How to encrypt your Python code -- a share of pycon China 2018

Disable bytecode

By disabling bytecode, we achieve the following effects:

  • No generation.pycfile
  • Func code of function can be accessed
  • Unable to access the co code of the code object, that is, f.func code.co code in this example
  • Unable to get bytecode using dis module

How to encrypt your Python code -- a share of pycon China 2018

Exception stack information

Although the code is encrypted, it does not affect the stack information in case of exception.

How to encrypt your Python code -- a share of pycon China 2018

debugging

Encrypted code is also allowed to debug, but the output code content will be encrypted, which is exactly what we expect.

How to encrypt your Python code -- a share of pycon China 2018

Reflection

  1. How to prevent the co? Code of an object from being found by memory operation?
  2. How to further improve the difficulty of private key detection by reverse engineering?
  3. How can I see it when I debug and want to see the source code?

How to encrypt your Python code -- a share of pycon China 2018

Recommended Today

Laravel service container must know

The article was forwarded from the professional laravel developer community. Original link: https://learnku.com/laravel/t To learn how to build an application with laravel is not only to learn how to use different classes and components in the framework, but also to remember allartisanCommand or all helper functions (we have Google). Learning to code with laravel is […]