How to reverse shuttle application (decompile)

Time:2021-11-30

At present, most applications using flutter adopt the method of add2app. The contents related to flutter in app mainly include flutterengine, APP products and resource files. We can find an application connected to flutter in the application market for experiments. (APK can be downloaded in major application markets, and IPA can be downloaded by installing Apple configurator 2 on MAC). The product directories related to fluent in APK and IPA are as follows:

The IOS package file is IPA. After downloading, rename its suffix to zip for decompression. After decompression, you can see the application folder under the payload, in which the flutterengine, APP products and resource files are located in the following locations:

xxx.app
└── Frameworks
    ├── App.framework
    │♪ app (DART app product)
    │   ├── Info.plist
    │   ├── SC_Info
    │   ├── _CodeSignature
    │   └── flutter_assets
    │       ├── flutter_assets
    │       ├── AssetManifest.json
    │       ├── FontManifest.json
    │       ├── LICENSE
    │       ├── fonts
    │       ├── images
    │       ├── mtf_module_info
    │       └── packages
    └── Flutter.framework
        ├── Flutter(FlutterEngine)
        ├── Info.plist
        ├── SC_Info
        ├── _CodeSignature
        └── icudtl.dat

The Android package file is APK. After downloading, rename its suffix to zip and unzip it. The flutterengine, APP product and resource files are located in the following locations:

xxx.apk
├── assets
│   └── flutter_assets
│       └── flutter_assets
│       ├── AssetManifest.json
│       ├── FontManifest.json
│       ├── LICENSE
│       ├── fonts
│       ├── images
│       ├── mtf_module_info
│       └── packages
└── lib
        └── armeabi
        Product - libapp. O (DART app product)
        └── libflutter.so(FlutterEngine)

All flutterengine apps use the official or modify them on the official basis. There are few differences. We don’t care about the reverse of this part for the time being. Resource files are mostly pictures, fonts and other resources that can be viewed without reverse. We are mainly concerned about the business logic written by dart or some framework code, which is in the app product. That is, app.framework/app or armeabi / libapp. O are dynamic libraries. Let’s take a brief look at what they contain first?

#You can install common bin utils tools, such as brew update & & Brew install binutils
~/Downloads > objdump -t App
App: file format mach-o-arm64

SYMBOL TABLE:
0000000001697e60 g       0f SECT   02 0000 [.const] _kDartIsolateSnapshotData
000000000000b000 g       0f SECT   01 0000 [.text] _kDartIsolateSnapshotInstructions
0000000001690440 g       0f SECT   02 0000 [.const] _kDartVmSnapshotData
0000000000006000 g       0f SECT   01 0000 [.text] _kDartVmSnapshotInstructions

~/Downloads > greadelf -s libapp.so
Symbol table '.dynsym' contains 5 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT        UND
     1: 00001000 12992 FUNC    GLOBAL DEFAULT        1 _kDartVmSnapshot[...]
     2: 00005000 0x127df60 FUNC    GLOBAL DEFAULT    2 _kDartIsolateSna[...]
     3: 01283000 22720 OBJECT  GLOBAL DEFAULT        3 _kDartVmSnapshotData
     4: 01289000 0x9fc858 OBJECT  GLOBAL DEFAULT     4 _kDartIsolateSna[...]

You can see that whether Android or IOS, dart app products contain four program segments. (fromhttps://github.com/flutter/flutter/wiki/Flutter-engine-operation-in-AOT-Mode

  • ‘_ Kdartvmsnapshotdata ‘: represents the initial state of the dart heap shared between isolates. Helps you start dart isolate faster, but does not contain any isolate specific information.

  • ‘_ Kdartvmsnapshotinstructions’: AOT instructions containing common routines shared among all DART isolates in the VM. The size of such snapshots is usually very small, and most of them contain program stubs.

  • ‘_ Kdartisolatesnapshotdata ‘: represents the initial state of the dart heap and contains exclusive information about isolate.

  • ‘_ Kdartisolatesnapshotinstructions’: contains AOT code executed by dart isolate.

After reading the above, you may still be confused ((⊙⊙) O. why are there four parts? What are data and instructions, VM and isolate? Why use snapshot naming. On these issues, recommend a bloghttps://mrale.ph/dartvm/。 The concepts of data and instructions, VM and isolate are combined in pairs, which exactly correspond to the above four paragraphs. That is, vmdata, vminstructions, isolatedata, isolateinstructions.

Let’s talk about data and instructions first. First of all, we know that the compilation and operation of FLUENT is divided into JIT and AOT modes on the app. Only AOT mode can be used online, that is, the dartvm introduced by fluent includes the function of executing AOT products. In order to be compatible with JIT mode, dartvm adopts the so-called snapshot method, that is, the basic structure compiled by JIT runtime is the same as that compiled by AOT. Class information, global variables, and function instructions are stored directly on disk in a serialized manner, which is called a snapshot.

How to reverse shuttle application (decompile)

1.png

Because the serialization format of the snapshot is specifically designed for the reading rate, reading from the snapshot also greatly improves the loading speed of the code (creating the required class information, global data, etc., which can be similar to the OC runtime starting to load metaclass, class information, etc.). At first, the snapshot does not contain machine code (i.e. the execution logic inside the function body). Later, with the development of AOT mode, this part is added to the snapshot. These latecomers are the instructions mentioned earlier.

How to reverse shuttle application (decompile)

2.png

It should be added here that instructions refer to executable assembly instructions, which must be placed in the text section in the. O file and marked as executable (otherwise IOS cannot load and execute it). Class information and global variables can be loaded as normal data on the data side. (the optimized 50% packet volume of bytes is also based on this. If you are interested, please see the article:https://juejin.im/post/6844904014170030087)。

Then we will talk about dartvmsnapshot and dartisolatessnapshot. This involves how the data virtual machine runs business code. Virtual data is the carrier of data code. The logic running in VM runs in an abstract entity called isolate. You can treat isolate as a thread with runloop in OC (the relationship between them is another troublesome interview question, which will not be expanded here). In brief, isolate maintains stack variables, function call stack frames, sub threads for GC, JIT and other auxiliary tasks, and the stack variables here are things to be serialized to disk, that is, isolatesnapshot. In addition, global objects preset by dart, such as null, true, false, etc., are managed by vmisolate. These things need to be serialized, that is, vmsnapshot.

How to reverse shuttle application (decompile)

3.png

Here we have a general understanding of the structure in the product of fluent app. How do you read them? We can get from clustered_ The fullsnapshotreader:: function in snapshot.cc looks to see how it is deserialized.


void Deserializer::ReadIsolateSnapshot(ObjectStore* object_store) {
  Array& refs = Array::Handle();
  Prepare();
  {
    NoSafepointScope no_safepoint;
    HeapLocker hl(thread(), heap_->old_space());
    // N.B.: Skipping index 0 because ref 0 is illegal.
    const Array& base_objects = Object::vm_isolate_snapshot_object_table();
    for (intptr_t i = 1; i < base_objects.Length(); i++) {
      AddBaseObject(base_objects.At(i));
    }
    Deserialize();
    // Read roots.
    RawObject** from = object_store->from();
    RawObject** to = object_store->to_snapshot(kind_);
    for (RawObject** p = from; p <= to; p++) {
      *p = ReadRef();
    }
#if defined(DEBUG)
    int32_t section_marker = Read<int32_t>();
    ASSERT(section_marker == kSectionMarker);
#endif

    refs = refs_;
    refs_ = NULL;
  }
  thread()->isolate()->class_table()->CopySizesFromClassObjects();
  heap_->old_space()->EvaluateSnapshotLoad();

#if defined(DEBUG)
  Isolate* isolate = thread()->isolate();
  isolate->ValidateClassTable();
  isolate->heap()->Verify();
#endif
  for (intptr_t i = 0; i < num_clusters_; i++) {
    clusters_[i]->PostLoad(refs, kind_, zone_);
  }
  // Setup native resolver for bootstrap impl.
  Bootstrap::SetupNativeResolver();
}

It is also very laborious to understand this part. The analysis article of another great God may bring us a lot of Enlightenment:https://blog.tst.sh/reverse-engineering-flutter-apps-part-1/

We’ll see how to read the rawobject object

How to reverse shuttle application (decompile)

4.png

Each object is represented as uint32 with the following tags_ T beginning:

How to reverse shuttle application (decompile)

5.png

In principle, we can write a reading program for analysis, but there is a reading program written in Python on the Internet (it only supports reading elf format files, that is, it only supports the analysis of Android package products):https://github.com/hdw09/darterBased on the API provided by this reading tool, we can write a tool to export and apply all class definitions.

from darter.file import parse_elf_snapshot, parse_appjit_snapshot
from darter.asm.base import populate_native_references
import re
from collections import defaultdict
import os
import shutil


def get_funciont(fun_index, s, span=False):
    spanStr = ''
    if span:
        spanStr = '    '
    fun_ STR = '\ n' + spanstr + '// function index:' + '{0}'. Format (fun_index) + '\ n'
    returnTypeStr = ''
    if '_class' in s.refs[fun_index].x['result_type'].x.keys():
        returnTypeStr = s.refs[fun_index].x['result_type'].x['_class'].x['name'].x['value']
    elif 'name' in s.refs[fun_index].x['result_type'].x.keys():
        returnTypeStr = str(s.refs[fun_index].x['result_type'])
    else:
        returnTypeStr = s.refs[fun_index].x['result_type'].x['value']
    fun_str = fun_str+spanStr + returnTypeStr
    fun_str = fun_str + ' ' + s.refs[fun_index].x['name'].x['value']+'('
    parameterCount = 0
    if type(s.refs[fun_index].x['parameter_types'].x['value']) != type(''):
        for parameterName in s.refs[fun_index].x['parameter_names'].x['value']:
            parType = ''
            if '_class' in s.refs[fun_index].x['parameter_types'].x['value'][parameterCount].x.keys():
                parType = s.refs[fun_index].x['parameter_types'].x['value'][parameterCount].x['_class'].x['name'].x['value']
            else:
                parType = s.refs[fun_index].x['parameter_types'].x['value'][parameterCount].x['value']
            fun_str = fun_str + parType + ' '
            fun_str = fun_str + parameterName.x['value'] + ', '
            parameterCount = parameterCount + 1
    fun_str = fun_str + ') \n'+spanStr+'{ \n'
    for nrefsItem in s.refs[fun_index].x['code'].x['nrefs']:
        fun_str = fun_str + spanStr + '    {0}'.format(nrefsItem) + '\n'

    fun_str = fun_str + spanStr+'}'
    return fun_str


def get_classDis(clas_index, s):
    class_ STR = '\ n // class index:' + '{0}'. Format (clas_index) + 'follow up with s.refs [XXXX]. X \ n'
    superName = ''
    if '_class' in s.refs[clas_index].x['super_type'].x.keys():
        superName = s.refs[clas_index].x['super_type'].x['_class'].x['name'].x['value']
    else:
        superName = s.refs[clas_index].x['super_type'].x['value']
    class_str = class_str + \
        'class {0} : {1} {2}\n'.format(
            s.refs[clas_index].x['name'].x['value'], superName, '{')
    if type(s.refs[clas_index].x['functions'].x['value']) != type(''):
        for fun in s.refs[clas_index].x['functions'].x['value']:
            class_str = class_str+'\n'+get_funciont(fun.ref, s, True)
    return class_str+'\n\n}'


def get_lob_class(lib, s):
    all_class = ''
    for item in lib.src:
        if 'name' in item[0].x.keys():
            all_class = all_class + get_classDis(item[0].ref, s) + '\n'
    If 'class index' in all_ class:
        return all_class
    else:
        Return 'didn't get any information'


def show_lob_class(lib, s):
    print(get_lob_class(lib, s))


def writeStringInPackageFile(packageFile, content):
    packageFile = packageFile.replace('dart:', 'package:dart/')
    filename = packageFile.replace('package:', 'out/')
    filePath = filename[0:filename.rfind('/')]
    content = '// {0} \n'.format(packageFile)+content
    if os.path.exists(filePath) == False:
        os.makedirs(filePath)
    file = open(filename, 'w')
    file.write(content)
    file.close()


def getFiles(elfFile, filter):
    s = parse_elf_snapshot(elfFile)
    populate_native_references(s)
    allLibrary = sorted(s.getrefs('Library'),
                        key=lambda x: x.x['url'].x['value'])
    for tempLibrary in allLibrary:
        name = tempLibrary.x['url'].x['value']
        if filter in name:
            Print (name + 'start generating...')
            writeStringInPackageFile(
                name, get_lob_class(s.strings[name].src[1][0], s))
            Print (name + 'generated successfully ✅')


#Start execution
getFiles('samples/arm-app.so', '')

This script will eventually extract the source code of all specified files. The export result of one class of friend app is as follows:

How to reverse shuttle application (decompile)

6.png

The indexes of class objects and functions are marked. You can use s.refs [xxxxx]. X on the console to continue tracking.