When fintech meets cloud native, how does ant financial do security architecture?


Ant financial has spent the past 15 years reshaping payments that have transformed lives, serving more than 1.2 billion people worldwide, and all this has been underpinned by technology. At the 2019 hangzhou yunqi conference, ant financial will share its 15 years of technological precipitation and future-oriented financial technology innovation with participants. We have compiled some of the best speeches and will publish them in the”Financial level distributed architecture“On the public account, this article is one of them.

When fintech meets cloud native, how does ant financial do security architecture?
He zhengyu, founder of gVisor, is a researcher at ant financial

Under the cloud native development trend, the security problem is a very big obstacle for the financial industry to apply cloud native technology, and the cloud native community pays far less attention to the security problem. When ant financial is on the ground, solving security problems is the top priority. After exploration and practice, we have developed a set of full-link financial level cloud native security architecture from the underlying hardware to the software and from the system to the application layer.

Trust is the most important thing in the financial industry, and we think,The trust that security brings is an intangible product that underpins all financial business

In line with the development of the Internet era, many changes have taken place in the financial industry and institutions, including more access channels such as apps and small programs, faster business changes and more third-party suppliers. However, no matter how things change, one thing remains the same in the financial industry: Zero Fault, or Zero tolerance for errors, which means extremely high requirements for stability and security.

I also want to clear up a misconception about the financial industry, which is that people say that financial institutions have a lot of legacy systems, a lot of technology is more than a decade old, and they think that financial institutions are technologically backward. But in fact, the financial industry has been very high-tech. There was a movie that came out a while ago called “the hummingbird project,” which was based on a real story about a group of high-frequency traders who, in order to reduce the time it takes to get from Kansas city to the New York stock exchange, built a fiber optic line thousands of miles from Kansas city to the New York stock exchange, and tried their best to get that last millisecond. Therefore, the financial industry not only has mediocre and conservative technology, but also pursues the most cutting-edge and advanced technology. Our mission is to further arm the financial industry with technology and inject more vitality into the financial technology.

Cloud native architecture represents a kind of new productivity, the financial industry is certainly need cloud native, it brings in cost savings and the ability of agile development, but also need to add a attributive in front of it, is the security of cloud native architecture, it contains not only before the security scheme of relatively simple, but a credible from end to end link security solutions. Including clear code ownership, credible start, mirror production and release closed, with the account system, clear application ownership and access rights; And a refined isolation scheme for security that can be deployed independently, integrating security policies and enforcement into the infrastructure and being transparent to software development and testing.

Here we focus on several cloud native security technologies that ant financial is implementing, including cloud native network security Service Mesh, security container, and confidential computing.

Cloud native network security: SOFAMesh

At present, the second technology in cloud native besides container is actually Service Mesh. From the practice of ants, it is actually very helpful to financial security. It can do at least three things:

  • Strategic and efficient flow control can help the operation and maintenance quickly adapt to the rapid change of business;
  • Full link encryption to protect end-to-end data security;
  • Flow hijacking and analysis, when found abnormal flow and container, flow blocking.

In addition, this work is transparent to the business, does not need to burden the business development, and we can also do real-time semantic analysis of traffic and so on, doing more than traditional firewall.

When fintech meets cloud native, how does ant financial do security architecture?

In the exploration of Service Mesh, ant financial launched its SOFAMesh created by Golang, which has been open source to the outside world. It hopes to work with the community to make the concept and technology of Service Mesh more popular.

SOFAMesh is a large-scale implementation of Service Mesh based on Istio. In succession Istio power and rich features, on the basis of to meet the performance requirements and to be born under the large-scale deployment of practice of the actual situation, the improvements include using Golang SOFAMosn replace Envoy, greatly reduced the Mesh itself to the development of the difficulty, and do some creative work, such as merger Mixer to the data plane to solve the performance bottleneck, enhance the Pilot to achieve a more flexible service discovery mechanism, to increase support for SOFARPC, Dubbo, and so on.

For more details, check out SOFAMesh’s GitHub home page:https://github.com/sofastack/…

Ant gold take lead in mass SOFAMesh landing in a production environment, more than 10 w + container to do the Mesh, a smooth support 618 large presses, brings us a multi-protocol support, UDPA, smooth upgrade, safety aspects of benefit, and only a slight impact on performance, single jump CPU increase 5% loss, RT increases less than 0.2 ms, even part of the business transformation through Mesh changes the business link sinking, RT has dropped 7%.

Security Containers: Kata Containers

When fintech meets cloud native, how does ant financial do security architecture?
Traditional container architecture

The traditional container from the virtual machine to the container is actually at the expense of isolation. It can be clearly seen from the figure above that when our application is in the container, it actually shares the same CPU, memory, network and storage, but it looks different from the outside. This can lead to security problems where there is no real isolation between different containers, and if a security problem occurs in one container, it is likely to affect other containers or even break into the entire system. Ant financial’s work in this area is security Containers, specifically Kata Containers.

When fintech meets cloud native, how does ant financial do security architecture?

When fintech meets cloud native, how does ant financial do security architecture?Safety container architecture

Kata Containers security Containers is the OpenStack foundation’s top open infrastructure project, led by ant financial and Intel. In a security container, each Pod runs in a separate sandbox and does not share the kernel with each other, providing strong security. Here are some of the recent developments in Kata Containers that have made significant improvements to the performance issues that you are most concerned about:

  • The number of shimv2 auxiliary processes per Pod was reduced from 2N+2 to 1.
  • Virtiofs was introduced to improve file system performance by about 70% to 90%.
  • Introduce Firecracker, VMM memory overhead drops from 60MB to about 15MB;
  • Switch to rust implementation agent, and the footprint drops from 11MB to about 1MB.

Together with the community, we will continue to build Kata Containers to make security Containers the standard for cloud natives.

Security containers can effectively protect the host, but the financial business itself still needs stronger isolation protection. Ant financial introduced confidential computing and developed a large-scale landing solution, SOFAEnclave, based on actual scenarios.

Confidential computing middleware: SOFAEnclave

Classified computing is defined as a solution based on Trusted Execution Environment (TEE) such as Inte SGX, ARM Trustzone, or Enclave, that isolates user data when accessing computer memory to avoid exposing the data to other applications, operating systems, or other cloud server tenants.

When fintech meets cloud native, how does ant financial do security architecture?
Enclave architecture

Enclave is a two-way protection at runtime. For example, when your financial business is in the Enclave, the operating system will not see the memory of the Enclave and will perform integrity checks to ensure that the code accessing the Enclave is not replaced.

But Enclave currently has problems that prevent it from being used in a real production environment. To sum up these problems include:

First, the application needs to be rewritten because there is no kernel or base library in the trusted execution environment, so the application cannot be executed directly in the Enclave. Second, the application needs to be segmented and the business program needs to be divided into Enclave and Enclave parts. Third, it is not clustered. Unlike the client scenario, how the Enclave application is failover is also a factor that prevents it from being used on a large scale in the data center.

Ant financial’s answer to these questions is the SOFAEnclave confidential computing middleware.

When fintech meets cloud native, how does ant financial do security architecture?
SOFAEnclave architecture

SOFAEnclave consists of three components, the first is Occlum LibOS, the other is SOFAst, and KubeTEE. Occlum is a memory-safe multi-task Enclave core developed by ant, Intel and tsinghua university. It links the functions of the system core through lib and adds functions to Enclave in this way. At the same time, we have innovatively solved the way of running multiple processes in Enclave, making Enclave suitable for large applications.

For more technical details on SOFAEnclave, see this article: “SOFAEnclave: ant financial’s next generation of trusted programming environments, enabling confidential computing to keep the financial business running for 102 years.”

SOFAEnclave open source module Occlum GitHub home page:https://github.com/occlum/occlum

When we weave these security components together with the cloud native framework to form a panorama, it is the security cloud native security architecture of financial services that we are constructing — based on ali cloud and Kubernetes to ensure end-to-end security of financial services.

When fintech meets cloud native, how does ant financial do security architecture?

Some of these components were open source and developed with partners and the community after ant financial tested them, and some were developed in the community from the start. Unlike traditional financial industry technology development, we advocate an open architecture and believe that open open source governance is indispensable to this architecture. We will continue to participate in and support community-based open development and work with the community to build the next generation of finance-level cloud native technologies.

Extension: ant financial’s contribution to the cloud native space



SOFAMosn(Modular Observable Smart Network) is a flat data agent of Service Mesh developed by GoLang, which aims to provide distributed, Modular, Observable and intelligent agent capability for services. SOFAMosn integrates with SOFAMesh through XDS API, and SOFAMosn can be used as a separate 4 and 7 layer load balancing. In the future, SOFAMosn will support more cloud native scenarios and support the core forwarding function of nginx.

This year 618 ant financial has completed the verification work on the core system to SOFAMosn. In the coming double 11 of this year, alibaba and ant financial will launch the Service Mesh on the core system on a large scale.



ElasticDL is ant financial’s next generation cloud-native open-source AI learning platform. Its architecture is based on the native Kubernetes system, with strong fault tolerance and resilient scheduling capabilities. ElasticDL can also support a new generation of TensorFlow 2.0 framework and hopes to lead AI developers to a new generation of machine learning.

In the future, ElasticDL will support more AI models to make it more powerful and better integrated into cloud native systems and Kubernetes systems.