Why focus on streaming media? Disclosure of PPIO Technology

Time:2019-8-24

At 8 a.m. on the subway on weekdays, Lisa took out her cell phone and turned on Tik Tok to kill half an hour of commuting time; at 12 p.m., after lunch, Lisa took advantage of her lunch break to sneak through funny videos on YouTube; at 8 p.m., after a busy day, Lisa lay down on the sofa and turned on the TV, on Netflix and Hulu. Search for the latest movies to enrich your nighttime life. Seeing here, do you seem to see your own shadow? Indeed, we spend a lot of online time on audio and video applications every day, and we may not be aware of all this.

Why PPIO Makes Audio and Video

According to the Global Internet Phenomenon Report in October 2018, video application traffic accounts for about 58% of the total Internet traffic. As pointed out in the report, the share of global application traffic in video applications has increased unprecedentedly.

PPIO is a de-centralized storage and distribution platform for developers to make data storage cheaper, faster and more private. The official website is pp.io.

In the design of PPIO, we regard the direction of audio and video as the top priority. We should not only support the mainstream audio and video transmission protocol smoothly, but also do a good job of quality of service (QOS). In order to better understand the data distribution of PPIO streaming media audio and video, we first review the commercial architecture of PPIO.

PPIO will provide three sets of APIs in succession:

  1. API for storage space and bandwidth rental based on IaaS layer.
  2. PaaS-based POSS, PCDN, PRoute API.
  3. On-demand, live broadcasting and more API interfaces based on Application Service layer.

Developers can choose to develop at any level to complete their own APP or DAPP.

If PPIO and AWS cloud computing services are compared, the hierarchical analogy is:

In the framework of PPIO, the API of streaming media audio and video is in the Application Services layer, because its scenario is very close to the application, but PCDN is based on PaaS layer. The following will focus on the design of PPIO in streaming media video, especially the relevant parts of streaming media video data distribution.

CDN and PCDN

CDN, full name is Content Delivery Network, that is, data distribution network, one of the core infrastructure of modern Internet. CDN nodes are the closest to users in the whole CDN architecture. CDN architecture is designed to store data in the center, but the central data is not the closest to every user. So CDN architecture deploys CDN nodes in many multi-edge MAN. These nodes are used for data caching. When users request to use data, they can give data directly and quickly. And guaranteed a good user experience. The following diagram shows the architecture of the CDN node.

So far, CDN technology has been developed for many years, and there are not a few companies engaged in CDN business, and large-scale commercial applications have emerged. By the end of 2018, the global CDN market had a total output value of 20 billion US dollars, and the market size was huge.

When PPIO designs PCDN, it does not propose a new data distribution scheme, but provides a supplementary scheme for P2P transmission based on the existing CDN distribution scheme, which makes the data distribution service compatible with the previous scheme and achieves cheaper and faster. The following is a design diagram of PCDN compatible with existing CDN schemes.

Fragmentation and Media

Fragmentation is the basis of P2P transmission technology. For P2P system, fragmentation rule is very important. So what is fragmentation? Fragmentation is to segment and number a file or stream according to some uniform rule. Each unit that is segmented is called Piece, which is the unit of P2P transmission. If the two Pieces have the same number, they are considered to be the same Piece. In the traditional P2P protocol, fragmentation is done in a centralized way; for example, BitTorrent is fragmented by seed users (the first user), and the latter P2P networks are fragmented according to the first user.

PCDN fragmentation is different. PCDN uses the technology of P2SP, where S refers to Server. That is to say, the original source of P2SP data is not a node, but a standardized output server. It may be HTTP protocol or other protocols such as HTTP2, QUIC+HTTP/3. Such servers are standardized under the CDN system, and they do not have the ability to slice. So when PPIO designs PCDN, Peer (the common network node of P2P) is used to distribute the slicing, so it requires all nodes to slice according to the same slicing rules for a same resource, so as to ensure that the slicing of Peer 1 (node 1) and Peer 2 (node 2) is consistent.

PPIO fragmentation is related to file structure or streaming media protocol. This paper first introduces the compatibility of PPIO with two mainstream streaming media transmission modes and their specific schemes. One is segmented streaming mode, including HLS (Http Live Streaming) and DASH; the other is HTTP continuous streaming mode, such as HTTP+FLV. In addition, PPIO will support two video data distribution scenarios: live and on demand.

First of all, let’s talk about the fragmentation of ordinary documents. It should be noted that the ordinary files here are not streaming video files and do not have the characteristics of streaming media.

First, define the noun.

Segment: Files, VOD streams, large segments of live streams, uncertain about fixed length. A file can be a Segment; however, a live stream consists of multiple Segments; a VOD stream, depending on the situation, may be a Segment or may be composed of multiple Segments.

Piece: The smallest unit of P2SP scheduling, represented as a bit map in P2P.

SubPiece: The transmission unit of the final P2P protocol is MTU smaller than UDP (generally 1350 direct). UDP protocol is widely used at the bottom of PPIO. If a UDP message is larger than MTP, the packet loss rate will be greatly increased.

TS: Transport Stream, the original segment of segmented streaming media. Let’s take HLS as an example, so it’s written as TS. If it is a DASH stream, it corresponds to FMP4.

TSP: Each Piece of TS after fragmentation.

VS: Video Segment, for HTTP continuous stream on demand, is made up of fixed size slices; for live broadcast, it is made up of I frame boundary and minimum length slices.

VSP: Each Piece after slicing Video Segments.

# 1. Fragmentation of Ordinary Documents

As shown in the figure above,

If the FS of a file partition is equal in length, then the last FS may be smaller.

Each FS partition has the same FSP length, so the last FSP may be smaller.

A FS corresponds to a bitmap, and a bit in a bitmap corresponds to a FSP.

If it is a normal video file, transparent support drag, but also support side-down broadcast. If the user drags, the player specifies the range, calculates the FS index according to the range and fixed size, and requests the required FSP in the relevant FS. After downloading the relevant FSP, the stream is merged and passed to the player.

# 2. On Demand

This paper focuses on the partitioning of PPIO architecture for two streaming media transmission modes.

# 2.1 HTTP Segmented Stream On Demand

Here, taking HLS as an example, other segment streams such as DASH are similar.

Get m3u8 from the video server and get the TS file list inside. Generally speaking, the TS of a HLS VOD stream is not necessarily equal in length. Each TS partition has the same TSP length, but the last TSP may be smaller. Among them, a TS corresponds to a bitmap, and a bit in a bitmap corresponds to a TSP.

This also supports dragging: when the user drags, the player specifies the TS index, and then downloads the relevant content according to the TS index. After the relevant TS is downloaded, it is passed to the player to complete the play.

# 2.2 HTTP Continuous Stream on Demand

Take HTTP + FLV as an example.

Tag in the figure refers to the original data features in streaming media. Among them, the VS of one FLV VOD stream is equal in length, and the last VS may be smaller; while the VSP of each VS is equal in length, the last VSP may be smaller. Each VS corresponds to a bitmap, and one bit in a bitmap corresponds to a VSP. FLV on demand slices the same way as file downloads.

Of course, this also supports side-down broadcasting and dragging: when the user drags, the player specifies the range, calculates the VS index according to the range and the fixed Segment size, and requests the relevant VS. After downloading, merge the stream and pass it to the player.

# 3. Live broadcasting

From the slicing point of view, live broadcasting is more complicated than on-demand broadcasting. Because live broadcasting does not start or end, when every user starts to watch live broadcasting, the downloaded data is in the middle. And all users’data should be fragmented according to the same fragmentation rules, not only fragmentation, but also synchronization. In addition, general live broadcasting also has playback function.

This article focuses on the PPIO architecture for two streaming media transmission mode of fragmentation thinking.

# 3.1 HTTP Segmented Streaming Live Broadcasting

Let’s take HLS as an example, DASH and other segment streams are similar.

Similar to the slicing method of HLS on demand, it is assumed that the current live m3u8 files play TS1, TS2, TS3, TS4, TS5. According to the settings of benchmark delay, live broadcasting will start from one TS, such as TS1, which has the longest delay. Therefore, the more opportunities to get data from P2P network, the higher the utilization rate of P2P. But if broadcast from TS5, the delay is the shortest, so the less opportunities to get data from P2P network. Thus, the utilization of P2P is the lowest; if played from TS3, it is a compromise.

# 3.2 HTTP Continuous Streaming Live Broadcasting

Live streaming of HTTP continuous stream means that there is no end to the stream at the beginning. Here we take HTTP + FLV as an example.

The VS divided by a FLV live stream is not necessarily equal in length. The VS starts with the key frame as the boundary and slices with a minimum time unit. Slicing algorithm is to ensure that the data of each frame in each VS is complete, and must contain a key frame.

Suppose that the current live broadcasting VS1, VS2, VS3, VS4, VS5, according to the settings of the reference delay, if playing from VS1, the longest delay, the more opportunities to get data from P2P, the highest utilization rate of P2P; if playing from VS5, the shortest delay, the fewer opportunities to get data from P2P, the lowest utilization rate of P2P; A compromise is a compromise.

In addition to supporting HTTP segmented streams and HTTP continuous streams, PPIO plans to gradually support other media formats and protocols.

Fragmentation only establishes the order of P2SP download, and efficient transmission architecture can not be ignored. After introducing the way of distributed slicing, then we will discuss how to use P2SP network for efficient data transmission.

The transmission of P2SP is a combination of P2P download and download from Server. The problem of data transmission needs to be solved is to select the right way to download at the right time.

This is the full-node architecture of PPIO’s PCDN. Here, we introduce the roles in it.

1. CDN node:

CDN node is the nearest node to the user in the whole CDN architecture; users can get data directly from it. CDN nodes have been developed for many years, and now support a variety of transport protocols, including HTTP, HTTP/2, QUIC+HTTP/3 and so on.

# 2. Mapping node:

CDN has a unique ID of resources, and P2P has a unique ID. They have different resource IDs. We need to map different resource IDs when carrying out cooperative transmission of P2SP. This is where Mapping nodes function. Their responsibility is to map these two different IDs and provide query functions.

Mapping node is a commercial node, developers can develop Mapping node for their own application scenarios, because only the developers themselves know best whether the unique ID of resources in CDN and the unique ID of P2P resources in PIO are consistent.

If developers do not develop Mapping nodes themselves, they can also dock with common Mapping nodes. The common Mapping node establishes the corresponding relationship between Url in CDN and RID in PPIO.

Is the Mapping node necessary? NoBecause Mapping node is only a correspondence, if the correspondence can be implemented offline directly by a simple algorithm for developers, there is no need for Mapping node.

# 3. Peer node:

Peer is a node in P2P network. In PPIO network, Peer may be either a storage node, a user or both (that is, a node that uploads data and downloads data). In PPIO supply and demand and block chain design, storage nodes and users are usually described in different roles. But in P2P network, most functions and codes are identical, so they are called Peer nodes, and they are also equal in transmission protocol.

# 4. Tracker node

The location of Tracker node in PPIO is close to that in BitTorrent. It is mainly used to manage the relationship between RID (resource ID, used to mark a file or stream) and P2P nodes. Each RID on Tracker records the relationship of Peer nodes that own the resource. When a Peer wants to get a resource, it first queries the first Peers from Tracker, and then downloads the data from those Peers who own the resource. Subsequently, more Peers can be queried from the first batch of Peers by using the “flooding” mechanism, and finally a larger and larger Peer library is formed in the local area until almost all Peers are found.

See here, you must have such questions. Isn’t Tracker the “center” of the network? As long as this “center” has problems, will the network not have problems? So does Tracker have to exist? Of course not, because Tracker only discovers the initial nodes, but there is also a mechanism for discovering the initial Peer nodes in PPIO, that is, DHT, which is a distributed hash table. PPIO uses KAD algorithm to implement DHT. However, using DHT to find the initial nodes is inefficient, and it is not fast and efficient through Tracker.

When developing applications based on PPIO, developers can choose Tracker mode or DHT mode according to their own requirements. Tracker is better if it pursues efficiency and quality of service; if it pursues complete de-centralization, it can only use DHT.

# 5. SuperPeer node

In PPIO distribution network, there is also a special Peer node, which we call SuperPeer. SuperPeer is selected automatically by our algorithm according to various technical conditions. SuperPeer will have many screening conditions, including network, storage, long-term online, mortgage, and historical default. When all aspects of the technical conditions meet the requirements, it can be automatically upgraded to SuperPeer.

SuperPeer, as a high-quality node, will be given priority in algorithm, and the return and benefit will be higher.

# 6. Push node

Push nodes are used for preheating scheduling. In short, they force a large number of Peers with artificially judged hot content in the future. Although PPIO has the mechanism of naturally discovering heating through Overlay network, naturally heating tends to be slow, and scheduling heating through Push nodes can achieve pre-processing. When it is necessary to download the file, a large number of Peers in the network have stored the file for data upload, which greatly improves the performance. User experience.

Of course, PPIO’s streaming media design is not clear in three words. This article mainly explains the PPIO’s PCDN architecture. More content will come slowly for you in the next article. Pay attention to PPIO’s public number. Don’t miss the latest wonderful content!

For more information about PPIO, go to the official website: pp.io