Road network data is very important for many applications in cities, such as vehicle navigation and route optimization. The traditional road data acquisition method depends on the acquisition vehicle, which consumes a lot of manpower and material resources. With the popularity of GPS equipment, massive trajectory data is generated in cities, so that we can use trajectory data to generate road network. This problem has been widely studied in recent ten years, but the accuracy of many methods is not high, especially in upper and lower roads, parallel roads and so on. Since the trajectory data is not evenly distributed in the city, is there any way to further improve the accuracy of road network estimation in these areas where vehicles frequently pass?

This paper will introduce the paper “Roadrunner: improving the precision of road network influence from GPS trajectories” jointly published by Massachusetts Institute of Technology (MIT) and Qatar Hamad bin Khalifa University (hbku) at ACM sigspatial 2018, so as to improve the accuracy of road network speculation without losing coverage (or recall), recall）。 This paper divides the problem of road network speculation into two stages_ Firstly, the Roadrunner algorithm proposed in this paper is used to infer a high-precision map in the high trajectory density area, and then combined with the traditional trajectory estimation road network method_ Meet the requirements of recall rate. The core idea of Roadrunner is to use the connectivity of each track to judge whether the intersecting tracks are driving on the same road or two parallel roads.

### 1、 Problem background

It is a very challenging problem to infer the road network from the trajectory. The left two columns of Figure 1 show the performance of two kinds of traditional algorithms based on probability density estimation (KDE) and K-means clustering in three cities (Los Angeles, Boston and Chicago). The generated map has three problems:

1) The upper road will be connected with the lower road;

2) The actual disjoint adjacent roads will be connected;

3) Detailed topologies are difficult to identify, such as highway intersections.

This paper presents* RoadRunner*, the method uses

*In each iteration, Roadrunner generates road segments by considering the same set of sub tracks of the precursors through a track filter operator. This method is very important to remove the interference of adjacent sections, and is robust to GPS noise and road topology. Although Roadrunner has high accuracy, the filtering operation will result in the loss of roads in areas with fewer tracks. In order to further improve the recall rate of road network speculation, this paper proposes a merging operation to integrate the results of Roadrunner speculation with those of traditional methods. The right column of Figure 1 shows the effect of the method proposed in the text.*

**The incremental way is to build the road network based on the trajectory flow.**

### 2、 Roadrunner

The algorithm flow of Roadrunner is shown in Figure 2. The input of the algorithm is an initial road network, which can come from the existing road network or the road network inferred by other methods. We first add all vertices in the initial road network to queue Q, and the vertices in the queue are called active vertices (lines 2-3). In each iteration, Roadrunner selects an active vertex v from the queue and extracts the outflow direction of the track at vertex v through the trace operation (lines 5-6). For each outflow direction θ， Let’s add a line starting from V in the direction of θ， Extend the current road network by a small path segment with a fixed length d (lines 7-11). Then, through the merge operation, try to merge another vertex u of the small section into the existing road network. If the merge fails, we will u add Q for the next iteration (lines 12-14). When q is empty, the algorithm stops and returns to the current road network.

▲ Figure 2 Roadrunner algorithm framework ▲

It is worth noting that in order to effectively track and merge, Roadrunner only retains part of the sub track set related to the current road section for the generation of road network in each iteration. Fig. 3 is a satellite map somewhere and the corresponding trajectory data distribution. Suppose we now want to expand the road network from the blue vertex in Figure 3. Because the spatial distance of the three highlighted roads is very close and their orientation is almost the same, if we consider all trajectory data, we may connect the red road with the green road, or merge the red road with the blue road. However, by excluding tracks that are not near the currently expanding road section, we can get a much cleaner set of sub tracks (tracks covering only the Red Road). We call this trajectory filtering operation as a way path filter. The implementation method is as follows:* Given a circle sequence whose center is along the road section (the radius represents the road width), the path filter operator only retains the tracks passing through these circles in order.*For an active vertex, we can calculate a path with length k (ending with the active vertex) based on the current road network, and then generate a circle sequence along the path (the radius of each circle can be dynamically estimated by trajectory data) to construct the filtering conditions.

▲ Figure 3. Motivation for introducing path filter operator ▲

Next, let’s briefly introduce the specific implementation of trace and merge operations.

#### 1，Tracing

The purpose of the trace operation is to* Extract the main outflow direction of the trajectory at vertex v.*As shown in Figure 4, we want to extract the direction of the track at the blue vertex. Firstly, we apply the path filter operator to obtain the sub track set of the path before passing through the blue vertex (shown in green). We find that the tracks are obviously divided into three groups at the intersection. We draw a circle with the blue vertex as the center and D as the radius (as shown in the pink circle), and then generate 72 angles at the vertex to divide the circle equally. Then take the intersection of each angle and the circle as the center and R as the radius to construct a small circle (as shown in the yellow circle). Then, the path filter is used again to filter the path before passing through and the track t ‘passing through the small circle

。 Finally, the number of tracks for this angle is recorded as

, where m is a constant for noise filtering. We calculate the number of trajectories of each angle and store them in a 72 dimensional vector, then smooth the vector by Gaussian check, and then detect the local peak. Figure 4 shows the distribution of the smoothed count values. The algorithm detects the local peaks in three directions.

▲ Figure 4. Example of trace operation ▲

#### 2，Merging

When we generate a road section, we need to merge it with the existing road network. However, this is not easy. In the process of merging, it is necessary to ensure the up-down relationship, parallel relationship and multi-level relationship. In order to overcome these challenges, this paper will merge the two sections only when the trajectory of the section is matched in the future distribution. In Figure 5, we show the future distribution of tracks passing through blue and green paths. It is obvious that in example (a), the distribution of the two is not consistent, but in example (b), the distribution of the two is almost the same.

▲ Figure 5. Example of merge operation ▲

The specific implementation of merge operation is shown in Figure 6. For a vertex v to be merged, we calculate the future distribution of the trajectory of the path passing through V, and then we find the vertex u around v. if the future distribution of the trajectory of the path passing through u is consistent with V, we add the section (U, V) and return true; Returns false if the distribution of V is different from the surrounding vertices.

▲ Figure 6. Merge operation ▲

### 3、 Two stage road network speculation

Although Roadrunner has high accuracy, many sections in low-frequency access areas will be lost due to the screening of tracks by path filter operator. Therefore, in the second stage, in order to improve the recall rate, the results of Roadrunner need to be combined with the results of other road network estimation algorithms.

Suppose G1 is the road network speculated by Roadrunner, and G2 is the output of other road network generation algorithms that can capture low-frequency traffic sections. First, we delete the sections in G2 that are within the range from the section rmerge in G1 to get G2 ‘, because the Roadrunner of these sections has been successfully speculated. Then we put G1 and G2 ‘together to get G. However, the sections added from G2 ‘are not connected with the rest of the sections. In order to connect these roads, for each vertex v with degree 1 in G, we connect it with the surrounding road sections (U, w) when the following two conditions are met: 1) V to

The distance of (U, w) is less than rmerge; 2) The trajectory passing through the path v → P → u or V → P → w exceeds a certain threshold, where p is the projection point of V on the road section (U, w).

### 4、 Experimental results

This paper verifies the effectiveness of the proposed method in four cities (Los Angeles, Boston, Chicago and New York). Each city selects an area of 4kmx4km, and there are about 60000 trajectory data. OpenStreetMap is used for validation as a real road network.

Figure 7 shows the change curves of error rate and recall rate of different methods under different parameter settings (the closer to the upper left corner, the better the performance, and the results of different data sets are averaged). The experimental results show that the error rate of Roadrunner + KDE method (RR-2 + be-2) is 33.6% lower than that of KDE only method (be-2), and the error rate of Roadrunner + kmeans method (RR-2 + kharita-20) is 60.7% lower than that of kmeans only method (kharita-20).

▲ Figure 7. Experimental results ▲

### 5、 Summary

This paper proposes a two-stage road network speculation framework, which can improve the accuracy without losing the recall rate. The core module of this framework is Roadrunner, which uses the connectivity of trajectory data to generate accurate road network. In the face of complex road conditions, it has a good performance compared with the existing ones.

**Recommended reading:**

- How to use Clickhouse to realize time series data management and mining?
- What does the Internet of vehicles platform with 200000 commercial vehicles look like?
- NLP brings a “sense of science fiction” beyond your imagination – Interpretation of acl2020 paper (I)

**Welcome to click**[JD Zhilian cloud]**, learn about the developer community**

**More wonderful technical practices and exclusive dry goods analysis**

**Welcome to the official account of Jingdong developer cloud.**