# Understanding topological ordering

Time：2021-5-2

## preface

Topological sort, also known as topological order, is a bit confusing, because topological sort is not a pure sort algorithm, it is only for the purpose of sortingA class of GraphsTo find a linear order that can be executed.

This algorithm sounds like a big one. Today’s interviews are also very popular. For example, when I was in our company at that time, there was a whole round of design based on topological sorting.

But it’s actually a well understood algorithm. Follow my train of thought, so that you will never forget her again.

## Directed acyclic graph

Just now, we mentioned that topological sorting is only for a specific class of graphs, so which class of graphs is it for?

A: directed acyclic graph (DAG). Namely:

1. The edges of this graph must be oriented;
2. There is no ring in the picture.

So what is the direction?

For example, wechat friends are directional. If you add friends, they may delete you, but you don’t know… Then the friendship is one-way..

What is a ring? Ring is related to direction. Starting from one point, you can return to yourself. This is ring.

So in the figure below, the left side is not a ring, the right side is a ring. Then, if there are rings in a graph, such as the right graph, if you want to execute 1, you have to execute 3 first, if you want to execute 3, you have to execute 2 first, if you want to execute 2, you have to execute 1 first. This becomes a dead loop. You can’t find the correct way to open it, so you can’t find a topological order.

### Conclusion:

• If the graph is not DAG, then it has no topological order;
• If DAG, it has at least one topological order;
• On the contrary, if it has a topological order, then the graph must be DGA

So this is a problemNecessary and sufficient conditions ## Topological sorting

So what’s the meaning of such a picture`Topological order`What do you mean?

We use the curriculum of Baidu Encyclopedia to illustrate.

Course Number Course name Prerequisite courses
C2 Fundamentals of programming nothing
C3 discrete mathematics C1, C2
C4 data structure C3, C5
C5 Algorithmic language C2
C6 Compiler technology C4, C5
C7 operating system C4, C9
C8 general physics C1
C9 Principles of computer C8

There are nine courses, some of which have the requirement of a prerequisite course, that is, you need to learn the “required course in the right column” before you can choose the “advanced” course.

In this example, topological sorting means:
Is to solve a feasible order, so that I can learn all the lessons.

So how?

First of all, we can use`chart`To describe it,
The two elements of a graph are`Vertex and edge`
So here:

• Summit: each course
• Side: the course at the beginning is a prerequisite for the course at the end

It looks like this: This kind of picture is calledAOV(activity on vertex) network

• Vertex: Indicates activity;
• Side: indicates the sequence of activities

**Therefore, an AOV network should be a DAG, that is, a directed acyclic graph, otherwise some activities will not be carried out.
<span style=” display:block; color:orangered; “> Then all activities can be arranged into a feasible linear sequence, which is`Topological sequence`。**

So what’s the significance of this sequence`Practical significance`Yes:
According to this sequence, at the beginning of each project, we can ensure that its precursor activities have been completed, so that the whole project can proceed smoothly.

Back to our example:

1. At a glance, we can see that we need to learn C1 and C2 first, because there are no requirements for these two courses. We should learn them when we are a freshman;
2. As a sophomore, we can learn C3, C5 and C8 in the second line, because the prerequisite courses of these three courses are C1 and C2;
3. Junior can learn C4 and C9 in the third line;
4. The remaining C6 and C7 in the last year.

In this way, we will finish all the courses and get the diagramOne`Topological sorting`

Note that sometimes the topological order is not unique. For example, in this example, learning C1 first and then learning C2, and learning C2 first and then C1, are the correct topological order of the graph, but these are two orders.

So when interviewing, you should ask the following examiners whether they want to solve any solution or list all solutions.

Let’s summarize,

What’s in this picture`edge`It’s a kind of`Dependency`If you want to take the next course, you have to take the previous one first.

This is the same as in the game. To get a prop, you have to do task a first, then complete task B, and finally get to the destination.

## Algorithm explanation

In the figure above, you can easily see its topological order. However, as the project becomes more and more huge, the dependency relationship will become more and more complicated, so we need to solve it in a systematic way.

So let’s think back to the process of finding the topological order. Why did we first look at C1 and C2?

Because they don’t depend on others,
That’s what it’s all about`The penetration is 0`.

Entering degree: the penetration of a vertex refers toThe edge that points to the vertexThe number of “the number of” the number of “the number of” the number of “the number of” the number of “the number of” the number of “the number of” the number of “the number of” the number of “;
Out degree: the out degree of a vertex is the number of edges that the vertex points to other points.

That is to record the penetration of each vertex.
becauseOnly when it is`Penetration = 0`We can implement it only when we have time.

In the example just now, the penetration of C1 and C2 is 0 at the beginning, so we can execute these two first.

In this algorithm, the first step is to get the penetration of each vertex.

### Step0: preprocess to get the in degree of each point

We can use a HashMap to store this information, or use a`array`It will be more sophisticated.

For the convenience of presentation, I use the form:

C1 C2 C3 C4 C5 C6 C7 C8 C9
Entering degree 0 0 2 2 1 2 2 1 1

### Step1

After you get this, you can execute these points with degree 0, that is, C1, C2

So let’s put these points that can be executed into one`Container to be executed`In this way, we can take the vertices from the container one by one.

As for this`container`Which do you choose`data structure`It depends on what we need to do`operation`Let’s see which data structure can serve it.

Well, first of all, you can`[C1, C2]`Put in`container`In the middle,

Then think about what we need to do!

What we do most often is`Put the dots in``Take the dots out`Implemented, that is, a`offer`and`poll`Operation of more efficient data structure, then`queue`That’s enough.

(others are OK. The positions of the vertices put into this container are all the same. They are executable and have nothing to do with the order in which they come in. But why bother yourself? A simple queue in regular order is enough.)

Then we need to take out some points to carry out.

[highlight] when we take C1 out for execution, what does that mean?

<span style=” display:block; color:blue; “> A: it means that the “edge” of the “point to other points” with C1 as the vertex “disappears, that is, the degree of C1 out becomes 0

As shown in the figure below, these two edges can disappear. Then we can update it`The points that C1 points to`that is`C3 and C8`Of`Entering degree`The updated array is as follows:

C3 C4 C5 C6 C7 C8 C9
Entering degree 1 2 1 2 2 <span style=”display:block;color:blue;”>0 1

<span style=” display:block; color:blue; “> So we see a very important step here, the penetration of C8 has become 0!

This means that C8 has no dependency at this time and can be put into our queue to wait for execution.

At this time, our`queue`It’s like:`[C2, C8]`.

#### Step2

Next we’ll do C2, that`C2 points to` `C3, C5`Of`Penetration - 1`

Update Form:

C3 C4 C5 C6 C7 C9
Entering degree <span style=”display:block;color:blue;”>0 2 <span style=”display:block;color:blue;”>0 2 2 1

That is, C3 and C5 have no constraints and can be put into the queue for execution.

`queue`It becomes:`[C8, C3, C5]`

#### Step3

So the next step is to execute C8, The penetration of C9 in C8 is – 1
Update Form:

C4 C6 C7 C9
Entering degree 2 2 2 <span style=”display:block;color:blue;”>0

Then C9 has no requirements and can be put into the queue for execution.

`queue`It becomes:`[C3, C5, C9]`

#### Step4

Next, execute C3, The penetration of C4 in C3 is – 1
Update Form:

C4 C6 C7
Entering degree <span style=”display:block;color:blue;”>1 2 2

<span style=” display:block; color:blue; “> However, the penetration of C4 does not change to 0, so there is no point to join the queue in this step

`queue`Now it becomes`[C5, C9]`

#### Step5

Then execute C5, Then the penetration of C4 and C6 in C5 is – 1
Update Form:

C4 C6 C7
Entering degree <span style=”display:block;color:blue;”>0 <span style=”display:block;color:blue;”>1 2

Here, the dependence of C4 is all gone, so you can put C4 into the queue

`queue` = `[C9, C4]`

#### Step6

Then perform C9, Then the penetration of C7 in C9 is – 1

C6 C7
Entering degree <span style=”display:block;color:blue;”>1 <span style=”display:block;color:blue;”>1

Here, the penetration of C7 is not zero, and it cannot be added to the queue,

here`queue` = `[C4]`

#### Step7

And then C4, So the penetration of C6 and C7 pointed by C4 is – 1,
Update Form:

C6 C7
Entering degree <span style=”display:block;color:blue;”>0 <span style=”display:block;color:blue;”>0

The penetration of C6 and C7 has become 0!! Put them in the queue and continue until the queue is empty.

### summary

OK, let’s sort out this algorithm

<span style=” display:block; color:blue; “> data structure

Map: < key = vertex, value = in degree >

But in the actual code, we use aint arrayThe graph node can be represented by the index of the array, and the value can be represented by the value in the array. This is more delicate than map.

Then a common queue is used to store the nodes that can be executed

<span style=” display:block; color:blue; “> process
We put the vertices with degree 0 into the queue, and then by executing the vertices in the queue each time, we can make the vertices that depend on the executed vertex have the same degree`Penetration - 1`If the in degree of a vertex becomes 0, it can be put into the queue until the queue is empty.

<span style=” display:block; color:blue; “> details
Here are some implementation details:

When we check whether there is a new vertex’s degree of penetration = = 0, we don’t need to go through the entire map or array. We just need to check what has just been changed.

The other is that if the problem does not give the condition that the graph is DAG, then there may be no feasible solution. How to judge? A very simple method is to compare whether the number of vertices in the final result is equal to that of all vertices in the graph, or add a counter. If not, it means that there is no effective solution. So this algorithm can also be usedDetermine whether a graph is a directed acyclic graph

A lot of questions may be given to this graph`edge list`, which is also a common way to represent graphs. So here’s the one`list`It’s the one in the diagram`edge`. Here we should pay attention to examining the topic, and see who depends on who. In fact, the title of the graph will not directly give you the graph, but give you a scene, you need to change it back to a graph.

<span style=” display:block; color:blue; “> Time complexity

be careful ⚠ Conclusion: for the time complexity analysis of graph, there must be two parameters. When interviewing, many students open their mouths to o (n)

For a graph with V vertices and e edges,

The first step is to preprocess to get a map or array. You need to go through all the edges, so it’s o (E);

The second step, the operation of joining and leaving the team is O (V). If it is a DAG, all the points need to join and leave the team once;

The third step is to eliminate the edge it points to every time a vertex is executed. This is executed e times in total;

Total: O (V + e)

<span style=” display:block; color:blue; “> Spatial complexity

An array is used to store the index of all the points, and the subsequent queue will put all the points in at most, so it is O (V)

<span style=” display:block; color:blue; “> code

There are two questions about the ranking of this course in leetcode. One is 207, which asks whether you can complete all the courses, that is, whether topological ranking exists; The other is question 210, which allows you to return any topological order. If you can’t complete it, you will return an empty array.

Here we write with 210, which is more complete and often tested.

The input given here is what we just mentioned`edge list`.

Example 1.

Input: 2, [[1,0]]
Output: [0,1]
Explanation: there are two courses, and the prerequisite course of 1 is 0. So the correct order of course selection is [0, 1]

Example 2.

Input: 4, [[1,0],[2,0],[3,1],[3,2]]
Output: [0,1,2,3] or [0,2,1,3]
Explanation: here is an example Example 3.

Input: 2, [[1,0],[0,1]]
Output: null
Explanation: I can’t have this class

``````class Solution {
public int[] findOrder(int numCourses, int[][] prerequisites) {
int[] res = new int[numCourses];
int[] indegree = new int[numCourses];

// get the indegree for each course
for(int[] pre : prerequisites) {
indegree[pre] ++;
}

// put courses with indegree == 0 to queue
Queue<Integer> queue = new ArrayDeque<>();
for(int i = 0; i < numCourses; i++) {
if(indegree[i] == 0) {
queue.offer(i);
}
}

// execute the course
int i = 0;
while(!queue.isEmpty()) {
Integer curr = queue.poll();
res[i++] = curr;

// remove the pre = curr
for(int[] pre : prerequisites) {
if(pre == curr) {
indegree[pre] --;
if(indegree[pre] == 0) {
queue.offer(pre);
}
}
}
}

return i == numCourses ? res : new int[]{};
}
}
``````

Just in case, if you want to see the details In addition, topological sorting can also be used`DFS - depth first search`To achieve, limited to space, we will not start here. You can refer to geeks for geeks.

## practical application

We have mentioned one of its use cases, the course selection system, which is also the most frequently tested topic.

The most important application of topological sorting is`Critical path problem`This problem corresponds to AOE (activity on edge) network.

AOE network: the vertex represents the event, the edge represents the activity, and the weight on the edge represents the time required for the activity.
AOV network: vertices represent activities and edges represent dependencies between activities.

In AOE network, the path with the largest length from the beginning to the end is called critical path, and the activities on the critical path are called critical activities. AOE network is generally used to analyze the process of a large project, at least how much time it takes to complete, and how much maneuver time each activity can have.

You can refer to the 14 minutes and 46 seconds of this video. This example is very good.

In fact, it is applicable to any graph with dependency relationship between tasks.

For example, when POM relies on importing jar packages, have you ever thought about how it imports some jar packages that you did not directly import? For example, you don’t import the jar package of AOP, but it appears automatically. This is because some packages you import depend on the jar package of AOP, and Maven will automatically import them for you.

Other practical applications are summarized here

1. Preprocessing of speech recognition system;
2. Manage the dependency relationship between target files, just like the jar package import I just mentioned;
3. Network structure processing in deep learning.