Concurrent implementation of go web crawler


Title: exercise: Web Crawler

Direct reference However, the code uses Chan bool to store whether the sub coroutine is completed. My code uses waitgroup to let the main coroutine wait for the sub coroutine to complete.

For complete code, please refer to

The code added to the original program is as follows:

var fetched = struct {
	m map[string]error
}{m: map[string]error{}}

// Crawl uses fetcher to recursively crawl
// pages starting with url, to a maximum of depth.
func Crawl(url string, depth int, fetcher Fetcher) {
	if _, ok := fetched.m[url]; ok || depth <= 0 {
	body, urls, err := fetcher.Fetch(url)

	fetched.m[url] = err

	if err != nil {
	fmt.Printf("Found: %s %q\n", url, body)
	var wg sync.WaitGroup
	for _, u := range urls {
		go func(url string) {
			defer wg.Done()
			Crawl(url, depth-1, fetcher)

Recommended Today

Big data Hadoop — spark SQL + spark streaming

catalogue 1、 Spark SQL overview 2、 Sparksql version 1) Evolution of sparksql 2) Comparison between shark and sparksql 3)SparkSession 3、 RDD, dataframes and dataset 1) Relationship between the three 1)RDD 1. Core concept 2. RDD simple operation 3、RDD API 1)Transformation 2)Action 4. Actual operation 2)DataFrames 1. DSL style syntax operation 1) Dataframe creation 2. SQL […]