Naming and grouping of golang regular

Time:2022-1-14

The function of grouping is available in regular, and named grouping can also be used in golang.

A match

The scene is restored as follows:

There is one line of text in the format of name age email address

Please convert it to a map

The code implementation is as follows:

str := `Alice 20 [email protected]`
//Use named groups to make it clearer
re := regexp.MustCompile(`(?P<name>[a-zA-Z]+)\s+(?P<age>\d+)\s+(?P<email>\[email protected]\w+(?:\.\w+)+)`)
match := re.FindStringSubmatch(str)
groupNames := re.SubexpNames()
fmt.Printf("%v, %v, %d, %d\n", match, groupNames, len(match), len(groupNames))
result := make(map[string]string)
//Convert to map
for i, name := range groupNames {
    if i !=  0 && name !=  "" {// the first group is empty (that is, the whole match)
        result[name] = match[i]
    }
}
prettyResult, _ := json.MarshalIndent(result, "", "  ")
fmt.Printf("%s\n", prettyResult)

Output is:


[Alice 20 [email protected] Alice 20 [email protected]], [ name age email], 4, 4
{
  "age": "20",
  "email": "[email protected]",
  "name": "Alice"
}

Note that [name age email] has four elements, the first of which is “”.

Multiple matches

Follow the example above to realize a demand closer to reality:

There is a file that roughly reads as follows:

Alice 20 [email protected]
Bob 25 [email protected]
gerrylon 26 [email protected]
...
More

It’s the same as the above, but this time it turns out a slice of map, that is, multiple maps.

The code is as follows:

//The contents of the file are directly represented by strings
usersStr := `
    Alice 20 [email protected]
    Bob 25 [email protected]
    gerrylon 26 [email protected]
`
userRe := regexp.MustCompile(`(?P<name>[a-zA-Z]+)\s+(?P<age>\d+)\s+(?P<email>\[email protected]\w+(?:\.\w+)+)`)
//Here, use findallstringsubmatch to find all the matches
users := userRe.FindAllStringSubmatch(usersStr, -1)
groupNames := userRe.SubexpNames()
var result []map[string]string // slice of map
//Loop all rows
for _, user := range users {
    m := make(map[string]string)
    //Generate a map for each row
    for j, name := range groupNames {
        if j != 0 && name != "" {
            m[name] = strings.TrimSpace(user[j])
        }
    }
    result = append(result, m)
}
prettyResult, _ := json.MarshalIndent(result, "", "  ")
fmt.Println(string(prettyResult))

Output is:


[
  {
    "age": "20",
    "email": "[email protected]",
    "name": "Alice"
  },
  {
    "age": "25",
    "email": "[email protected]",
    "name": "Bob"
  },
  {
    "age": "26",
    "email": "[email protected]",
    "name": "gerrylon"
  }
]

summary

Using named grouping can make the meaning of regular representation clearer.

Converting to map is more in line with human reading habits, but it is more troublesome than taking grouping values according to the index.

Supplement: golang regular grouping matches multiple values

Look at the code~


import (
   "encoding/json"
   "fmt"
   "regexp"
)
str := `9x_xx:995:88`  // `9x_xx:995`
//Use named groups to match multiple values at once
re := regexp.MustCompile(`(?P<fname>\w+):+(?P<mod>[1-9]*):*(?P<strlen>[0-9]*)`)
match := re.FindStringSubmatch(str)
groupNames := re.SubexpNames()
fmt.Printf("%v, %v, %d, %d\n", match, groupNames, len(match), len(groupNames))
 
result := make(map[string]string)
if len(match) == len(groupNames) {
   //Convert to map
   for i, name := range groupNames {
      if i !=  0 && name !=  "" {// the first group is empty (that is, the whole match)
         result[name] = match[i]
      }
   }
}

prettyResult, _ := json.MarshalIndent(result, "", "  ") 
fmt.Printf("%s\n", prettyResult)

The above is my personal experience. I hope I can give you a reference, and I hope you can support developpaer. If you have any mistakes or don’t consider completely, please don’t hesitate to comment.