diff --git a/en/eBook/07.3.md b/en/eBook/07.3.md index 4ab0aa00..2d524455 100644 --- a/en/eBook/07.3.md +++ b/en/eBook/07.3.md @@ -1,24 +1,24 @@ # 7.3 Regexp -Regexp is a complicated but powerful tool for pattern match and text manipulation. Although its performance is lower than pure text match, it's more flexible. Base on its syntax, you can almost filter any kind of text from your source content. If you need to collect data in web development, it's not hard to use Regexp to have meaningful data. +Regexp is a complicated but powerful tool for pattern matching and text manipulation. Although does not perform as well as pure text matching, it's more flexible. Based on its syntax, you can filter almost any kind of text from your source content. If you need to collect data in web development, it's not hard to use Regexp to retrieve meaningful data. -Go has package `regexp` as official support for regexp, if you've already used regexp in other programming languages, you should be familiar with it. Note that Go implemented RE2 standard except `\C`, more details: [http://code.google.com/p/re2/wiki/Syntax](http://code.google.com/p/re2/wiki/Syntax). +Go has the `regexp` package, which provides official support for regexp. If you've already used regexp in other programming languages, you should be familiar with it. Note that Go implemented RE2 standard except for `\C`. For more details, follow this link: [http://code.google.com/p/re2/wiki/Syntax](http://code.google.com/p/re2/wiki/Syntax). -Actually, package `strings` does many jobs like search(Contains, Index), replace(Replace), parse(Split, Join), etc. and it's faster than Regexp, but these are simple operations. If you want to search a string without case sensitive, Regexp should be your best choice. So if package `strings` can achieve your goal, just use it, it's easy to use and read; if you need to more advanced operation, use Regexp obviously. +Go's `strings` package can actually do many jobs like searching (Contains, Index), replacing (Replace), parsing (Split, Join), etc., and it's faster than Regexp. However, these are all trivial operations. If you want to search a case insensitive string, Regexp should be your best choice. So, if the `strings` package is sufficient for your needs, just use it since it's easy to use and read; if you need to perform more advanced operations, use Regexp. -If you remember form verification we talked before, we used Regexp to verify if input information is valid there already. Be aware that all characters are UTF-8, and let's learn more about Go `regexp`! +If you recall form verification from previous sections, we used Regexp to verify the validity of user input information. Be aware that all characters are UTF-8. Let's learn more about the Go `regexp` package! ## Match -Package `regexp` has 3 functions to match, if it matches returns true, returns false otherwise. +The `regexp` package has 3 functions to match: if it matches a pattern, then it returns true, returning false otherwise. func Match(pattern string, b []byte) (matched bool, error error) func MatchReader(pattern string, r io.RuneReader) (matched bool, error error) func MatchString(pattern string, s string) (matched bool, error error) -All of 3 functions check if `pattern` matches input source, returns true if it matches, but if your Regex has syntax error, it will return error. The 3 input sources of these functions are `slice of byte`, `RuneReader` and `string`. +All of 3 functions check if `pattern` matches the input source, returning true if it matches. However if your Regex has syntax errors, it will return an error. The 3 input sources of these functions are `slice of byte`, `RuneReader` and `string`. -Here is an example to verify IP address: +Here is an example of how to verify an IP address: func IsIP(ip string) (b bool) { if m, _ := regexp.MatchString("^[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}$", ip); !m { @@ -27,7 +27,7 @@ Here is an example to verify IP address: return true } -As you can see, using pattern in package `regexp` is not that different. One more example, to verify if user input is valid: +As you can see, using pattern in the `regexp` package is not that different. Here's one more example on verifying if user input is valid: func main() { if len(os.Args) == 1 { @@ -40,13 +40,13 @@ As you can see, using pattern in package `regexp` is not that different. One mor } } -In above examples, we use `Match(Reader|Sting)` to check if content is valid, they are all easy to use. +In the above examples, we use `Match(Reader|Sting)` to check if content is valid, but they are all easy to use. ## Filter -Match mode can verify content, but it cannot cut, filter or collect data from content. If you want to do that, you have to use complex mode of Regexp. +Match mode can verify content but it cannot cut, filter or collect data from it. If you want to do that, you have to use complex mode of Regexp. -Sometimes we need to write a crawl, here is an example that shows you have to use Regexp to filter and cut data. +Let's say we need to write a crawler. Here is an example that shows when you must use Regexp to filter and cut data. package main @@ -95,7 +95,7 @@ Sometimes we need to write a crawl, here is an example that shows you have to us fmt.Println(strings.TrimSpace(src)) } -In this example, we use Compile as the first step for complex mode. It verifies if your Regex syntax is correct, then returns `Regexp` for parsing content in other operations. +In this example, we use Compile as the first step for complex mode. It verifies that your Regex syntax is correct, then returns a `Regexp` for parsing content in other operations. Here are some functions to parse your Regexp syntax: @@ -104,9 +104,9 @@ Here are some functions to parse your Regexp syntax: func MustCompile(str string) *Regexp func MustCompilePOSIX(str string) *Regexp -The difference between `ComplePOSIX` and `Compile` is that the former has to use POSIX syntax which is leftmost longest search, and the latter is only leftmost search. For instance, for Regexp `[a-z]{2,4}` and content `"aa09aaa88aaaa"`, `CompilePOSIX` returns `aaaa` but `Compile` returns `aa`. `Must` prefix means panic when the Regexp syntax is not correct, returns error only otherwise. +The difference between `ComplePOSIX` and `Compile` is that the former has to use POSIX syntax which is leftmost longest search, and the latter is only leftmost search. For instance, for Regexp `[a-z]{2,4}` and content `"aa09aaa88aaaa"`, `CompilePOSIX` returns `aaaa` but `Compile` returns `aa`. `Must` prefix means panic when the Regexp syntax is not correct, returning error otherwise. -After you knew how to create a new Regexp, let's see this struct provides what methods that help us to operate content: +Now that we know how to create a new Regexp, let's see what how the methods provided by this struct can help us to operate on content: func (re *Regexp) Find(b []byte) []byte func (re *Regexp) FindAll(b []byte, n int) [][]byte @@ -127,7 +127,7 @@ After you knew how to create a new Regexp, let's see this struct provides what m func (re *Regexp) FindSubmatch(b []byte) [][]byte func (re *Regexp) FindSubmatchIndex(b []byte) []int -These 18 methods including same function for different input sources(byte slice, string and io.RuneReader), we can simplify it by ignoring input sources as follows: +These 18 methods include identical functions for different input sources (byte slice, string and io.RuneReader), so we can really simplify this list by ignoring input sources as follows: func (re *Regexp) Find(b []byte) []byte func (re *Regexp) FindAll(b []byte, n int) [][]byte @@ -194,13 +194,13 @@ Code sample: fmt.Println(submatchallindex) } -As we introduced before, Regexp also has 3 methods for matching, they do exactly same thing as exported functions, those exported functions call these methods underlying: +As we've previously introduced, Regexp also has 3 methods for matching. They do the exact same things as the exported functions. In fact, those exported functions actually call these methods under the hood: func (re *Regexp) Match(b []byte) bool func (re *Regexp) MatchReader(r io.RuneReader) bool func (re *Regexp) MatchString(s string) bool -Next, let's see how to do displacement through Regexp: +Next, let's see how to replace strings using Regexp: func (re *Regexp) ReplaceAll(src, repl []byte) []byte func (re *Regexp) ReplaceAllFunc(src []byte, repl func([]byte) []byte) []byte @@ -209,14 +209,14 @@ Next, let's see how to do displacement through Regexp: func (re *Regexp) ReplaceAllString(src, repl string) string func (re *Regexp) ReplaceAllStringFunc(src string, repl func(string) string) string -These are used in crawl example, so we don't explain more here. +These are used in the crawling example, so we don't explain more here. -Let's take a look at explanation of `Expand`: +Let's take a look at the definition of `Expand`: func (re *Regexp) Expand(dst []byte, template []byte, src []byte, match []int) []byte func (re *Regexp) ExpandString(dst []byte, template string, src string, match []int) []byte -So how to use `Expand`? +So how do we use `Expand`? func main() { src := []byte(` @@ -232,10 +232,10 @@ So how to use `Expand`? fmt.Println(string(res)) } -At this point, you learned whole package `regexp` in Go, I hope you can understand more by studying examples of key methods, and do something interesting by yourself. +At this point, you've learned the whole `regexp` package in Go. I hope that you can understand more by studying examples of key methods, so that you can do something interesting on your own. ## Links - [Directory](preface.md) - Previous section: [JSON](07.2.md) -- Next section: [Templates](07.4.md) \ No newline at end of file +- Next section: [Templates](07.4.md)