Improve clarity, grammar and typos in 03.0.md and 03.1.md

This commit is contained in:
Anchor
2014-09-06 23:19:13 -07:00
committed by James Miranda
parent c81135e50f
commit e7e0dbd3c9
2 changed files with 42 additions and 39 deletions

View File

@@ -1,6 +1,6 @@
#3 Web foundation
The reason you are reading this book is that you want to learn to build web applications in Go. As I said before, Go provides many powerful packages like `http`. It helps you a lot when you build web applications. I'll teach you everything you should know in following chapters, and we'll talk about some concepts of the web and how to run web applications in Go in this chapter.
The reason you are reading this book is that you want to learn to build web applications in Go. As i've said before, Go provides many powerful packages like `http`. These packages can help you a lot when trying to build web applications. I'll teach you everything you need to know in the following chapters, and we'll talk about some concepts of the web and how to run web applications in Go in this chapter.
## Links

View File

@@ -1,70 +1,73 @@
# Web working principles
Every time you open your browsers, type some URLs and press enter, then you will see the beautiful web pages appear on you screen. But do you know what is happening behind this simple action?
Every time you open your browsers, type some URLs and press enter, you will see beautiful web pages appear on your screen. But do you know what is happening behind these simple actions?
Normally, your browser is a client, after you typed URL, it sends requests to DNS server, get IP address of the URL. Then it finds the server in that IP address, asks to setup TCP connections. When the browser finished sending HTTP requests, server starts handling your request packages, then return HTTP response packages to your browser. Finally, the browser renders bodies of the web pages, and disconnects from the server.
Normally, your browser is a client. After you type a URL, it sends your request to a DNS server in order to get the IP address of that URL. Then it finds the server in that IP address and asks to setup TCP connections. When the browser is finished sending HTTP requests, the server starts handling your request packages, then returns HTTP response packages to your browser. Finally, the browser renders bodies of the web pages and disconnects from the server.
![](images/3.1.web2.png?raw=true)
Figure 3.1 Processes of users visit a website
A web server also known as a HTTP server, it uses HTTP protocol to communicate with clients. All web browsers can be seen as clients.
A web server, also known as an HTTP server, uses the HTTP protocol to communicate with clients. All web browsers can be considered as clients.
We can divide web working principles to following steps:
We can divide the web's working principles into the following steps:
- Client uses TCP/IP protocol to connect to server.
- Client sends HTTP request packages to server.
- Server returns HTTP response packages to client, if request resources including dynamic scripts, server calls script engine first.
- Server returns HTTP response packages to client. If the requested resources include dynamic scripts, server calls script engine first.
- Client disconnects from server, starts rendering HTML.
This a simple work flow of HTTP affairs, notice that every time server closes connections after sent data to clients, and waits for next request.
This is a simple work flow of HTTP affairs -notice that the server closes connections after sending data to clients every time, then waits for the next request.
## URL and DNS resolution
We are always using URL to access web pages, but do you know how URL works?
We always use URLs to access web pages, but do you know how URLs work?
The full name of URL is Uniform Resource Locator, this is for describing resources on the internet. Its basic form as follows.
The full name of a URL is Uniform Resource Locator. It's for describing resources on the internet and its basic form is as follows.
scheme://host[:port#]/path/.../[?query-string][#anchor]
scheme assign underlying protocol(such as HTTP, HTTPS, ftp)
scheme assign underlying protocol (such as HTTP, HTTPS, FTP)
host IP or domain name of HTTP server
port# default port is 80, and you can omit in this case. If you want to use other ports, you must to specify which port. For example, http://www.cnblogs.com:8080/
port# default port is 80, and it can be omitted in this case. If you want to use other ports, you must specify which port. For example, http://www.cnblogs.com:8080/
path resources path
query-string data are sent to server
anchor anchor
DNS is abbreviation of Domain Name System, it's the name system for computer network services, it converts domain name to actual IP addresses, just like a translator.
DNS is an abbreviation of Domain Name System. It's the naming system for computer network services, and it converts domain names to actual IP addresses, just like a translator.
![](images/3.1.dns_hierachy.png?raw=true)
Figure 3.2 DNS working principles
To understand more about its working principle, let's see detailed DNS resolution process as follows.
To understand more about its working principle, let's see the detailed DNS resolution process as follows.
1. After typed domain name `www.qq.com` in the browser, operating system will check if there is any mapping relationship in the hosts file for this domain name, if so then finished the domain name resolution.
2. If no mapping relationship in the hosts file, operating system will check if there is any cache in the DNS, if so then finished the domain name resolution.
3. If no mapping relationship in the hosts and DNS cache, operating system finds the first DNS resolution server in your TCP/IP setting, which is local DNS server at this time. When local DNS server received query, if the domain name that you want to query is contained in the local configuration of regional resources, then gives back results to the client. This DNS resolution is authoritative.
4. If local DNS server doesn't contain the domain name, and there is a mapping relationship in the cache, local DNS server gives back this result to client. This DNS resolution is not authoritative.
5. If local DNS server cannot resolve this domain name either by configuration of regional resource or cache, it gets into next step depends on the local DNS server setting. If the local DNS server doesn't enable forward mode, it sends request to root DNS server, then returns the IP of top level DNS server may know this domain name, `.com` in this case. If the first top level DNS server doesn't know, it sends request to next top level DNS server until the one that knows the domain name. Then the top level DNS server asks next level DNS server for `qq.com`, then finds the `www.qq.com` in some servers.
6. If the local DNS server enabled forward mode, it sends request to upper level DNS server, if the upper level DNS server also doesn't know the domain name, then keep sending request to upper level. Whether local DNS server enables forward mode, server's IP address of domain name returns to local DNS server, and local server sends it to clients.
1. After typing the domain name `www.qq.com` in the browser, the operating system will check if there are any mapping relationships in the hosts' files for this domain name. If so, then the domain name resolution is complete.
2. If no mapping relationships exist in the hosts' files, the operating system will check if any cache exists in the DNS. If so, then the domain name resolution is complete.
3. If no mapping relationships exist in both the host and DNS cache, the operating system finds the first DNS resolution server in your TCP/IP settings, which is likely your local DNS server. When the local DNS server receives the query, if the domain name that you want to query is contained within the local configuration of its regional resources, it returns the results to the client. This DNS resolution is authoritative.
4. If the local DNS server doesn't contain the domain name but a mapping relationship exists in the cache, the local DNS server gives back this result to the client. This DNS resolution is not authoritative.
5. If the local DNS server cannot resolve this domain name either by configuration of regional resources or cache, it will proceed to the next step, which depends on the local DNS server's settings.
-If the local DNS server doesn't enable forwarding, it routes the request to the root DNS server, then returns the IP address of a top level DNS server which may know the domain name, `.com` in this case. If the first top level DNS server doesn't recognize the domain name, it again reroutes the request to the next top level DNS server until it reaches one that recognizes the domain name. Then the top level DNS server asks this next level DNS server for the IP address corresponding to `www.qq.com`.
-If the local DNS server has forwarding enabled, it sends the request to an upper level DNS server. If the upper level DNS server also doesn't recognize the domain name, then the request keeps getting rerouted to higher levels until it finally reaches a DNS server which recognizes the domain name.
Whether or not the local DNS server enables forwarding, the IP address of the domain name always returns to the local DNS server, and the local DNS server sends it back to the client.
![](images/3.1.dns_inquery.png?raw=true)
Figure 3.3 DNS resolution work flow
`Recursive query process` means the enquirers are changing in the process, and enquirers do not change in `Iterative query process`.
`Recursive query process` simply means that the enquirers change in the process. Enquirers do not change in `Iterative query` processes.
Now we know clients get IP addresses in the end, so the browsers are communicating with servers through IP addresses.
## HTTP protocol
HTTP protocol is the core part of web services. It's important to know what is HTTP protocol before you understand how web works.
The HTTP protocol is a core part of web services. It's important to know what the HTTP protocol is before you understand how the web works.
HTTP is the protocol that used for communicating between browsers and web servers, it is based on TCP protocol, and usually use port 80 in the web server side. It is a protocol that uses request-response model, clients send request and servers response. According to HTTP protocol, clients always setup a new connection and send a HTTP request to server in every affair. Server is not able to connect to client proactively, or a call back connection. The connection between the client and the server can be closed by any side. For example, you can cancel your download task and HTTP connection. It disconnects from server before you finish downloading.
HTTP is the protocol that is used to facilitate communication between browsers and web servers. It is based on the TCP protocol and usually uses port 80 on the side of the web server. It is a protocol that utilizes the request-response model -clients send requests and servers respond. According to the HTTP protocol, clients always setup new connections and send HTTP requests to servers. Servers are not able to connect to clients proactively, or establish callback connections. The connection between a client and a server can be closed by either side. For example, you can cancel your download request and HTTP connection and your browser will disconnect from the server before you finish downloading.
HTTP protocol is stateless, which means the server has no idea about the relationship between two connections, even though they are both from same client. To solve this problem, web applications use Cookie to maintain sustainable state of connections.
The HTTP protocol is stateless, which means the server has no idea about the relationship between the two connections even though they are both from same client. To solve this problem, web applications use cookies to maintain the state of connections.
Because HTTP protocol is based on TCP protocol, so all TCP attacks will affect the HTTP communication in your server, such as SYN Flood, DoS and DDoS.
Because the HTTP protocol is based on the TCP protocol, all TCP attacks will affect HTTP communications in your server. Examples of such attacks are SYN flooding, DoS and DDoS attacks.
### HTTP request package (browser information)
@@ -79,19 +82,19 @@ Request packages all have three parts: request line, request header, and body. T
// blank line
// body, request resource arguments (for example, arguments in POST)
We use fiddler to get following request information.
We use fiddler to get the following request information.
![](images/3.1.http.png?raw=true)
Figure 3.4 Information of GET method caught by fiddler
Figure 3.4 Information of a GET request caught by fiddler
![](images/3.1.httpPOST.png?raw=true)
Figure 3.5 Information of POST method caught by fiddler
Figure 3.5 Information of a POST request caught by fiddler
**We can see that GET method doesn't have request body that POST does.**
**We can see that GET does not have a request body, unlike POST, which does.**
There are many methods you can use to communicate with servers in HTTP, and GET, POST, PUT, DELETE are the basic 4 methods that we use. A URL represented a resource on the network, so these 4 method means query, change, add and delete operations. GET and POST are most commonly used in HTTP. GET appends data to the URL and uses `?` to break up them, uses `&` between arguments, like `EditPosts.aspx?name=test1&id=123456`. POST puts data in the request body because URL has length limitation by browsers, so POST can submit much more data than GET method. Also when we submit our user name and password, we don't want this kind of information appear in the URL, so we use POST to keep them invisible.
There are many methods you can use to communicate with servers in HTTP; GET, POST, PUT and DELETE are the 4 basic methods that we typically use. A URL represents a resource on a network, so these 4 methods define the query, change, add and delete operations that can act on these resources. GET and POST are very commonly used in HTTP. GET can append query parameters to the URL, using `?` to separate the URL and parameters and `&` between the arguments, like `EditPosts.aspx?name=test1&id=123456`. POST puts data in the request body because the URL implements a length limitation via the browser. Thus, POST can submit much more data than GET. Also, when we submit user names and passwords, we don't want this kind of information to appear in the URL, so we use POST to keep them invisible.
### HTTP response package (server information)
@@ -107,9 +110,9 @@ Let's see what information is contained in the response packages.
// blank line
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"... // message body
The first line is called status line, it has HTTP version, status code and statue message.
The first line is called the status line. It supplies the HTTP version, status code and status message.
Status code tells the client is HTTP server has expectation response. In HTTP/1.1, we defined 5 kinds of status code.
The status code informs the client of the status of the HTTP server's response. In HTTP/1.1, 5 kinds of status codes were defined:
- 1xx Informational
- 2xx Success
@@ -117,7 +120,7 @@ Status code tells the client is HTTP server has expectation response. In HTTP/1.
- 4xx Client Error
- 5xx Server Error
Let's see more examples about response packages, 200 means server responded correctly, 302 means redirection.
Let's see more examples about response packages. 200 means server responded correctly, 302 means redirection.
![](images/3.1.response.png?raw=true)
@@ -125,23 +128,23 @@ Figure 3.6 Full information for visiting a website
### HTTP is stateless and Connection: keep-alive
Stateless doesn't means server has no ability to keep a connection, in other words, server doesn't know any relationship between any two requests.
The term stateless doesn't mean that the server has no ability to keep a connection. It simply means that the server doesn't recognize any relationships between any two requests.
In HTTP/1.1, Keep-alive is used as default, if clients have more requests, they will use the same connection for many different requests.
In HTTP/1.1, Keep-alive is used by default. If clients have additional requests, they will use the same connection for them.
Notice that Keep-alive cannot keep one connection forever, the software runs in the server has certain time to keep connection, and you can change it.
Notice that Keep-alive cannot maintian one connection forever; the application running in the server determines the limit with which to keep the connection alive for, and in most cases you can configure this limit.
## Request instance
![](images/3.1.web.png?raw=true)
Figure 3.7 All packages for open one web page
Figure 3.7 All packages for opening one web page
We can see the whole process of communication between the client and server by above picture. You may notice that there are many resource files in the list, they are called static files, and Go has specialized processing methods for these files.
We can see the entire communication process between client and server from the above picture. You may notice that there are many resource files in the list; these are called static files, and Go has specialized processing methods for these files.
This is the most important function of browsers, request for a URL and get data from web servers, then render HTML for good user interface. If it finds some files in the DOM, such as CSS or JS files, browsers will request for these resources from server again, until all the resources finished rendering on your screen.
This is the most important function of browsers: to request for a URL and retrieve data from web servers, then render the HTML. If it finds some files in the DOM such as CSS or JS files, browsers will request these resources from the server again until all the resources finish rendering on your screen.
Reduce HTTP request times is one of the methods that improves speed of loading web pages, which is reducing CSS and JS files, it reduces pressure in the web servers at the same time.
Reducing HTTP request times is one way of improving the loading speed of web pages. By reducing the number of CSS and JS files that need to be loaded, both request latencies and pressure on your web servers can be reduced at the same time.
## Links