Merging other languages
This commit is contained in:
216
en/06.1.md
216
en/06.1.md
@@ -1,105 +1,111 @@
|
||||
# 6.1 session和cookie
|
||||
session和cookie是网站浏览中较为常见的两个概念,也是比较难以辨析的两个概念,但它们在浏览需要认证的服务页面以及页面统计中却相当关键。我们先来了解一下session和cookie怎么来的?考虑这样一个问题:
|
||||
|
||||
如何抓取一个访问受限的网页?如新浪微博好友的主页,个人微博页面等。
|
||||
|
||||
显然,通过浏览器,我们可以手动输入用户名和密码来访问页面,而所谓的“抓取”,其实就是使用程序来模拟完成同样的工作,因此我们需要了解“登陆”过程中到底发生了什么。
|
||||
|
||||
当用户来到微博登陆页面,输入用户名和密码之后点击“登录”后浏览器将认证信息POST给远端的服务器,服务器执行验证逻辑,如果验证通过,则浏览器会跳转到登录用户的微博首页,在登录成功后,服务器如何验证我们对其他受限制页面的访问呢?因为HTTP协议是无状态的,所以很显然服务器不可能知道我们已经在上一次的HTTP请求中通过了验证。当然,最简单的解决方案就是所有的请求里面都带上用户名和密码,这样虽然可行,但大大加重了服务器的负担(对于每个request都需要到数据库验证),也大大降低了用户体验(每个页面都需要重新输入用户名密码,每个页面都带有登录表单)。既然直接在请求中带上用户名与密码不可行,那么就只有在服务器或客户端保存一些类似的可以代表身份的信息了,所以就有了cookie与session。
|
||||
|
||||
cookie,简而言之就是在本地计算机保存一些用户操作的历史信息(当然包括登录信息),并在用户再次访问该站点时浏览器通过HTTP协议将本地cookie内容发送给服务器,从而完成验证,或继续上一步操作。
|
||||
|
||||

|
||||
|
||||
图6.1 cookie的原理图
|
||||
|
||||
session,简而言之就是在服务器上保存用户操作的历史信息。服务器使用session id来标识session,session id由服务器负责产生,保证随机性与唯一性,相当于一个随机密钥,避免在握手或传输中暴露用户真实密码。但该方式下,仍然需要将发送请求的客户端与session进行对应,所以可以借助cookie机制来获取客户端的标识(即session id),也可以通过GET方式将id提交给服务器。
|
||||
|
||||

|
||||
|
||||
图6.2 session的原理图
|
||||
|
||||
## cookie
|
||||
Cookie是由浏览器维持的,存储在客户端的一小段文本信息,伴随着用户请求和页面在Web服务器和浏览器之间传递。用户每次访问站点时,Web应用程序都可以读取cookie包含的信息。浏览器设置里面有cookie隐私数据选项,打开它,可以看到很多已访问网站的cookies,如下图所示:
|
||||
|
||||

|
||||
|
||||
图6.3 浏览器端保存的cookie信息
|
||||
|
||||
cookie是有时间限制的,根据生命期不同分成两种:会话cookie和持久cookie;
|
||||
|
||||
如果不设置过期时间,则表示这个cookie生命周期为从创建到浏览器关闭止,只要关闭浏览器窗口,cookie就消失了。这种生命期为浏览会话期的cookie被称为会话cookie。会话cookie一般不保存在硬盘上而是保存在内存里。
|
||||
|
||||
如果设置了过期时间(setMaxAge(60*60*24)),浏览器就会把cookie保存到硬盘上,关闭后再次打开浏览器,这些cookie依然有效直到超过设定的过期时间。存储在硬盘上的cookie可以在不同的浏览器进程间共享,比如两个IE窗口。而对于保存在内存的cookie,不同的浏览器有不同的处理方式。
|
||||
|
||||
|
||||
### Go设置cookie
|
||||
Go语言中通过net/http包中的SetCookie来设置:
|
||||
|
||||
http.SetCookie(w ResponseWriter, cookie *Cookie)
|
||||
|
||||
w表示需要写入的response,cookie是一个struct,让我们来看一下cookie对象是怎么样的
|
||||
|
||||
type Cookie struct {
|
||||
Name string
|
||||
Value string
|
||||
Path string
|
||||
Domain string
|
||||
Expires time.Time
|
||||
RawExpires string
|
||||
|
||||
// MaxAge=0 means no 'Max-Age' attribute specified.
|
||||
// MaxAge<0 means delete cookie now, equivalently 'Max-Age: 0'
|
||||
// MaxAge>0 means Max-Age attribute present and given in seconds
|
||||
MaxAge int
|
||||
Secure bool
|
||||
HttpOnly bool
|
||||
Raw string
|
||||
Unparsed []string // Raw text of unparsed attribute-value pairs
|
||||
}
|
||||
|
||||
我们来看一个例子,如何设置cookie
|
||||
|
||||
expiration := time.Now()
|
||||
expiration = expiration.AddDate(1, 0, 0)
|
||||
cookie := http.Cookie{Name: "username", Value: "astaxie", Expires: expiration}
|
||||
http.SetCookie(w, &cookie)
|
||||
|
||||
|
||||
### Go读取cookie
|
||||
上面的例子演示了如何设置cookie数据,我们这里来演示一下如何读取cookie
|
||||
|
||||
cookie, _ := r.Cookie("username")
|
||||
fmt.Fprint(w, cookie)
|
||||
|
||||
还有另外一种读取方式
|
||||
|
||||
for _, cookie := range r.Cookies() {
|
||||
fmt.Fprint(w, cookie.Name)
|
||||
}
|
||||
|
||||
可以看到通过request获取cookie非常方便。
|
||||
|
||||
## session
|
||||
|
||||
session,中文经常翻译为会话,其本来的含义是指有始有终的一系列动作/消息,比如打电话是从拿起电话拨号到挂断电话这中间的一系列过程可以称之为一个session。然而当session一词与网络协议相关联时,它又往往隐含了“面向连接”和/或“保持状态”这样两个含义。
|
||||
|
||||
session在Web开发环境下的语义又有了新的扩展,它的含义是指一类用来在客户端与服务器端之间保持状态的解决方案。有时候Session也用来指这种解决方案的存储结构。
|
||||
|
||||
session机制是一种服务器端的机制,服务器使用一种类似于散列表的结构(也可能就是使用散列表)来保存信息。
|
||||
|
||||
但程序需要为某个客户端的请求创建一个session的时候,服务器首先检查这个客户端的请求里是否包含了一个session标识-称为session id,如果已经包含一个session id则说明以前已经为此客户创建过session,服务器就按照session id把这个session检索出来使用(如果检索不到,可能会新建一个,这种情况可能出现在服务端已经删除了该用户对应的session对象,但用户人为地在请求的URL后面附加上一个JSESSION的参数)。如果客户请求不包含session id,则为此客户创建一个session并且同时生成一个与此session相关联的session id,这个session id将在本次响应中返回给客户端保存。
|
||||
|
||||
session机制本身并不复杂,然而其实现和配置上的灵活性却使得具体情况复杂多变。这也要求我们不能把仅仅某一次的经验或者某一个浏览器,服务器的经验当作普遍适用的。
|
||||
|
||||
## 小结
|
||||
|
||||
如上文所述,session和cookie的目的相同,都是为了克服http协议无状态的缺陷,但完成的方法不同。session通过cookie,在客户端保存session id,而将用户的其他会话消息保存在服务端的session对象中,与此相对的,cookie需要将所有信息都保存在客户端。因此cookie存在着一定的安全隐患,例如本地cookie中保存的用户名密码被破译,或cookie被其他网站收集(例如:1. appA主动设置域B cookie,让域B cookie获取;2. XSS,在appA上通过javascript获取document.cookie,并传递给自己的appB)。
|
||||
|
||||
|
||||
通过上面的一些简单介绍我们了解了cookie和session的一些基础知识,知道他们之间的联系和区别,做web开发之前,有必要将一些必要知识了解清楚,才不会在用到时捉襟见肘,或是在调bug时候如无头苍蝇乱转。接下来的几小节我们将详细介绍session相关的知识。
|
||||
|
||||
## links
|
||||
* [目录](<preface.md>)
|
||||
* 上一节: [session和数据存储](<06.0.md>)
|
||||
* 下一节: [Go如何使用session](<06.2.md>)
|
||||
## 6.1 Session and cookies
|
||||
|
||||
Sessions and cookies are two very common web concepts, and are also very easy to misunderstand. However, they are extremely important for the authorization of pages, as well as for gathering page statistics. Let's take a look at these two use cases.
|
||||
|
||||
Suppose you want to crawl a page that restricts public access, like a twitter user's homepage for instance. Of course you can open your browser and type in your username and password to login and access that information, but so-called "web crawling" means that we use a program to automate this process without any human intervention. Therefore, we have to find out what is really going on behind the scenes when we use a browser to login.
|
||||
|
||||
When we first receive a login page and type in a username and password, after we press the "login" button, the browser sends a POST request to the remote server. The Browser redirects to the user homepage after the server verifies the login information and returns an HTTP response. The question here is, how does the server know that we have access priviledges for the desired webpage? Because HTTP is stateless, the server has no way of knowing whether or not we passed the verification in last step. The easiest and perhaps the most naive solution is to append the username and password to the URL. This works, but puts too much pressure on the server (the server must validate every request against the database), and can be detrimental to the user experience. An alternative way of achieving this goal is to save the user's identity either on the server side or client side using cookies and sessions.
|
||||
|
||||
Cookies, in short, store historical information (including user login information) on the client's computer. The client's browser sends these cookies everytime the user visits the same website, automatically completing the login step for the user.
|
||||
|
||||

|
||||
|
||||
Figure 6.1 cookie principle.
|
||||
|
||||
Sessions, on the other hand, store historical information on the server side. The server uses a session id to identify different sessions, and the session id that is generated by the server should always be random and unique. You can use cookies or URL arguments to get the client's identity.
|
||||
|
||||

|
||||
|
||||
Figure 6.2 session principle.
|
||||
|
||||
## Cookies
|
||||
|
||||
Cookies are maintained by browsers. They can be modified during communication between webservers and browsers. Web applications can access cookie information when users visit the corresponding websites. Within most browser settings, there is one setting pertaining to cookie privacy. You should be able to see something similar to the following when you open it.
|
||||
|
||||

|
||||
|
||||
Figure 6.3 cookie in browsers.
|
||||
|
||||
Cookies have an expiry time, and there are two types of cookies distinguished by their life cyles: session cookies and persistent cookies.
|
||||
|
||||
If your application doesn't set a cookie expiry time, the browser will not save it into the local file system after the browser is closed. These cookies are called session cookies, and this type of cookie is usually saved in memory instead of to the local file system.
|
||||
|
||||
If your application does set an expiry time (for example, setMaxAge(60*60*24)), the browser *will* save this cookie to the local file system, and it will not be deleted until reaching the allotted expiry time. Cookies that are saved to the local file system can be shared by different browser processes -for example, by two IE windows; different browsers use different processes for dealing with cookies that are saved in memory.
|
||||
|
||||
## Set cookies in Go
|
||||
|
||||
Go uses the `SetCookie` function in the `net/http` package to set cookies:
|
||||
|
||||
http.SetCookie(w ResponseWriter, cookie *Cookie)
|
||||
|
||||
`w` is the response of the request and cookie is a struct. Let's see what it looks like:
|
||||
|
||||
type Cookie struct {
|
||||
Name string
|
||||
Value string
|
||||
Path string
|
||||
Domain string
|
||||
Expires time.Time
|
||||
RawExpires string
|
||||
|
||||
// MaxAge=0 means no 'Max-Age' attribute specified.
|
||||
// MaxAge<0 means delete cookie now, equivalently 'Max-Age: 0'
|
||||
// MaxAge>0 means Max-Age attribute present and given in seconds
|
||||
MaxAge int
|
||||
Secure bool
|
||||
HttpOnly bool
|
||||
Raw string
|
||||
Unparsed []string // Raw text of unparsed attribute-value pairs
|
||||
}
|
||||
|
||||
Here is an example of setting a cookie:
|
||||
|
||||
expiration := *time.LocalTime()
|
||||
expiration.Year += 1
|
||||
cookie := http.Cookie{Name: "username", Value: "astaxie", Expires: expiration}
|
||||
http.SetCookie(w, &cookie)
|
||||
|
||||
|
||||
## Fetch cookies in Go
|
||||
|
||||
The above example shows how to set a cookie. Now let's see how to get a cookie that has been set:
|
||||
|
||||
cookie, _ := r.Cookie("username")
|
||||
fmt.Fprint(w, cookie)
|
||||
|
||||
Here is another way to get a cookie:
|
||||
|
||||
for _, cookie := range r.Cookies() {
|
||||
fmt.Fprint(w, cookie.Name)
|
||||
}
|
||||
|
||||
As you can see, it's very convenient to get cookies from requests.
|
||||
|
||||
## Sessions
|
||||
|
||||
A session is a series of actions or messages. For example, you can think of the actions you between picking up your telephone to hanging up to be a type of session. When it comes to network protocols, sessions have more to do with connections between browsers and servers.
|
||||
|
||||
Sessions help to store the connection status between server and client, and this can sometimes be in the form of a data storage struct.
|
||||
|
||||
Sessions are a server side mechanism, and usually employ hash tables (or something similar) to save incoming information.
|
||||
|
||||
When an application needs to assign a new session to a client, the server should check if there are any existing sessions for same client with a unique session id. If the session id already exists, the server will just return the same session to the client. On the other hand, if a session id doesn't exist for the client, the server creates a brand new session (this usually happens when the server has deleted the corresponding session id, but the user has appended the old session manually).
|
||||
|
||||
The session itself is not complex but its implementation and deployment are, so you cannot use "one way to rule them all".
|
||||
|
||||
## Summary
|
||||
|
||||
In conclusion, the purpose of sessions and cookies are the same. They are both for overcoming the statelessness of HTTP, but they use different ways. Sessions use cookies to save session ids on the client side, and save all other information on the server side. Cookies save all client information on the client side. You may have noticed that cookies have some security problems. For example, usernames and passwords can potentially be cracked and collected by malicious third party websites.
|
||||
|
||||
Here are two common exploits:
|
||||
|
||||
1. appA setting an unexpected cookie for appB.
|
||||
2. XSS attack: appA uses the JavaScript `document.cookie` to access the cookies of appB.
|
||||
|
||||
After finishing this section, you should know some of the basic concepts of cookies and sessions. You should be able to understand the differences between them so that you won't kill yourself when bugs inevitably emerge. We'll discuss sessions in more detail in the following sections.
|
||||
|
||||
## Links
|
||||
|
||||
- [Directory](preface.md)
|
||||
- Previous section: [Data storage and session](06.0.md)
|
||||
- Next section: [How to use session in Go](06.2.md)
|
||||
|
||||
Reference in New Issue
Block a user