Merging other languages

This commit is contained in:
James Miranda
2016-09-23 18:01:10 -03:00
parent 380a8ee74c
commit de3c5bdaa4
490 changed files with 24539 additions and 24588 deletions

View File

@@ -1,72 +1,78 @@
# 9.2 确保输入过滤
过滤用户数据是Web应用安全的基础。它是验证数据合法性的过程。通过对所有的输入数据进行过滤可以避免恶意数据在程序中被误信或误用。大多数Web应用的漏洞都是因为没有对用户输入的数据进行恰当过滤所引起的。
我们介绍的过滤数据分成三个步骤:
- 1、识别数据搞清楚需要过滤的数据来自于哪里
- 2、过滤数据弄明白我们需要什么样的数据
- 3、区分已过滤及被污染数据如果存在攻击数据那么保证过滤之后可以让我们使用更安全的数据
## 识别数据
“识别数据”作为第一步是因为在你不知道“数据是什么,它来自于哪里”的前提下,你也就不能正确地过滤它。这里的数据是指所有源自非代码内部提供的数据。例如:所有来自客户端的数据,但客户端并不是唯一的外部数据源,数据库和第三方提供的接口数据等也可以是外部数据源。
由用户输入的数据我们通过Go非常容易识别Go通过`r.ParseForm`之后把用户POST和GET的数据全部放在了`r.Form`里面。其它的输入要难识别得多,例如,`r.Header`中的很多元素是由客户端所操纵的。常常很难确认其中的哪些元素组成了输入,所以,最好的方法是把里面所有的数据都看成是用户输入。(例如`r.Header.Get("Accept-Charset")`这样的也看做是用户输入,虽然这些大多数是浏览器操纵的)
## 过滤数据
在知道数据来源之后,就可以过滤它了。过滤是一个有点正式的术语,它在平时表述中有很多同义词,如验证、清洁及净化。尽管这些术语表面意义不同,但它们都是指的同一个处理:防止非法数据进入你的应用。
过滤数据有很多种方法其中有一些安全性较差。最好的方法是把过滤看成是一个检查的过程在你使用数据之前都检查一下看它们是否是符合合法数据的要求。而且不要试图好心地去纠正非法数据而要让用户按你制定的规则去输入数据。历史证明了试图纠正非法数据往往会导致安全漏洞。这里举个例子“最近建设银行系统升级之后如果密码后面两位是0只要输入前面四位就能登录系统”这是一个非常严重的漏洞。
过滤数据主要采用如下一些库来操作:
- strconv包下面的字符串转化相关函数因为从Request中的`r.Form`返回的是字符串,而有些时候我们需要将之转化成整/浮点数,`Atoi``ParseBool``ParseFloat``ParseInt`等函数就可以派上用场了。
- string包下面的一些过滤函数`Trim``ToLower``ToTitle`等函数,能够帮助我们按照指定的格式获取信息。
- regexp包用来处理一些复杂的需求例如判定输入是否是Email、生日之类。
过滤数据除了检查验证之外,在特殊时候,还可以采用白名单。即假定你正在检查的数据都是非法的,除非能证明它是合法的。使用这个方法,如果出现错误,只会导致把合法的数据当成是非法的,而不会是相反,尽管我们不想犯任何错误,但这样总比把非法数据当成合法数据要安全得多。
## 区分过滤数据
如果完成了上面的两步数据过滤的工作就基本完成了但是在编写Web应用的时候我们还需要区分已过滤和被污染数据因为这样可以保证过滤数据的完整性而不影响输入的数据。我们约定把所有经过过滤的数据放入一个叫全局的Map变量中(CleanMap)。这时需要用两个重要的步骤来防止被污染数据的注入:
- 每个请求都要初始化CleanMap为一个空Map。
- 加入检查及阻止来自外部数据源的变量命名为CleanMap。
接下来,让我们通过一个例子来巩固这些概念,请看下面这个表单
<form action="/whoami" method="POST">
我是谁:
<select name="name">
<option value="astaxie">astaxie</option>
<option value="herry">herry</option>
<option value="marry">marry</option>
</select>
<input type="submit" />
</form>
在处理这个表单的编程逻辑中非常容易犯的错误是认为只能提交三个选择中的一个。其实攻击者可以模拟POST操作递交`name=attack`这样的数据,所以在此时我们需要做类似白名单的处理
r.ParseForm()
name := r.Form.Get("name")
CleanMap := make(map[string]interface{}, 0)
if name == "astaxie" || name == "herry" || name == "marry" {
CleanMap["name"] = name
}
上面代码中我们初始化了一个CleanMap的变量当判断获取的name是`astaxie``herry``marry`三个中的一个之后
我们把数据存储到了CleanMap之中这样就可以确保CleanMap["name"]中的数据是合法的从而在代码的其它部分使用它。当然我们还可以在else部分增加非法数据的处理一种可能是再次显示表单并提示错误。但是不要试图为了友好而输出被污染的数据。
上面的方法对于过滤一组已知的合法值的数据很有效,但是对于过滤有一组已知合法字符组成的数据时就没有什么帮助。例如,你可能需要一个用户名只能由字母及数字组成:
r.ParseForm()
username := r.Form.Get("username")
CleanMap := make(map[string]interface{}, 0)
if ok, _ := regexp.MatchString("^[a-zA-Z0-9].$", username); ok {
CleanMap["username"] = username
}
## 总结
数据过滤在Web安全中起到一个基石的作用大多数的安全问题都是由于没有过滤数据和验证数据引起的例如前面小节的CSRF攻击以及接下来将要介绍的XSS攻击、SQL注入等都是没有认真地过滤数据引起的因此我们需要特别重视这部分的内容。
## links
* [目录](<preface.md>)
* 上一节: [预防CSRF攻击](<09.1.md>)
* 下一节: [避免XSS攻击](<09.3.md>)
# 9.2 Filtering inputs
Filtering user data is one way we can improve the security of our web apps, using it to verify the legitimacy of incoming data. All of the input data is filtered in order to avoid malicious code or data from being mistakenly executed or stored. Most web application vulnerabilities arise form neglecting to filter input data and naively trusting it.
Our introduction to filtering data is divided into three steps:
1. identifying the data; we need to filter the data to figure out where it originated form
2. filtering of the data itself; we need to figure out what kind of data we have received
3. distinguish between filtered (sanitized) and tainted data; after the data has been filtered, we can be assured that it is is secure
## Identifying data
"Identifying the data" is our first step because most of the time, as mentioned, we don't know where it originates from. Without this knowledge, we would be unable to properly filter it. The data here is provided internally all from non-code data. For example: all data comes from clients, however clients that are users are not the only external sources of data. A database interface providing third party data could also be an external data source.
Data that has been entered by a user is very easy to recognize in Go. We use `r.ParseForm` after the user POSTs a form to get all of the data inside the `r.Form`. Other types of input are much harder to identify. For example in `r.Header`s, many of the elements are often manipulated by the client. It can often be difficult to identify which of these elements have been manipulated by clients, so it's best to consider all of them as having been tainted. The `r.Header.Get("Accept-Charset")` header field, for instance, is also considered as user input, although these are typically only manipulated by browsers.
## Filtering data
If we know the source of the data, we can filter it. Filtering is a bit of a formal use of the term. The process is known by many other terms such as input cleaning, validation and sanitization. Despite the fact that these terms somewhat differ in their meaning, they all refer to the same thing: the process of preventing illegal data from making its way into your applications.
There are many ways to filter data, some of which are less secure than others. The best method is to check whether or not the data itself meets the legal requirements dictated by your application. When attempting to do so, it's very important not to make any attempts at correcting the illegal data; this could allow malicious users to manipulate your validation rules for their own needs, altogether defeating the purpose of filtering the data in the first place. History has proven that attempting to correct invalid data often leads to security vulnerabilities. Let's take a look at an overly simple example for illustration purposes. Suppose that a banking system asks users to supply a secure, 6 digit password. The system validates the length of all passwords. One might naively write a validation rule that corrects passwords of illegal lengths: "If a password is shorter than the legal length, fill in the remaining digits with 0s". This simple rule would allow attackers to guess just the first few digits of a password to successfully gain access to user accounts!
We can use several libraries to help us to filter data:
- The strconv package can help us to convert user inputed strings into specific types, since `r.Form`s are maps of string values. Some common string conversions provided by strconv are `Atoi`, `ParseBool`, ` ParseFloat ` and `ParseInt`.
- Go's `strings` package contains some filter functions like `Trim`, `ToLower` and `ToTitle`, which can help us to obtain data in a specific formats, according to our needs.
- Go's `regexp` package can be used to handle cases which are more complex in nature, such as determining whether an input is an email address, a birthday, etc.
Filtering incoming data in addition to authentication can be quite effective. Let's add another technique to our repertoire, called whitelisting. Whitelisting is a good way of confirming the legitimacy of incoming data. Using this method, if an error occurs, it can only mean that the incoming data is illegal, and not the opposite. Of course, we don't want to make any mistakes in our whitelist by falsely labelling legitimate data as illegal, but this scenario is much better than illegal data being labeled as legitimate, and thus much more secure.
## Distinguishing between filtered and tainted data
If you have completed the above steps, the job of filtering data has basically been completed. However when writing web applications, we also need to distinguish between filtered and tainted data because doing so can guarantee the integrity of our data filtering process without affecting the input data. Let's put all of our filtered data into a global map variable called `CleanMap`. Then, two important steps are required to prevent contamination via data injection:
- Each request must initialize `CleanMap` as an empty map.
- Prevent variables from external data sources named `CleanMap` from being introduced into the app.
Next, let's use an example form to reinforce these concepts:
<form action="/whoami" method="POST">
Who am I:
<select name="name">
<option value="astaxie">astaxie</option>
<option value="herry">herry</option>
<option value="marry">marry</option>
</select>
<input type="submit" />
</form>
In dealing with this type of form, it can be very easy to make the mistake of thinking that users will only be able to submit one of the three `select` options. In fact, POST operations can easily be simulated by attackers. For example, by submitting the same form with `name = attack`, a malicious user could introduce illegal data into our system. We can use a simple whitelist to counter these types of attacks:
r.ParseForm()
name := r.Form.Get("name")
CleanMap := make(map[string]interface{}, 0)
if name == "astaxie" || name == "herry" || name == "marry" {
CleanMap["name"] = name
}
The above code initializes a `CleanMap` variable, and a name is only assigned after checking it against an internal whitelist of legitimate values (`astaxie`, `herry` and `marry` in this case). We store the data in the `CleanMap` instance so you can be sure that `CleanMap["name"]` holds a validated value. Any code wishing to access this value can then freely do so. We can also add an additional `else` statement to the above `if` whitelist for dealing with illegal data, a possibility being that the form was displayed with an error. Do not try to be too accommodating though, or you run the risk of accidentally contaminating your `CleanMap`.
The above method for filtering data against a set of known, legitimate values is very effective. There is another method for checking whether or not incoming data consists of legal characters using `regexp`, however this would be ineffectual in the above case where we require that the name be an option from the select. For example, you may require that user names only consist of letters and numbers:
r.ParseForm()
username := r.Form.Get("username")
CleanMap := make(map[string]interface{}, 0)
if ok, _ := regexp.MatchString("^[a-zA-Z0-9].$", username); ok {
CleanMap["username"] = username
}
## Summary
Data filtering plays a vital role in the security of modern web applications. Most security vulnerabilities are the result of improperly filtering data or neglecting to properly validate it. Because the previous section dealt with CSRF attacks and the next two will be introducing XSS attacks and SQL injection, there was no natural segue into dealing with as important a topic as data sanitization, so in this section, we paid special attention to it.
## Links
- [Directory](preface.md)
- Previous section: [CSRF attacks](09.1.md)
- Next section: [XSS attacks](09.3.md)