+ 湖南.大学生科技创新平台's Archiver

maozilee 发表于 2008-9-7 00:32

L7-filter 七层协议过滤

RouterOS 3.x中开始引入Linux 的7层协议过滤技术 ,使用该功能可以对应用层进行过滤,路由器实现QOS更加灵活!
|(mOw_(e ]A;W4pxX!QO
It's fairly easy to add support for more protocols to l7-filter. All you need to do is add a new pattern file to [font=新宋体]/etc/l7-protocols[/font]. This directory and its subdirectories are searched (non-recursively) for pattern files. (Thus, it will find [font=新宋体]/etc/l7-protocols/http.pat[/font] and [font=新宋体]/etc/l7-protocols/protocols/http.pat[/font], but not [font=新宋体]/etc/l7-protocols/foo/bar/http.pat[/font].) Please consider submitting any patterns you write for inclusion into the official distribution.4s sum.l4me!h8q
File formatBasic formatThe basic format is very simple:e[J)]y-t,f*K
[list=1][*]The name of the protocol on one line[*]A regular expression defining the protocol on the next line (see [url=http://l7-filter.sourceforge.net/Pattern-HOWTO#regexp][i][color=#800080]regular expressions[/color][/i][/url] below)[/list]The name of the file must match the name of the protocol. (If the protocol is "ftp", the file must be "ftp.pat".) Lines starting with '#' and blank lines are ignored. Both the [url=http://l7-filter.sourceforge.net/HOWTO-kernel][color=#0000ff]kernel[/color][/url] and [url=http://l7-filter.sourceforge.net/HOWTO-userspace][color=#0000ff]userspace[/color][/url] versions of l7-filter will use the given regular expression. For example, vnc.pat could be: tJr._.{7uM
vnc^rfb 00[1-9]\.00[0-9]\x0a$Defining a separate userspace patternSometimes it will be desirable to define a separate regular expression for the kernel and userspace versions or to pass a custom set of flags to the userspace version's regcomp/regexec. (See [i][url=http://l7-filter.sourceforge.net/Pattern-HOWTO#regexp][color=#800080]regular expressions[/color][/url][/i] below for why.) In this case, add either or both of these lines after the two above:
O-lH/]$[ [font=新宋体]userspace pattern=<userspace pattern>
;@'GU-Em1@\n`g1BZ userspace flags=<regexec and/or regcomp flags, whitespace delimited>[/font]8g%Xx.];h}E1W0I
For example, smtp.pat could be:
)Y B&YK%Vj"U M:V smtp^220[\x09-\x0d -~]* (e?smtp|simple mail)userspace pattern=^220[\x09-\x0d -~]* (E?SMTP|[Ss]imple [Mm]ail)userspace flags=REG_NOSUB REG_EXTENDEDMeta-dataPattern files that are part of the official distribution need some metadata at the top for [url=http://l7-filter.sourceforge.net/protocols][color=#0000ff]display on the webpage[/color][/url] and for the use of frontends. The top four lines should look like this:b@h)G1^}
# <Protocol name and some concise detail about the protocol># Pattern attributes: [attribute word]*# Protocol groups: [group name]*# Wiki: [link]*"Pattern attributes" give information about how good the pattern is on various scales. Attribute words can be any of [i]undermatch[/i], [i]overmatch[/i], [i]superset[/i], [i]subset[/i], [i]great[/i], [i]good[/i], [i]ok[/i], [i]marginal[/i], [i]poor[/i], [i]veryfast[/i], [i]fast[/i], [i]nosofast[/i], or [i]slow[/i]. Any number of these may be used. They are defined [url=http://l7-filter.sourceforge.net/protocols][color=#0000ff]on the protocols page[/color][/url].`6wEX|X*K8l+CK([
"Protocol groups" are supposed to give frontends a way to group similar protocols. Group names can be whatever you like, but should match existing names if possible. Any number may be used. More relevant groups should be listed first for sorting purposes. Group names in use as of 2007-01-14 are:
:SA9ErNA1t W ` [list][*]chat[*]document_retrieval[*]file[*]game[*]ietf_draft_standard[*]ietf_internet_standard[*]ietf_proposed_standard[*]ietf_rfc_documented[*]mail[*]monitoring[*]networking[*]obsolete[*]open_source[*]p2p[*]printer[*]proprietary[*]remote_access[*]secure[*]streaming_audio[*]streaming_video[*]time_synchronization[*]version_control[*]voip[*]worm[*]x_consortium_standard[/list]"Wiki" gives zero or more links to pages documenting the pattern and other methods of identifying the protocol on [url=http://protocolinfo.org/][color=#0000ff]protocolinfo.org[/color][/url]. h9p8XFoS
Regular expressionsThe [url=http://l7-filter.sourceforge.net/HOWTO-kernel][color=#0000ff]kernel[/color][/url] and [url=http://l7-filter.sourceforge.net/HOWTO-userspace][color=#0000ff]userspace[/color][/url] versions of l7-filter use different regular expressions libraries. They use generally the same syntax, but have some differences.
Y5X_$c/x'T Pj General informationBecause patterns frequently need to use non-printable characters, both versions of l7-filter add [url=http://perldoc.perl.org/perlre.html#Regular-Expressions][color=#0000ff]perl-style hex matching[/color][/url] on top of their stock libraries. This uses \xHH notation, so to match a tab, use "[font=新宋体]\x09[/font]". Note that regexp control characters are [b]still[/b] control characters even when written in hex:fU z)KWL
\x24 == $ \x28 == (\x29 == ) \x2a == *\x2b == + \x2e == .\x3f == ? \x5b == [\x5c == \ \x5d == ]\x5e == ^ \x7b == { (only a control character for the userspace version)\x7c == | \x7d == } (only a control character for the userspace version)Both versions of l7-filter strip out the nulls (\x00 bytes) from network data so that they can treat it as normal C strings. So (1) you can't match on nulls and (2) fields may appear shorter than expected. For example, if a protocol has a 4 byte field and any of those bytes can be null, it can appear to be any length from 0 to 4.
e2l d3oYub&V;j#E Kernel versionThe kernel version of l7-filter uses Henry Spencer's 1987 implementation of [url=http://l7-filter.sourceforge.net/V8regex][color=#0000ff]Bell Version 8 regular expressions[/color][/url] ("V8 regexps"), with a few modifications, noted here. V8 regexps are likely more limited than the regexps you are used to. Notably, you [b]cannot[/b] use bounds ("[font=新宋体]foo{3}[/font]"), character classes ("[font=新宋体][[:punct:]][/font]") or backreferences.
7s2SNI#N Because this library does not have a flag for case-sensitivity, the kernel version of l7-filter is always case insensitive. Upper case in patterns is identical to lower case. (This is true even if you write an uppercase letter in hex!)q+Bjv#[7CWw FMg'y
The kernel version completely ignores any lines in the pattern file after the second non-comment line.tJ-fnh{7]
Userspace versionThe userspace version of l7-filter uses the GNU regular expression library, so its behaviour should be more familiar. This library is documented in [i][url=http://en.wikipedia.org/wiki/Man_pages][color=#0000ff]man[/color][/url] 3 regcomp[/i] and [i]man 7 regex[/i].GpQ&s(k!K7y k
If only one regular expression is specified in the pattern file (see [url=http://l7-filter.sourceforge.net/Pattern-HOWTO#format][i][color=#800080]file format[/color][/i][/url] above), the userspace version compiles it with the flags [font=新宋体]REG_EXTENDED | REG_ICASE | REG_NOSUB[/font] and executes it with no flags.
~m1xV8|R7_C)u+m$T If the [font=新宋体]userspace pattern[/font] and [font=新宋体]userspace flags[/font] lines are given, the userspace pattern will be used instead of the first one. It will be compiled and executed with the given flags. (l7-filter will sort out which flags go to regcomp and which to regexec.)
2aC]+k&dj0D5L If only the [font=新宋体]userspace pattern[/font] line is given, the userspace pattern will be compiled with [font=新宋体]REG_EXTENDED | REG_ICASE | REG_NOSUB[/font] and executed with no flags. If only the [font=新宋体]userspace flags[/font] line is given, the single regular expression will be compiled and executed with the given flags.`G3i~}!?O
What l7-filter sees and doesIf you have set up your iptables rules correctly (see the [url=http://l7-filter.sourceforge.net/HOWTO][color=#0000ff]HOWTO[/color][/url]), l7-filter sees the data going in both directions in the order that it passes through the computer. For instance, in FTP, the first thing it sees is "221 server ready", then "USER bob", then "331 send password", then "PASS frogbeard", and so on."_a?[!V;O O&m r0p
l7-filter can match across packets. For instance, with the above FTP example, the match is first attempted on "221 server ready", then on "221 server readyUser bob", then "221 server readyUSER bob331 send password",[url=http://l7-filter.sourceforge.net/Pattern-HOWTO#picky][color=#800080][1][/color][/url] so you could match it with "[font=新宋体]220.*user.*331[/font]". At each match attempt, the regexp special character [font=新宋体]^[/font] will match the beginning of the stream and [font=新宋体]$[/font] will match the end of the last packet seen so far. Because the Linux kernel's ip_conntrack module tracks connectionless UDP and ICMP sessions as "connections", this works with them as well as TCP.
jt,_8};e%|$R Usually the identifying characteristics of a connection are found at the beginning of that connection. For this reason, and to save processing time, l7-filter only looks at the first 10 packets or 2kB of each connection, whichever is smaller. Any match made within this time is applied to the rest of the connection as well.
H Z'e:Le(^$yS\n 1Yes, there should be CRLFs in there, which I omitted for clarity. Picky, picky.
ks0[ \4a ai7k:tW9c?
.C&i$_&^aEt)fH'pP What makes a good patternThere are two general guidelines:
%n3US`8P.`? 1) A pattern must be neither too specific nor not specific enough.
jpnH9l9]*k H Example 1: The pattern "[font=新宋体]bear[/font]" for Bearshare is not specific enough. This pattern could match a wide variety of non-Bearshare connections. For instance, an HTTP request for [url]http://bear.com[/url] would be matched.
k7HZ)uF$p Example 2: "[font=新宋体]220 .*ftp.*(\[.*\]|\(.*\))[/font]" for FTP is too specific. Not all servers send ()s or []s after their 220. In fact, servers are not even required to send the string "ftp" at any time, but the vast majority do. Good judgement and testing are necessary for instances such as this.n"s cg{ t
2) It should use a minimum of processing power. If it's possible to reduce the number of instances of [font=新宋体]*[/font], [font=新宋体]+[/font] and [font=新宋体]|[/font] in your pattern, you should do so. Use the performance testing program included in the patterns package.
[)?o7PB#WEh@ {N 3) It should complete its match on the earliest packet possible. The FTP pattern could be "[font=新宋体]^220[\x09-\x0d -~]*\x0d\x0aUSER[\x09-\x0d -~]*\x0d\x0a331[/font]", but that won't match until the third data packet. Instead, we use "[font=新宋体]^220[\x09-\x0d -~]*ftp[/font]", which matches on the first data packet.
3MOt]#E.m GY Miscellaneous tips[\x09-\x0d -~] == printable characters, including whitespace[\x09-\x0d ] == any whitespace[!-~] == non-whitespace printable charactersRecommended procedure for writing patterns[list=1][*]Find and read the spec for the protocol you wish to match. If it's an Internet standard, [url=http://rfc-editor.org/][color=#0000ff]RFCs[/color][/url] are a good place to start, although not all standards are RFCs. If it is a proprietary protocol, it is likely that someone has written a reverse-engineered spec for it. Do a general web search to find it. Skipping this step is a good way to write patterns that are overly specific![*]Use something like [url=http://wireshark.org/][color=#0000ff]Wireshark[/color][/url] (formerly known as Ethereal) to watch packets of this protocol go by in a typical session of its use. (If you failed to find a spec for your protocol, but Wireshark can parse it, reading the Wireshark source code may also be worth your time.)[*]Write a pattern that will reliably match one of the first few packets that are sent in your protocol. Test it. Test its performance.[*]Send your pattern to l7-filter-developers{/-\T}lists*sf*net for it to be incorporated into the official pattern definitions (you [b]must[/b] [url=http://lists.sourceforge.net/lists/listinfo/l7-filter-developers][color=#0000ff]subscribe first[/color][/url]).[/list]HOWTO send a packet dump to the mailing listIf you do not feel that you are able to do all of the above yourself, you may want to send some packets you have captured to the mailing list so that others can do the rest. In order for this to be useful, please follow these guidelines:`"wW]s%T"dM8Dq
[list][*]If you have never done anything like this before, use [url=http://wireshark.org/][color=#0000ff]Wireshark[/color][/url]. It's easy to use and available for GNU/Linux, Mac and Windows (and FreeBSD, HP-UX, NetBSD, Solaris...). Use File→Save to save the captured packets.[*]Make sure that you start capturing packets before the application that you are testing has started using the network. l7-filter looks at the opening packets of a connection. If these are not present in the packet dump, it is useless.[*]If it makes sense for the protocol in question, send a recognizable text string so that the relevant connection can be found in the packet dump. For instance, if testing an instant messenger, send a message with "hello hello hello."[*]Along with your capture, send us anything that could be helpful in picking out the relevant data. For example, this could include the server's IP address, what network operations you performed, the version numbers of all software used, any strings you expect to appear in the packets (such as instant messenger text, e-mail addresses, gaming handles, etc.), etc.[*]Try not to capture an excessive number of packets. In particular:[list][*]Avoid having other programs use the network during your capture. Assuming their traffic is recognizable, the excess packets can be filtered out, but it's annoying.[*]Avoid sending captures that have many thousands of packets from the same connection. All but the first few are useless.[*][b]However[/b], if you are not sure when the application opens connections, or if it opens many simultaneous connections, it might be necessary to send a large number of packets. This is ok.[/list][*]Send the packets in libpcap format or something else that Wireshark can read. [b]Do not:[/b][list][*]send only a text hexdump of the packets. This is unnecessarily hard to read.[*]send only the data portion of the packets. The TCP headers in particular are essential for finding streams. You may anonymize addresses if necessary, but try to avoid it.[*]compress the captured packets with anything other than gzip or bzip2. In fact, no compression is needed unless the file is very large.[/list][/list]If you aren't sure how to follow these guidelines, try your best and send the result to us. If it's wrong, we'll be happy to tell you how to fix it.
JAWvQ#kGs [size=2][/size] D9dYALVR
[size=2]Last updated 23 April 2008    来源:[url=http://l7-filter.sourceforge.net/Pattern-HOWTO][color=#800080]http://l7-filter.sourceforge.net/Pattern-HOWTO[/color][/url][/size]

maozilee 发表于 2008-9-7 00:34

[b]L7-filter的工作原理[/b] wP1n Z\p;`
L7-filter为我们实现了可以从应用层实现过滤的功能,它的实现原理仍然是基于特征的关键字匹配。但是它不是简单的匹配某个单字和词,它使用了更高级“正则”来进行匹配。
;vP"qW)\ X4`:i1D  正则表达式(regular expression)描述了一种字符串匹配的模式,可以用来检查一个串是否含有某种子串、将匹配的子串做替换或者从某个串中取出符合某个条件的子串等。
/` a7?d)Z 正则表达式是由普通字符(例如字符 a 到 z)以及特殊字符(称为元字符)组成的文字模式。正则表达式作为一个模板,将某个字符模式与所搜索的字符串进行匹配。
0q0O;{*d9Z)]R L7-filter在默认情况下,将同一个连接中的10个数据包或者2KB的数据包内容放在缓存中。并将缓存中的内容作为一段普通的文本,用模板文件中的正则去搜索,如果发现有正则匹配的内容,就会在netfilter中将这几个数据包DROP掉或者给数据包打上标记。
(] nO;J'P}0m N f 以上就是L7-filter的实现原理
1^/] sOD8Fg q [b]模板文件格式[/b]-v;a ` O-`
文件的名字必须和匹配的协议要相同。 (如果你要匹配的协议是“ftp”, 那你的文件名必须是 “ftp.pat”) 文件的内容格式如下:
c ^9V ~:qJ` 第一行是协议的名字,要和文件名相同。
C!K&x6J.d,q 第二行是这个协议正则表达式定义。5? [Q Y/fH
[b]正则表达式[/b]
%@;DX:{5|5|X,~O l7-filter 使用
.}5?1W o'aO%i(d [url=http://www.hmug.org/man/3/regsub.html][color=#0000ff]Version 8 正则表达式[/color][/url]
Cr0lo&W3h 。 使用这个版本的正则有很多限制,只能使用一些基本的正则表达式。例如, 你不能使用范围限制 ("foo{3}"), 字符分类 ("[[:punct:]]") 或者向后引用。 v8m3O(H,A c&O
另外, 我们加入了[u]
HWVSk-D$yu#l(Y]7ic [url=http://perldoc.perl.org/perlre.html#Regular-Expressions][color=#0000ff]perl-style hex[/color][/url]I4F/N{B1M {
[/u]来匹配 \xHH 这样的16进制值。(例如匹配一个tab, 则使用 "\x09").如果要匹配以下的字符,应当这样:
2S l1MeJ!GAQ Yb$Q \x24 == $ (only matters if it's the last character)5xS1?8tC?It
\x28 == (/]Lv Vb*n9o+`c
\x29 == )4R9rg I#@p*X
\x2a == *
%m1}a0R2} \x2b == +W#F(Y3F}"RA,pp
\x2e == .
/d,Ll+w1u \x3f == ??Fj{6q8Tp
\x5b == [woFIRa3MK6A:p
\x5c == \dr\_u
\x5e == ^ (only matters if it's the first character)
7h/zIIwf \x7c == |
K&x;C,j&LRg l7-filter 对大小写不敏感。l7-filter 对网络数据包当然一段普通的字符串来进行处理,所以对网络数据包中的\x00这样的为零的值是视而不见的,因为00代表没有,所以你不能匹配数据包中为null的字符。例如:一个包中的数据部分为4字节,但是这4字节的内容的某一个或者多个字节的值可能为零。这样的话,在l7-filter看来,这个数据包的长度有可能是0,也可能是1,也可能是2或者3或者4。
xpT2j6T 例子:
1N;gD(Ex [\x09-\x0d -~] == 可打印字符, 包括空白字符 whitespace
G D5Gg$Fr2Hu [\x09-\x0d ] == 任何空白字符.Gy S6`3g
[!-~] == non-whitespace printable characters
4UsY6Nj9|As;n$wz [b]怎样写一个好的模板[/b]p*SK7bh,XPP9jb'H
1)一个模板既不要写得太详细,也不要写得太宽泛。太精细了,会导致效率降低,太宽泛了会导致误杀其它网络协议。Pt`g+b3W0o+i.k,g!_
模板 "bear" 没有写得比较精确. 这样的话,凡是连接中包含有"bear" 的都会被匹配到。如,一个HTTP 请求[url=http://bear.com/][color=#0000ff]http://bear.com[/color][/url] 也会被匹配到。%Ke!w,^9anL
具体写模板的方法"Ln2QU_%BBQ
一、首先便是抓包了。用Ethereal抓包,观察数据包的内容的特征和规律。1[3~%mz!T/ib
以下以QQ的数据包为例:
7x C z(r tf e EFJb 我们看它数据包的数据内容部分(不管IP层,只看应用层),对比每一个数据包的内容,就可以发现一个规律,每一个包的第一个字节是02(16进制值),最后一个字节是03(16进制值)。
_5n,^Gl|T$Z 有了这个规律,我们就可以用正则表达式来表达它了。EvI d(x:h
二、正则表达式的相关知识。
jl#Xv*E"W2}!q 只讲最基本,如果不明白的,可以查一些相关的资料。
&B;zv0Un4oM ^代表首字符,$代表末字符,.代表任意一个字符,.?代表零个或者一个字符,.+代表一个或者任意多个字符,\是代表转义字符(\x02这个表示匹配16进制的02值)。
G~#]8L @ 三、结合以上知识,因为QQ包的第一个字节是16进制02,所以正则是^\x02,最后一个字节是16进制03,正则是\x03$,在L7-filter的缓存中的QQ数据包中可能有任意个字节,再将两个正则结合起来就应当是^\x02.+ \x03$(N*@V2?D Xh
四、最后写一个pat文件,文件名就叫qq.pat,文件的内容如下:!~&D!]O*Mud1a
qq
/^hb u"M#B#On ^\x02.+ \x03$D%[m,ti]x
再将qq.pat放入/etc/l7-protocols这个目录中。执行iptables命令:
'bB,h(j9@A(_ iptables -t mangle -A POSTROUTING -m layer7 --l7proto qq  -j DROP
2B-E"E Ev4dv 即可成功阻止QQ的通讯。
yI7hc5@H%b `
i5]8_.k] FG @ 1zo8D(H B Jd/Ss5n7pK
h"M'Vr&O
另一个迅雷的模板例子: v Fc;^h)\\-v;HK

H3_}QsyI 启动迅雷,双击种子文件开始下载后,通过种子文件推算出文件的直接的下载地址。然后直接到下载地址进行下载。再次连接资源服务器,相当于bt服务器:hub4t.sandai.net,IP是 219.134.132.81向服务器发送tcp查询请求。@ R#KDTm*^(b
28 00 00 00 53 00 00 00 64 00 00 00 05 00 00 00 51 55 45 52 59 00 31 00 00 00(16进制值))YQu$`0aM%Z
第一字节不变                                      (以上四字节不变)
yW)vhu,Q1R3js1\ 其中第1、17、18、19、20、21字节不变。实际上ascii值为:query
#YM0QVnqC+~8e 服务器返回文件在其它地方的下载地址:
S&y7F4~R!iY:m 28 00 00 00 53 00 00 00 04 20 00 00 09 00 00 00 51 55 45 52 59 52 45 53 50 01 98
7KP0_9}&_B 其中第1、17、18、19、20、21字节的值不变。实际上ascii值为:query
3wDO5y'^ 最后迅雷根据服务器返回的文件的下载地址进行下载。
D.mW__ KI2Wu,A
a`:XT6M 另外,如果返回的地址是另一迅雷用户本机上的文件,会使用以下命令进行下载Gs6Ccpk3W8m }%A
29 00 00 00 a9 00 00 00 5d 00 00 00 03 00 00 00 47 45 54 22 00 00 00 47 3a2m Y2}eV3R} tX
其中:P(N*D~dxX8lE
第1位可能值为28或者29,我觉得可能是迅雷的版本号,17、18、19不变。后四位的值ascii值为:get,~k4qsE8X!j9\

2Wf S.}B n*fw(A)c 重要一点。迅雷每个命令间用3个null值填充。而l7对null不敏感。所以表达式为:_ EW}U-Et4Y
^[\x28\x29]...(query|get)gd'c(E3af
srhS8is
特别注意:L7-filter不是匹配每一个包的内容,而是匹配N个包中的全部内容。所以例子中,缓存中可能有N个QQ的数据包,但是缓存中的第一个字节肯定是QQ包的第一字节,缓存中的最后个字节肯定是QQ包的最后一个字节。

oizys 发表于 2008-9-7 12:19

我觉得7层的活还是要单独拿出来做,不然及其负载太大了。

页: [1]

Powered by Discuz! Archiver 7.0.0  © 2001-2009 Comsenz Inc.