TY - JOUR
T1 - Fast packet classification using recursive endpoint-cutting and bucket compression on FPGA
AU - Chang, Yeim Kuan
AU - Chen, Han Chen
N1 - Publisher Copyright:
© The British Computer Society 2018. All rights reserved. For permissions, please e-mail: [email protected].
PY - 2019/2/1
Y1 - 2019/2/1
N2 - Packet classification is one of the important functions in today's high-speed Internet routers. Many existing FPGA-based approaches can achieve a high throughput but cannot accommodate the memory required for large rule tables because on-chip memory in FPGA devices is limited. In this paper, we propose a high-throughput and low-cost pipelined architecture using a new recursive endpoint-cutting (REC) decision tree. In the software environment, REC needs only 5-66% of the memory needed in Efficuts for various rule tables. Since the rule buckets associated with leaf nodes in decision trees consume a large portion of total memory, a bucket compression scheme is also proposed to reduce rule duplication. Based on experimental results on Xilinx Virtex-5/6 FPGA, the block RAM required by REC is much less than the existing FPGA-based approaches. The proposed parallel and pipelined architecture can accommodate various tables of 20 K or more rules, in the FPGA devices containing 1.6 Mb block RAM. By using dual-ported memory, throughput of beyond 100 Gbps for 40-byte packets can be achieved. The proposed architecture outperforms most FPGA-based search engines for large and complex rule tables.
AB - Packet classification is one of the important functions in today's high-speed Internet routers. Many existing FPGA-based approaches can achieve a high throughput but cannot accommodate the memory required for large rule tables because on-chip memory in FPGA devices is limited. In this paper, we propose a high-throughput and low-cost pipelined architecture using a new recursive endpoint-cutting (REC) decision tree. In the software environment, REC needs only 5-66% of the memory needed in Efficuts for various rule tables. Since the rule buckets associated with leaf nodes in decision trees consume a large portion of total memory, a bucket compression scheme is also proposed to reduce rule duplication. Based on experimental results on Xilinx Virtex-5/6 FPGA, the block RAM required by REC is much less than the existing FPGA-based approaches. The proposed parallel and pipelined architecture can accommodate various tables of 20 K or more rules, in the FPGA devices containing 1.6 Mb block RAM. By using dual-ported memory, throughput of beyond 100 Gbps for 40-byte packets can be achieved. The proposed architecture outperforms most FPGA-based search engines for large and complex rule tables.
UR - http://www.scopus.com/inward/record.url?scp=85062689296&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85062689296&partnerID=8YFLogxK
U2 - 10.1093/comjnl/bxy052
DO - 10.1093/comjnl/bxy052
M3 - Article
AN - SCOPUS:85062689296
SN - 0010-4620
VL - 62
SP - 198
EP - 204
JO - Computer Journal
JF - Computer Journal
IS - 2
ER -