TY - GEN
T1 - DATE
T2 - 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2020
AU - Kim, Sundong
AU - Tsai, Yu Che
AU - Singh, Karandeep
AU - Choi, Yeonsoo
AU - Ibok, Etim
AU - Li, Cheng Te
AU - Cha, Meeyoung
N1 - Funding Information:
This work was supported by the Institute for Basic Science (IBS-R029-C2), Basic Science Research Program through the National Research Foundation of Korea (NRF) (No. NRF-2017R1E1A1A01076400), Ministry of Science and Technology (MOST) of Taiwan (MOST Young Scholar Fellowship 109-2636-E-006-017 and grant 108-2218-E-006-036), Academia Sinica (AS-TP-107-M05), and WCO Customs Cooperation Fund of Korea (CCF Korea).
Publisher Copyright:
© 2020 Owner/Author.
PY - 2020/8/23
Y1 - 2020/8/23
N2 - Intentional manipulation of invoices that lead to undervaluation of trade goods is the most common type of customs fraud to avoid ad valorem duties and taxes. To secure government revenue without interrupting legitimate trade flows, customs administrations around the world strive to develop ways to detect illicit trades. This paper proposes DATE, a model of Dual-task Attentive Tree-aware Embedding, to classify and rank illegal trade flows that contribute the most to the overall customs revenue when caught. The strength of DATE comes from combining a tree-based model for interpretability and transaction-level embeddings with dual attention mechanisms. To accurately identify illicit transactions and predict tax revenue, DATE learns simultaneously from illicitness and surtax of each transaction. With a five-year amount of customs import data with a test illicit ratio of 2.24%, DATE shows a remarkable precision of 92.7% on illegal cases and a recall of 49.3% on revenue after inspecting only 1% of all trade flows. We also discuss issues on deploying DATE in Nigeria Customs Service, in collaboration with the World Customs Organization.
AB - Intentional manipulation of invoices that lead to undervaluation of trade goods is the most common type of customs fraud to avoid ad valorem duties and taxes. To secure government revenue without interrupting legitimate trade flows, customs administrations around the world strive to develop ways to detect illicit trades. This paper proposes DATE, a model of Dual-task Attentive Tree-aware Embedding, to classify and rank illegal trade flows that contribute the most to the overall customs revenue when caught. The strength of DATE comes from combining a tree-based model for interpretability and transaction-level embeddings with dual attention mechanisms. To accurately identify illicit transactions and predict tax revenue, DATE learns simultaneously from illicitness and surtax of each transaction. With a five-year amount of customs import data with a test illicit ratio of 2.24%, DATE shows a remarkable precision of 92.7% on illegal cases and a recall of 49.3% on revenue after inspecting only 1% of all trade flows. We also discuss issues on deploying DATE in Nigeria Customs Service, in collaboration with the World Customs Organization.
UR - http://www.scopus.com/inward/record.url?scp=85090412171&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090412171&partnerID=8YFLogxK
U2 - 10.1145/3394486.3403339
DO - 10.1145/3394486.3403339
M3 - Conference contribution
AN - SCOPUS:85090412171
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 2880
EP - 2890
BT - KDD 2020 - Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 23 August 2020 through 27 August 2020
ER -