TY - JOUR
T1 - DDNAS
T2 - Discretized Differentiable Neural Architecture Search for Text Classification
AU - Chen, Kuan Chun
AU - Li, Cheng Te
AU - Lee, Kuo Jung
N1 - Publisher Copyright:
© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2023/10/3
Y1 - 2023/10/3
N2 - Neural Architecture Search (NAS) has shown promising capability in learning text representation. However, existing text-based NAS neither performs a learnable fusion of neural operations to optimize the architecture nor encodes the latent hierarchical categorization behind text input. This article presents a novel NAS method, Discretized Differentiable Neural Architecture Search (DDNAS), for text representation learning and classification. With the continuous relaxation of architecture representation, DDNAS can use gradient descent to optimize the search. We also propose a novel discretization layer via mutual information maximization, which is imposed on every search node to model the latent hierarchical categorization in text representation. Extensive experiments conducted on eight diverse real datasets exhibit that DDNAS can consistently outperform the state-of-the-art NAS methods. While DDNAS relies on only three basic operations, i.e., convolution, pooling, and none, to be the candidates of NAS building blocks, its promising performance is noticeable and extensible to obtain further improvement by adding more different operations.
AB - Neural Architecture Search (NAS) has shown promising capability in learning text representation. However, existing text-based NAS neither performs a learnable fusion of neural operations to optimize the architecture nor encodes the latent hierarchical categorization behind text input. This article presents a novel NAS method, Discretized Differentiable Neural Architecture Search (DDNAS), for text representation learning and classification. With the continuous relaxation of architecture representation, DDNAS can use gradient descent to optimize the search. We also propose a novel discretization layer via mutual information maximization, which is imposed on every search node to model the latent hierarchical categorization in text representation. Extensive experiments conducted on eight diverse real datasets exhibit that DDNAS can consistently outperform the state-of-the-art NAS methods. While DDNAS relies on only three basic operations, i.e., convolution, pooling, and none, to be the candidates of NAS building blocks, its promising performance is noticeable and extensible to obtain further improvement by adding more different operations.
UR - http://www.scopus.com/inward/record.url?scp=85174902767&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174902767&partnerID=8YFLogxK
U2 - 10.1145/3610299
DO - 10.1145/3610299
M3 - Article
AN - SCOPUS:85174902767
SN - 2157-6904
VL - 14
JO - ACM Transactions on Intelligent Systems and Technology
JF - ACM Transactions on Intelligent Systems and Technology
IS - 5
M1 - 88
ER -