TY - GEN

T1 - On a learnability question associated to neural networks with continuous activations (Extended Abstract)

AU - DasGupta, Bhaskar

AU - Siegelmann, Hava T.

AU - Sontag, Eduardo

PY - 1994/7/16

Y1 - 1994/7/16

N2 - This paper deals with learnability of concept classes defined by neural networks, showing the hardness of PAC-learning (in the complexity, not merely information-theoretic sense) for networks with a particular class of activation. The obstruction lies not with the VC dimension, which is known to grow slowly; instead, the result follows the fact that the loading problem is NP-complete. (The complexity scales badly with input dimension; the loading problem is polynomial-time if the input dimension is constant.) Similar and well-known theorems had already been proved by Megiddo and by Blum and Rivest, for binary-threshold networks. It turns out the general problem for continuous sigmoidal-type functions, as used in practical applications involving steepest descent, is not NP-hard-there are "sigmoidals" for which the problem is in fact trivial-so it is an open question to determine what properties of the activation function cause difficulties. Ours is the first result on the hardness of loading networks which do not consist of binary neurons; we employ a piecewise-linear activation function that has been used in the neural network literature. Our theoretical results lend further justification to the use of incremental (architecture-changing) techniques for training networks.

AB - This paper deals with learnability of concept classes defined by neural networks, showing the hardness of PAC-learning (in the complexity, not merely information-theoretic sense) for networks with a particular class of activation. The obstruction lies not with the VC dimension, which is known to grow slowly; instead, the result follows the fact that the loading problem is NP-complete. (The complexity scales badly with input dimension; the loading problem is polynomial-time if the input dimension is constant.) Similar and well-known theorems had already been proved by Megiddo and by Blum and Rivest, for binary-threshold networks. It turns out the general problem for continuous sigmoidal-type functions, as used in practical applications involving steepest descent, is not NP-hard-there are "sigmoidals" for which the problem is in fact trivial-so it is an open question to determine what properties of the activation function cause difficulties. Ours is the first result on the hardness of loading networks which do not consist of binary neurons; we employ a piecewise-linear activation function that has been used in the neural network literature. Our theoretical results lend further justification to the use of incremental (architecture-changing) techniques for training networks.

UR - http://www.scopus.com/inward/record.url?scp=34648822523&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34648822523&partnerID=8YFLogxK

U2 - 10.1145/180139.181009

DO - 10.1145/180139.181009

M3 - Conference contribution

AN - SCOPUS:34648822523

T3 - Proceedings of the Annual ACM Conference on Computational Learning Theory

SP - 47

EP - 56

BT - Proceedings of the 7th Annual Conference on Computational Learning Theory, COLT 1994

PB - Association for Computing Machinery

T2 - 7th Annual Conference on Computational Learning Theory, COLT 1994

Y2 - 12 July 1994 through 15 July 1994

ER -