On a learnability question associated to neural networks with continuous activations (Extended Abstract)

Bhaskar DasGupta, Hava T. Siegelmann, Eduardo Sontag

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations

Abstract

This paper deals with learnability of concept classes defined by neural networks, showing the hardness of PAC-learning (in the complexity, not merely information-theoretic sense) for networks with a particular class of activation. The obstruction lies not with the VC dimension, which is known to grow slowly; instead, the result follows the fact that the loading problem is NP-complete. (The complexity scales badly with input dimension; the loading problem is polynomial-time if the input dimension is constant.) Similar and well-known theorems had already been proved by Megiddo and by Blum and Rivest, for binary-threshold networks. It turns out the general problem for continuous sigmoidal-type functions, as used in practical applications involving steepest descent, is not NP-hard-there are "sigmoidals" for which the problem is in fact trivial-so it is an open question to determine what properties of the activation function cause difficulties. Ours is the first result on the hardness of loading networks which do not consist of binary neurons; we employ a piecewise-linear activation function that has been used in the neural network literature. Our theoretical results lend further justification to the use of incremental (architecture-changing) techniques for training networks.

Original languageEnglish (US)
Title of host publicationProceedings of the 7th Annual Conference on Computational Learning Theory, COLT 1994
PublisherAssociation for Computing Machinery
Pages47-56
Number of pages10
ISBN (Electronic)0897916557
DOIs
StatePublished - Jul 16 1994
Event7th Annual Conference on Computational Learning Theory, COLT 1994 - New Brunswick, United States
Duration: Jul 12 1994Jul 15 1994

Publication series

NameProceedings of the Annual ACM Conference on Computational Learning Theory
VolumePart F129415

Other

Other7th Annual Conference on Computational Learning Theory, COLT 1994
CountryUnited States
CityNew Brunswick
Period7/12/947/15/94

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'On a learnability question associated to neural networks with continuous activations (Extended Abstract)'. Together they form a unique fingerprint.

Cite this