TY - GEN

T1 - Vapnik-Chervonenkis dimension of recurrent neural networks

AU - Koiran, Pascal

AU - Sontag, Eduardo D.

N1 - Publisher Copyright:
© Springer-Verlag Berlin Heidelberg 1997.

PY - 1997

Y1 - 1997

N2 - Most of the work on the Vapnik-Chervonenkis dimension of neural networks has been focused on feedforward networks. However, recurrent networks are also widely used in learning applications, in particular when time is a relevant parameter. This paper provides lower and upper bounds for the VC dimension of such networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k of the input sequence. In contrast, for feedforward networks, VC dimension bounds can be expressed as a function of w only. An important difference between recurrent and feedforward nets is that a fixed recurrent net can receive inputs of arbitrary length. Therefore we are particularly interested in the case k≫w. Ignoring multiplicative constants, the main results say roughly the following: - For architectures with activation σ = any fixed nonlinear polynomial, the VC dimension is ≈ wk. - For architectures with activation σ = any fixed piecewisepolynomial, the VC dimension is between wk and w2k. - For architectures with activation σ = H (threshold nets), the VC dimension is between w log(k/w) and min{wk log wk, w2+w log wk}. - For the standard sigmoid σ(x)=1/(1+e −x ), the VC dimension is between wk and w4k2.

AB - Most of the work on the Vapnik-Chervonenkis dimension of neural networks has been focused on feedforward networks. However, recurrent networks are also widely used in learning applications, in particular when time is a relevant parameter. This paper provides lower and upper bounds for the VC dimension of such networks. Several types of activation functions are discussed, including threshold, polynomial, piecewise-polynomial and sigmoidal functions. The bounds depend on two independent parameters: the number w of weights in the network, and the length k of the input sequence. In contrast, for feedforward networks, VC dimension bounds can be expressed as a function of w only. An important difference between recurrent and feedforward nets is that a fixed recurrent net can receive inputs of arbitrary length. Therefore we are particularly interested in the case k≫w. Ignoring multiplicative constants, the main results say roughly the following: - For architectures with activation σ = any fixed nonlinear polynomial, the VC dimension is ≈ wk. - For architectures with activation σ = any fixed piecewisepolynomial, the VC dimension is between wk and w2k. - For architectures with activation σ = H (threshold nets), the VC dimension is between w log(k/w) and min{wk log wk, w2+w log wk}. - For the standard sigmoid σ(x)=1/(1+e −x ), the VC dimension is between wk and w4k2.

UR - http://www.scopus.com/inward/record.url?scp=84949203253&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84949203253&partnerID=8YFLogxK

U2 - 10.1007/3-540-62685-9_19

DO - 10.1007/3-540-62685-9_19

M3 - Conference contribution

AN - SCOPUS:84949203253

SN - 3540626859

SN - 9783540626855

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 223

EP - 237

BT - Computational Learning Theory - 3rd European Conference, EuroCOLT 1997, Proceedings

A2 - Ben-David, Shai

PB - Springer Verlag

T2 - 3rd European Conference on Computational Learning Theory, EuroCOLT 1997

Y2 - 17 March 1997 through 19 March 1997

ER -