Mixture of Experts(ME) is an ensemble of function approximators that fit the clustered data set locally rather than globally. ME provides a useful tool to learn multi-valued mappings(ambiguities) in the data set. Mixture of Experts training involve learning a multi-category classifier for the gates distribution and fitting a regressor within each of the clusters. The learning of ME is based on divide and conquer which is known to increase the error due to variance. In order to avoid overfitting several researchers have proposed using linear experts. However in the absence of any knowledge of non-linearities existing in the data set, it is not clear how many linear experts could accurately model the data. In this work we propose a bayesian learning framework for learning Mixture of Experts. Bayesian learning intrinsically embodies regularization and model selection using Occam's razor. In the past Bayesian learning methods have been applied to classification and regression in order to avoid scale sensitivity and orthodox model selection procedure of cross validation. Although true Bayesian learning is computationally intractable, approximations do result in sparser and more compact models.