Devising a complete and correct set of roles has been recognized as one of the most important and challenging tasks in implementing role based access control. A key problem related to this is the notion of goodness/interestingness - when is a role good/interesting? In this paper, we define the role mining problem (RMP) as the problem of discovering an optimal set of roles from existing user permissions. The main contribution of this paper is to formally define RMP, and analyze its theoretical bounds. In addition to the above basic RMP, we introduce two different variations of the RMP, called the -approx RMP and the Minimal Noise RMP that have pragmatic implications. We reduce the known "set basis problem" to RMP to show that RMP is an NP-complete problem. An important contribution of this paper is also to show the relation of the role mining problem to several problems already identified in the data mining and data analysis literature. By showing that the RMP is in essence reducible to these known problems, we can directly borrow the existing implementation solutions and guide further research in this direction.