Network analysis has been an active area of research for the past few decades. Out of many open research questions that have been extensively studied, relational classification, community detection, link prediction are only to name a few. Collective classification is a well-known relational classification method for classifying entities (nodes) within a network which involves using both node based features and topological features of each node. It involves
of the unknown labels of
the test nodes in the network using label information of the training nodes. Even though this has been a well researched topic for years, very little has been done to address the following two challenges: (1) how to actively select the labeled nodes from the network to be used for training, and (2) how to efficiently obtain a sparse representation of the original network without losing much information, so that learning can scale to large networks. A lot of work has been done in theoretical computer science which aims towards finding the best approximation of large graphs. However, not much has been done from the perspective of finding an approximate subgraph that will help in classification of network datasets. In this paper, our contribution is in proposing an efficient graph sparsification method and a sampling technique which, along with the state-of-the-art network classifiers, can give comparable runtime and classification accuracies.