Researchers, nowadays, have at their disposal valuable data from social networking applications, of which Twitter and Facebook are the most prominent examples. To retrieve this content, the Twitter service provides 2 distinct
Application Programming Interfaces
(APIs): a probe-based and a streaming one, each of which imposes different limitations on the data collection process. In this paper, we present a general architecture to facilitate
of the service, which simplifies retrieval. We give implementation details of our system, while providing a simple way to express the crawling process, i.e., the
. We experimentally evaluate it on a variety of faceted crawls, depicting its efficacy for the online medium.