Constraining Twitter data stream

Orbifold Registered Posts: 4 ✭✭✭✭
How can one filter out tweets satisfying a criterium? Say, only tweets with a location or a user with more than x followers?

I suppose one can listen to everything and filter things out thereafter but it's a waste of storage.

Thank you.


  • jereze
    jereze Alpha Tester, Dataiker Alumni Posts: 190 ✭✭✭✭✭✭✭✭


    1) If you use the Twitter REST API:

    You can try to use some "query operators", as defined in the Twitter REST API documentation: Some are not listed but you can find them running an advanced search on the website:

    For example:

    • from:atwitteraccountname
    • near:"Paris, France" within:15mi

    Selecting users with more than x followers is not an available option.

    2) If you use the Twitter Streaming API (used by the DSS built-in connector):

    Options look more limited.

  • Orbifold
    Orbifold Registered Posts: 4 ✭✭✭✭
    Precisely, if one uses the full Twitter API there is no issue. I mean, one can use Python or R to fill a dataset. My question was related to the DSS built-in dataset definition where one (as far as I can tell) only configure a filter on the content of the "text" field. This is a bit too simple and at the same time too much AFAIC. Too simple as a filter and too much data being streamed in as a result.
Setup Info
      Help me…