Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Constraining Twitter data stream

Level 1
Constraining Twitter data stream
How can one filter out tweets satisfying a criterium? Say, only tweets with a location or a user with more than x followers?

I suppose one can listen to everything and filter things out thereafter but it's a waste of storage.

Thank you.
0 Kudos
2 Replies
Dataiker Alumni


1) If you use the Twitter REST API:

You can try to use some "query operators", as defined in the Twitter REST API documentation: Some are not listed but you can find them running an advanced search on the website:

For example:

  • from:atwitteraccountname

  • near:"Paris, France" within:15mi

Selecting users with more than x followers is not an available option.

2) If you use the Twitter Streaming API (used by the DSS built-in connector):

Options look more limited.

Jeremy, Product Manager at Dataiku
0 Kudos
Level 1
Precisely, if one uses the full Twitter API there is no issue. I mean, one can use Python or R to fill a dataset. My question was related to the DSS built-in dataset definition where one (as far as I can tell) only configure a filter on the content of the "text" field. This is a bit too simple and at the same time too much AFAIC. Too simple as a filter and too much data being streamed in as a result.
0 Kudos


Labels (2)
A banner prompting to get Dataiku