Python 3.6
twint 2.1.7 updated from master
Have searched issues without finding anything
Running on Ubuntu 18.04, anaconda, jupyter notebook
Commands run:
import twint
import pandas as pd
c = twint.Config()
c.Pandas=True
c.Search = "#nfl"
c.Hide_output=True
c.Since = '2019-12-01'
c.Until = '2019-12-02'
twint.run.Search(c)
df = twint.storage.panda.Tweets_df
Hi, thanks for writing this package, it's very useful. I'm clearly not using it right though. I ran the commands above as a test, using"#nfl" as a query because it's innocuous and guaranteed to have a lot of results over the course of one day, but I am getting inconsistent results.
First, when I run it I get a lot of these warnings (which I saw from another issue are probably related to http/https?): CRITICAL:twint.run:Twint:Feed:noDataExpecting value: line 1 column 1 (char 0)
That's fine though, the script still runs. The problem is the results are inconsistent. I ran it last night and got back 6,832 tweets, then ran it again this morning as part of testing some other code and got 4,710 tweets. When I saw that I ran it again and got 0 tweets.
I have a couple of questions if that's okay. Is twint caching the results of queries somewhere, and if so, how do I clear the cache? Is this inconsistent behaviour expected (is it a Twitter search page thing?) and if so, does it make sense to run the same search multiple times and concatenate results? Finally, is there a suggested best practice for searching date ranges? (i.e. If you want all the tweets for a hashtag for the past 3 months, is it better to do one big search or break the search into daily or weekly time ranges?)
Again, thanks for this package. Great work.