You could use REPLACE instead of ADD (or db:replace instead of db:add) and name your tweet by the JSON id. For more details, have a look at our documentation [1].
Deleting duplicates after the insertion would be another approach, but it surely is too slow if your plan is to store thousands or millions of tweets.
Not sure of the correct lingo, but I'm building a database of tweets.
As I run it, duplicate tweets are added to the database. I can see the
duplicates with:
for $tweets in db:open("twitter")
return <tweet>{$tweets/json/id__str}</tweet>
Firstly, how would I select the json node for a duplicate entity. But,
before even selecting that node, recursively look to see if there's more
than one result for that id__str value.
How would I even generate a count of each occurrence for the data of a
specific id__str?
thanks,
Thufir