Deduplication
GDELT data often contains duplicates. Use deduplication strategies to clean data.
Strategies
URL_ONLY- Deduplicate by source URLURL_DATE- By URL and dateURL_DATE_LOCATION- By URL, date, and locationACTOR_PAIR- By actor pairFULL- By all fields
Usage
from py_gdelt.utils.dedup import DedupeStrategy
result = await client.events.query(
event_filter,
deduplicate=True,
dedupe_strategy=DedupeStrategy.URL_DATE_LOCATION,
)
For details, see Events guide.