Endpoints API
File-Based Endpoints
EventsEndpoint
EventsEndpoint
Endpoint for querying GDELT Events data.
This endpoint orchestrates querying GDELT Events data from multiple sources (files and BigQuery) using a DataFetcher. It handles: - Source selection and fallback logic - Type conversion from internal _RawEvent to public Event models - Optional deduplication - Both streaming and batch query modes
The endpoint uses dependency injection to receive source instances, making it easy to test and configure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_source
|
FileSource
|
FileSource instance for downloading GDELT files |
required |
bigquery_source
|
BigQuerySource | None
|
Optional BigQuerySource instance for fallback queries |
None
|
fallback_enabled
|
bool
|
Whether to fallback to BigQuery on errors (default: True) |
True
|
Note
BigQuery fallback only activates if both fallback_enabled=True AND bigquery_source is provided AND credentials are configured.
Example
from py_gdelt.sources import FileSource from py_gdelt.filters import DateRange, EventFilter from datetime import date
async with FileSource() as file_source: ... endpoint = EventsEndpoint(file_source=file_source) ... filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... actor1_country="USA", ... ) ... # Batch query ... result = await endpoint.query(filter_obj, deduplicate=True) ... for event in result: ... print(event.global_event_id) ... # Streaming query ... async for event in endpoint.stream(filter_obj): ... process(event)
Source code in src/py_gdelt/endpoints/events.py
77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 | |
query(filter_obj, *, deduplicate=False, dedupe_strategy=None, use_bigquery=False)
async
Query GDELT Events with automatic fallback.
This is a batch query method that materializes all results into memory. For large datasets, prefer stream() for memory-efficient iteration.
Files are always tried first (free, no credentials), with automatic fallback to BigQuery on rate limit/error if credentials are configured.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
EventFilter
|
Event filter with date range and query parameters |
required |
deduplicate
|
bool
|
If True, deduplicate events based on dedupe_strategy |
False
|
dedupe_strategy
|
DedupeStrategy | None
|
Deduplication strategy (default: URL_DATE_LOCATION) |
None
|
use_bigquery
|
bool
|
If True, skip files and use BigQuery directly |
False
|
Returns:
| Type | Description |
|---|---|
FetchResult[Event]
|
FetchResult containing Event instances. Use .data to access the list, |
FetchResult[Event]
|
.failed to see any failed requests, and .complete to check if all |
FetchResult[Event]
|
requests succeeded. |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and fallback not available |
APIError
|
If download fails and fallback not available |
ConfigurationError
|
If BigQuery requested but not configured |
Example
filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... actor1_country="USA", ... ) result = await endpoint.query(filter_obj, deduplicate=True) print(f"Found {len(result)} unique events") for event in result: ... print(event.global_event_id)
Source code in src/py_gdelt/endpoints/events.py
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 | |
stream(filter_obj, *, deduplicate=False, dedupe_strategy=None, use_bigquery=False)
async
Stream GDELT Events with memory-efficient iteration.
This is a streaming method that yields events one at a time, making it suitable for large datasets. Memory usage is constant regardless of result size.
Files are always tried first (free, no credentials), with automatic fallback to BigQuery on rate limit/error if credentials are configured.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
EventFilter
|
Event filter with date range and query parameters |
required |
deduplicate
|
bool
|
If True, deduplicate events based on dedupe_strategy |
False
|
dedupe_strategy
|
DedupeStrategy | None
|
Deduplication strategy (default: URL_DATE_LOCATION) |
None
|
use_bigquery
|
bool
|
If True, skip files and use BigQuery directly |
False
|
Yields:
| Name | Type | Description |
|---|---|---|
Event |
AsyncIterator[Event]
|
Individual Event instances matching the filter |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and fallback not available |
APIError
|
If download fails and fallback not available |
ConfigurationError
|
If BigQuery requested but not configured |
Example
filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1), end=date(2024, 1, 7)), ... actor1_country="USA", ... ) count = 0 async for event in endpoint.stream(filter_obj, deduplicate=True): ... print(event.global_event_id) ... count += 1 print(f"Streamed {count} unique events")
Source code in src/py_gdelt/endpoints/events.py
query_sync(filter_obj, *, deduplicate=False, dedupe_strategy=None, use_bigquery=False)
Synchronous wrapper for query().
This is a convenience method that runs the async query() method in a new event loop. Prefer using the async version when possible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
EventFilter
|
Event filter with date range and query parameters |
required |
deduplicate
|
bool
|
If True, deduplicate events based on dedupe_strategy |
False
|
dedupe_strategy
|
DedupeStrategy | None
|
Deduplication strategy (default: URL_DATE_LOCATION) |
None
|
use_bigquery
|
bool
|
If True, skip files and use BigQuery directly |
False
|
Returns:
| Type | Description |
|---|---|
FetchResult[Event]
|
FetchResult containing Event instances |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and fallback not available |
APIError
|
If download fails and fallback not available |
ConfigurationError
|
If BigQuery requested but not configured |
RuntimeError
|
If called from within an already running event loop |
Example
filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... actor1_country="USA", ... ) result = endpoint.query_sync(filter_obj) for event in result: ... print(event.global_event_id)
Source code in src/py_gdelt/endpoints/events.py
stream_sync(filter_obj, *, deduplicate=False, dedupe_strategy=None, use_bigquery=False)
Synchronous wrapper for stream().
This method provides a synchronous iterator interface over async streaming. It internally manages the event loop and yields events one at a time, providing true streaming behavior with memory efficiency.
Note: This creates a new event loop for each iteration, which has some overhead. For better performance, use the async stream() method directly if possible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
EventFilter
|
Event filter with date range and query parameters |
required |
deduplicate
|
bool
|
If True, deduplicate events based on dedupe_strategy |
False
|
dedupe_strategy
|
DedupeStrategy | None
|
Deduplication strategy (default: URL_DATE_LOCATION) |
None
|
use_bigquery
|
bool
|
If True, skip files and use BigQuery directly |
False
|
Returns:
| Type | Description |
|---|---|
Iterator[Event]
|
Iterator that yields Event instances for each matching event |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and fallback not available |
APIError
|
If download fails and fallback not available |
ConfigurationError
|
If BigQuery requested but not configured |
RuntimeError
|
If called from within an already running event loop |
Example
filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... actor1_country="USA", ... ) for event in endpoint.stream_sync(filter_obj, deduplicate=True): ... print(event.global_event_id)
Source code in src/py_gdelt/endpoints/events.py
MentionsEndpoint
MentionsEndpoint
Endpoint for querying GDELT Mentions data.
Mentions track individual occurrences of events across different news sources. Each mention links to an event in the Events table via GlobalEventID and contains metadata about the source, timing, document position, and confidence.
This endpoint uses DataFetcher for multi-source orchestration: - Primary: File downloads (free, no credentials needed) - Fallback: BigQuery (on rate limit/error, if credentials configured)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_source
|
FileSource
|
FileSource instance for downloading GDELT files |
required |
bigquery_source
|
BigQuerySource | None
|
Optional BigQuerySource instance for fallback queries |
None
|
settings
|
GDELTSettings | None
|
Optional GDELTSettings for configuration (currently unused but reserved for future features like caching) |
None
|
fallback_enabled
|
bool
|
Whether to fallback to BigQuery on errors (default: True) |
True
|
error_policy
|
ErrorPolicy
|
How to handle errors - 'raise', 'warn', or 'skip' (default: 'warn') |
'warn'
|
Note
Mentions queries require BigQuery as files don't support event-specific filtering. File downloads would require fetching entire date ranges and filtering client-side, which is inefficient for single-event queries. BigQuery fallback only activates if both fallback_enabled=True AND bigquery_source is provided AND credentials are configured.
Example
from datetime import date from py_gdelt.filters import DateRange, EventFilter from py_gdelt.sources import FileSource, BigQuerySource from py_gdelt.sources.fetcher import DataFetcher
async with FileSource() as file_source: ... bq_source = BigQuerySource() ... fetcher = DataFetcher(file_source=file_source, bigquery_source=bq_source) ... endpoint = MentionsEndpoint(fetcher=fetcher) ... ... filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1), end=date(2024, 1, 7)) ... ) ... ... # Batch query ... result = await endpoint.query(global_event_id="123456789", filter_obj=filter_obj) ... print(f"Found {len(result)} mentions") ... for mention in result: ... print(mention.source_name) ... ... # Streaming query ... async for mention in endpoint.stream(global_event_id="123456789", filter_obj=filter_obj): ... print(mention.source_name)
Source code in src/py_gdelt/endpoints/mentions.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 | |
query(global_event_id, filter_obj, *, use_bigquery=True)
async
Query mentions for a specific event and return all results.
This method collects all mentions into memory and returns them as a FetchResult. For large result sets or memory-constrained environments, use stream() instead.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
global_event_id
|
int
|
Global event ID to fetch mentions for (integer) |
required |
filter_obj
|
EventFilter
|
Filter with date range for the query window |
required |
use_bigquery
|
bool
|
If True, use BigQuery directly (default: True, recommended for mentions) |
True
|
Returns:
| Type | Description |
|---|---|
FetchResult[Mention]
|
FetchResult[Mention]: Container with list of Mention objects and failure tracking |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If BigQuery not configured but required |
ValueError
|
If date range is invalid or too large |
Example
filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1), end=date(2024, 1, 7)) ... ) result = await endpoint.query(123456789, filter_obj) print(f"Complete: {result.complete}, Count: {len(result)}") for mention in result: ... print(f"{mention.source_name}: {mention.confidence}%")
Source code in src/py_gdelt/endpoints/mentions.py
stream(global_event_id, filter_obj, *, use_bigquery=True)
async
Stream mentions for a specific event.
This method yields mentions one at a time, converting from internal _RawMention to public Mention model at the yield boundary. Memory-efficient for large result sets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
global_event_id
|
int
|
Global event ID to fetch mentions for (integer) |
required |
filter_obj
|
EventFilter
|
Filter with date range for the query window |
required |
use_bigquery
|
bool
|
If True, use BigQuery directly (default: True, recommended for mentions) |
True
|
Yields:
| Name | Type | Description |
|---|---|---|
Mention |
AsyncIterator[Mention]
|
Individual mention records with full type safety |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If BigQuery not configured but required |
ValueError
|
If date range is invalid or too large |
Example
filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1), end=date(2024, 1, 7)) ... ) async for mention in endpoint.stream(123456789, filter_obj): ... if mention.confidence >= 80: ... print(f"High confidence: {mention.source_name}")
Source code in src/py_gdelt/endpoints/mentions.py
query_sync(global_event_id, filter_obj, *, use_bigquery=True)
Synchronous wrapper for query().
This is a convenience method for synchronous code. It runs the async query() method in a new event loop. For better performance, use the async version directly.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
global_event_id
|
int
|
Global event ID to fetch mentions for (integer) |
required |
filter_obj
|
EventFilter
|
Filter with date range for the query window |
required |
use_bigquery
|
bool
|
If True, use BigQuery directly (default: True) |
True
|
Returns:
| Type | Description |
|---|---|
FetchResult[Mention]
|
FetchResult[Mention]: Container with list of Mention objects |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If BigQuery not configured but required |
ValueError
|
If date range is invalid |
Example
filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1), end=date(2024, 1, 7)) ... ) result = endpoint.query_sync(123456789, filter_obj) for mention in result: ... print(mention.source_name)
Source code in src/py_gdelt/endpoints/mentions.py
stream_sync(global_event_id, filter_obj, *, use_bigquery=True)
Synchronous wrapper for stream().
This method provides a synchronous iterator interface over async streaming. It internally manages the event loop and yields mentions one at a time, providing true streaming behavior with memory efficiency.
Note: This creates a new event loop for each iteration, which has some overhead. For better performance, use the async stream() method directly if possible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
global_event_id
|
int
|
Global event ID to fetch mentions for (integer) |
required |
filter_obj
|
EventFilter
|
Filter with date range for the query window |
required |
use_bigquery
|
bool
|
If True, use BigQuery directly (default: True) |
True
|
Returns:
| Type | Description |
|---|---|
Iterator[Mention]
|
Iterator of individual Mention records |
Raises:
| Type | Description |
|---|---|
ConfigurationError
|
If BigQuery not configured but required |
ValueError
|
If date range is invalid |
RuntimeError
|
If called from within an already running event loop |
Example
filter_obj = EventFilter( ... date_range=DateRange(start=date(2024, 1, 1), end=date(2024, 1, 7)) ... ) for mention in endpoint.stream_sync(123456789, filter_obj): ... print(mention.source_name)
Source code in src/py_gdelt/endpoints/mentions.py
GKGEndpoint
GKGEndpoint
GKG (Global Knowledge Graph) endpoint for querying GDELT enriched content data.
The GKGEndpoint provides access to GDELT's Global Knowledge Graph, which contains rich content analysis including themes, people, organizations, locations, counts, tone, and other metadata extracted from news articles.
This endpoint uses DataFetcher to orchestrate source selection: - Files are ALWAYS primary (free, no credentials needed) - BigQuery is FALLBACK ONLY (on 429/error, if credentials configured)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file_source
|
FileSource
|
FileSource instance for downloading GDELT files |
required |
bigquery_source
|
BigQuerySource | None
|
Optional BigQuerySource instance for fallback queries |
None
|
settings
|
GDELTSettings | None
|
Optional GDELTSettings for configuration (currently unused but reserved for future features like caching) |
None
|
fallback_enabled
|
bool
|
Whether to fallback to BigQuery on errors (default: True) |
True
|
error_policy
|
ErrorPolicy
|
How to handle errors - 'raise', 'warn', or 'skip' (default: 'warn') |
'warn'
|
Note
BigQuery fallback only activates if both fallback_enabled=True AND bigquery_source is provided AND credentials are configured.
Example
Basic GKG query:
from datetime import date from py_gdelt.filters import GKGFilter, DateRange from py_gdelt.endpoints.gkg import GKGEndpoint from py_gdelt.sources.files import FileSource
async def main(): ... async with FileSource() as file_source: ... endpoint = GKGEndpoint(file_source=file_source) ... filter_obj = GKGFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... themes=["ENV_CLIMATECHANGE"] ... ) ... result = await endpoint.query(filter_obj) ... for record in result: ... print(record.record_id, record.source_url)
Streaming large result sets:
async def stream_example(): ... async with FileSource() as file_source: ... endpoint = GKGEndpoint(file_source=file_source) ... filter_obj = GKGFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... country="USA" ... ) ... async for record in endpoint.stream(filter_obj): ... print(record.record_id, record.primary_theme)
Synchronous usage:
endpoint = GKGEndpoint(file_source=file_source) result = endpoint.query_sync(filter_obj) for record in result: ... print(record.record_id)
Source code in src/py_gdelt/endpoints/gkg.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 | |
query(filter_obj, *, use_bigquery=False)
async
Query GKG data with automatic fallback and return all results.
This method fetches all matching GKG records and returns them as a FetchResult container. For large result sets, consider using stream() instead to avoid loading everything into memory.
Files are always tried first (free, no credentials), with automatic fallback to BigQuery on rate limit/error if credentials are configured.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
GKGFilter
|
GKG filter with date range and query parameters |
required |
use_bigquery
|
bool
|
If True, skip files and use BigQuery directly (default: False) |
False
|
Returns:
| Type | Description |
|---|---|
FetchResult[GKGRecord]
|
FetchResult[GKGRecord] containing all matching records and any failures |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and fallback not available/enabled |
APIError
|
If download fails and fallback not available/enabled |
ConfigurationError
|
If BigQuery requested but not configured |
Example
from datetime import date from py_gdelt.filters import GKGFilter, DateRange
filter_obj = GKGFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... themes=["ECON_STOCKMARKET"], ... min_tone=0.0, # Only positive tone ... ) result = await endpoint.query(filter_obj) print(f"Fetched {len(result)} records") if not result.complete: ... print(f"Warning: {result.total_failed} requests failed") for record in result: ... print(record.record_id, record.tone.tone if record.tone else None)
Source code in src/py_gdelt/endpoints/gkg.py
stream(filter_obj, *, use_bigquery=False)
async
Stream GKG records with automatic fallback.
This method streams GKG records one at a time, which is memory-efficient for large result sets. Records are converted from internal _RawGKG dataclass to public GKGRecord Pydantic model at the yield boundary.
Files are always tried first (free, no credentials), with automatic fallback to BigQuery on rate limit/error if credentials are configured.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
GKGFilter
|
GKG filter with date range and query parameters |
required |
use_bigquery
|
bool
|
If True, skip files and use BigQuery directly (default: False) |
False
|
Yields:
| Name | Type | Description |
|---|---|---|
GKGRecord |
AsyncIterator[GKGRecord]
|
Individual GKG records matching the filter criteria |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and fallback not available/enabled |
APIError
|
If download fails and fallback not available/enabled |
ConfigurationError
|
If BigQuery requested but not configured |
Example
from datetime import date from py_gdelt.filters import GKGFilter, DateRange
filter_obj = GKGFilter( ... date_range=DateRange(start=date(2024, 1, 1), end=date(2024, 1, 7)), ... organizations=["United Nations"], ... ) count = 0 async for record in endpoint.stream(filter_obj): ... print(f"Processing {record.record_id}") ... count += 1 ... if count >= 1000: ... break # Stop after 1000 records
Source code in src/py_gdelt/endpoints/gkg.py
query_sync(filter_obj, *, use_bigquery=False)
Synchronous wrapper for query().
This is a convenience method for synchronous code that internally uses asyncio.run() to execute the async query() method.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
GKGFilter
|
GKG filter with date range and query parameters |
required |
use_bigquery
|
bool
|
If True, skip files and use BigQuery directly (default: False) |
False
|
Returns:
| Type | Description |
|---|---|
FetchResult[GKGRecord]
|
FetchResult[GKGRecord] containing all matching records and any failures |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and fallback not available/enabled |
APIError
|
If download fails and fallback not available/enabled |
ConfigurationError
|
If BigQuery requested but not configured |
RuntimeError
|
If called from within an existing event loop |
Example
from datetime import date from py_gdelt.filters import GKGFilter, DateRange
Synchronous usage (no async/await needed)
endpoint = GKGEndpoint(file_source=file_source) filter_obj = GKGFilter( ... date_range=DateRange(start=date(2024, 1, 1)) ... ) result = endpoint.query_sync(filter_obj) for record in result: ... print(record.record_id)
Source code in src/py_gdelt/endpoints/gkg.py
stream_sync(filter_obj, *, use_bigquery=False)
Synchronous wrapper for stream().
This method provides a synchronous iterator interface over async streaming. It internally manages the event loop and yields records one at a time.
Note: This creates a new event loop for each iteration, which has some overhead. For better performance, use the async stream() method directly if possible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
GKGFilter
|
GKG filter with date range and query parameters |
required |
use_bigquery
|
bool
|
If True, skip files and use BigQuery directly (default: False) |
False
|
Returns:
| Type | Description |
|---|---|
Iterator[GKGRecord]
|
Iterator of GKGRecord instances for each matching record |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and fallback not available/enabled |
APIError
|
If download fails and fallback not available/enabled |
ConfigurationError
|
If BigQuery requested but not configured |
RuntimeError
|
If called from within an existing event loop |
Example
from datetime import date from py_gdelt.filters import GKGFilter, DateRange
Synchronous streaming (no async/await needed)
endpoint = GKGEndpoint(file_source=file_source) filter_obj = GKGFilter( ... date_range=DateRange(start=date(2024, 1, 1)) ... ) for record in endpoint.stream_sync(filter_obj): ... print(record.record_id) ... if record.has_quotations: ... print(f" {len(record.quotations)} quotations found")
Source code in src/py_gdelt/endpoints/gkg.py
NGramsEndpoint
NGramsEndpoint
Endpoint for querying GDELT NGrams 3.0 data.
Provides access to GDELT's NGrams dataset, which tracks word and phrase occurrences across global news with contextual information. NGrams are file-based only (no BigQuery support).
The endpoint uses DataFetcher for orchestrated file downloads with automatic retry, error handling, and intelligent caching. Internal _RawNGram dataclass instances are converted to Pydantic NGramRecord models at the yield boundary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
settings
|
GDELTSettings | None
|
Configuration settings. If None, uses defaults. |
None
|
file_source
|
FileSource | None
|
Optional shared FileSource. If None, creates owned instance. When provided, the source lifecycle is managed externally. |
None
|
Example
Batch query with filtering:
from py_gdelt.filters import NGramsFilter, DateRange from datetime import date async with NGramsEndpoint() as endpoint: ... filter_obj = NGramsFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... language="en", ... ngram="climate", ... ) ... result = await endpoint.query(filter_obj) ... print(f"Found {len(result)} records")
Streaming for large datasets:
async with NGramsEndpoint() as endpoint: ... filter_obj = NGramsFilter( ... date_range=DateRange( ... start=date(2024, 1, 1), ... end=date(2024, 1, 7) ... ), ... language="en", ... ) ... async for record in endpoint.stream(filter_obj): ... if record.is_early_in_article: ... print(f"Early: {record.ngram} in {record.url}")
Source code in src/py_gdelt/endpoints/ngrams.py
51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 | |
close()
async
Close resources if we own them.
Only closes resources that were created by this instance. Shared resources are not closed to allow reuse.
Source code in src/py_gdelt/endpoints/ngrams.py
__aenter__()
async
Async context manager entry.
Returns:
| Type | Description |
|---|---|
NGramsEndpoint
|
Self for use in async with statement. |
__aexit__(*args)
async
Async context manager exit - close resources.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*args
|
object
|
Exception info (unused, but required by protocol). |
()
|
query(filter_obj)
async
Query NGrams data and return all results.
Fetches all NGram records matching the filter criteria and returns them as a FetchResult. This method collects all records in memory before returning, so use stream() for large result sets to avoid memory issues.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
NGramsFilter
|
Filter with date range and optional ngram/language constraints |
required |
Returns:
| Type | Description |
|---|---|
FetchResult[NGramRecord]
|
FetchResult containing list of NGramRecord instances and any failed requests |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and retries exhausted |
APIError
|
If downloads fail |
DataError
|
If file parsing fails |
Example
filter_obj = NGramsFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... language="en", ... min_position=0, ... max_position=20, ... ) result = await endpoint.query(filter_obj) print(f"Found {len(result)} records in article headlines")
Source code in src/py_gdelt/endpoints/ngrams.py
stream(filter_obj)
async
Stream NGrams data record by record.
Yields NGram records one at a time, converting internal _RawNGram dataclass instances to Pydantic NGramRecord models at the yield boundary. This method is memory-efficient for large result sets.
Client-side filtering is applied for ngram text, language, and position constraints since file downloads provide all records for a date range.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
NGramsFilter
|
Filter with date range and optional ngram/language constraints |
required |
Yields:
| Name | Type | Description |
|---|---|---|
NGramRecord |
AsyncIterator[NGramRecord]
|
Individual NGram records matching the filter criteria |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and retries exhausted |
APIError
|
If downloads fail |
DataError
|
If file parsing fails |
Example
filter_obj = NGramsFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... ngram="climate", ... language="en", ... ) async for record in endpoint.stream(filter_obj): ... print(f"{record.ngram}: {record.context}")
Source code in src/py_gdelt/endpoints/ngrams.py
query_sync(filter_obj)
Synchronous wrapper for query().
Runs the async query() method in a new event loop. This is a convenience method for synchronous code, but async methods are preferred when possible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
NGramsFilter
|
Filter with date range and optional constraints |
required |
Returns:
| Type | Description |
|---|---|
FetchResult[NGramRecord]
|
FetchResult containing list of NGramRecord instances |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and retries exhausted |
APIError
|
If downloads fail |
DataError
|
If file parsing fails |
Example
filter_obj = NGramsFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... language="en", ... ) result = endpoint.query_sync(filter_obj) print(f"Found {len(result)} records")
Source code in src/py_gdelt/endpoints/ngrams.py
stream_sync(filter_obj)
Synchronous wrapper for stream().
Yields NGram records from the async stream() method in a blocking manner. This is a convenience method for synchronous code, but async methods are preferred when possible.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filter_obj
|
NGramsFilter
|
Filter with date range and optional constraints |
required |
Yields:
| Name | Type | Description |
|---|---|---|
NGramRecord |
NGramRecord
|
Individual NGram records matching the filter criteria |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
If rate limited and retries exhausted |
APIError
|
If downloads fail |
DataError
|
If file parsing fails |
Example
filter_obj = NGramsFilter( ... date_range=DateRange(start=date(2024, 1, 1)), ... ngram="climate", ... ) for record in endpoint.stream_sync(filter_obj): ... print(f"{record.ngram}: {record.url}")
Source code in src/py_gdelt/endpoints/ngrams.py
REST API Endpoints
DocEndpoint
DocEndpoint
Bases: BaseEndpoint
DOC 2.0 API endpoint for searching GDELT articles.
The DOC API provides full-text search across GDELT's monitored news sources with support for various output modes (article lists, timelines, galleries) and flexible filtering by time, source, language, and relevance.
Attributes:
| Name | Type | Description |
|---|---|---|
BASE_URL |
Base URL for the DOC API endpoint |
Example
Basic article search:
async with DocEndpoint() as doc: ... articles = await doc.search("climate change", max_results=100) ... for article in articles: ... print(article.title, article.url)
Using filters for advanced queries:
from py_gdelt.filters import DocFilter async with DocEndpoint() as doc: ... filter = DocFilter( ... query="elections", ... timespan="7d", ... source_country="US", ... sort_by="relevance" ... ) ... articles = await doc.query(filter)
Getting timeline data:
async with DocEndpoint() as doc: ... timeline = await doc.timeline("protests", timespan="30d") ... for point in timeline.points: ... print(point.date, point.value)
Source code in src/py_gdelt/endpoints/doc.py
25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 | |
search(query, *, timespan=None, max_results=250, sort_by='date', source_language=None, source_country=None)
async
Search for articles matching a query.
This is a convenience method that constructs a DocFilter internally. For more control over query parameters, use query() with a DocFilter directly.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string (supports boolean operators, phrases). |
required |
timespan
|
str | None
|
Time range like "24h", "7d", "30d". If None, searches all time. |
None
|
max_results
|
int
|
Maximum results to return (1-250, default: 250). |
250
|
sort_by
|
Literal['date', 'relevance', 'tone']
|
Sort order - "date", "relevance", or "tone" (default: "date"). |
'date'
|
source_language
|
str | None
|
Filter by source language (ISO 639 code). |
None
|
source_country
|
str | None
|
Filter by source country (FIPS country code). |
None
|
Returns:
| Type | Description |
|---|---|
list[Article]
|
List of Article objects matching the query. |
Raises:
| Type | Description |
|---|---|
APIError
|
On HTTP errors or invalid responses. |
APIUnavailableError
|
When API is down or unreachable. |
RateLimitError
|
When rate limited by the API. |
Example
async with DocEndpoint() as doc: ... # Search recent articles about climate ... articles = await doc.search( ... "climate change", ... timespan="7d", ... max_results=50, ... sort_by="relevance" ... ) ... # Filter by country ... us_articles = await doc.search( ... "elections", ... source_country="US", ... timespan="24h" ... )
Source code in src/py_gdelt/endpoints/doc.py
query(query_filter)
async
Query the DOC API with a filter.
Executes a search using a pre-configured DocFilter object, providing full control over all query parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query_filter
|
DocFilter
|
DocFilter with query parameters and constraints. |
required |
Returns:
| Type | Description |
|---|---|
list[Article]
|
List of Article objects matching the filter criteria. |
Raises:
| Type | Description |
|---|---|
APIError
|
On HTTP errors or invalid responses. |
APIUnavailableError
|
When API is down or unreachable. |
RateLimitError
|
When rate limited by the API. |
Example
from py_gdelt.filters import DocFilter from datetime import datetime async with DocEndpoint() as doc: ... # Complex query with datetime range ... doc_filter = DocFilter( ... query='"machine learning" AND python', ... start_datetime=datetime(2024, 1, 1), ... end_datetime=datetime(2024, 1, 31), ... source_country="US", ... max_results=100, ... sort_by="relevance" ... ) ... articles = await doc.query(doc_filter)
Source code in src/py_gdelt/endpoints/doc.py
timeline(query, *, timespan='7d')
async
Get timeline data for a query.
Returns time series data showing article volume over time for a given search query. Useful for visualizing trends and tracking story evolution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query string. |
required |
timespan
|
str | None
|
Time range to analyze (default: "7d" - 7 days). Common values: "24h", "7d", "30d", "3mon". |
'7d'
|
Returns:
| Type | Description |
|---|---|
Timeline
|
Timeline object with time series data points. |
Raises:
| Type | Description |
|---|---|
APIError
|
On HTTP errors or invalid responses. |
APIUnavailableError
|
When API is down or unreachable. |
RateLimitError
|
When rate limited by the API. |
Example
async with DocEndpoint() as doc: ... # Get article volume over last month ... timeline = await doc.timeline("protests", timespan="30d") ... for point in timeline.points: ... print(f"{point.date}: {point.value} articles")
Source code in src/py_gdelt/endpoints/doc.py
GeoEndpoint
GeoEndpoint
Bases: BaseEndpoint
GEO 2.0 API endpoint for geographic article data.
Returns locations mentioned in news articles matching a query. Supports time-based filtering and geographic bounds.
Example
async with GeoEndpoint() as geo: result = await geo.search("earthquake", max_points=100) for point in result.points: print(f"{point.name}: {point.count} articles")
Attributes:
| Name | Type | Description |
|---|---|---|
BASE_URL |
GEO API base URL |
Source code in src/py_gdelt/endpoints/geo.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 | |
search(query, *, timespan=None, max_points=250, bounding_box=None)
async
Search for geographic locations in news.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query (full text search) |
required |
timespan
|
str | None
|
Time range (e.g., "24h", "7d", "1m") |
None
|
max_points
|
int
|
Maximum points to return (1-250) |
250
|
bounding_box
|
tuple[float, float, float, float] | None
|
Optional (min_lat, min_lon, max_lat, max_lon) |
None
|
Returns:
| Type | Description |
|---|---|
GeoResult
|
GeoResult with list of GeoPoints |
Example
async with GeoEndpoint() as geo: result = await geo.search( "earthquake", timespan="7d", max_points=50 ) print(f"Found {len(result.points)} locations")
Source code in src/py_gdelt/endpoints/geo.py
query(query_filter)
async
Query the GEO API with a filter.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query_filter
|
GeoFilter
|
GeoFilter with query parameters |
required |
Returns:
| Type | Description |
|---|---|
GeoResult
|
GeoResult containing geographic points |
Raises:
| Type | Description |
|---|---|
APIError
|
On request failure |
RateLimitError
|
On rate limit |
APIUnavailableError
|
On server error |
Source code in src/py_gdelt/endpoints/geo.py
to_geojson(query, *, timespan=None, max_points=250)
async
Get raw GeoJSON response.
Useful for direct use with mapping libraries (Leaflet, Folium, etc).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query |
required |
timespan
|
str | None
|
Time range (e.g., "24h", "7d") |
None
|
max_points
|
int
|
Maximum points (1-250) |
250
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Raw GeoJSON dict (FeatureCollection) |
Example
async with GeoEndpoint() as geo: geojson = await geo.to_geojson("climate change", timespan="30d") # Pass directly to mapping library folium.GeoJson(geojson).add_to(map)
Source code in src/py_gdelt/endpoints/geo.py
ContextEndpoint
ContextEndpoint
Bases: BaseEndpoint
Context 2.0 API endpoint for contextual analysis.
Provides contextual information about search terms including related themes, entities, and sentiment analysis.
Attributes:
| Name | Type | Description |
|---|---|---|
BASE_URL |
Base URL for the Context API endpoint |
Example
async with ContextEndpoint() as ctx: result = await ctx.analyze("climate change") for theme in result.themes[:5]: print(f"{theme.theme}: {theme.count} mentions")
Source code in src/py_gdelt/endpoints/context.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 | |
__aenter__()
async
Async context manager entry.
Returns:
| Type | Description |
|---|---|
ContextEndpoint
|
Self for use in async with statement. |
analyze(query, *, timespan=None)
async
Get contextual analysis for a search term.
Retrieves comprehensive contextual information including themes, entities, tone analysis, and related queries for the specified search term.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search term to analyze |
required |
timespan
|
str | None
|
Time range (e.g., "24h", "7d", "30d") |
None
|
Returns:
| Type | Description |
|---|---|
ContextResult
|
ContextResult with themes, entities, and tone analysis |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
On 429 response |
APIUnavailableError
|
On 5xx response or connection error |
APIError
|
On other HTTP errors or invalid JSON |
Source code in src/py_gdelt/endpoints/context.py
get_themes(query, *, timespan=None, limit=10)
async
Get top themes for a search term.
Convenience method that returns just themes sorted by count.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search term |
required |
timespan
|
str | None
|
Time range |
None
|
limit
|
int
|
Max themes to return |
10
|
Returns:
| Type | Description |
|---|---|
list[ContextTheme]
|
List of top themes sorted by count (descending) |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
On 429 response |
APIUnavailableError
|
On 5xx response or connection error |
APIError
|
On other HTTP errors or invalid JSON |
Source code in src/py_gdelt/endpoints/context.py
get_entities(query, *, timespan=None, entity_type=None, limit=10)
async
Get top entities for a search term.
Convenience method that returns entities, optionally filtered by type and sorted by count.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search term |
required |
timespan
|
str | None
|
Time range |
None
|
entity_type
|
str | None
|
Filter by type (PERSON, ORG, LOCATION) |
None
|
limit
|
int
|
Max entities to return |
10
|
Returns:
| Type | Description |
|---|---|
list[ContextEntity]
|
List of top entities sorted by count (descending) |
Raises:
| Type | Description |
|---|---|
RateLimitError
|
On 429 response |
APIUnavailableError
|
On 5xx response or connection error |
APIError
|
On other HTTP errors or invalid JSON |
Source code in src/py_gdelt/endpoints/context.py
TVEndpoint
TVEndpoint
Bases: BaseEndpoint
TV API endpoint for television news monitoring.
Searches transcripts from major US television networks including CNN, Fox News, MSNBC, and others. Provides three query modes: - Clip gallery: Individual video clips matching query - Timeline: Time series of mention frequency - Station chart: Breakdown by network
The endpoint handles date formatting, parameter building, and response parsing automatically.
Attributes:
| Name | Type | Description |
|---|---|---|
BASE_URL |
API endpoint URL for TV queries |
Example
async with TVEndpoint() as tv: clips = await tv.search("election", station="CNN") for clip in clips: print(f"{clip.show_name}: {clip.snippet}")
Source code in src/py_gdelt/endpoints/tv.py
135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 | |
search(query, *, timespan=None, start_datetime=None, end_datetime=None, station=None, market=None, max_results=250)
async
Search TV transcripts for clips.
Searches television news transcripts and returns matching video clips with metadata and text excerpts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query (keywords, phrases, or boolean expressions) |
required |
timespan
|
str | None
|
Time range (e.g., "24h", "7d", "30d") |
None
|
start_datetime
|
datetime | None
|
Start of date range (alternative to timespan) |
None
|
end_datetime
|
datetime | None
|
End of date range (alternative to timespan) |
None
|
station
|
str | None
|
Filter by station (CNN, FOXNEWS, MSNBC, etc.) |
None
|
market
|
str | None
|
Filter by market (National, Philadelphia, etc.) |
None
|
max_results
|
int
|
Maximum clips to return (1-250) |
250
|
Returns:
| Type | Description |
|---|---|
list[TVClip]
|
List of TVClip objects matching the query |
Raises:
| Type | Description |
|---|---|
APIError
|
If the API returns an error |
RateLimitError
|
If rate limit is exceeded |
APIUnavailableError
|
If the API is unavailable |
Example
clips = await tv.search("climate change", station="CNN", timespan="7d")
Source code in src/py_gdelt/endpoints/tv.py
query_clips(query_filter)
async
Query for TV clips with a filter.
Lower-level method that accepts a TVFilter object for more control over query parameters.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query_filter
|
TVFilter
|
TVFilter object with query parameters |
required |
Returns:
| Type | Description |
|---|---|
list[TVClip]
|
List of TVClip objects |
Raises:
| Type | Description |
|---|---|
APIError
|
If the API returns an error |
RateLimitError
|
If rate limit is exceeded |
APIUnavailableError
|
If the API is unavailable |
Source code in src/py_gdelt/endpoints/tv.py
timeline(query, *, timespan='7d', start_datetime=None, end_datetime=None, station=None)
async
Get timeline of TV mentions.
Returns a time series showing when a topic was mentioned on television, useful for tracking coverage patterns over time.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query |
required |
timespan
|
str | None
|
Time range (default: "7d") |
'7d'
|
start_datetime
|
datetime | None
|
Start of date range (alternative to timespan) |
None
|
end_datetime
|
datetime | None
|
End of date range (alternative to timespan) |
None
|
station
|
str | None
|
Optional station filter |
None
|
Returns:
| Type | Description |
|---|---|
TVTimeline
|
TVTimeline with time series data |
Raises:
| Type | Description |
|---|---|
APIError
|
If the API returns an error |
RateLimitError
|
If rate limit is exceeded |
APIUnavailableError
|
If the API is unavailable |
Example
timeline = await tv.timeline("election", timespan="30d") for point in timeline.points: print(f"{point.date}: {point.count} mentions")
Source code in src/py_gdelt/endpoints/tv.py
station_chart(query, *, timespan='7d', start_datetime=None, end_datetime=None)
async
Get station comparison chart.
Shows which stations covered a topic the most, useful for understanding which networks are focusing on particular stories.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query |
required |
timespan
|
str | None
|
Time range (default: "7d") |
'7d'
|
start_datetime
|
datetime | None
|
Start of date range (alternative to timespan) |
None
|
end_datetime
|
datetime | None
|
End of date range (alternative to timespan) |
None
|
Returns:
| Type | Description |
|---|---|
TVStationChart
|
TVStationChart with station breakdown |
Raises:
| Type | Description |
|---|---|
APIError
|
If the API returns an error |
RateLimitError
|
If rate limit is exceeded |
APIUnavailableError
|
If the API is unavailable |
Example
chart = await tv.station_chart("healthcare") for station in chart.stations: print(f"{station.station}: {station.percentage}%")
Source code in src/py_gdelt/endpoints/tv.py
TVAIEndpoint
TVAIEndpoint
Bases: BaseEndpoint
TVAI API endpoint for AI-enhanced TV analysis.
Similar to TVEndpoint but uses AI-powered features for enhanced analysis. Uses the same data models and similar interface as TVEndpoint.
Attributes:
| Name | Type | Description |
|---|---|---|
BASE_URL |
API endpoint URL for TVAI queries |
Example
async with TVAIEndpoint() as tvai: clips = await tvai.search("artificial intelligence")
Source code in src/py_gdelt/endpoints/tv.py
423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 | |
search(query, *, timespan=None, start_datetime=None, end_datetime=None, station=None, max_results=250)
async
Search using AI-enhanced analysis.
Searches television transcripts using AI-powered analysis for potentially better semantic matching and relevance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query
|
str
|
Search query |
required |
timespan
|
str | None
|
Time range (e.g., "24h", "7d") |
None
|
start_datetime
|
datetime | None
|
Start of date range (alternative to timespan) |
None
|
end_datetime
|
datetime | None
|
End of date range (alternative to timespan) |
None
|
station
|
str | None
|
Filter by station |
None
|
max_results
|
int
|
Maximum clips to return (1-250) |
250
|
Returns:
| Type | Description |
|---|---|
list[TVClip]
|
List of TVClip objects |
Raises:
| Type | Description |
|---|---|
APIError
|
If the API returns an error |
RateLimitError
|
If rate limit is exceeded |
APIUnavailableError
|
If the API is unavailable |
Example
clips = await tvai.search("machine learning", timespan="7d")
Source code in src/py_gdelt/endpoints/tv.py
452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 | |