| name | json-extraction |
| description | Extract and transform data from JSON documents and API responses. Use this skill when parsing JSON files, extracting nested data using JSONPath or jq-like queries, flattening nested structures, or transforming JSON data for further processing. |
JSON Extraction
Extract, query, and transform data from JSON documents.
Quick Start
import json
# Load JSON
with open('data.json') as f:
data = json.load(f)
# Access nested data
value = data['key']['nested']['value']
items = data.get('items', [])
JSONPath Queries
Use jsonpath-ng for complex queries:
pip install jsonpath-ng
from jsonpath_ng import parse
# Find all prices
jsonpath_expr = parse('$.products[*].price')
prices = [match.value for match in jsonpath_expr.find(data)]
# Find nested items
jsonpath_expr = parse('$..items[*].name')
names = [match.value for match in jsonpath_expr.find(data)]
Common JSONPath Patterns
| Pattern | Description |
|---|---|
$.store.book[*] |
All books |
$..author |
All authors at any level |
$.store.book[0] |
First book |
$.store.book[-1] |
Last book |
$.store.book[0,1] |
First two books |
$.store.book[?@.price<10] |
Books cheaper than 10 |
Flattening Nested JSON
def flatten_json(nested: dict, prefix: str = '') -> dict:
"""Flatten nested JSON into single-level dictionary."""
flat = {}
for key, value in nested.items():
new_key = f"{prefix}.{key}" if prefix else key
if isinstance(value, dict):
flat.update(flatten_json(value, new_key))
elif isinstance(value, list):
for i, item in enumerate(value):
if isinstance(item, dict):
flat.update(flatten_json(item, f"{new_key}[{i}]"))
else:
flat[f"{new_key}[{i}]"] = item
else:
flat[new_key] = value
return flat
Extracting from API Responses
def extract_paginated(response_json: dict) -> list:
"""Extract data from paginated API response."""
data = response_json.get('data', response_json.get('results', []))
return data
def safe_get(data: dict, *keys, default=None):
"""Safely navigate nested dictionary."""
for key in keys:
if isinstance(data, dict):
data = data.get(key, default)
elif isinstance(data, list) and isinstance(key, int):
data = data[key] if key < len(data) else default
else:
return default
return data
Helper Script
Use extract_json.py for command-line extraction:
python extract_json.py data.json --path "$.items[*].name" --output names.json
python extract_json.py data.json --flatten --output flat.json