← Back to Blog
TutorialMarch 22, 202513 min read

Transaction Enrichment in Python: Complete Guide with Code Examples

A hands-on guide to integrating transaction enrichment into your Python applications. From basic API calls to pandas batch processing and building a CLI tool.

If you're building a fintech app, expense tracker, or accounting tool in Python, raw bank transaction descriptions are one of the biggest UX headaches you'll face. Strings like "POS 4829 AMZN MKTP US" mean nothing to your users. Transaction enrichment transforms these cryptic strings into clean merchant names, logos, and categories — and Python makes it straightforward to integrate.

Prerequisites

Before we start, make sure you have the following:

  • Python 3.8+ installed on your machine
  • An API key from Easy Enrichment (free tier available at the dashboard)
  • requests and pandas libraries installed
pip install requests pandas

Basic: Your First Enrichment Call

The simplest integration is a single POST request to the /enrich endpoint. Pass a raw transaction description and get back structured merchant data.

import requests

API_KEY = "your_api_key_here"
BASE_URL = "https://api.easyenrichment.com"

def enrich_transaction(description: str) -> dict:
    """Enrich a single bank transaction description."""
    response = requests.post(
        f"{BASE_URL}/enrich",
        headers={
            "Authorization": f"Bearer {API_KEY}",
            "Content-Type": "application/json"
        },
        json={"description": description}
    )
    response.raise_for_status()
    return response.json()

# Example usage
result = enrich_transaction("POS 4829 AMZN MKTP US")
print(result)
# {
#   "merchant": "Amazon",
#   "category": "Shopping",
#   "logo_url": "https://logo.clearbit.com/amazon.com",
#   "website": "amazon.com",
#   "confidence": 0.97
# }

That's it for a basic call. The API returns the cleaned merchant name, a spending category, the merchant's logo URL, their website, and a confidence score. Let's build on this.

Batch Processing with Pandas

In production, you're rarely enriching one transaction at a time. Here's how to process a CSV of bank transactions using pandas and add enrichment columns in bulk.

import pandas as pd
import requests
from time import sleep

API_KEY = "your_api_key_here"
BASE_URL = "https://api.easyenrichment.com"

def enrich_batch(descriptions: list[str]) -> list[dict]:
    """Enrich a list of transaction descriptions."""
    results = []
    for desc in descriptions:
        try:
            resp = requests.post(
                f"{BASE_URL}/enrich",
                headers={
                    "Authorization": f"Bearer {API_KEY}",
                    "Content-Type": "application/json"
                },
                json={"description": desc},
                timeout=10
            )
            resp.raise_for_status()
            results.append(resp.json())
        except requests.RequestException as e:
            results.append({"error": str(e), "merchant": None})
        sleep(0.1)  # respect rate limits
    return results

# Load your transactions
df = pd.read_csv("transactions.csv")

# Enrich all descriptions
enriched = enrich_batch(df["description"].tolist())

# Add enrichment columns to the DataFrame
df["merchant_name"] = [r.get("merchant") for r in enriched]
df["category"] = [r.get("category") for r in enriched]
df["logo_url"] = [r.get("logo_url") for r in enriched]
df["confidence"] = [r.get("confidence") for r in enriched]

# Save enriched data
df.to_csv("transactions_enriched.csv", index=False)
print(f"Enriched {len(df)} transactions")

Error Handling and Retries

Production code needs to handle network errors, rate limits, and API downtime gracefully. Here's a robust wrapper with exponential backoff.

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

def create_session() -> requests.Session:
    """Create a requests session with retry logic."""
    session = requests.Session()
    retry = Retry(
        total=3,
        backoff_factor=0.5,
        status_forcelist=[429, 500, 502, 503, 504],
        allowed_methods=["POST"]
    )
    adapter = HTTPAdapter(max_retries=retry)
    session.mount("https://", adapter)
    session.headers.update({
        "Authorization": f"Bearer {API_KEY}",
        "Content-Type": "application/json"
    })
    return session

session = create_session()

def enrich_safe(description: str) -> dict:
    """Enrich with automatic retries and error handling."""
    try:
        resp = session.post(
            f"{BASE_URL}/enrich",
            json={"description": description},
            timeout=15
        )
        resp.raise_for_status()
        return resp.json()
    except requests.exceptions.HTTPError as e:
        if e.response.status_code == 429:
            print("Rate limited — back off and retry")
        return {"error": str(e), "merchant": None}
    except requests.exceptions.ConnectionError:
        return {"error": "Connection failed", "merchant": None}

Caching to Reduce API Calls

Many transaction descriptions repeat (e.g., your users buy coffee at the same place daily). A simple cache dramatically reduces API usage and cost.

import hashlib
import json
from pathlib import Path

CACHE_DIR = Path(".enrichment_cache")
CACHE_DIR.mkdir(exist_ok=True)

def get_cache_key(description: str) -> str:
    """Generate a cache key from the transaction description."""
    normalized = description.strip().lower()
    return hashlib.md5(normalized.encode()).hexdigest()

def enrich_with_cache(description: str) -> dict:
    """Enrich with file-based caching."""
    key = get_cache_key(description)
    cache_file = CACHE_DIR / f"{key}.json"

    # Check cache first
    if cache_file.exists():
        return json.loads(cache_file.read_text())

    # Call API
    result = enrich_safe(description)

    # Cache successful results
    if "error" not in result:
        cache_file.write_text(json.dumps(result))

    return result

Production tip

For production systems, replace the file-based cache with Redis or Memcached. Set a TTL of 24-48 hours so merchant data stays fresh. Merchant names rarely change, but logos and categories can be updated.

Building a CLI Tool

Let's put it all together into a CLI tool you can use to enrich transactions from the terminal or pipe into other scripts.

#!/usr/bin/env python3
"""enrich_cli.py - Enrich bank transactions from the command line."""

import argparse
import csv
import sys
import json

def main():
    parser = argparse.ArgumentParser(
        description="Enrich bank transaction descriptions"
    )
    parser.add_argument(
        "input",
        help="Transaction description or CSV file path"
    )
    parser.add_argument(
        "--csv", action="store_true",
        help="Treat input as a CSV file"
    )
    parser.add_argument(
        "--column", default="description",
        help="CSV column name containing descriptions"
    )
    parser.add_argument(
        "--output", "-o",
        help="Output file path (default: stdout)"
    )
    args = parser.parse_args()

    if args.csv:
        df = pd.read_csv(args.input)
        enriched = enrich_batch(df[args.column].tolist())
        df["merchant"] = [r.get("merchant") for r in enriched]
        df["category"] = [r.get("category") for r in enriched]
        if args.output:
            df.to_csv(args.output, index=False)
        else:
            print(df.to_string())
    else:
        result = enrich_with_cache(args.input)
        print(json.dumps(result, indent=2))

if __name__ == "__main__":
    main()

Usage examples:

# Enrich a single transaction
python enrich_cli.py "CHECKCARD 1234 STARBUCKS SEATTLE"

# Enrich a CSV file
python enrich_cli.py transactions.csv --csv --column description -o enriched.csv

Using Additional Endpoints

Beyond the core /enrich endpoint, the API offers specialized lookups for deeper data:

  • /enrich/company — Get detailed company info (industry, employee count, funding)
  • /enrich/domain — Look up a merchant by domain name
  • /enrich/social — Retrieve social media profiles for a merchant
  • /enrich/person — Identify individuals associated with a business
# Get detailed company data after enrichment
result = enrich_transaction("UBER *TRIP HELP.UBER.COM")

if result.get("website"):
    company = requests.post(
        f"{BASE_URL}/enrich/company",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={"domain": result["website"]}
    ).json()
    print(f"Industry: {company.get('industry')}")
    print(f"Category: {company.get('category')}")

Performance Tips

  • Use connection pooling: The requests.Session() object reuses TCP connections, reducing latency by 30-50%.
  • Batch with asyncio: For high-volume processing, use aiohttp with asyncio.gather() to run concurrent requests.
  • Normalize first: Strip whitespace, lowercase, and remove card numbers before sending — this improves cache hit rates.
  • Handle duplicates: De-duplicate descriptions before calling the API. Enrich unique strings only, then map results back.

Start enriching transactions in Python today

Get your free API key and start with 100 enrichments per month on the free tier. No credit card required.

Get Your API Key →