Easy Enrichment - Transaction Data Intelligence

If you've ever tried to build a personal finance app or expense tracker, you know the pain: bank transaction descriptions look like CHECKCARD 0315 AMZN MKTP US*2K1AB0C9Z AMZN.COM/BILL WA instead of simply "Amazon." Parsing these descriptions into clean merchant names is one of the hardest problems in fintech development.

The Problem: Why Bank Descriptions Are a Mess

Banks don't standardize transaction descriptions. What you get depends on the payment processor, the bank, and the merchant's terminal configuration. Here are real examples:

Raw Description	Actual Merchant
CHECKCARD 0315 AMZN MKTP US*2K1AB0C9Z	Amazon
TST* SWEETGREEN - DOWNT WASHINGTON DC	Sweetgreen
SQ *THE COFFEE BEAN 1234 Los Angeles CA	The Coffee Bean
UBER *TRIP HELP.UBER.COM CA	Uber
PY *SPOTIFY USA 877-778-8672 NY	Spotify
DD *DOORDASH PANERA BREAD 800-958-3...	DoorDash (Panera Bread)

Notice the patterns: processor prefixes (SQ *, TST*, DD *), trailing location data, phone numbers, reference codes, and truncated text. No two banks format these the same way.

Approach 1: Regex (Quick & Dirty)

The simplest approach is to strip known prefixes and suffixes with regular expressions. This gets you 60-70% of the way there for common merchants.

Python Example

import re

def parse_transaction(description: str) -> str:
    """Basic regex-based transaction parser."""
    text = description.upper().strip()

    # Remove common prefixes
    prefixes = [
        r'^CHECKCARD \d+ ',
        r'^POS (DEBIT |PURCHASE )?',
        r'^PURCHASE AUTHORIZED ON \d+/\d+ ',
        r'^SQ \*',
        r'^TST\* ?',
        r'^DD \*',
        r'^PY \*',
        r'^UBER \*',
        r'^LYFT \*',
        r'^AMZN MKTP ',
    ]
    for prefix in prefixes:
        text = re.sub(prefix, '', text)

    # Remove trailing location (STATE abbreviation + ZIP)
    text = re.sub(r'\s+[A-Z]{2}\s*\d{5}(-\d{4})?$', '', text)

    # Remove trailing phone numbers
    text = re.sub(r'\s+\d{3}[\-.]\d{3}[\-.]\d{4}.*$', '', text)

    # Remove trailing reference codes
    text = re.sub(r'\s+[A-Z0-9*#]{6,}$', '', text)

    # Remove trailing URLs
    text = re.sub(r'\s+\S+\.(COM|NET|ORG)\S*', '', text)

    return text.strip().title()

# Examples
print(parse_transaction("CHECKCARD 0315 AMZN MKTP US*2K1AB0C9Z"))
# => "Us" ← Not great!
print(parse_transaction("SQ *THE COFFEE BEAN 1234 Los Angeles CA"))
# => "The Coffee Bean 1234 Los Angeles" ← Close but noisy

The problem is obvious: regex can't understand context. It doesn't know that "AMZN MKTP" means Amazon, or that "1234" is a store number and not part of the name. You end up maintaining an ever-growing list of patterns that still miss edge cases.

Node.js Example

function parseTransaction(description) {
  let text = description.toUpperCase().trim();

  // Remove common prefixes
  const prefixes = [
    /^CHECKCARD \d+ /,
    /^POS (DEBIT |PURCHASE )?/,
    /^SQ \*/,
    /^TST\* ?/,
    /^DD \*/,
    /^PY \*/,
  ];
  for (const prefix of prefixes) {
    text = text.replace(prefix, '');
  }

  // Remove trailing state + zip
  text = text.replace(/\s+[A-Z]{2}\s*\d{5}(-\d{4})?$/, '');

  // Remove trailing phone numbers
  text = text.replace(/\s+\d{3}[\-.]\d{3}[\-.]\d{4}.*$/, '');

  return text.trim();
}

console.log(parseTransaction("SQ *THE COFFEE BEAN 1234 Los Angeles CA 90001"));
// => "THE COFFEE BEAN 1234 LOS ANGELES" — still noisy

Approach 2: Lookup Table + Fuzzy Matching

A better approach is combining regex cleanup with a merchant database and fuzzy string matching. This handles known merchants well but still fails on local businesses.

from fuzzywuzzy import fuzz

KNOWN_MERCHANTS = {
    "AMZN": "Amazon",
    "AMAZON": "Amazon",
    "STARBUCKS": "Starbucks",
    "UBER": "Uber",
    "LYFT": "Lyft",
    "NETFLIX": "Netflix",
    "SPOTIFY": "Spotify",
    # ... hundreds more needed
}

def match_merchant(cleaned_text: str) -> str | None:
    for pattern, name in KNOWN_MERCHANTS.items():
        if pattern in cleaned_text:
            return name
    # Fuzzy fallback
    best_match = None
    best_score = 0
    for pattern, name in KNOWN_MERCHANTS.items():
        score = fuzz.partial_ratio(cleaned_text, pattern)
        if score > best_score and score > 80:
            best_score = score
            best_match = name
    return best_match

This works for major chains, but you need to maintain a database of thousands of merchants. And you still can't handle local businesses, international merchants, or new companies.

Approach 3: Use a Transaction Enrichment API

The most accurate and maintainable solution is to use a dedicated transaction enrichment API. These services maintain massive merchant databases, use AI models trained on billions of transactions, and handle all the edge cases for you.

Python — Using Easy Enrichment

import requests

API_KEY = "your_api_key"

def enrich_transaction(description: str, amount: float = None) -> dict:
    """Parse a bank transaction using Easy Enrichment API."""
    response = requests.post(
        "https://api.easyenrichment.com/enrich",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={
            "description": description,
            "amount": amount,
            "currency": "USD"
        }
    )
    return response.json()

# Try it
result = enrich_transaction("CHECKCARD 0315 AMZN MKTP US*2K1AB0C9Z", 49.99)
print(result["merchant_name"])  # "Amazon"
print(result["category"])       # "Shopping"
print(result["logo_url"])       # "https://logo.easyenrichment.com/amazon.com"
print(result["mcc_code"])       # "5942"
print(result["confidence"])     # 0.97

Node.js — Using Easy Enrichment

const API_KEY = 'your_api_key';

async function enrichTransaction(description, amount) {
  const response = await fetch('https://api.easyenrichment.com/enrich', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      description,
      amount,
      currency: 'USD',
    }),
  });
  return response.json();
}

// Usage
const result = await enrichTransaction(
  'SQ *THE COFFEE BEAN 1234 Los Angeles CA',
  5.75
);
console.log(result.merchant_name); // "The Coffee Bean & Tea Leaf"
console.log(result.category);      // "Food & Drink"
console.log(result.subcategory);   // "Coffee Shops"

Comparing the Approaches

Criteria	Regex	Lookup + Fuzzy	Enrichment API
Accuracy	~40%	~70%	~95%+
Local businesses	Fails	Fails	Works
Returns categories	No	Manual	Yes
Returns logos	No	No	Yes
Maintenance effort	High	High	None
Cost	Free	Free	$0.002/tx

When to Use Each Approach

Regex only: Quick prototyping, internal tools where accuracy doesn't matter much.
Lookup + fuzzy matching: When you only deal with a small set of known merchants (e.g., internal expense tracking for a corporate card).
Enrichment API: Any user-facing application where you need accurate merchant names, categories, and logos. The cost per transaction is negligible compared to the engineering time saved.

Stop Writing Regex — Use Easy Enrichment

Parse any bank transaction description into a clean merchant name, category, logo, and more. Start free with 500 requests/month.

Get Your API Key →API Documentation

How to Parse Bank Transaction Descriptions (with Code Examples)