Skip to main content
Featured PostProduct

API Integration: Best Case vs Worst Case

The first two posts covered why API integration is difficult and the three categories of challenges we’ve encountered. Now I’ll show you the framework we use to evaluate APIs before committing engineering time, and what we built to handle the worst cases.

The Evaluation Framework

When we look at a new API for Plotter, we assess it across the same three categories: Technical Complexity, Data Reliability, and Compliance & Operations. This led us to 12 point high and low level checklist, which I will use to describe best and worst case scenarios for an API:

Best-Case Scenario (Ideal API)

1. Technical Complexity

API Age

A modern REST API with stable versioning uses predictable patterns that integrate cleanly with existing tools. Its consistency reduces the amount of custom logic required during development.

API Authentication

Authentication relies on a simple API key passed in the header. There are no token refresh cycles, OAuth redirects, or scope-management headaches.

API Technical Resources

The API can be loaded with lightweight compute jobs that finish quickly. No special hardware, large-memory environments, or orchestration layers are necessary.

API Endpoint Format

Endpoints return clean JSON or CSV with consistent data types. There are no unusual formats, mixed date structures, or quirks that break parsers.

API Accessibility

All data is retrievable with one or two standard HTTPS requests. No separate client installation, chained lookups, or multi-step processes are required.

API System Configuration

The data maps cleanly into your existing warehouse and ETL pipelines. Minimal transformation is needed because the API’s schema aligns with internal structures.

API Rate Limits & Size

Rate limits are generous enough to support parallel ingestion. Data payloads are moderate, allowing fast transfer and quick processing.

Example: Alpaca fits remarkably well into this best-case profile. Its REST endpoints are clean, authentication is trivial, and documentation is accurate, making integration feel nearly plug-and-play.

2. Data Reliability

API Documentation

Documentation is accurate, up to date, and matches real responses exactly. Examples work as advertised, reducing debugging time dramatically.

API Endpoint Updating

The API clearly lists all tables along with “last updated” timestamps. This allows true incremental updates without needing to reload everything.

API Data Uniformity

All tables follow the same structure with consistent logic. No cross-referencing is required because each dataset is self-contained and fully documented.

Example: EODHD demonstrates this well with standard formats, consistent schema design, and clean incremental updates. Rate limits exist, but for most workloads they remain easy to manage.

3. Compliance & Operations

API Accessibility

Access is provided through a straightforward, predictable subscription model. The paywall ensures stable uptime, support, and long-term API maintenance.

API GenAI Access

The API provides proprietary or frequently updated data that GenAI models cannot reliably supply. Integrating it offers clear added value beyond what AI can generate.

Real Pattern: Most commercial APIs fall near this ideal — clear licensing, stable operations, and predictable versioning that minimizes surprises during long-term maintenance.

Worst-Case Scenario (Painful API)

1. Technical Complexity

API Age

The API uses legacy SOAP or custom architectures with inconsistent patterns. This requires specialized logic and frequent troubleshooting just to keep the integration running.

API Authentication

Authentication involves multi-step OAuth flows, token refresh cycles, or cryptographic signatures. Tokens may expire mid-process, causing failures and requiring complex retry logic.

API Technical Resources

The API demands high-memory machines or constant parallel processing to handle large payloads. Costs rise quickly because long-running jobs or special infrastructure become necessary.

API Endpoint Format

Endpoints return a mix of JSON, XML, CSV, and even compressed archives. Inconsistent formats and mixed data types routinely break ingestion pipelines.

API Accessibility

Data retrieval requires long chains of dependent requests, such as categories → sections → items. Custom SDKs or local daemons must be installed and maintained.

API System Configuration

The API’s schema conflicts with internal systems, forcing heavy restructuring of every dataset. Metadata is incomplete or inconsistent, requiring guesswork and manual repairs.

API Rate Limits & Size

Strict rate limits prevent parallel loading and slow overall ingestion. Large payloads or millions of rows require thousands of sequential requests that take hours or days to complete.

Example: BEA lands close to this worst-case end: legacy architecture, PDF-only documentation, and a dozen sub-sources each requiring custom handling. Simply understanding how the system fit together took significant engineering time.

2. Data Reliability

API Documentation

Documentation is outdated or incorrect, with examples that fail when used. Developers must reverse-engineer logic by inspecting raw responses.

API Endpoint Updating

There is no clear inventory of tables or update indicators. This forces full reloads, deduplication, and constant uncertainty about whether data is actually current.

API Data Uniformity

Data is fragmented across multiple dependent tables requiring complex joins. Relationships are undocumented and rely on opaque IDs, increasing the likelihood of errors.

Example: BLS scatters data across dozens of metadata tables with hidden, undocumented relationships, forcing us to dynamically detect cross-references. Econoday is even harder: you can only fetch events by time windows, release times drift unpredictably, and continuous polling with backoff is required to stay current.

3. Compliance & Operations

API Accessibility

Paywall tiers are opaque or restrictive, limiting which data you can access. Sudden changes in pricing or access levels can break existing workflows.

API GenAI Access

The API’s data is already freely accessible through GenAI models, reducing its strategic value. Integrating it still requires heavy engineering without offering distinct benefits.

Real Pattern: Government APIs like BEA and BLS offer permissive licensing but unpredictable operational behavior. Their older architectures and sudden changes can create recurring maintenance challenges despite being “free.”

The 10-25-65 Split

Based on integrating 50+ sources for Plotter, here’s the rough distribution:

  • 10% are best-case: Modern, clean, well-documented. Integration takes a few days with minimal surprises.
  • 25% are worst-case: Legacy systems, messy data, poor docs. These require 5-10× more engineering effort than you’d expect going in.
  • 65% fall in between: Manageable but not trivial. Usually 2-4 weeks per source depending on complexity and how much of your tooling is reusable.

The problem is you often can’t tell which bucket an API falls into until you’re already committed. The documentation might look fine, but then you discover the date formats are inconsistent. The authentication might seem simple, but then token rotation breaks your pipeline every few weeks.

This is why upfront evaluation matters. If you can spot red flags early, you either avoid the integration entirely or budget appropriately for the work. At Plotter, we took many initiatives in improving our infrastructure to better handle API ETL based on these best practices, which is the topic of the next post on this subject: Plotters API Strategy