I'm redesigning our company's legacy internal API to a modern RESTful structure, and while I understand the basic principles, I'm getting bogged down in the practical details of REST API design best practices that affect long-term developer experience. I'm debating between HATEOAS for discoverability versus a simpler, more predictable endpoint structure, and I'm unsure how to best version the API without breaking existing integrations. For architects who have maintained large-scale REST APIs, what are your concrete rules for resource naming, error handling, and pagination that you've found minimize support tickets? How do you balance strict adherence to REST constraints with the practical need for performance, especially for complex queries that don't map cleanly to simple CRUD operations?
Non-negotiable: don't chase hypermedia for its own sake. HATEOAS can help discoverability but adds coupling and testing friction. Start with a clean, versioned REST surface plus strong docs, then consider hypermedia only if you have multi-step workflows that truly benefit from it.
Resource naming & versioning: Use plural nouns, avoid verbs, and keep nesting to 3–4 levels. Example: /api/v1/users/{id}/orders; keep the top-level resources stable and evolve through versions. For versioning, bump the major version in the URL (v1, v2) and provide a deprecation window with migration guides; try to minimize breaking changes by aiming for extensibility via query params or optional fields rather than replacing endpoints.
Error handling: adopt RFC 7807 Problem Details. In every error include type (a URL), title, status, detail, and instance; add a machine-readable errorCode for client logic. Use a single place to map internal exceptions to HTTP status and payload; include a correlation-id to trace issues across logs.
Pagination: prefer cursor-based or page tokens for large lists; avoid offset-based pagination on large datasets due to performance. Return items plus nextCursor or nextPageUrl; enforce a sane page size (e.g., 20–100) and provide total count only if you can compute it cheaply. Ensure deterministic ordering by a stable field (created_at, id).