MultiHub Forum

Full Version: Understanding latency trade-offs in distributed systems under peak load
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I'm a solo developer trying to build a lightweight, offline-capable web app for field botanists to log plant observations, but I'm hitting a wall with the database sync logic. My prototype uses SQLite in the browser via the Origin Private File System for local storage, and a PostgreSQL backend. The core issue is handling merge conflicts when a user has been offline for a week in a remote area, modifies dozens of existing records on their device, and then reconnects—some of those same records may have been updated or deleted by other users on the server in the meantime. I have a basic "last write wins" approach, but that feels irresponsible for scientific data. I'm working with about 5,000 potential users, a near-zero budget for third-party sync services, and a self-imposed deadline of three months to have a robust sync engine. I need a conflict resolution strategy that prioritizes data integrity without overcomplicating the client-side code for slow mobile connections.
Adopt an offline-first CRDT approach (Yjs or Automerge) so edits offline merge automatically on reconnect. Model each observation as a CRDT document and sync via a lightweight server that stores a change log and causal metadata. Resolve conflicts with deterministic rules (server-wins deletes, client-wins non-conflicting fields) and flag unresolved cases for review. Minimal client logic, scalable to 5k users; pilot first.