← Back to Careers

Senior / Staff Platform Engineer

We're not looking for someone to manage servers. We're looking for an engineer who sees the failure modes in distributed, event-driven systems before they show up in production.

This is a role for someone who treats operational excellence as product work: deterministic processing, recoverable failures, and systems that are understandable under pressure.

About Blue Language Labs

Blue is building foundational infrastructure for the AI economy: a protocol for structured, verifiable, multi-party agreements that execute deterministically, leave an audit trail, and work across real-world systems.

We are building the trust layer AI agents need to coordinate safely across company boundaries, without replacing the financial rails businesses already use.

This is the kind of work that looks obvious only in hindsight: difficult systems problems, a small team, and product decisions that move directly into production.

PayNotes

Programmable payments with conditional capture, milestone releases, refunds, vouchers, and partner settlement.

MyOS

An orchestration layer for merchants and agents that makes cross-platform trust practical without exposing the protocol directly.

The System You'll Own

MyOS processes documents, payments, and agent interactions through a distributed, event-driven pipeline where correctness is operational, not abstract.

State transitions must be recoverable, ordering guarantees must hold, and a bad deployment cannot silently break process guarantees across live workflows.

What you'll do

  • Own AWS infrastructure end-to-end: architecture, cost, security, reliability, and compliance.
  • Design and maintain CI/CD pipelines that let engineers ship confidently and fast.
  • Build monitoring and alerting that surfaces real problems early: queue depth, processing failures, latency spikes, and delivery anomalies.
  • Manage PostgreSQL at scale, including configuration, connection strategy, performance, backup, and restore.
  • Keep async processing boundaries healthy across Lambda, queues, storage, and the services that connect them.
  • Ensure failed work is recoverable through retries, DLQs, idempotency keys, and operational guardrails.
  • Own cost visibility and optimization across a growing AWS footprint.
  • Use modern AI tooling to accelerate infrastructure work and experimentation.

What you bring

  • Deep AWS expertise across compute, storage, messaging, networking, and observability.
  • Strong infrastructure-as-code discipline and a habit of versioning everything.
  • A solid grasp of distributed systems fundamentals: ordering, consistency, delivery guarantees, and partial failure recovery.
  • Experience operating event-driven architectures under production load.
  • Enough application-layer literacy to read TypeScript / Node.js code and understand how services behave, not just that they run.
  • Strong opinions about observability: structured logs, distributed tracing, and dashboards built for real operators.

What you get

  • Competitive salary (B2B, based on experience).
  • Equity - we want you to own a meaningful piece of what you're building.
  • Fully remote work with an async-first culture.
  • A small team where your infrastructure decisions land in production quickly, and you'll feel the impact.
  • Infrastructure problems that are genuinely interesting: deterministic processing, trusted state, real-time fan-out, and payment-grade reliability.

Why this role matters

The protocol is only as trustworthy as the infrastructure running it. If that's the part you want to own, we'd like to hear from you.

Other open roles

You may also want to look at

The two roles share the same product context, but they anchor ownership in different places.