Skip to content

Staff Software Engineer · AI Infrastructure

I build AI infrastructure and run it in production.

Memory, code intelligence, and real-time messaging: three live products on a six-server fleet, 24/7. Fourteen years in production, and code merged into projects I don't own.

Selected results

1830×
Graph hot path, cut from 446ms to 0.24ms.
go-code · krolik.tools
72.5%
LoCoMo memory benchmark, ahead of Mem0's open baseline.
MemDB · memdb.ai
99.6%
Message delivery to 5,000 users in one room, over a 5-minute load test.
OxPulse · oxpulse.chat

By the numbers

3
Live products: memdb.ai, krolik.tools, oxpulse.chat
230+
MCP tools across 9 production servers (go-code alone: 35)
6
Production servers I run across the US and Europe, 24/7
14
Years building and operating production systems

Live products

What I've shipped

Three production systems, each on its own live domain.

01

MemDB · memdb.ai

Self-hosted long-term memory database for AI agents. Pure Go core, zero Python in the hot path, one docker-compose. 72.5% on LoCoMo LLM-Judge, beating Mem0 by 5.62pp with open-source primitives only. My Postgres + pgvector graph backend was also merged into MemOS (PR #966), so it self-hosts without a separate graph store. Apache 2.0, v0.23.0 in production, with a Discord community of early adopters. GoPostgreSQLApache AGEpgvectorQdrantRedis

02

go-code · krolik.tools

Code intelligence MCP server for AI coding agents. 35 tools across 16 languages: semantic search, call-graph and dataflow analysis, automated PR review. Apache AGE knowledge graph + pgvector + ColBERT reranker. Bridges Prometheus alerts and Jaeger traces directly to file:function hypotheses. Gotree-sitterApache AGEpgvectorColBERTPrometheusJaegerMCP

03

OxPulse Chat · oxpulse.chat

Censorship-resistant encrypted chat and real-time video, usable down to 1 KB/s. Verified a 5,000-user-per-room broadcast at 99.6% delivery (load test) on a 4-core box. Multi-domain partner-edge architecture (oxpulse.chat + 4 partner brands). Rust backend, SolidJS widget, production CSP. RustAxumSolidJSPostgreSQLWebRTC

Engineering & infrastructure

What makes it run

Libraries, engines, and MCP servers, built by the same hands. Most are open source and link to their repo.

go-kit

The shared Go toolkit the whole fleet is built on: a tiered L1/L2 cache (S3-FIFO + Redis), circuit breaker, rate limiter, hedged requests, in-process pub/sub, plus LLM, rerank, and sparse-embedding clients for hybrid retrieval. Two dozen independent packages, most stdlib-only, one module. GoRedisPrometheus

GitHub →

go-mcpserver

The bootstrap framework behind all nine of my MCP servers: one Run() call instead of ~80 lines of boilerplate, on the official MCP Go SDK. GoMCP

GitHub →

go-workflow

A standalone DAG workflow engine for multi-turn LLM and agent tool-loops: 15 step types, native MCP integration, distributed execution over a Postgres SKIP LOCKED queue, crash recovery, and approval flows. Apache 2.0. GoPostgreSQLMCP

GitHub →

oxpulse-sfu-kit

Reusable WebRTC SFU library on str0m: simulcast, dual BWE (Kalman + googcc), pacer, AV1-DD, active speaker. Published v0.11.0 on crates.io. Ruststr0mTokioWebRTC

crates.io →

ox-whisper

Self-hosted, OpenAI-compatible speech-to-text in Rust: a single ARM64 CPU with no GPU, 8 languages, real-time WebSocket streaming, and word-level timestamps. Built on sherpa-onnx + Moonshine v2. RustONNXWebSocket

GitHub →

go-twitter

A Twitter/X scraping library that gets through the anti-bot wall: TLS fingerprinting, x-client-transaction-id, a multi-account pool with health tracking, TOTP 2FA, and CAPTCHA solving. Built on go-stealth. GoGraphQLgo-stealth

GitHub →

go-search

A self-hosted Perplexity: an MCP server that fuses web scrapers, API integrations, and LLM summarization into cited answers across web, GitHub, Hacker News, YouTube, and Hugging Face. GoMCPLLM

dozor

AI-first server monitoring and deploy, exposed over MCP: it walks any Linux fleet (Docker, systemd, SSL, remote hosts) and returns LLM-optimized output instead of dashboards, with non-blocking webhook deploys and a Telegram alert bus. GoMCPDockersystemd

GitHub →

Why now

Choosing an IC seat, on purpose

For 14 years I ran my own products and carried production myself: on call, no team to escalate to, every outage mine to fix. That taught me what holds up under real load and what only looks good in a demo.

I'm choosing an individual-contributor role deliberately: I want to go deep on hard infrastructure problems alongside a strong team, not manage one. I ship, I operate, and I take the pager. And I work inside code I don't own, not only my own: my graph backend was merged into MemOS (PR #966). Contributing upstream is the part of the job I want more of.

Open to

What I'm looking for

Staff+ engineering roles at AI infrastructure companies

Anthropic, Cursor, Sourcegraph, Cognition, and similar: code intelligence, agent memory, distributed systems, platform, internal tooling. After 14 years running production end to end, I want to take on the hard parts with a strong team behind me.

Founding-engineer roles at early-stage AI startups

AI-infrastructure and developer tools, where deep ownership and 0-to-1 building matter. Same work, earlier stage.

Let's talk

Staff and Senior engineering roles at AI-infrastructure and dev-tools companies: code intelligence, agent memory, distributed systems, platform. Open to founding-engineer roles at early-stage startups too. SF Bay Area, onsite or hybrid.

Email →
Available now
location Bay Area · UTC-7
prefer remote, SF hybrid OK