Product updates, engineering deep-dives, and practical guides from the aiAxonIQ team.
All Posts
LLM calls are expensive, non-deterministic, and opaque. Here's what to capture on every model call — tokens, latency, cost, and errors — and how the OpenTelemetry GenAI conventions make it portable.
Your OTel SDK can ship telemetry straight to a backend, or through a Collector. Both are valid. Here's a practical decision guide — and the gateway pattern most teams land on.
Most teams get paged for things that don't need attention. Learn how to build alert rules that fire when it actually matters.
A step-by-step walkthrough of instrumenting a Node.js API with OpenTelemetry and shipping traces to aiAxonIQ.
These three acronyms are everywhere in reliability engineering. Here's what they actually mean and how to use them correctly.
After evaluating ElasticSearch, Loki, and Tempo, here's why we landed on ClickHouse for our logs and metrics storage engine.
Pods, nodes, deployments, namespaces — monitoring Kubernetes can feel overwhelming. Start here.
PromQL's power comes from a handful of patterns. Master these 10 and you'll handle 90% of real-world monitoring use cases.
Get new posts delivered to your inbox. No spam, unsubscribe any time.