Introduction

As a Checkmk extension developer and enthusiast, and most recently OpenTelemetry Certified, I wanted to put observability into practice on a real-world project.

For Lynxmind , simple uptime checks were no longer enough. We needed more than a binary view of “up or down.” The goal was to understand the actual health of the system: how it performs under load, how resources are consumed, and where bottlenecks emerge.

Dashboard OpenTelemetry whith data

Solution

I instrumented our Node.js and Astro stack with the OpenTelemetry SDK and exported metrics into the Checkmk 2.4 OpenTelemetry Collector (OTLP/HTTP). This pipeline was designed to capture essential runtime signals, including CPU usage, memory and heap statistics, garbage collection pauses, event loop latency, and HTTP client/server performance. Using service name mapping with Checkmk’s Dynamic Configuration Daemon (DCD), hosts such as Lynxmind website were created automatically. Custom dashboards in Checkmk then turned this raw telemetry into actionable insights.

Sample Checkmk dashboard with real-time metrics

Output

The resulting dashboards now provide:

Benefits

This observability stack delivers tangible value:

Conclusion

With OpenTelemetry and Checkmk we no longer just monitor if Lynxmind is available. We monitor how healthy it is, how it behaves under stress, and exactly where optimization is needed. In other words, it is like being able to see the cake from the inside while it is still baking.

tags
share on
Elicarlos Dias
Elicarlos Dias

As a DevOps Consultant Trainee at Lynxmind, he is building a strong foundation in automation, CI/CD and cloud infrastructure. With a hands-on approach and a drive to grow, he supports the team in streamlining processes and enhancing system reliability.