Popov R&D logo
POPOV R&D tech blog
8 min read java / observability by Andrii Popov

Observability vs profiling: where observability ends and profiling starts

"Here we explore current trends about observability and profiling in Java ecosystem"


Where does observability end and profiling begin in a Java project?

In this post, I tried to put everything into a single visualization - logs, metrics, traces, & profiling - to show the full picture, with key insights, trends & best practices.

Observability [macro view] Based on 3 pillars such as:

• Logs: discrete event records, errors, etc. • Metrics: quantified system health: latency, memory, CPU load • Traces: distributed call chains across services, latency per span (method)

Profiling [micro view] Focuses on internal JVM behavior:

• Flamegraphs: which method is eating your CPU? • Allocation & heap: who’s allocating the most objects? • Thread dumps: why is this thread always blocked?

But, let’s dive deeper into JVM, to understand the origins of these metrics and insights.

𝗠𝗲𝘁𝗿𝗶𝗰𝘀 in JVM are backed by MXBeans - Java objects that expose live counters maintained by JVM subsystems like GC, threads, etc. Metrics SDKs like Micrometer (Spring, Quarkus) pull data from these MXBeans on demand, either when a backend like Prometheus scrapes them or when the app pushes metrics to systems like CloudWatch. These SDKs also let devs define counters, gauges, timers, and histograms for app-specific metrics.

𝗟𝗼𝗴𝘀. Devs use SLF4J as a common logging API, typically backed by Logback (Spring) or JBoss Logging (Quarkus). These libraries handle formatting, filtering, structured logging, and support both sync and async modes. Logs are written to stdout or files, then collected by external agents and sent to backends for centralized storage, indexing, retention, and querying by dashboards.

𝗧𝗿𝗮𝗰𝗶𝗻𝗴. Distributed tracing in modern Java stacks relies on OpenTelemetry SDK in both Spring and Quarkus. The core output of tracing is latency per span (method or service call). The SDK links spans across distributed services via a shared trace ID, collects them, and pushes the data to a tracing backend, where it’s stored, indexed, and visualized in dashboard tools.

𝗣𝗿𝗼𝗳𝗶𝗹𝗶𝗻𝗴. JVM profiling follows two paths: JVMTI (Tool Interface): native API used by debuggers and profilers (e.g., JProfiler, YourKit). Offers deep inspection but higher overhead and unavailable in AOT/native images. JFR (Java Flight Recorder): built into the JVM since JDK 11, open-source, low-overhead (~1–2%), production-ready. Captures JVM events into .jfr files for analysis in JMC (Java Mission Control) or via streaming for continuous profiling.

👉 𝗧𝗵𝗲 𝘁𝗿𝗲𝗻𝗱𝘀 𝗮𝗿𝗲 𝗰𝗹𝗲𝗮𝗿

• Logs, metrics, traces, and profiles are converging into a single developer experience, routed through a scalable collector such as the OpenTelemetry Collector or Grafana Alloy in modern setups • OpenTelemetry is stabilizing as the unifying standard across frameworks • Continuous profiling is becoming a default, not a luxury