Postfix delivery monitoring without the ELK or Graylog stack (2026)
Organizations that run ELK (Elasticsearch, Logstash, Kibana), Graylog, or Grafana Loki often ask: should I centralize my Postfix mail logs there? The honest answer depends on your scope and operational tolerance. If you already centralize all logs and can afford the engineering to build Postfix semantics on top, yes, consider it. If you want mail delivery visibility without building a log pipeline, Postfix Insights is purpose-built for exactly that use case. This guide clarifies the tradeoff.
What these log platforms are genuinely good at
Section titled “What these log platforms are genuinely good at”ELK, Graylog, and Grafana Loki are production-grade centralized log platforms. They are engineered for breadth: ingesting logs from applications, infrastructure, security services, and business systems; full-text searching across billions of lines; and exposing rich dashboards and alerting across an organization’s entire event stream.
ELK (Elasticsearch, Logstash, Kibana) and Graylog are built on inverted-index full-text search (Elasticsearch or OpenSearch backend). They ingest unstructured log lines, index every word, and let you query across your entire infrastructure: “find all lines matching ‘failed authentication’ from the last 6 hours across all 50 servers.” This is powerful and essential at scale.
Grafana Loki takes a different approach: it indexes only structured labels (host, service, level) and stores raw log lines, then filters and processes with LogQL. Loki is designed to be lighter-weight and less storage-hungry than full inverted-index search, making it accessible for smaller deployments.
All three platforms assume you want to answer questions like:
- What happened across my entire infrastructure in the last hour?
- Which services are failing and where?
- What patterns emerge when I search 100 million log lines?
That is genuinely valuable. Many sysadmins run one of these platforms to centralize alerts, audit trails, application logs, and system events into a single searchable store.
The honest gap: Postfix mail semantics
Section titled “The honest gap: Postfix mail semantics”Here is where the fit breaks down for mail logs specifically.
All three platforms ingest and search at the log-line level. They will happily store your Postfix maillog and let you search it: “find lines containing ‘bounce’”, “search by recipient email”, “filter by timestamp”. But Postfix mail delivery does not happen in a single log line. A single email message generates multiple log entries, each tagged with a queue ID. To understand whether a message was delivered, bounced, or is still queued, you must correlate log lines by queue ID and reconstruct the delivery record.
ELK, Graylog, and Loki require you to build this correlation yourself:
- ELK: You write Logstash grok filters and ingest processors to parse Postfix log lines, tag each line with queue ID, and build a Logstash pipeline that correlates lines into delivery records. You engineer the logic: if a line matches pattern X with queue ID Y, enrich it with field Z; if we see “sent” and “nrcpt=1”, mark it delivered. This works, but it is engineering work.
- Graylog: You configure pipeline rules and processors to parse queue IDs, extract fields (sender, recipient, status), and correlate lines. The same work, different interface.
- Loki + LogQL: You write LogQL queries to filter labels and search raw log lines, but LogQL does not natively parse Postfix semantics. You build metric extraction pipelines (metric_sd, parsing expressions) to derive per-message records from lines. Again, engineering.
Once you have built the pipeline, you have per-message data. Then you face the second gap.
Mail-specific metrics are not built in
Section titled “Mail-specific metrics are not built in”Out of the box, these platforms give you aggregates on what they ingest: line counts, byte counts, field distributions. They do not give you mail-specific metrics:
- Bounce rate: percentage of mail bounced by reason (user unknown, mailbox full, etc.)
- Defer rate: percentage of mail deferred (temporary failure; will retry).
- SLA by destination: for each domain, median delivery time and 95th percentile latency.
- DSN breakdown: how many messages got each Delivery Status Notification code (2xx delivered, 4xx deferred, 5xx bounced)?
- TLS coverage: percentage of outbound connections negotiating TLS, and which protocol versions.
- DKIM signing rate: percentage of outbound mail signed with DKIM, and which signing policies.
These are not hard to calculate from raw logs, but they are not built in. You build them: you query your pipeline’s output, count bounces, compute percentages, aggregate by domain, and feed the results into Grafana or a custom dashboard. This is more engineering.
If you are already a Grafana shop, you already have skills to write metric extraction and build a mail dashboard. If you are not, this is real work.
Operational footprint
Section titled “Operational footprint”ELK and Graylog run on top of Elasticsearch or OpenSearch, both Java-based. Elasticsearch is resource-hungry: it typically requires gigabytes of heap memory (starting at 2-4 GB per node), allocates significant disk for indices, and demands periodic index management (rollover, deletion, snapshot/restore). For a single-purpose maillog store on a single mail server, this overhead is not justified. For an organization that already runs ELK to centralize all infrastructure logs, the cost is already sunk and a bit more disk is negligible.
Grafana Loki is Go-based and much lighter: it can run with hundreds of megabytes of memory and stores raw logs in object storage or local files. Loki’s label-only indexing means disk usage is lower than Elasticsearch. If footprint is your concern, Loki is significantly less demanding than ELK or Graylog. The tradeoff is that Loki does not do full-text inverted-index search; you filter by label and pattern-match log lines.
For a dedicated maillog stack on one box: Elasticsearch stacks are overprovisioned. Loki is lighter. Postfix Insights is even lighter still.
What is Postfix Insights, and why is it fit for purpose?
Section titled “What is Postfix Insights, and why is it fit for purpose?”Postfix Insights is a self-hosted FastAPI application that runs against local logs or SSHFS-mounted remote logs (no external store, no log shipping). It is lightweight (Docker container with a sqld libSQL sidecar) and purpose-built for mail delivery visibility.
Two surfaces:
- Diagnostic search finds delivery records by recipient, domain, or subject with an optional date range. The application parses Postfix log lines in real-time, correlates them by queue ID, and presents structured per-recipient delivery status (raw lines and formatted summary). No manual pipeline to build.
- Delivery-health dashboard (
/stats) aggregates volume, bounce and defer rate, SLA, domain mix, slow domains, DSN breakdown, and calendar heatmap trends. A background scanner rolls log data into hourly, daily, and weekly tiers in a libSQL database so trends stay fast across months of mail.
Postfix semantics are built in: queue-ID correlation, DSN parsing, bounce and defer classification, TLS version detection, DKIM coverage, SLA calculation. You get them without writing a pipeline.
Operational footprint: lightweight. Memory usage is typically 100-300 MB; disk usage is minimal (just the log file and a small SQLite-compatible stats database). There is no JVM to tune, no index management, no external dependencies. You run docker compose up and search in four commands.
Scope: mail-only. Postfix Insights does not ingest your application logs, your firewall logs, or your authentication events. If you need to correlate mail delivery with other infrastructure signals, it is not the right tool. But if you want to answer “where did this mail go?” quickly and without engineering, it is fit for exactly that.
Postfix Insights exposes metrics for your existing Grafana
Section titled “Postfix Insights exposes metrics for your existing Grafana”If you already use Grafana for dashboards but do not yet centralize mail logs, Postfix Insights exposes a Prometheus /metrics endpoint. You can scrape it and feed mail metrics (bounce rate, defer rate, SLA, TLS coverage, DKIM rate) into your existing Grafana. This lets you integrate mail-delivery visibility alongside your other dashboards without running Elasticsearch or building a Logstash pipeline.
When to use which
Section titled “When to use which”| Capability | ELK / Graylog | Grafana Loki | Postfix Insights |
|---|---|---|---|
| Full-text search across all logs | Yes (Elasticsearch) | No (label + pattern matching) | No (Postfix only) |
| Ingest all infrastructure logs | Yes | Yes | No |
| Postfix queue-ID correlation (out of box) | No (build pipeline) | No (build pipeline) | Yes |
| Mail-specific metrics (out of box) | No (build dashboards) | No (build metric extraction) | Yes |
| Per-message delivery status | After pipeline engineering | After metric extraction | Yes |
| SLA and DSN breakdown | After engineering | After engineering | Yes |
| Memory footprint | 2-4 GB+ | 100-500 MB | 100-300 MB |
| Disk footprint | Significant (indices) | Moderate | Minimal |
| Setup overhead | High | Moderate | Low (Docker Compose) |
| Scope | Organization-wide | Organization-wide | Mail delivery only |
Use ELK or Graylog when
Section titled “Use ELK or Graylog when”- You already run one of these platforms to centralize all infrastructure logs.
- You have the skills and appetite to build Logstash pipelines or Graylog rules for Postfix semantics.
- You want to correlate mail delivery events with application and security events in a single searchable store.
- Your mail volume or server count justifies the operational cost of Elasticsearch/JVM.
Use Grafana Loki when
Section titled “Use Grafana Loki when”- You are already invested in Grafana for dashboards and observability.
- You want lightweight log aggregation without full-text search across all logs.
- You can afford the engineering to write metric extraction and mail-parsing pipelines.
- You need logs + metrics in a single platform.
Use Postfix Insights when
Section titled “Use Postfix Insights when”- You want to investigate a specific message’s delivery path (recipient, domain, subject) without running a centralized log platform.
- You want mail-delivery metrics and SLA dashboards out of the box, with zero pipeline engineering.
- You operate one or a few mail servers and want a lightweight, self-hosted solution.
- You prefer simple setup and minimal operational overhead.
- You do not need to correlate mail events with other infrastructure signals.
Use both
Section titled “Use both”Postfix Insights and ELK/Graylog/Loki are not mutually exclusive. A common pattern: you run Postfix Insights for fast, purpose-built mail delivery diagnosis, and you also ship mail logs to your org-wide log platform for correlation with other events if needed. Or you use Postfix Insights’ Prometheus /metrics endpoint to feed mail metrics into an existing Grafana without duplicating log ingestion.
For example:
- Postfix Insights runs as a Docker container and handles diagnostic search and the
/statsmail-delivery dashboard. - Independently, a Logstash forwarder or log shipper sends raw mail logs to your ELK cluster for org-wide retention and cross-infrastructure troubleshooting.
- Your mail metrics appear in both Postfix Insights’ dashboard and your Grafana (via Prometheus scrape).
This gives you the best of both: lightweight, mail-focused diagnosis plus org-wide log correlation.
Getting started
Section titled “Getting started”If Postfix Insights fits your use case, see the Quick start guide to install with Docker Compose in four commands.
For ELK, Graylog, or Loki, refer to their official documentation:
- Elasticsearch documentation
- Graylog documentation
- Grafana Loki documentation
- Postfix documentation and log formats