almost ready

This commit is contained in:
bob
2026-06-01 10:52:06 -05:00
parent 8b0eb0db78
commit 763305ca89
94 changed files with 8766 additions and 2674 deletions

148
monitoring/README.md Normal file
View File

@@ -0,0 +1,148 @@
# BlueLaminate observability stack (standalone, Proxmox LXC)
A self-contained Grafana **LGTM** stack — **L**oki (logs), **G**rafana (dashboards),
**T**empo (traces), and Prometheus (**M**etrics) — fronted by **Grafana Alloy** as a single
OTLP ingress. It runs as native systemd services on its own Proxmox LXC, decoupled from the
app's `docker-compose.yml`. The C2 and Python workers push OpenTelemetry data to Alloy, which
fans the three signals out to the backends; Grafana ties them together.
```
C2 / workers ──OTLP(4317 grpc / 4318 http)──► Alloy ──┬─► Loki (logs, :3100)
(other host) ├─► Prometheus (metrics, :9090, remote-write)
└─► Tempo (traces, :4319 OTLP → store)
Grafana (:3000)
datasources: Loki + Prometheus + Tempo
```
Only Alloy's OTLP ports (`4317`/`4318`) and Grafana (`3000`) need to be reachable from the
LAN. Loki and Tempo bind localhost; Alloy is the only client that talks to them.
## Layout
```
monitoring/
install.sh # idempotent provisioner — run as root in the LXC
alloy/config.alloy # OTLP receiver → batch → Loki / Prometheus / Tempo
prometheus/prometheus.yml # self-monitoring scrapes (app metrics arrive via remote-write)
prometheus/prometheus.service # systemd unit: remote-write + OTLP receivers, 15d retention
loki/loki.yml # single-binary, filesystem store, 15d retention
tempo/tempo.yml # OTLP on :4319, local store, metrics_generator → Prometheus
grafana/datasources.yml # Loki + Prometheus(default) + Tempo, correlated
grafana/dashboards.yml # file-based dashboard provider
grafana/dashboards/overview.json # starter dashboard (target health, span rates, logs)
```
## 1. Create the LXC (run on the Proxmox host)
Reference only — adjust the storage, bridge, and template names to your node. An unprivileged
Debian 13 container with ~2 vCPU / 24 GB RAM / 2040 GB disk is plenty.
```bash
# Make sure a Debian 13 template is present (once):
# pveam update && pveam available | grep debian-13
# pveam download local debian-13-standard_*_amd64.tar.zst
pct create 910 local:vztmpl/debian-13-standard_13.0-1_amd64.tar.zst \
--hostname grafana-lxc \
--cores 2 --memory 4096 --swap 1024 \
--rootfs local-lvm:32 \
--net0 name=eth0,bridge=vmbr0,ip=dhcp \
--unprivileged 1 --features nesting=0 \
--onboot 1 --start 1
# (Optional) give it a static IP instead of dhcp, e.g.
# --net0 name=eth0,bridge=vmbr0,ip=192.168.1.50/24,gw=192.168.1.1
```
`nesting=0` is fine — there's no Docker here, just native binaries.
## 2. Deploy the stack (inside the LXC)
```bash
pct enter 910 # or: ssh root@<lxc-ip>
apt-get update && apt-get install -y git
git clone <this-repo-url> /opt/bluelaminate
cd /opt/bluelaminate/monitoring
sudo bash install.sh
```
No git on the LXC? Copy just this folder over instead:
`scp -r monitoring root@<lxc-ip>:/opt/monitoring && ssh root@<lxc-ip> 'cd /opt/monitoring && bash install.sh'`
The script adds the Grafana apt repo, installs grafana/loki/tempo/alloy, drops the Prometheus
release binary into `/opt/prometheus`, lays our configs over the packaged defaults, and
enables all five services. It prints the URLs and the OTLP endpoint when done.
## 3. Verify
```bash
systemctl is-active grafana-server loki tempo prometheus alloy # all → active
curl -s localhost:3100/ready # Loki → ready
curl -s localhost:3200/ready # Tempo → ready
curl -s localhost:9090/-/ready # Prometheus → Ready
```
Open Grafana at `http://<lxc-ip>:3000` (first login `admin` / `admin` — change it). The three
datasources and the **BlueLaminate → Stack Overview** dashboard are provisioned automatically.
Alloy's pipeline graph is at `http://<lxc-ip>:12345`.
### End-to-end OTLP smoke test (no app changes needed)
Send synthetic telemetry from any machine that can reach the LXC, using the OpenTelemetry
`telemetrygen` tool (`go install github.com/open-telemetry/opentelemetry-collector-contrib/cmd/telemetrygen@latest`):
```bash
telemetrygen traces --otlp-endpoint <lxc-ip>:4317 --otlp-insecure --traces 5
telemetrygen metrics --otlp-endpoint <lxc-ip>:4317 --otlp-insecure --duration 10s
telemetrygen logs --otlp-endpoint <lxc-ip>:4317 --otlp-insecure --logs 5
```
Then in Grafana **Explore**: pick **Tempo** (search recent traces), **Prometheus** (query
`gen`), and **Loki** (`{service_name=~".+"}`) — seeing data in all three confirms the full
fan-out before any app is wired up.
## 4. Wiring the apps later (the OTLP contract)
This deployment is **stack-only**; the C2 and workers aren't instrumented yet. When you do,
point them at this LXC — nothing here changes. The drop-in:
**.NET C2** (`BlueLaminate.C2`) — add packages `OpenTelemetry.Extensions.Hosting`,
`OpenTelemetry.Exporter.OpenTelemetryProtocol`, and the
`OpenTelemetry.Instrumentation.AspNetCore` / `.Http` / runtime instrumentations, then
`builder.Services.AddOpenTelemetry().WithTracing(...).WithMetrics(...)` plus
`builder.Logging.AddOpenTelemetry(...)`. Configure via env:
```
OTEL_EXPORTER_OTLP_ENDPOINT=http://<lxc-ip>:4318
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
OTEL_SERVICE_NAME=bluelaminate-c2
```
**Python workers** (`worker/csmoney_worker.py`, `skinland_worker.py`) — add
`opentelemetry-distro` and `opentelemetry-exporter-otlp` to `worker/requirements.txt`, run
under `opentelemetry-instrument python csmoney_worker.py`, same env vars with
`OTEL_SERVICE_NAME=csmoney-worker` / `skinland-worker`. (Today the workers emit structured
JSON logs to stdout — `LOG_JSON=1`, set by default in the image; an interim option is to
ship their Docker stdout to Loki with an Alloy `loki.source.docker` component on the app
host, which can parse those JSON fields directly, instead of instrumenting in-process.)
Add those env vars to the matching `docker-compose.yml` services when the instrumentation lands.
## Hardening
- **Firewall the OTLP ports.** `4317`/`4318` are bound to `0.0.0.0`. Restrict them to the app
host, e.g. `ufw allow from <app-host-ip> to any port 4317,4318 proto tcp`.
- **Auth on ingest (optional).** Add an `otelcol.auth.bearer` handler to
`otelcol.receiver.otlp` in `alloy/config.alloy` and send a matching
`OTEL_EXPORTER_OTLP_HEADERS=Authorization=Bearer <token>` from the apps.
- **Grafana password.** Change `admin` on first login, or set
`GF_SECURITY_ADMIN_PASSWORD` in `/etc/grafana/grafana.ini`.
## Retention / sizing
Defaults are LXC-friendly: Prometheus **15d**, Loki **15d**, Tempo **7d**. Bump the
`retention.time` flag (`prometheus.service`), `limits_config.retention_period` (`loki.yml`),
and `compactor.compaction.block_retention` (`tempo.yml`) if you have the disk. Re-run
`install.sh` to apply config edits.
```

View File

@@ -0,0 +1,67 @@
// Grafana Alloy — the single OTLP ingress for the BlueLaminate fleet.
//
// Receives OTLP (gRPC :4317 / HTTP :4318) from the C2 and the Python workers, batches it,
// then fans the three signals out to the local backends:
// metrics -> Prometheus (remote-write)
// logs -> Loki (push API)
// traces -> Tempo (OTLP gRPC on :4319, a non-colliding port)
//
// OTLP is bound on 0.0.0.0 so apps on other LAN hosts can push to this LXC. Everything it
// forwards to listens on localhost only (see each backend's config) — Alloy is the only
// thing that talks to Loki/Prometheus/Tempo. See README "Hardening" to add a bearer token.
otelcol.receiver.otlp "in" {
grpc {
endpoint = "0.0.0.0:4317"
}
http {
endpoint = "0.0.0.0:4318"
}
output {
metrics = [otelcol.processor.batch.default.input]
logs = [otelcol.processor.batch.default.input]
traces = [otelcol.processor.batch.default.input]
}
}
otelcol.processor.batch "default" {
output {
metrics = [otelcol.exporter.prometheus.to_prom.input]
logs = [otelcol.exporter.loki.to_loki.input]
traces = [otelcol.exporter.otlp.to_tempo.input]
}
}
// --- metrics -> Prometheus remote-write ---------------------------------------------------
otelcol.exporter.prometheus "to_prom" {
forward_to = [prometheus.remote_write.local.receiver]
}
prometheus.remote_write "local" {
endpoint {
url = "http://localhost:9090/api/v1/write"
}
}
// --- logs -> Loki push --------------------------------------------------------------------
otelcol.exporter.loki "to_loki" {
forward_to = [loki.write.local.receiver]
}
loki.write "local" {
endpoint {
url = "http://localhost:3100/loki/api/v1/push"
}
}
// --- traces -> Tempo ----------------------------------------------------------------------
// Tempo's own OTLP receiver listens on :4319 so it doesn't collide with this Alloy receiver
// on :4317/:4318. TLS off — it's a localhost hop.
otelcol.exporter.otlp "to_tempo" {
client {
endpoint = "localhost:4319"
tls {
insecure = true
}
}
}

View File

@@ -0,0 +1,15 @@
# Grafana dashboard provider — loads JSON dashboards from /var/lib/grafana/dashboards.
# Copied to /etc/grafana/provisioning/dashboards/ by install.sh.
apiVersion: 1
providers:
- name: BlueLaminate
orgId: 1
folder: BlueLaminate
type: file
disableDeletion: false
allowUiUpdates: true
updateIntervalSeconds: 30
options:
path: /var/lib/grafana/dashboards
foldersFromFilesStructure: false

View File

@@ -0,0 +1,109 @@
{
"annotations": { "list": [] },
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"links": [],
"panels": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"fieldConfig": {
"defaults": {
"mappings": [
{ "type": "value", "options": { "0": { "text": "DOWN", "color": "red" }, "1": { "text": "UP", "color": "green" } } }
],
"thresholds": { "mode": "absolute", "steps": [ { "color": "red", "value": null }, { "color": "green", "value": 1 } ] }
},
"overrides": []
},
"gridPos": { "h": 6, "w": 24, "x": 0, "y": 0 },
"id": 1,
"options": {
"colorMode": "background",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions": { "calcs": ["lastNotNull"], "fields": "", "values": false },
"textMode": "value_and_name"
},
"pluginVersion": "11.0.0",
"targets": [
{ "datasource": { "type": "prometheus", "uid": "prometheus" }, "expr": "up", "legendFormat": "{{job}}", "refId": "A" }
],
"title": "Stack targets — up/down",
"type": "stat"
},
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"fieldConfig": {
"defaults": { "custom": { "drawStyle": "line", "fillOpacity": 10, "lineWidth": 1 }, "unit": "reqps" },
"overrides": []
},
"gridPos": { "h": 8, "w": 12, "x": 0, "y": 6 },
"id": 2,
"options": { "legend": { "displayMode": "list", "placement": "bottom", "showLegend": true }, "tooltip": { "mode": "multi", "sort": "desc" } },
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "sum by (service_name) (rate(traces_spanmetrics_calls_total[5m]))",
"legendFormat": "{{service_name}}",
"refId": "A"
}
],
"title": "Span call rate by service (Tempo span-metrics)",
"type": "timeseries"
},
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"fieldConfig": {
"defaults": { "custom": { "drawStyle": "line", "fillOpacity": 10, "lineWidth": 1 }, "unit": "bytes" },
"overrides": []
},
"gridPos": { "h": 8, "w": 12, "x": 12, "y": 6 },
"id": 3,
"options": { "legend": { "displayMode": "list", "placement": "bottom", "showLegend": true }, "tooltip": { "mode": "multi", "sort": "desc" } },
"targets": [
{
"datasource": { "type": "prometheus", "uid": "prometheus" },
"expr": "process_resident_memory_bytes",
"legendFormat": "{{job}}",
"refId": "A"
}
],
"title": "Stack process memory",
"type": "timeseries"
},
{
"datasource": { "type": "loki", "uid": "loki" },
"gridPos": { "h": 10, "w": 24, "x": 0, "y": 14 },
"id": 4,
"options": {
"dedupStrategy": "none",
"enableLogDetails": true,
"showTime": true,
"sortOrder": "Descending",
"wrapLogMessage": true
},
"targets": [
{
"datasource": { "type": "loki", "uid": "loki" },
"expr": "{service_name=~\".+\"}",
"refId": "A"
}
],
"title": "Recent logs (all services)",
"type": "logs"
}
],
"refresh": "30s",
"schemaVersion": 39,
"tags": ["bluelaminate"],
"templating": { "list": [] },
"time": { "from": "now-6h", "to": "now" },
"timepicker": {},
"timezone": "",
"title": "BlueLaminate — Stack Overview",
"uid": "bl-overview",
"version": 1,
"weekStart": ""
}

View File

@@ -0,0 +1,53 @@
# Grafana datasource provisioning — Prometheus (default), Loki, Tempo, wired for
# trace <-> log <-> metric correlation. Copied to
# /etc/grafana/provisioning/datasources/ by install.sh.
apiVersion: 1
datasources:
- name: Prometheus
type: prometheus
uid: prometheus
access: proxy
url: http://localhost:9090
isDefault: true
jsonData:
httpMethod: POST
- name: Loki
type: loki
uid: loki
access: proxy
url: http://localhost:3100
jsonData:
# Turn a trace_id found on a log line into a clickable jump to the trace in Tempo.
# OTLP logs carry the id as structured metadata `trace_id`; adjust the regex if your
# app instrumentation emits it differently.
derivedFields:
- name: TraceID
matcherType: label
matcherRegex: trace_id
datasourceUid: tempo
url: "${__value.raw}"
urlDisplayLabel: "View trace"
- name: Tempo
type: tempo
uid: tempo
access: proxy
url: http://localhost:3200
jsonData:
# Span -> related logs in Loki.
tracesToLogsV2:
datasourceUid: loki
spanStartTimeShift: "-1h"
spanEndTimeShift: "1h"
filterByTraceID: true
filterBySpanID: false
# Span -> RED metrics in Prometheus (from Tempo's metrics_generator).
tracesToMetrics:
datasourceUid: prometheus
# Service graph + node graph from the generator's service-graph metrics.
serviceMap:
datasourceUid: prometheus
nodeGraph:
enabled: true

122
monitoring/install.sh Normal file
View File

@@ -0,0 +1,122 @@
#!/usr/bin/env bash
#
# Provision the standalone BlueLaminate observability stack on a fresh Debian LXC:
# Grafana + Loki + Tempo + Alloy (Grafana apt repo, each with its own systemd unit)
# Prometheus (official release tarball -> /opt/prometheus + our unit)
#
# Idempotent: safe to re-run (re-applies configs and restarts services). Run as root.
#
# sudo ./install.sh
#
# Override the Prometheus version with PROM_VERSION=x.y.z ./install.sh if needed.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
if [[ "${EUID}" -ne 0 ]]; then
echo "ERROR: run as root (sudo ./install.sh)." >&2
exit 1
fi
ARCH="$(dpkg --print-architecture)" # amd64 / arm64
echo "==> Target architecture: ${ARCH}"
# --- prerequisites ------------------------------------------------------------------------
echo "==> Installing prerequisites"
export DEBIAN_FRONTEND=noninteractive
apt-get update -y
apt-get install -y apt-transport-https software-properties-common gpg wget curl tar
# --- Grafana apt repo: grafana, loki, tempo, alloy ----------------------------------------
echo "==> Adding the Grafana apt repository"
mkdir -p /etc/apt/keyrings
if [[ ! -s /etc/apt/keyrings/grafana.asc ]]; then
wget -qO /etc/apt/keyrings/grafana.asc https://apt.grafana.com/gpg-full.key
fi
echo "deb [signed-by=/etc/apt/keyrings/grafana.asc] https://apt.grafana.com stable main" \
> /etc/apt/sources.list.d/grafana.list
apt-get update -y
echo "==> Installing Grafana, Loki, Tempo, Alloy"
apt-get install -y grafana loki tempo alloy
# --- Prometheus (release tarball) ---------------------------------------------------------
echo "==> Installing Prometheus"
PROM_VERSION="${PROM_VERSION:-$(curl -fsSL https://api.github.com/repos/prometheus/prometheus/releases/latest \
| grep -oP '"tag_name":\s*"v\K[^"]+' || true)}"
PROM_VERSION="${PROM_VERSION:-3.2.1}"
echo " Prometheus version: ${PROM_VERSION}"
id -u prometheus &>/dev/null || useradd --system --no-create-home --shell /usr/sbin/nologin prometheus
PROM_DIR="prometheus-${PROM_VERSION}.linux-${ARCH}"
TMP="$(mktemp -d)"
trap 'rm -rf "${TMP}"' EXIT
wget -qO "${TMP}/prom.tar.gz" \
"https://github.com/prometheus/prometheus/releases/download/v${PROM_VERSION}/${PROM_DIR}.tar.gz"
tar -xzf "${TMP}/prom.tar.gz" -C "${TMP}"
install -d /opt/prometheus
install -m 0755 "${TMP}/${PROM_DIR}/prometheus" /opt/prometheus/prometheus
install -m 0755 "${TMP}/${PROM_DIR}/promtool" /opt/prometheus/promtool
# --- data directories ---------------------------------------------------------------------
echo "==> Creating data directories"
install -d -o prometheus -g prometheus /var/lib/prometheus
install -d -o loki -g loki /var/lib/loki /var/lib/loki/chunks /var/lib/loki/rules /var/lib/loki/compactor
install -d -o tempo -g tempo /var/lib/tempo /var/lib/tempo/wal /var/lib/tempo/blocks \
/var/lib/tempo/generator/wal /var/lib/tempo/generator/traces
# --- configuration ------------------------------------------------------------------------
echo "==> Installing configuration files"
install -d /etc/alloy /etc/loki /etc/tempo /etc/prometheus
install -m 0644 "${SCRIPT_DIR}/alloy/config.alloy" /etc/alloy/config.alloy
install -m 0644 "${SCRIPT_DIR}/loki/loki.yml" /etc/loki/config.yml
install -m 0644 "${SCRIPT_DIR}/tempo/tempo.yml" /etc/tempo/config.yml
install -m 0644 "${SCRIPT_DIR}/prometheus/prometheus.yml" /etc/prometheus/prometheus.yml
install -m 0644 "${SCRIPT_DIR}/prometheus/prometheus.service" /etc/systemd/system/prometheus.service
# Point Alloy's systemd unit at our config (the package reads /etc/default/alloy).
cat > /etc/default/alloy <<'EOF'
CONFIG_FILE="/etc/alloy/config.alloy"
CUSTOM_ARGS=""
RESTART_ON_UPGRADE=true
EOF
# Grafana provisioning (datasources + dashboards).
echo "==> Installing Grafana provisioning"
install -d /etc/grafana/provisioning/datasources \
/etc/grafana/provisioning/dashboards \
/var/lib/grafana/dashboards
install -m 0644 "${SCRIPT_DIR}/grafana/datasources.yml" /etc/grafana/provisioning/datasources/bluelaminate.yml
install -m 0644 "${SCRIPT_DIR}/grafana/dashboards.yml" /etc/grafana/provisioning/dashboards/bluelaminate.yml
install -m 0644 "${SCRIPT_DIR}"/grafana/dashboards/*.json /var/lib/grafana/dashboards/
chown -R grafana:grafana /var/lib/grafana/dashboards 2>/dev/null || true
# --- start everything ---------------------------------------------------------------------
echo "==> Enabling + starting services"
systemctl daemon-reload
systemctl enable --now grafana-server loki tempo prometheus alloy
systemctl restart loki tempo prometheus alloy grafana-server
# --- summary ------------------------------------------------------------------------------
IP="$(hostname -I 2>/dev/null | awk '{print $1}')"
cat <<EOF
============================================================================
BlueLaminate observability stack installed.
Grafana UI : http://${IP:-<lxc-ip>}:3000 (first login admin/admin)
OTLP ingress : ${IP:-<lxc-ip>}:4317 (gRPC) / ${IP:-<lxc-ip>}:4318 (HTTP)
Alloy debug UI : http://${IP:-<lxc-ip>}:12345
Prometheus : http://${IP:-<lxc-ip>}:9090
Point apps at: OTEL_EXPORTER_OTLP_ENDPOINT=http://${IP:-<lxc-ip>}:4318
Readiness checks:
systemctl is-active grafana-server loki tempo prometheus alloy
curl -s localhost:3100/ready # Loki
curl -s localhost:3200/ready # Tempo
curl -s localhost:9090/-/ready # Prometheus
============================================================================
EOF

59
monitoring/loki/loki.yml Normal file
View File

@@ -0,0 +1,59 @@
# Loki — single-binary, filesystem-backed, no auth (localhost-only; Alloy is the only writer).
# Tuned for an LXC: TSDB index, 15-day retention with the compactor enforcing deletes.
auth_enabled: false
server:
http_listen_address: 127.0.0.1
http_listen_port: 3100
grpc_listen_port: 9096
log_level: info
common:
instance_addr: 127.0.0.1
path_prefix: /var/lib/loki
storage:
filesystem:
chunks_directory: /var/lib/loki/chunks
rules_directory: /var/lib/loki/rules
replication_factor: 1
ring:
kvstore:
store: inmemory
schema_config:
configs:
- from: 2024-01-01
store: tsdb
object_store: filesystem
schema: v13
index:
prefix: index_
period: 24h
limits_config:
retention_period: 360h # 15 days
reject_old_samples: true
reject_old_samples_max_age: 168h
# Required so OTLP resource/scope attributes (and trace_id/span_id) land as structured metadata.
allow_structured_metadata: true
volume_enabled: true
compactor:
working_directory: /var/lib/loki/compactor
compaction_interval: 10m
retention_enabled: true
retention_delete_delay: 2h
delete_request_store: filesystem
query_range:
results_cache:
cache:
embedded_cache:
enabled: true
max_size_mb: 100
ruler:
storage:
type: local
local:
directory: /var/lib/loki/rules

View File

@@ -0,0 +1,25 @@
# Prometheus is not in the Grafana apt repo, so install.sh drops the release binary into
# /opt/prometheus and installs this unit. Flags: remote-write + OTLP receivers ON (Alloy and
# Tempo push to it), 15-day local retention.
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/docs/
Wants=network-online.target
After=network-online.target
[Service]
User=prometheus
Group=prometheus
Type=simple
Restart=on-failure
RestartSec=5
ExecStart=/opt/prometheus/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus \
--storage.tsdb.retention.time=15d \
--web.enable-remote-write-receiver \
--web.enable-otlp-receiver \
--web.listen-address=0.0.0.0:9090
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,32 @@
# Prometheus for the BlueLaminate observability LXC.
#
# App + Tempo metrics arrive via REMOTE-WRITE (Alloy and Tempo's metrics_generator push to
# /api/v1/write — enabled by the --web.enable-remote-write-receiver flag in prometheus.service),
# so they need no scrape config. The scrape jobs below are just the stack's own self-monitoring.
global:
scrape_interval: 30s
evaluation_interval: 30s
external_labels:
monitor: bluelaminate-lxc
scrape_configs:
- job_name: prometheus
static_configs:
- targets: ["localhost:9090"]
- job_name: alloy
static_configs:
- targets: ["localhost:12345"]
- job_name: loki
static_configs:
- targets: ["localhost:3100"]
- job_name: tempo
static_configs:
- targets: ["localhost:3200"]
- job_name: grafana
static_configs:
- targets: ["localhost:3000"]

View File

@@ -0,0 +1,48 @@
# Tempo — local-disk trace store. Receives OTLP from Alloy on :4319 (Alloy owns :4317/:4318),
# and runs the metrics_generator to emit RED + service-graph metrics, remote-written into
# Prometheus so Grafana can draw request rates and the service map without any app metrics.
server:
http_listen_address: 0.0.0.0
http_listen_port: 3200
grpc_listen_port: 9095
log_level: info
distributor:
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4319"
ingester:
max_block_duration: 5m
compactor:
compaction:
block_retention: 168h # 7 days of traces
metrics_generator:
registry:
external_labels:
source: tempo
storage:
path: /var/lib/tempo/generator/wal
remote_write:
- url: http://localhost:9090/api/v1/write
send_exemplars: true
traces_storage:
path: /var/lib/tempo/generator/traces
storage:
trace:
backend: local
wal:
path: /var/lib/tempo/wal
local:
path: /var/lib/tempo/blocks
# Turn the generator on for every tenant (single-tenant here).
overrides:
defaults:
metrics_generator:
processors: [service-graphs, span-metrics]