.. _rosi-collector-grafana-dashboards: Grafana Dashboards ================== .. index:: pair: ROSI Collector; dashboards single: Grafana single: Syslog Explorer single: LogQL ROSI Collector includes pre-built Grafana dashboards for exploring logs and monitoring system health. This guide explains each dashboard and how to use them effectively. Dashboard Overview ------------------ ROSI Collector provisions five dashboards: +------------------------+------------------------------------------------+ | Dashboard | Purpose | +========================+================================================+ | Syslog Explorer | Search and browse logs from all hosts | +------------------------+------------------------------------------------+ | Syslog Deep Dive | Detailed analysis of log patterns | +------------------------+------------------------------------------------+ | Node Overview | System metrics (CPU, memory, disk, network) | +------------------------+------------------------------------------------+ | Client Health | rsyslog client status and statistics | +------------------------+------------------------------------------------+ | Alerting Overview | Active alerts and notification status | +------------------------+------------------------------------------------+ Syslog Explorer --------------- .. figure:: /_static/dashboard-explorer.png :alt: Syslog Explorer Dashboard :align: center Syslog Explorer dashboard for browsing and searching logs The Syslog Explorer is your primary interface for searching logs. It provides: - **Time range selector** - Choose the time window for your search - **Host filter** - Limit results to specific hosts - **Severity filter** - Filter by syslog severity (err, warning, info, etc.) - **Facility filter** - Filter by syslog facility (auth, daemon, etc.) - **Log table** - Browse log entries with timestamp, host, and message **Common searches**: - All errors: Set severity to ``err`` - Authentication events: Set facility to ``auth`` - Specific host: Select from host dropdown **Using LogQL for advanced queries**: Click "Explore" in the left sidebar and use LogQL queries: .. code-block:: text # Find SSH failures {job="syslog"} |= "Failed password" # Errors from specific host {host="webserver-01"} | json | severity = "err" # Count errors by host sum by (host) (count_over_time({job="syslog"} | json | severity = "err" [5m])) Syslog Deep Dive ---------------- .. figure:: /_static/dashboard-deepdive.png :alt: Syslog Deep Dive Dashboard :align: center Syslog Deep Dive dashboard for log analysis This dashboard provides analytical views of log data: - **Log volume graph** - Messages over time - **Severity breakdown** - Pie chart of log severities - **Top hosts** - Hosts generating the most logs - **Error trends** - Error count over time Use this dashboard to: - Identify hosts with unusual log volumes - Track error rates over time - Find the most common error messages Node Overview ------------- .. figure:: /_static/dashboard-node.png :alt: Node Overview Dashboard :align: center Node Overview showing system metrics The Node Overview shows system metrics from node_exporter: - **CPU usage** - Per-core and total utilization - **Memory** - Used, available, and cached - **Disk I/O** - Read/write throughput - **Network** - Bytes in/out per interface - **Disk space** - Usage by filesystem Select a host from the dropdown to view its metrics. Time range applies to all panels. **Key metrics to watch**: - CPU usage sustained above 80% - Memory usage approaching 100% - Disk usage above 80% - Network errors or drops Alerting Overview ----------------- .. figure:: /_static/dashboard-alerts.png :alt: Alerting Overview Dashboard :align: center Alerting Overview dashboard for monitoring alert status View active alerts and notification status: - **Firing alerts** - Currently active alerts - **Alert history** - Recent alert state changes - **Silence status** - Active silences Creating Custom Dashboards -------------------------- You can create your own dashboards in Grafana: 1. Click the **+** icon in the left sidebar 2. Select **New Dashboard** 3. Add panels using Loki (logs) or Prometheus (metrics) as data source **Loki query examples**: .. code-block:: text # Count logs by host sum by (host) (count_over_time({job="syslog"}[5m])) # Specific application logs {job="syslog", host="appserver"} |= "myapp" # Parse and filter {job="syslog"} | pattern "<_> <_> <_> . " | severity = "err" **Prometheus query examples**: .. code-block:: text # CPU usage percentage 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) # Memory usage (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 # Disk usage 100 - ((node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100) Exporting Dashboards -------------------- To backup or share dashboards: 1. Open the dashboard 2. Click the gear icon (Dashboard settings) 3. Select **JSON Model** 4. Copy the JSON and save it Or use the API: .. code-block:: bash curl -s http://localhost:3000/api/dashboards/db/syslog-explorer \ -H "Authorization: Bearer YOUR_API_KEY" | jq . Dashboard Best Practices ------------------------ **For log exploration**: - Start with a narrow time range - Use filters to reduce result set - Export to CSV for offline analysis **For monitoring**: - Set appropriate alert thresholds - Use template variables for host selection - Create summary panels for quick status **For performance**: - Avoid queries spanning more than 24 hours on busy systems - Use labels to filter before pattern matching - Limit log line display count See Also -------- - `Grafana Loki documentation `_ - `LogQL query language `_ - `PromQL basics `_