Prometheus

Prometheus is an open source, metrics based MonitoringMonitoring
Monitoring is an integral part of running services in production. Without it, we are blind to what's going on, and thus unable to act according to our best interest.

Providing visibility is in the...
system. Its data model is kept as a time series, each consisting of key value pairs called labels.

PromQL is a querying language that allows for aggregation across any of these labels, allowing you to see metrics per process, machine, namespace, cluster. PromQL can be used to create graphs in software like Grafana and for creating alerts.

Prometheus works as follows:

  • metrics are exposed by the application (Exposing Prometheus MetricsExposing Prometheus Metrics
    Depending on which software you are trying to expose [[Prometheus]] metrics for, there are three options to achieve this.

    Pre-instrumented software

    Many applications are already exposing metrics ...
    )
  • discovers scrape targets (Prometheus Service DiscoveryPrometheus Service Discovery
    Once your code is instrumented, [[Prometheus]] needs a way to find the services which are exposing the metrics. Yes, you could tell Prometheus where the services are and where they expose metrics, ...
    )
  • scrapes their metrics (Prometheus Metric ScrapingPrometheus Metric Scraping
    Once [[Prometheus Service Discovery]] gets us the list of targets to be monitored, [[Prometheus]] fetches the metrics by sending a "scrape" http request.

    Once Prometheus receives a response to the...
    )
  • stores their metrics (Prometheus StoragePrometheus Storage
    Prometheus stores data locally on the disk in a custom [[Database]]. It doesn't support any form of clustering by itself, in an attempt to make running Prometheus a simple task.

    This means that yo...
    )
  • exposes these metrics to be queried or used to create alerts (Prometheus QueryingPrometheus Querying
    Prometheus has a number of http apis which allow you to request raw data and evaluate PromQL queries.

    Often needed or computationally expensive queries can be optimized by the use of [[Prometheus ...
    , Prometheus AlertingPrometheus Alerting
    Prometheus can create alert based on Alerting Rules. Alerting rules are another form of [[Prometheus Recording Rules]], which turn the results of the PromQL expressions into alerts that are sent to...
    )

It's also worth noting what Prometheus is not:

  • not suitable for storing event logs or individual events
  • not suitable for high cardinality data (e.g. emails and usernames)
  • prometheus makes tradeoffs and prefers to give you 99.9% correct data then for your monitoring to break while looking for 100% correct answer. If high accuracy is needed, Prometheus should be used with caution

Status: #🌱

References: