27 Temmuz 2018 Cuma

New Metricbeat Module: Envoyproxy Module

Envoyproxy Module is a Metricbeat Module that collects metrics from Envoy Proxy and indexes them into Elasticsearch or Logstash.

Envoyproxy module is written in Golang like all Metricbeat Modules and Beats.

You can check out this blog post for more information about how to create a Metricbeat Module.

What is the Envoy Proxy?


Envoy is an L7 proxy and communication bus designed for large modern service-oriented architectures.

Envoy Stats

Envoy outputs numerous statistics which depend on how the server is configured. They can be seen locally via the /stats admin endpoint. The admin endpoint looks directly into the store to load all of the counters and gauges and print them.

Envoyproxy module reads data from Envoy admin endpoint. For now, Envoyproxy module collects non-dynamic statistics.

I have blogged in the past about "Envoy Proxy Statistics" and "Envoy Proxy Installation to Ubuntu 17.10".

Getting Started with Envoyproxy Module


Initially, you need to make Envoyproxy module enabled:
./metricbeat modules enable envoyproxy
You can see a list of enabled and disabled modules by running following command:
./metricbeat modules list
Then when you run Metricbeat, it loads the corresponding module configurations specified in the modules.d directory.
For more information, you can check [1].

Now Envoyproxy module is enabled and ready to collects metrics.

Configuration


Envoyproxy Module has some settings. If you want to change some config options adjust the modules.d/envoyproxy.yml file to your needs.

Here is a sample configuration:
- module: envoyproxy
  metricsets: ["server"]
  period: 10s
  hosts: ["localhost:9901"]

module: The name of the module to run.
metricsets: A list of metricsets to execute.(For now, envoyproxy module have one metricset)
period: How often the metricsets are executed. If a system is not reachable, Metricbeat returns an error for each period.
hosts: A list of hosts of fetch information from.


Metricsets

Envoyproxy module has one metricset; server.

Server Metricset


Server statistics describe how the Envoy server instance is working. Statistics like server uptime or amount of allocated memory are categorized here.


Metrics Collected


cluster_manager
Envoy’s cluster manager manages all configured upstream clusters. Just as the Envoy configuration can contain any number of listeners, the configuration can also contain any number of independently configured upstream clusters. The cluster manager has a statistics tree rooted at cluster_manager with the following statistics.

  • active_clusters: Number of currently active (warmed) clusters. 
  • cluster_added: Total clusters added (either via static config or CDS)
  • cluster_modified: Total clusters modified (via CDS)
  • cluster_removed: Total clusters removed (via CDS)
  • warming_clusters: Number of currently warming (not active) clusters


filesystem
Statistics related to file system are emitted in the filesystem namespace.

  • flushed_by_timer: Total number of times internal flush buffers are written to a file due to flush timeout
  • reopen_failed: Total number of times a file was failed to be opened
  • write_buffered: Total number of times file data is moved to Envoys internal flush buffer.
  • write_completed: Total number of times a file was written.
  • write_total_buffered: Current total size of internal flush buffer in bytes


runtime
The runtime configuration specifies the location of the local file system tree that contains re-loadable configuration elements. The file system runtime provider emits some statistics in the runtime namespace.

  • load_error: Total number of load attempts that resulted in an error
  • load_success: Total number of load attempts that were successful
  • num_keys: Number of keys currently loaded
  • override_dir_exists: Total number of loads that did use an override directory
  • override_dir_not_exists: Total number of loads that did not use an override directory
  • admin_overrides_active


listener_manager
The top level Envoy configuration contains a list of listeners. The listener manager has a statistics tree rooted at listener_manager with the following statistics.

  • listener_added: Total listeners added (either via static config or LDS)
  • listener_create_failure: Total failed listener object additions to workers
  • listener_create_success: Total listener objects successfully added to workers
  • listener_modified: Total listeners modified (via LDS)
  • listener_removed: Total listeners removed (via LDS)
  • total_listeners_active: Number of currently active listeners
  • total_listeners_draining: Number of currently draining listeners
  • total_listeners_warming: Number of currently warming listeners


stats
A few statistics are emitted to report statistics system behavior:

  • overflow: Total number of times Envoy cannot allocate a statistic due to a shortage of shared memory


server
Server related statistics are rooted at server with following statistics:

  • days_until_first_cert_expiring: Number of days until the next certificate being managed will expire
  • live: 1 if the server is not currently draining, 0 otherwise
  • memory_allocated: Current amount of allocated memory in bytes
  • memory_heap_size: Current reserved heap size in bytes
  • parent_connections: Total connections of the old Envoy process on hot restart
  • total_connections: Total connections of both new and old Envoy processes
  • uptime: Current server uptime in seconds
  • version: Integer represented version number based on SCM revision
  • watchdog_mega_miss
  • watchdog_miss
  • hot_restart_epoch: Current hot restart epoch


http2
Each codec has the option of adding per-codec statistics. Currently only http2 has codec stats. All http2 statistics are rooted at http2.

  • header_overflow: Total number of connections reset due to the headers being larger than 63k
  • headers_cb_no_stream: Total number of errors where a header callback is called without an associated stream. This tracks an unexpected occurrence due to an as yet undiagnosed bug
  • rx_messaging_error: Total number of invalid received frames that violated section 8 of the HTTP/2 spec. This will result in a tx_reset
  • rx_reset: Total number of reset stream frames received by Envoy
  • too_many_header_frames: Total number of times an HTTP2 connection is reset due to receiving too many headers frames. Envoy currently supports proxying at most one header frame for 100-Continue one non-100 response code header frame and one frame with trailers
  • trailers: Total number of trailers seen on requests coming from downstream
  • tx_reset: Total number of reset stream frames transmitted by Envoy


Here is an example document generated by this metricset:

{
  "@timestamp": "2018-07-26T12:02:15.099Z",
  "@metadata": {
    "beat": "metricbeat",
    "type": "doc",
    "version": "7.0.0-alpha1"
  },
  "metricset": {
    "rtt": 11301,
    "name": "server",
    "module": "envoyproxy",
    "host": "localhost:9901"
  },
  "envoyproxy": {
    "server": {
      "cluster_manager": {
        "warming_clusters": 0,
        "active_clusters": 1,
        "cluster_added": 1,
        "cluster_modified": 0,
        "cluster_removed": 0
      },
      "filesystem": {
        "write_buffered": 1,
        "write_completed": 1,
        "write_total_buffered": 0,
        "flushed_by_timer": 0,
        "reopen_failed": 0
      },
      "runtime": {
        "override_dir_exists": 0,
        "override_dir_not_exists": 0,
        "admin_overrides_active": 0,
        "load_error": 0,
        "load_success": 0,
        "num_keys": 0
      },
      "listener_manager": {
        "listener_added": 1,
        "listener_create_failure": 0,
        "listener_create_success": 4,
        "listener_modified": 0,
        "listener_removed": 0,
        "total_listeners_active": 1,
        "total_listeners_draining": 0,
        "total_listeners_warming": 0
      },
      "stats": {
        "overflow": 0
      },
      "server": {
        "live": 1,
        "memory_heap_size": 4194304,
        "watchdog_mega_miss": 0,
        "version": 4151803,
        "uptime": 15,
        "memory_allocated": 3168904,
        "parent_connections": 0,
        "days_until_first_cert_expiring": 2147483647,
        "watchdog_miss": 0,
        "total_connections": 0,
        "hot_restart_epoch": 0
      },
      "http2": {}
    }
  },
  "host": {
    "name": "kripton"
  },
  "beat": {
    "name": "kripton",
    "hostname": "kripton",
    "version": "7.0.0-alpha1"
  }
}