- Remove unnecessary whitespace
- Other minor cleanup
This commit is contained in:
Steve Flanders 2020-11-12 09:07:54 -05:00 committed by GitHub
parent 1d5d158609
commit 1dedaeee84
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
13 changed files with 84 additions and 67 deletions

View File

@ -26,4 +26,4 @@ A clear and concise description of any alternative solutions or features you've
**Additional context**
Add any other context or screenshots about the feature request here.
_Plesae delete paragraphs that you did not use before submitting._
_Please delete paragraphs that you did not use before submitting._

View File

@ -276,7 +276,7 @@ See [release](docs/release.md) for details.
## Common Issues
Build fails due to depenedency issues, e.g.
Build fails due to dependency issues, e.g.
```sh
go: github.com/golangci/golangci-lint@v1.31.0 requires

View File

@ -216,7 +216,7 @@ dedicated port for Agent, while there could be multiple instrumented processes.
were sent in a subsequent message. Identifier is no longer needed once the
streams are established.
3. On Sender side, if connection to Collector failed, Sender should retry
indefintely if possible, subject to available/configured memory buffer size.
indefinitely if possible, subject to available/configured memory buffer size.
(Reason: consider environments where the running applications are already
instrumented with OpenTelemetry Library but Collector is not deployed yet.
Sometime in the future, we can simply roll out the Collector to those

View File

@ -17,15 +17,15 @@ desired reliability level.
### Low on CPU Resources
This depends on the CPU metrics available on the deployment, eg.:
`kube_pod_container_resource_limits_cpu_cores` for Kubernetes. Let's call it
`available_cores` below. The idea here is to have an upper bound of the number
of available cores, and the maximum expected ingestion rate considered safe,
This depends on the CPU metrics available on the deployment, eg.:
`kube_pod_container_resource_limits_cpu_cores` for Kubernetes. Let's call it
`available_cores` below. The idea here is to have an upper bound of the number
of available cores, and the maximum expected ingestion rate considered safe,
let's call it `safe_rate`, per core. This should trigger increase of resources/
instances (or raise an alert as appropriate) whenever
`(actual_rate/available_cores) < safe_rate`.
The `safe_rate` depends on the specific configuration being used.
The `safe_rate` depends on the specific configuration being used.
// TODO: Provide reference `safe_rate` for a few selected configurations.
## Secondary Monitoring

View File

@ -24,4 +24,4 @@ Support more application-specific metric collection (e.g. Kafka, Hadoop, etc)
| |
**Other Features**|
Graceful shutdown (pipeline draining)| |[#483](https://github.com/open-telemetry/opentelemetry-collector/issues/483)
Deprecate queue retry processor and enable queueing per exporter by default||[#1721](https://github.com/open-telemetry/opentelemetry-collector/issues/1721)
Deprecate queue retry processor and enable queuing per exporter by default||[#1721](https://github.com/open-telemetry/opentelemetry-collector/issues/1721)

View File

@ -27,7 +27,7 @@ by passing the `--metrics-addr` flag to the `otelcol` process. See `--help` for
more details.
```bash
# otelcol --metrics-addr 0.0.0.0:8888
$ otelcol --metrics-addr 0.0.0.0:8888
```
A grafana dashboard for these metrics can be found

5
examples/README.md Normal file
View File

@ -0,0 +1,5 @@
# Examples
Information on how the examples can be used can be found in the [Getting
Started
documentation](https://opentelemetry.io/docs/collector/getting-started/).

View File

@ -23,6 +23,6 @@ The following configuration options can be modified:
- `requests_per_second` is the average number of requests per seconds.
- `resource_to_telemetry_conversion`
- `enabled` (default = false): If `enabled` is `true`, all the resource attributes will be converted to metric labels by default.
- `timeout` (defult = 5s): Time to wait per individual attempt to send data to a backend.
- `timeout` (default = 5s): Time to wait per individual attempt to send data to a backend.
The full list of settings exposed for this helper exporter are documented [here](factory.go).
The full list of settings exposed for this helper exporter are documented [here](factory.go).

View File

@ -14,7 +14,7 @@ The following diagram shows an example of Prometheus remote write API usage, wit
![Cortex Archietecture](./img/cortex.png)
Our project is focused on developing an exporter for the OpenTelemetry Collector to any Prometheus remote storage backend.
Our project is focused on developing an exporter for the OpenTelemetry Collector to any Prometheus remote storage backend.
### **1.1 Remote Write API**
@ -29,7 +29,7 @@ More details of Prometheus remote write API can be found in Prometheus [document
### **1.2 Gaps and Assumptions**
**Gap 1:**
**Gap 1:**
Currently, metrics from the OpenTelemetry SDKs cannot be exported to Prometheus from the collector correctly ([#1255](https://github.com/open-telemetry/opentelemetry-collector/issues/1255)). This is because the SDKs send metrics to the collector via their OTLP exporter, which exports the delta value of cumulative counters. The same issue will arise for exporting to any Prometheus remote storage backend.
To overcome this gap in the Collector pipeline, we had proposed 2 different solutions:
@ -38,19 +38,19 @@ To overcome this gap in the Collector pipeline, we had proposed 2 different solu
2. Require the OTLP exporters in SDKs to [send cumulative values for cumulative metric types to the Collector by default](https://github.com/open-telemetry/opentelemetry-specification/issues/731). Therefore, no aggregation of delta metric values is required in the Collector pipeline for Prometheus/storage backends to properly process the data.
**Gap 2:**
Another gap is that OTLP metric definition is still in development. This exporter will require refactoring as OTLP changes in the future.
Another gap is that OTLP metric definition is still in development. This exporter will require refactoring as OTLP changes in the future.
**Assumptions:**
Because of the gaps mentioned above, this project will convert from the current OTLP metrics and work under the assumption one of the above solutions will be implemented, and all incoming monotonic scalars/histogram/summary metrics should be cumulative or otherwise dropped. More details on the behavior of the exporter is in section 2.2.
## **2. Prometheus Remote Write/Cortex Exporter**
The Prometheus remote write/Cortex exporter should receive OTLP metrics, group data points by metric name and label set, convert each group to a TimeSeries, and send all TimeSeries to a storage backend via HTTP.
The Prometheus remote write/Cortex exporter should receive OTLP metrics, group data points by metric name and label set, convert each group to a TimeSeries, and send all TimeSeries to a storage backend via HTTP.
### **2.1 Receiving Metrics**
The Prometheus remote write/Cortex exporter receives a MetricsData instance in its PushMetrics() function. MetricsData contains a collection of Metric instances. Each Metric instance contains a series of data points, and each data point has a set of labels associated with it. Since Prometheus remote write TimeSeries are identified by unique sets of labels, the exporter needs to group data points within each Metric instance by their label set, and convert each group to a TimeSeries.
The Prometheus remote write/Cortex exporter receives a MetricsData instance in its PushMetrics() function. MetricsData contains a collection of Metric instances. Each Metric instance contains a series of data points, and each data point has a set of labels associated with it. Since Prometheus remote write TimeSeries are identified by unique sets of labels, the exporter needs to group data points within each Metric instance by their label set, and convert each group to a TimeSeries.
To group data points by label set, the exporter should create a map with each PushMetrics() call. The key of the map should represent a combination of the following information:
To group data points by label set, the exporter should create a map with each PushMetrics() call. The key of the map should represent a combination of the following information:
* the metric type
* the metric name
@ -67,20 +67,20 @@ The value of the map should be Prometheus TimeSeries, and each data points va
Pseudocode:
func PushMetrics(metricsData) {
// Create a map that stores distinct TimeSeries
map := make(map[String][]TimeSeries)
for metric in metricsData:
for point in metric:
// Generate signature string
sig := pointSignature(metric, point)
// Find corresponding TimeSeries in map
// Add to TimeSeries
// Sends TimeSeries to backend
func PushMetrics(metricsData) {
// Create a map that stores distinct TimeSeries
map := make(map[String][]TimeSeries)
for metric in metricsData:
for point in metric:
// Generate signature string
sig := pointSignature(metric, point)
// Find corresponding TimeSeries in map
// Add to TimeSeries
// Sends TimeSeries to backend
export(map)
}
@ -125,19 +125,19 @@ Authentication credentials should be added to each request before sending to the
Pseudocode:
func export(*map) error {
// Stores timeseries
arr := make([]TimeSeries)
for timeseries in map:
arr = append(arr, timeseries)
// Converts arr to WriteRequest
request := proto.Marshal(arr)
// Sends HTTP request to endpoint
func export(*map) error {
// Stores timeseries
arr := make([]TimeSeries)
for timeseries in map:
arr = append(arr, timeseries)
// Converts arr to WriteRequest
request := proto.Marshal(arr)
// Sends HTTP request to endpoint
}
## **3. Other Components**
@ -147,13 +147,13 @@ Pseudocode:
This struct is based on an inputted YAML file at the beginning of the pipeline and defines the configurations for an Exporter build. Examples of configuration parameters are HTTP endpoint, compression type, backend program, etc.
Converting YAML to a Go struct is done by the Collector, using [_the Viper package_](https://github.com/spf13/viper), which is an open-source library that seamlessly converts inputted YAML files into a usable, appropriate Config struct.
Converting YAML to a Go struct is done by the Collector, using [_the Viper package_](https://github.com/spf13/viper), which is an open-source library that seamlessly converts inputted YAML files into a usable, appropriate Config struct.
An example of the exporter section of the Collector config.yml YAML file can be seen below:
...
exporters:
prometheus_remote_write:
http_endpoint: <string>
@ -167,7 +167,7 @@ An example of the exporter section of the Collector config.yml YAML file can be
[X-Prometheus-Remote-Write-Version:<string>]
[Tenant-id:<int>]
request_timeout: <int>
# ************************************************************************
# below are configurations copied from Prometheus remote write config
# ************************************************************************
@ -178,31 +178,31 @@ An example of the exporter section of the Collector config.yml YAML file can be
[ username: <string> ]
[ password: <string> ]
[ password_file: <string> ]
# Sets the `Authorization` header on every remote write request with
# the configured bearer token. It is mutually exclusive with `bearer_token_file`.
[ bearer_token: <string> ]
# Sets the `Authorization` header on every remote write request with the bearer token
# read from the configured file. It is mutually exclusive with `bearer_token`.
[ bearer_token_file: /path/to/bearer/token/file ]
# Configures the remote write request's TLS settings.
tls_config:
# CA certificate to validate API server certificate with.
[ ca_file: <filename> ]
# Certificate and key files for client cert authentication to the server.
[ cert_file: <filename> ]
[ key_file: <filename> ]
# ServerName extension to indicate the name of the server.
# https://tools.ietf.org/html/rfc4366#section-3.1
[ server_name: <string> ]
# Disable validation of the server certificate.
[ insecure_skip_verify: <boolean> ]
...
### **3.2 Factory Struct**
@ -241,9 +241,9 @@ Once the shutdown() function is called, the exporter should stop accepting incom
func Shutdown () {
close(stopChan)
waitGroup.Wait()
waitGroup.Wait()
}
func PushMetrics() {
select:
case <- stopCh
@ -280,7 +280,7 @@ We will follow test-driven development practices while completing this project.
## **Request for Feedback**
We'd like to get some feedback on whether we made the appropriate assumptions in [this](#1.2-gaps-and-ssumptions) section, and appreciate more comments, updates , and suggestions on the topic.
We'd like to get some feedback on whether we made the appropriate assumptions in [this](#1.2-gaps-and-ssumptions) section, and appreciate more comments, updates , and suggestions on the topic.
Please let us know if there are any revisions, technical or informational, necessary for this document. Thank you!

View File

@ -1,16 +1,25 @@
# General Information
Extensions provide capabilities on top of the primary functionality of the collector.
Generally, extensions are used for implementing components that can be added to the Collector, but which do not require direct access to telemetry data and are not part of the pipelines (like receivers, processors or exporters). Example extensions are: Health Check extension that responds to health check requests or PProf extension that allows fetching Collector's performance profile.
Extensions provide capabilities on top of the primary functionality of the
collector. Generally, extensions are used for implementing components that can
be added to the Collector, but which do not require direct access to telemetry
data and are not part of the pipelines (like receivers, processors or
exporters). Example extensions are: Health Check extension that responds to
health check requests or PProf extension that allows fetching Collector's
performance profile.
Supported service extensions (sorted alphabetically):
- [Health Check](healthcheckextension/README.md)
- [Performance Profiler](pprofextension/README.md)
- [zPages](zpagesextension/README.md)
The [contributors repository](https://github.com/open-telemetry/opentelemetry-collector-contrib)
may have more extensions that can be added to custom builds of the Collector.
The [contributors
repository](https://github.com/open-telemetry/opentelemetry-collector-contrib)
may have more extensions that can be added to custom builds of the Collector.
## Ordering Extensions
The order extensions are specified for the service is important as this is the
order in which each extension will be started and the reverse order in which they
will be shutdown. The ordering is determined in the `extensions` tag under the
@ -45,6 +54,7 @@ The full list of settings exposed for this exporter is documented [here](healthc
with detailed sample configurations [here](healthcheckextension/testdata/config.yaml).
## <a name="pprof"></a>Performance Profiler
Performance Profiler extension enables the golang `net/http/pprof` endpoint.
This is typically used by developers to collect performance profiles and
investigate issues with the service.
@ -66,8 +76,8 @@ The following settings can be optionally configured:
Collector starts and is saved to the file when the Collector is terminated.
Example:
```yaml
```yaml
extensions:
pprof:
```
@ -76,9 +86,10 @@ The full list of settings exposed for this exporter are documented [here](pprofe
with detailed sample configurations [here](pprofextension/testdata/config.yaml).
## <a name="zpages"></a>zPages
Enables an extension that serves zPages, an HTTP endpoint that provides live
data for debugging different components that were properly instrumented for such.
All core exporters and receivers provide some zPage instrumentation.
All core exporters and receivers provide some zPages instrumentation.
The following settings are required:
@ -86,6 +97,7 @@ The following settings are required:
zPages.
Example:
```yaml
extensions:
zpages:

View File

@ -40,12 +40,12 @@ processor documentation for more information.
2. *any sampling processors*
3. [batch](batchprocessor/README.md)
4. *any other processors*
5. [queued_retry](queuedprocessor/README.md)
### Metrics
1. [memory_limiter](memorylimiter/README.md)
2. [batch](batchprocessor/README.md)
3. *any other processors*
## <a name="data-ownership"></a>Data Ownership

View File

@ -46,11 +46,11 @@ allocated by the process heap. Note that typically the total memory usage of
process will be about 50MiB higher than this value.
- `spike_limit_mib` (default = 0): Maximum spike expected between the
measurements of memory usage. The value must be less than `limit_mib`.
- `limit_percentage` (default = 0): Maximum amount of total memory, in percents, targeted to be
- `limit_percentage` (default = 0): Maximum amount of total memory targeted to be
allocated by the process heap. This configuration is supported on Linux systems with cgroups
and it's intended to be used in dynamic platforms like docker.
This option is used to calculate `memory_limit` from the total available memory.
For instance setting of 75% with the total memory of 1GiB will result in the limit of 750 MiB.
For instance setting of 75% with the total memory of 1GiB will result in the limit of 750 MiB.
The fixed memory setting (`limit_mib`) takes precedence
over the percentage configuration.
- `spike_limit_percentage` (default = 0): Maximum spike expected between the

View File

@ -256,7 +256,7 @@ type MetricsData struct {
The scrape page as whole also can be fit into the above `MetricsData` data
structure, and all the metrics data points can be stored with the `Metrics`
array. We will explain the mappings of individual metirc types in the following
array. We will explain the mappings of individual metric types in the following
couple sections
### Metric Value Mapping