Commit Graph

15 Commits

Author SHA1 Message Date
Tyler Helmuth 0b88b764f5
[interna/sharedcomponent] Use a ring buffer to restrict remembered status count (#11826)
#### Description
Use a ring buffer to only remember the last 5 events. This ensures we
remember a reasonable number of events during startup, so that a status
aggregator gets the events for all instances. Then during normal
operation, when we're done adding sources and no longer need to replay
events, we don't have to remember every single event.

#### Link to tracking issue
Closes
https://github.com/open-telemetry/opentelemetry-collector/issues/11818

#### Testing

`go test status_test.go -count 1000 -failfast` still passes with many
tries.
2024-12-18 17:52:09 +00:00
Matthieu MOREL 824c9f7a43
[chore]: enable gofumpt linter in internal, otelcol, pdata, pipeline and processor (#11855)
#### Description

[gofumpt](https://golangci-lint.run/usage/linters/#gofumpt) is a
stricter format than gofmt, while being backwards compatible.

---------

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>
2024-12-12 19:21:54 +00:00
Bogdan Drutu f5db5dc952
Remove race-condition and cleanup locking in sharedcomponent host (#11819)
Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
2024-12-09 19:03:44 +00:00
Bogdan Drutu 65dcab1568
[chore] Remove unused parameter for sharedcomponent (#11439)
Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
2024-10-14 15:36:22 -07:00
Tyler Helmuth cb24d0c7d7
[component] Remove ReportStatus from component.TelemetrySettings (#10777)
#### Description

This PR removes `ReportStatus` from `component.TelemetrySettings` and
instead expects components to check if their `component.Host` implements
a new `componentstatus.Reporter` interface.

<!-- Issue number if applicable -->
#### Link to tracking issue
Related to
https://github.com/open-telemetry/opentelemetry-collector/pull/10725
Related to
https://github.com/open-telemetry/opentelemetry-collector/pull/10413

<!--Describe what testing was performed and which tests were added.-->
#### Testing
unit tests and a sharedinstance e2e test.

The contrib tests will fail because this is a breaking change. If we
merge this I and @mwear can commit to updating contrib before the next
release.

---------

Co-authored-by: Pablo Baeyens <pablo.baeyens@datadoghq.com>
2024-08-16 09:27:01 +02:00
Antoine Toulme 6a726c955a
[chore] make sharedcomponent Map threadsafe (#9170)
**Description:**
Add just enough code to make sharedcomponent Map thread safe.

**Link to tracking Issue:**
Relates to #9156

Follow up to #9157, should be reviewed after it is merged.
2024-01-09 12:24:40 -08:00
Antoine Toulme c5a2c78d61
Move error out of `ReportComponentStatus` function signature, use `ReportStatus` instead (#9175)
Fixes #9148
2024-01-09 09:36:41 -08:00
Antoine Toulme 1c845787ba
[chore] polish sharedcomponent API (#9157)
Attempt to simplify as much as possible the API exposed by the
sharedcomponent package ahead of exposing it as part of a published
module.

Relates to #9156 

Changes:
* Remove the map field in the struct, make the struct a map
* Remove the `NewSharedComponents` function, just initialize the map
instead.
* Rename GetOrAdd to LoadOrStore
2023-12-21 13:18:11 -08:00
Matthew Wear 433f7aef92
Automate status reporting on start (#8836)
This is part of the continued component status reporting effort.
Currently we have automated status reporting for the following component
lifecycle events: `Starting`, `Stopping`, `Stopped` as well as
definitive errors that occur in the starting or stopping process (e.g.
as determined by an error return value). This leaves the responsibility
to the component to report runtime status after start and before stop.
We'd like to be able to extend the automatic status reporting to report
`StatusOK` if `Start` completes without an error. One complication with
this approach is that some components spawn async work (via goroutines)
that, depending on the Go scheduler, can report status before `Start`
returns. As such, we cannot assume a nil return value from `Start` means
the component has started properly. The solution is to detect if the
component has already reported status when start returns, if it has, we
will use the component-reported status and will not automatically report
status. If it hasn't, and `Start` returns without an error, we can
report `StatusOK`. Any subsequent reports from the component (async or
otherwise) will transition the component status accordingly.

The tl;dr is that we cannot control the execution of async code, that's
up to the Go scheduler, but we can handle the race, report the status
based on the execution, and not clobber status reported from within the
component during the startup process. That said, for components with
async starts, you may see a `StatusOK` before the component-reported
status, or just the component-reported status depending on the actual
execution of the code. In both cases, the end status will be same.

The work in this PR will allow us to simplify #8684 and #8788 and
ultimately choose which direction we want to go for runtime status
reporting.

**Link to tracking Issue:** #7682

**Testing:** units / manual

---------

Co-authored-by: Alex Boten <aboten@lightstep.com>
2023-11-28 12:43:32 -08:00
Matthew Wear 53615832e6
Component Status Reporting (#8169)
This PR introduces component status reporting. There have been several
attempts to introduce this functionality previously, with the most
recent being: #6560.

This PR was orignally based off of #6560, but has evolved based on the
feedback received and some additional enhancements to improve the ease
of use of the `ReportComponentStatus` API.

In earlier discussions (see
https://github.com/open-telemetry/opentelemetry-collector/pull/8169#issuecomment-1668367246)
we decided to model status as a finite state machine with the following
statuses: `Starting`, `OK`, `RecoverableError`, `PermanentError`,
`FatalError`. `Stopping`, and `Stopped`. A benefit of this design is
that `StatusWatcher`s will be notified on changes in status rather than
on potentially repetitive reports of the same status.

With the additional statuses and modeling them using a finite state
machine, there are more statuses to report. Rather than having each
component be responsible for reporting all of the statuses, I automated
status reporting where possible. A component's status will automatically
be set to `Starting` at startup. If the components `Start` returns an
error, the status will automatically be set to `PermanentError`. A
component is expected to report `StatusOK` when it has successfully
started (if it has successfully started) and from there can report
changes in status as it runs. It will likely be a common scenario for
components to transition between `StatusOK` and `StatusRecoverableError`
during their lifetime. In extenuating circumstances they can transition
into terminal states of `PermanentError` and `FatalError` (where a fatal
error initiates collector shutdown). Additionally, during component
Shutdown statuses are automatically reported where possible. A
component's status is set to `Stopping` when Shutdown is initially
called, if Shutdown returns an error, the status will be set to
`PermanentError` if it does not return an error, the status is set to
`Stopped`.

In #6560 ReportComponentStatus was implemented on the `Host` interface.
I found that few components use the Host interface, and none of them
save a handle to it (to be used outside of the `start` method). I found
that many components keep a handle to the `TelemetrySettings` that they
are initialized with, and this seemed like a more natural, convenient
place for the `ReportComponentStatus` API. I'm ultimately flexible on
where this method resides, but feel that `TelemetrySettings` a more user
friendly place for it.

Regardless of where the `ReportComponentStatus` method resides (Host or
TelemetrySettings), there is a difference in the method signature for
the API based on whether it is used from the service or from a
component. As the service is not bound to a specific component, it needs
to take the `instanceID` of a component as a parameter, whereas the
component version of the method already knows the `instanceID`. In #6560
this led to having both `component.Host` and `servicehost.Host` versions
of the Host interface to be used at the component or service levels. In
this version, we have the same for TelemetrySettings. There is a
`component.TelemetrySettings` and a `servicetelemetry.Settings` with the
only difference being the method signature of `ReportComponentStatus`.

Lastly, this PR sets up the machinery for report component status, and
allows extensions to be `StatusWatcher`s, but it does not introduce any
`StatusWatcher`s. We expect the OpAMP extension to be a `StatusWatcher`
and use data from this system as part of its AgentHealth message (the
message is currently being extended to accommodate more component level
details). We also expect there to be a non-OpAMP `StatusWatcher`
implementation, likely via the HealthCheck extension (or something
similiar).

**Link to tracking Issue:** #7682

cc: @tigrannajaryan @djaglowski @evan-bradley

---------

Co-authored-by: Tigran Najaryan <tnajaryan@splunk.com>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
Co-authored-by: Daniel Jaglowski <jaglows3@gmail.com>
Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
Co-authored-by: Tigran Najaryan <4194920+tigrannajaryan@users.noreply.github.com>
Co-authored-by: Alex Boten <aboten@lightstep.com>
2023-10-06 11:35:38 -07:00
Alex Boten 80d704deb4
[chore] use license shortform (#7694)
* [chore] use license shortform

To remain consistent w/ contrib repo, see https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/22052

Signed-off-by: Alex Boten <aboten@lightstep.com>

* make goporto

Signed-off-by: Alex Boten <aboten@lightstep.com>

---------

Signed-off-by: Alex Boten <aboten@lightstep.com>
2023-05-18 13:11:17 -07:00
Bogdan Drutu a4d8fc1bfa
Use generics for sharedcomponents, removes casting increases type safetiness (#6772)
Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
2023-01-24 11:55:38 -08:00
Bogdan Drutu def55617f5
Fix otlpreceiver transport metrics attribute (#6784)
Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
2022-12-13 14:10:50 -08:00
José Carlos Chávez 9d3a8a4608
Adds vanity import check (#4180)
* chore: adds porto and fixes vanity imports.

* chore: fixes target overriding.

* chore: fixes install of porto.

* chore: includes porto as a tool.

* chore: upgrades porto to check internals.

* chore: rebase and update vanity import.

* chore: removes unnecessary space.

* chore: rollsback vanity import in generated files.
2021-10-12 13:47:36 -07:00
Bogdan Drutu a19d6ce268
Add an internal sharedcomponent to be shared by receivers with shared resources (#3198)
* Add an internal sharedcomponent to be shared by receivers with shared resources

Use the new code in OTLP receiver.

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>

* Add comments to sharedcomponent package

Signed-off-by: Bogdan Drutu <bogdandrutu@gmail.com>
2021-05-17 16:45:06 -07:00