<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
<!-- Issue number if applicable -->
On the GA roadmap, we defined what modules are in scope, but we did not
explicitly define how to determine the scope within these modules.
This PR adds some wording to define what the scope *within* modules is.
These are roughly the criteria I have personally been using.
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description
<!-- Issue number if applicable -->
- Adds new `expandedValue` struct that holds the original string
representation if available for values resolved from a provider.
- Removes any mention of `expandedValue` in the public API by adding a
`sanitize` step before returning any `Get`s or `ToStringMap`s.
- Adds new decoding hook that checks if the target field is of `string`
type and uses the string representation in that case.
#### Link to tracking issue
Fixes#10605, Fixes#10405, Fixes#10659
<!--Describe what testing was performed and which tests were added.-->
#### Testing
<!--Describe the documentation added.-->
This changes the behavior in some cases, I update the test cases.
#### Documentation
<!--Please delete paragraphs that you did not use before submitting.-->
| ENV value | ${ENV} before unification | ${ENV} in v0.105.0 (also
${env:ENV} before unification) | Value after this PR |
|----------------------------|----------------------------|---------------------------------------------------------|----------------------------|
| foo\nbar | foo\nbar | foo bar | foo\nbar |
| 1111:1111:1111:1111:1111:: | 1111:1111:1111:1111:1111:: | **Error** |
1111:1111:1111:1111:1111:: |
| "0123" | "0123" | 0123 | "0123" |
Improves the docs on how to setup networking in common environments now
that the collector binds to localhost by default.
<!-- Issue number if applicable -->
#### Link to tracking issue
Fixes#10548
<!--Describe what testing was performed and which tests were added.-->
#### Testing
Tested the Kubernetes setup on a cluster
This PR adds documentation for the collector status reporting system. It
describes the current state of the system and has a section for best
practices that we intend to evolve as we develop them. The intended
audience is future users of the system and anyone interested in getting
a deeper look into how the system works without having to read all of
the code. This is intended to be complementary to the [in-progress
RFC](https://github.com/open-telemetry/opentelemetry-collector/pull/10413).
[Here is a
preview](61abf91b4f/docs/component-status.md)
with the diagrams properly rendered.
---------
Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
When I performed this step yesterday, I found that it was necessary to
update the builder configs first, otherwise `make update-otel` would
fail. This PR just updates the order of operations to match what worked
for me.
Step 8 in the core release process references a draft PR opened in Step
1. However, Step 1 was changed somewhat recently to state that this PR
should be merged before proceeding. Therefore, a new PR should be
opened. Since this new PR is for contrib and no longer directly
associated with a step in the core process, it makes sense to move it to
Step 1 of contrib section.
#### Description
In normal github issues etc, if you `@user`, it makes a link for you.
So, it feels weird to see `@user` without a link. Feel free to reject
this if it is a cure worse than the disease.
<!--Describe the documentation added.-->
#### Documentation
Creates an internal architecture file. In it is a diagram of the startup
flow of the collector as well as links to key files / packages. I also
added package level comments to some key packages.
I wrote some other documentation in
https://github.com/open-telemetry/opentelemetry-collector/pull/10029 but
split the PRs up.
---------
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
I have been unable to provide this position the bandwidth that it
deserves and it is time to formalize recognition of that fact.
Signed-off-by: Anthony J Mirabella <a9@aneurysm9.com>
Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>
**Description:** Adds an RFC about how environment variable resolution
should work
**Link to tracking Issue:** Fixes#9515, relates to:
- #8215
- #8565
- #9162
- #9531
- #9532
---------
Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>
Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>
This document now contains the current focus of the maintainers of the
collector project.
---------
Signed-off-by: Alex Boten <223565+codeboten@users.noreply.github.com>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
This is a documentation change reflecting the progress we have made in
supporting Linux ARM64 type machines.
We now run both core and contrib builds on Ampere machines, supported by
the CNCF, through Actuated github action runners.
This PR fixes#9731
**Description:**searched both the core and contrib Collector
repositories and found that the images are only used in this file. So I
think it's safe to remove them as well.
**Link to tracking Issue:** fixes#8889
This is in preparation of the next PR which will introduce the new
proposal for achieving a v1 release of the Collector. The idea being
that we wanted to collect feedback on the proposal without having to
deal with conflicts/changes in the old outdated document.
Related to #9718
Signed-off-by: Alex Boten <223565+codeboten@users.noreply.github.com>
**Description:** <Describe what has changed.>
warning and alert for using localhost which might go under DNS
resolution and end up with an unexpected IP, risking security.
**Link to tracking Issue:** #9338
**Documentation:** Added Waring and risk alert in
https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md
---------
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
**Description:**
Updates release schedule. The March 18th release overlaps with KubeCon
EU, so I am shifting everything by one week starting with that release.
This means the next release cycle will have three weeks.
**Description:**
Updates mentions to `spanmetrics` processor and references `spanmetrics`
connector instead if applicable.
**Link to tracking Issue:** Relates to
open-telemetry/opentelemetry-collector-contrib#29567
**Description:** Add @IBM-Currency-Helper, @adilhusain-s and @seth-priya
as owners for the `linux/ppc64le` platform.
**Link to tracking Issue:** Fixes#8528
Co-authored-by: Alex Boten <aboten@lightstep.com>
**Description:**
- Define `component.UseLocalHostAsDefaultHost` in the
`internal/localhostgate` package.
- Define `featuregate.ErrIsAlreadyRegistered` error, returned by
`Register` when a gate is already registered.
- Adds support for the localhost gate on the OTLP receiver.
This PR does not remove the current warning in any way, we can remove
this separately.
**Link to tracking Issue:** Updates #8510
**Testing:** Adds unit tests
**Documentation:** Document on OTLP receiver template and add related
logging.
Signed-off-by: Alex Boten <aboten@lightstep.com>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
Co-authored-by: Curtis Robert <crobert@splunk.com>
**Description:**
Fix run as a service detection on Windows. Instead of trying to detect
if it is a service or not, for which both `svc.IsAnInteractiveSession()`
and `svc.IsWindowsService()` are somehow broken, try to run as a Windows
service, if that fails fallback to run as an interactive process. This
follows a recommendation of the Windows [service API
documentation](https://learn.microsoft.com/en-us/windows/win32/api/winsvc/nf-winsvc-startservicectrldispatchera#return-value).
The new code calls `svc.Run` and in case of error checks for
`windows.ERROR_FAILED_SERVICE_CONTROLLER_CONNECT`. If this is the error
the application can safely assume that it is not running as a service.
The duration of a call to `svc.Run` failing with this error was below 3
microseconds in the current GH runner and below 5 microseconds on my
box. While this value seems fine for startup I'm keeping the
`NO_WINDOWS_SERVICE` option instead of deprecating it (it doesn't seem
worth the trouble of deprecating it).
**Link to tracking Issue:**
Fix#7350
**Testing:**
Fix tested on the Splunk fork that deploys the collector as a service
and as an interactive process on Windows containers.
**Documentation:**
Added changelog.
Clarifies our stance regarding RC releases, inspired by
https://github.com/open-telemetry/opentelemetry-collector/pull/8935#discussion_r1397887303.
In a nutshell:
- Stabilization criteria have to be met for at least two minor version
releases before moving to the stable module.
- We treat these two (or more) minor releases as release candidates and
do not release under the `-rc` release family.
- After these releases, we move directly to the `1.x` release family.
**Link to tracking Issue:** Fixes#8063 (together with the upcoming 1.0
release for pdata and featuregate)
cc @braydonk
---------
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
This adds a new release template for stabilizing a module.
The intent is to try this out for pdata and featuregate, and iterate on
the template over time.
The roadmap.md was useful in the old times, but it has outlived its
usefulness and we should remove it. We have achieved the goals set in it
and we have not kept it updated so far, resorting instead to github
issues, milestones and the like.
Note that the ga-roadmap.md document remains, since we have not yet
achieved the goals set in it. We can discuss removing/updating it, but I
think this is best done in a separate PR.
This documents how users can enable OTLP export for internal collector
telemetry. NOTE: This feature is all still behind feature gates and
subject to change
Signed-off-by: Alex Boten <aboten@lightstep.com>
**Description**:
Enforce order of start and shutdown of extensions according to their
internally declared dependencies
**Link to tracking Issue**:
Resolves#8732
**Motivation**:
This is an alternative approach to #8733 which uses declaration order in
the config to start extensions. That approach (a) enforces order when
it's not always necessary to enforce, and (b) exposes unnecessary
complexity to the user by making them responsible for the order.
This PR instead derives the desired order of extensions based on the
dependencies they declare by implementing a `DependentExtension`
interface. That means that extensions that must depend on others can
expose this interface and be guaranteed to start after their
dependencies, while other extensions can be started in arbitrary order
(same as happens today because of iterating over a map).
The extensions that have dependencies have two options to expose them:
1. if the dependency is always static (e.g. `jaeger_query` extension
depending on `jaeger_storage` as in the OP), the extension can express
this statically as well, by returning a predefined ID of the dependent
extension
2. in cases where dependencies are dynamic, the extension can read the
names of the dependencies from its configuration.
The 2nd scenario is illustrated by the following configuration. Here
each complex extension knows that it needs dependencies that implement
`storage` and `encoding` interfaces (both existing APIs in collector &
contrib), but does not know statically which instances of those, the
actual names are supplied by the user in the configuration.
```yaml
extensions:
complex_extension_1:
storage: filestorage
encoding: otlpencoding
complex_extension_2:
storage: dbstorage
encoding: jsonencoding
filestorage:
...
dbstorage:
...
otlpencoding:
jsonencoding:
```
**Changes**:
* Introduce `DependentExtension` optional interface
* Change `Extensions` constructor to derive the required order using a
directed graph (similar to pipelines)
* Inherited from #8733 - use new ordered list of IDs to
start/stop/notify extensions in the desired order (previously a map was
used to iterate over, which resulted in random order).
* Tests
**Testing**:
Unit tests
---------
Signed-off-by: Yuri Shkuro <github@ysh.us>
Co-authored-by: Antoine Toulme <antoine@lunar-ocean.com>
Adjusts the release schedule to account for Kubecon. I also adjusted the
following dates, but let me know if you'd rather go right from a
2023-11-13 release to a 2023-11-20 release (1 week between releases).
**Description:**
Remove bits mentioning configuration on versions v0.42.0 or below and
versions v0.35.0 and below.
These versions are more than 2 years old, and users can always switch to
the docs when their version was released.
Since we now have support tiers defined, we can clarify what platforms
allow for a bugfix release
---------
Co-authored-by: Alex Boten <aboten@lightstep.com>
This PR adds documentation to introduce a tiered
platform support model for the OpenTelemetry Collector. The tiered
platform support model provides clarity to the project and its users
about how existing platforms are supported today and how requests for
new platforms can be supported in future, while balancing between the
aim to support as many platforms as possible and to guarantee stability
for the most important platforms.
**Link to tracking Issue:** #8209
---------
Co-authored-by: Aunsh Chaudhari <aunsh04@yahoo.in>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
Co-authored-by: Alex Boten <aboten@lightstep.com>
The following changes are applied:
1. Move the contrib draft PR creation to the first step. This is
required to be done before the core release, because some changes may be
needed in the core codebase to resolve possible core/contrib integration
issues which was the case with the 0.84.0 release.
2. Add a missing `v` prefix for OTEL_RC_VERSION env var in step 8.
3. Rotate the schedule
4. Update 0.55.0 version in the examples to 0.85.0. It was inconsistent
with using the latest version for the stable module set
5. Mention CHANGELOG-API in step 7
Based on a my experience releasing v0.83.0, I believe the following
clarifications would help reduce potential errors in future releases.
---------
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
Updated Grafana dashboard link. Old dashboard was not updated for
3+years. I linked "my" dashboard, which is published also in the
[repo](https://github.com/monitoringartist/opentelemetry-collector-monitoring),
so users may contribute and improve it easily.
Dashboard has structure - horizontal by receivers/processors/exporters,
vertical by traces/metrics/logs + process metrics. It is trying to be
generic, no distro/component specific.
- v0.82.0 release was delayed but is underway today
- Since we've nearly reached the original release date for v0.83.0, push
future release schedule by 2 weeks
- @codeboten was scheduled for 2 consecutive releases. Rather than shift
dates for all release managers, only shift those up until @codeboten's
first scheduled release. This elimates the double release and changes
scheduled dates only for myself and @dmitryax.
Adds note about when we will typically make a release for security-related issues. This introduces an informal 30 day SLA (the tightest TTR SLA for Gitlab) for security-related issues. Since we release every ~14 days, this means we will, in most cases, not make bugfix releases for security related issues. Critical vulnerabilities are added as an exception.
This relates to the discussion around CVE-2023-24534 and CVE-2023-24536.
Added a nudge for the configopaque package in the security recommendations document and also in the contributing doc under "when adding a new component" section.
Fixes#6854
---------
Co-authored-by: Alex Boten <alex@boten.ca>
Updated the release schedule and removed the bit about adding a link to the top of the release notes for the contrib release as this is done automatically now.
Signed-off-by: Alex Boten <aboten@lightstep.com>
@bogdandrutu if you're ok with it, I'd like to do the next release to ensure the changes to the release process I've made work.
Signed-off-by: Alex Boten <aboten@lightstep.com>
* add github action to automate the release the process
This PR adds a github action that can be triggered manually to do the following:
- creates tracking issue
- checks for blockers in core
- checks for blockers in contrib
- checks status of build-and-test in core
- checks status of build-and-test in contrib
- runs chlog update
- runs make-prepare for both stable/beta
- creates a PR
Signed-off-by: Alex Boten <aboten@lightstep.com>
* add details to docs
Signed-off-by: Alex Boten <aboten@lightstep.com>
* Apply suggestions from code review
Co-authored-by: Daniel Jaglowski <jaglows3@gmail.com>
* Apply suggestions from code review
Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>
* add github action to automate the release the process
This PR adds a github action that can be triggered manually to do the following:
- creates tracking issue
- checks for blockers in core
- checks for blockers in contrib
- checks status of build-and-test in core
- checks status of build-and-test in contrib
- runs chlog update
- runs make-prepare for both stable/beta
- creates a PR
Signed-off-by: Alex Boten <aboten@lightstep.com>
* update repo, remove chlog-install
* remove unnecessary comments from the action
* move bash commands into script files
Signed-off-by: Alex Boten <aboten@lightstep.com>
Signed-off-by: Alex Boten <aboten@lightstep.com>
Co-authored-by: Daniel Jaglowski <jaglows3@gmail.com>
Co-authored-by: Juraci Paixão Kröhling <juraci.github@kroehling.de>
This removes the need to add the location of the releases repo at the top of the changelog.
Signed-off-by: Alex Boten <aboten@lightstep.com>
Signed-off-by: Alex Boten <aboten@lightstep.com>
Removes the first two points from the bugfix release criteria.
I think the remaining points give a more accurate picture of the decision making process we have taken so far, (e.g for #6420, where the first two points were not fulfilled).
We can revisit this in the future if there are disagreements on when to do a bugfix release
I plan to focus on other OpenTelemetry areas, such as OpAMP and Logs and no longer have time to be a Collector maintainer.
It has been a please working with all @open-telemetry/collector-approvers @open-telemetry/collector-contrib-approvers @open-telemetry/collector-maintainers @open-telemetry/collector-contrib-maintainer @open-telemetry/collector-contrib-triagers @open-telemetry/collector-triagers :-)
* [pipelines] Change test to not reuse same processor twice in one pipeline
* Add note to documentation about reuse of processors within a pipeline
* can -> MUST
Updating the existing long-term roadmap to reflect the current status of the items on the roadmap. Note this roadmap could probably use a bit of a refresh, will add a discussion topic to the next SIG meeting agenda.
* [docs] update design.md doc
Started looking at various documents under the `docs` directory that are somewhat outdatted. This PR updates the design document.
* Apply suggestions from code review
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
* Minor update to tracking issue docs
* add note about invalid merge error
* Add another issue I ran into
* Undo order swap and add a 'plz commit' instead
Reordering config instructions so that newer versions are put first and older versions later.
With time, the users will be less and less interested in configuration for older versions.
Co-authored-by: Juraci Paixão Kröhling <juraci@kroehling.de>
* add prepare-release make target
make prepare-release does the following:
- checks the local repository is clean
- search/replace PREVIOUS_VERSION with RELEASE_CANDIDATE
- runs make genotelcorecol
- creates a commit on a new branch
- runs make multimod-prerelease
- commits the changes to the branch
- creates a PR
* add check for gh