opentelemetry-collector

Commit Graph

Author	SHA1	Message	Date
Pablo Baeyens	28ca163a92	[docs/component-stability.md] Add criteria for graduating between stability levels (#11864 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Code ownership and maintenance of components continues to be an issue, with varying levels of support across contrib. As we approach 1.0 and the ability to mark components as stable, we want to make sure that components that we deem as 'stable' have a healthy community around them. We have three datapoints that we can leverage here: how many codeowners a component has, how diverse these are in terms of employers and how actively the codeowners have been responding to issues/PRs in the recent past. We need criteria that 1. Are reasonable predictors of the component health over the short/medium term 2. Are not too onerous on the code owners Some notes: 1. Some beta components do not meet the criteria listed on the PR. This will be the case even after the transition for some components. This PR makes no claim as to what should happen to these components stability (so, de facto, they will stay as is). 2. The OTLP receiver and exporters do not meet this criteria today because they don't have listed code owners. We can solve this either by carving out an exception or by listing code owners. 3. We need automation and templates to enforce this. <!-- Issue number if applicable --> #### Link to tracking issue Fixes #11850 --------- Co-authored-by: Christos Markou <chrismarkou92@gmail.com>	2025-04-14 09:00:01 +00:00
Daniel Jaglowski	2cab46e990	[chore] Propose additional attribute for 'otelcol.connector.produced.*' metrics (#12815 ) Without this attribute, it's not possible to accurately understand how data flows out of connectors.	2025-04-14 08:59:45 +00:00
Juraci Paixão Kröhling	8e6393bd64	[chore] Move jpkrohling to emeritus (#12819 ) A few weeks ago, I mentioned to the Collector leads about my intention to resign as maintainer/approver. My current focus on building OllyGarden isn't leaving much room to be an approver or maintainer. The plan right now is to ramp up again as approver/maintainer in the future once time allows. Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de> Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>	2025-04-11 15:21:58 +00:00
Tyler Helmuth	ec1f13a7f4	[chore] update release schedule (#12787 )	2025-04-01 22:03:29 +00:00
Bogdan Drutu	d6bd0a5193	Revert "[chore] Auto-populate versions during release preparation" (#12775 ) Reverts open-telemetry/opentelemetry-collector#12733 The problem is that the CURRENT_VERSION envs are prefixed with "v" and nothing is working anymore.	2025-04-01 15:32:55 +00:00
Moritz Wiesinger	a624a45dd0	[chore] Auto-populate versions during release preparation (#12733 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR updated the prepare-release workflow so that the version bump just needs to be selected in a dropdown menu instead of putting in the version numbers by hand. Additionally, "no bump" can be selected to not bump specific component sets (beta or stable). <!-- Issue number if applicable --> #### Link to tracking issue Fixes #12565 <!--Describe what testing was performed and which tests were added.--> #### Testing Only bumping beta to next minor: [workflow run](https://github.com/mowies/opentelemetry-collector/actions/runs/14059792536) - [release tracker issue](https://github.com/mowies/opentelemetry-collector/issues/17) Bumping beta and stable to next minor: [workflow run](https://github.com/mowies/opentelemetry-collector/actions/runs/14058979304/job/39364990356) - [release tracker issue](https://github.com/mowies/opentelemetry-collector/issues/16) Bumping beta and stable to next patch: [workflow run](https://github.com/mowies/opentelemetry-collector/actions/runs/14059861157) <!--Please delete paragraphs that you did not use before submitting.--> --------- Signed-off-by: Moritz Wiesinger <moritz.wiesinger@dynatrace.com>	2025-03-26 10:34:19 +00:00
Yuri Sa	e0a1f2163b	[chore] Added spell check (#12671 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Added the cspell to check spelling in .md, .yaml files. <!-- Issue number if applicable --> #### Link to tracking issue Fixes #9287 <!--Describe what testing was performed and which tests were added.--> #### Testing <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> --------- Signed-off-by: Yuri Oliveira <yurimsa@gmail.com>	2025-03-20 18:53:25 +00:00
Alex Boten	69908f4ee7	[chore] update release documentation (#12670 ) There was no mention of disabling the merge queue which is needed if we need to merge a commit (instead of squashing it) Signed-off-by: Alex Boten <223565+codeboten@users.noreply.github.com>	2025-03-19 18:19:26 +00:00
Moritz Wiesinger	027cfd89f3	[chore] Auto-populate release notes (#12644 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR changes the release workflow to autofill the release notes from `CHANGELOG.md` and `CHANGELOG-API.md` into the generated GH release. It makes use od `awk` and `sed` to build the release notes step by step from the changelog files. The [default chloggen template](`c43cb0331c/chloggen/internal/chlog/summary.tmpl`) was added and a `<!--preview-version-->` tag was added to easily filter out the changelog of just the latest version. <!-- Issue number if applicable --> #### Link to tracking issue Fixes #10191 <!--Describe what testing was performed and which tests were added.--> #### Testing Tested on my fork. Release with autofilled changelog: https://github.com/mowies/opentelemetry-collector/releases/tag/v0.121.0 Workflow that did it: https://github.com/mowies/opentelemetry-collector/actions/runs/13899615357/job/38888008499 <!--Describe the documentation added.--> #### Documentation The release checklist was updated accordingly. <!--Please delete paragraphs that you did not use before submitting.--> --------- Signed-off-by: Moritz Wiesinger <moritz.wiesinger@dynatrace.com>	2025-03-17 13:19:30 +00:00
Jade Guiton	0e9e259750	[chore] Update release doc to remove step 1 of releasing contrib (#12632 ) #### Description Once [contrib#38534](https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/38534) is merged, the manual changes that were necessary in step 1 of releasing contrib should now be included in step 2 (the Prepare Release CI workflow). This PR updates the release doc to remove step 1. #### Link to tracking issue Updates #12294	2025-03-14 10:22:44 +00:00
Benjamin Blattberg	36eaf6a917	Update whitespace for list formatting (#12569 ) #### Description A very minor whitespace issue was preventing the list from formatting correctly on one .md doc page. This fixes that _very minor_ issue. Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>	2025-03-06 21:05:37 +00:00
Pablo Baeyens	df36fd7a06	[chore] Simplify release doc (#12563 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Simplifies description of automated release steps. While there is some value in having the description of the automated steps somewhere, I think this runs the risk of getting outdated and us having to look at the code directly, so I would rather just remove it from here and improve the comments/code of the automation over time. See open-telemetry/opentelemetry-collector-releases/pull/856 for one improvement of this kind.	2025-03-05 16:35:20 +00:00
Christos Markou	a9c5de2b65	[mdatagen] Add deprecation date and migration note fields for deprecated components (#12464 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This PR adds deprecation date and migration note fields for deprecated components as described at https://github.com/open-telemetry/opentelemetry-collector/issues/12359. Example metadata file: ```yaml status: class: receiver stability: development: [logs] beta: [traces] stable: [metrics] deprecated: [profiles] deprecation: profiles: migration: "no migration needed" date: "2025-02-05" ``` Example README.md: \| Status \| \| \| ------------- \|-----------\| \| Stability \| [deprecated]: profiles \| \| \| [development]: logs \| \| \| [beta]: traces \| \| \| [stable]: metrics \| \| Deprecation of profiles \| [Date]: 2025-02-05 \| \| \| [Migration Note]: no migration needed \| [deprecated]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#deprecated [development]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#development [beta]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#beta [stable]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#stable [Date]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#deprecation-information [Migration Note]: https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#deprecation-information I'd appreciate any suggestions if there is a better way to represent this information in the markdown table. <!-- Issue number if applicable --> #### Link to tracking issue Fixes https://github.com/open-telemetry/opentelemetry-collector/issues/12359 <!--Describe what testing was performed and which tests were added.--> #### Testing Added <!--Describe the documentation added.--> #### Documentation Added <!--Please delete paragraphs that you did not use before submitting.--> /cc @atoulme who filed the feature request issue originally. Signed-off-by: ChrsMark <chrismarkou92@gmail.com>	2025-03-04 19:12:54 +00:00
Pablo Baeyens	909a8bbffd	[chore] Update release schedule (#12551 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Updates #12533	2025-03-04 16:09:16 +00:00
Jade Guiton	1bb0469b16	[service] Move batchprocessor metrics to normal level and update level guidelines (#12525 ) #### Description This PR: - requires "level: normal" before outputting batch processor metrics (in addition to one specific metric which was already restricted to "level: detailed") - clarifies wording in the telemetry level guidelines and documentation, and adds said guidelines to the requirements for stable components. Some rationale for these changes can be found in the tracking issue and [this comment](https://github.com/open-telemetry/opentelemetry-collector/issues/7890#issuecomment-2684652956). #### Link to tracking issue Resolves #7890 #### To be discussed Should we add a feature gate for this, in case a user relies on "level: basic" outputting batch processor metrics? This feels like a niche use case, so considering the "alpha" stability level of these metrics, I don't think it's really necessary. Considering batch processor metrics had already been switched to "normal" once (#9767), but were turned back to basic at some later point (not sure when), we might also want to add tests to avoid further regressions (especially as the handling of telemetry levels is bound to change further with #11754). --------- Co-authored-by: Dmitrii Anoshin <anoshindx@gmail.com>	2025-03-04 15:57:23 +00:00
Alex Boten	dc627469c2	[docs] add guidance around what a collector/distro is (#12435 ) Fixes #8555 --------- Signed-off-by: Alex Boten <223565+codeboten@users.noreply.github.com>	2025-02-28 15:54:18 +00:00
Pablo Baeyens	8c24854960	[chore] Add naming guidance for modules prefix (#12479 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Add guidelines for naming Go modules	2025-02-25 22:40:14 +00:00
Pablo Baeyens	b37cd330af	[chore] Promote Jade Guiton to approver (#12446 ) ## Description Promote @jade-guiton-dd to approver on behalf of @open-telemetry/collector-maintainers. PRs reviewed: https://github.com/open-telemetry/opentelemetry-collector/pulls?q=is%3Apr+reviewed-by%3Ajade-guiton-dd+ PRs authored: https://github.com/open-telemetry/opentelemetry-collector/pulls?q=is%3Apr+author%3Ajade-guiton-dd+ Issues commented: https://github.com/open-telemetry/opentelemetry-collector/issues?q=is%3Aissue+commenter%3Ajade-guiton-dd+ Commits: https://github.com/open-telemetry/opentelemetry-collector/commits?author=jade-guiton-dd Co-authored-by: Alex Boten <223565+codeboten@users.noreply.github.com>	2025-02-20 21:55:24 +00:00
Pablo Baeyens	1a8fe1bd47	[chore] Promote Joshua MacDonald to approver (#12447 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Promote @jmacd to approver on behalf of @open-telemetry/collector-maintainers. PRs reviewed: https://github.com/open-telemetry/opentelemetry-collector/pulls?q=is%3Apr+reviewed-by%3Ajmacd+ PRs authored: https://github.com/open-telemetry/opentelemetry-collector/pulls?q=is%3Apr+author%3Ajmacd+ Issues commented: https://github.com/open-telemetry/opentelemetry-collector/issues?q=is%3Aissue+commenter%3Ajmacd+ Commits: https://github.com/open-telemetry/opentelemetry-collector/commits?author=jmacd	2025-02-20 19:34:05 +00:00
Juraci Paixão Kröhling	56fbf4c2c1	[chore] Update release schedule (#12441 ) Note the 3-week window between .128 and .129, as we'll likely have OTel Community Day on the week of June 23. An alternative to that is to have the release on June 23 and assign to someone who knows already that they won't be there anyway. Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de> Signed-off-by: Juraci Paixão Kröhling <juraci@kroehling.de>	2025-02-20 11:05:00 +00:00
Daniel Jaglowski	139d2ce30f	Add clarification about attribute types to Telemetry RFC (#12407 ) As suggested in https://github.com/open-telemetry/opentelemetry-collector/issues/12217#issuecomment-2648752916 Co-authored-by: Pablo Baeyens <pablo.baeyens@datadoghq.com>	2025-02-18 12:45:29 +00:00
Guspan Tanadi	12d8ab1106	[docs] Fix links `syslist.go` (#12388 ) #### Description	2025-02-14 15:50:35 +00:00
Pablo Baeyens	c713e7c4d7	[chore][docs] Add guidance regarding breaking changes (#12274 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description <!-- Issue number if applicable --> Reworks breaking changes section to include information about our approach to feature gates. --------- Co-authored-by: Evan Bradley <11745660+evan-bradley@users.noreply.github.com>	2025-02-14 11:40:48 +00:00
Antoine Toulme	f71fe2757f	[chore] fix metadata.yaml (#12387 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description fix metadata.yaml github_project, add a go:generate instruction to confmap and a banner to README	2025-02-14 11:00:42 +00:00
Daniel Jaglowski	26caac9f8f	[chore] Move djaglowski to emeritus approver (#12310 )	2025-02-06 16:42:52 +00:00
Moritz Wiesinger	43087bdd92	[chore] add githubgen to enhance codeowners and issue templates (#11756 ) ### This PR - adds the githubgen tool as a dependency in internal/tools - uses githubgen to generate codeowners and issue template files - updates lots of metadata files by - taking the existing codeowners file and feeding the info from there back into the component metadata.yaml files or creating new metadata.yaml files where none existed yet - adds distributions.yaml as a basis the mostly already existing `distributions:` keys in metadata.yaml files (needed for githubgen to work correctly) - adds relevant make commands to make the githubgen tool usage mostly transparent to users This change is a prerequisite to be able to ping codeowners reliably with automated tooling as a next step. Part of #11562 --------- Signed-off-by: Moritz Wiesinger <moritz.wiesinger@dynatrace.com>	2025-02-06 10:31:36 +00:00
Pablo Baeyens	1041a0be80	[chore][docs] Use consistent title hierarchy (#12272 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description <!-- Issue number if applicable --> Use h2 (hN-1) titles for h2 (hN-1) sections instead of h3 (hN)	2025-02-04 16:39:08 +00:00
Jade Guiton	1c4726a382	Amend Pipeline Component Telemetry RFC to add a "rejected" outcome (#11956 ) ### Context The [Pipeline Component Telemetry RFC](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/rfcs/component-universal-telemetry.md) was recently accepted (#11406). The document states the following regarding error monitoring: > For both [consumed and produced] metrics, an `outcome` attribute with possible values `success` and `failure` should be automatically recorded, corresponding to whether or not the corresponding function call returned an error. Specifically, consumed measurements will be recorded with `outcome` as `failure` when a call from the previous component the `ConsumeX` function returns an error, and `success` otherwise. Likewise, produced measurements will be recorded with `outcome` as `failure` when a call to the next consumer's `ConsumeX` function returns an error, and `success` otherwise. [Observability requirements for stable pipeline components](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/component-stability.md#observability-requirements) were also recently merged (#11772). The document states the following regarding error monitoring: > The goal is to be able to easily pinpoint the source of data loss in the Collector pipeline, so this should either: > - only include errors internal to the component, or; > - allow distinguishing said errors from ones originating in an external service, or propagated from downstream Collector components. Because errors are typically propagated across `ConsumeX` calls in a pipeline (except for components with an internal queue like `processor/batch`), the error observability mechanism proposed by the RFC implies that Pipeline Telemetry will record failures for every component interface upstream of the component that actually emitted the error, which does not match the goals set out in the observability requirements, and makes it much harder to tell which component errors are coming from from the emitted telemetry. ### Description This PR amends the Pipeline Component Telemetry RFC with the following: - restrict the `outcome=failure` value to cases where the error comes from the very next component (the component on which `ConsumeX` was called); - add a third possible value for the `outcome` attribute: `rejected`, for cases where an error observed at an interface comes from further downstream (the component did not "fail", but its output was "rejected"); - propose a mechanism to determine which of the two values should be used. The current proposal for the mechanism is for the pipeline instrumentation layer to wrap errors in an unexported `downstream` struct, which upstream layers could check for with `errors.As` to check whether the error has already been "attributed" to a component. This is the same mechanism currently used for tracking permanent vs. retryable errors. Please check the diff for details. ### Possible alternatives There are a few alternatives to this amendment, which were discussed as part of the observability requirements PR: - loosen the observability requirements for stable components to not require distinguishing internal errors from downstream ones → makes it harder to identify the source of an error; - modify the way we use the `Consumer` API to no longer propagate errors upstream → prevents proper propagation of backpressure through the pipeline (although this is likely already a problem with the `batch` prcessor); - let component authors make their own custom telemetry to solve the problem → higher barrier to entry, especially for people wanting to opensource existing components. --------- Co-authored-by: Pablo Baeyens <pablo.baeyens@datadoghq.com>	2025-01-28 12:41:10 +00:00
Curtis Robert	df99547796	[chore][docs] Fix typos in Pipeline Component Telemetry RFC (#12181 ) <!--Describe the documentation added.--> #### Documentation This is a documentation-only change to fix some typos in the Pipeline Component Telemetry RFC doc.	2025-01-24 19:22:21 +00:00
Pablo Baeyens	c400731605	[chore] RFC: Collector releases approvers group (#11577 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description <!-- Issue number if applicable --> Proposal for creating a new `collector-release-approvers` group. Announced at: - #otel-collector-dev on 2024-10-30: https://cloud-native.slack.com/archives/C07CCCMRXBK/p1730307025302339 - Collector SIG on 2024-11-05 (TBD) The stakeholders for this PR are: - @open-telemetry/collector-approvers - @open-telemetry/collector-contrib-approvers --------- Co-authored-by: Andrzej Stencel <andrzej.stencel@elastic.co>	2025-01-23 11:24:55 +00:00
Pablo Baeyens	54c3cd8596	[chore][docs/component-stability.md] Add documentation requirements for components based on their stability level (#11871 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description <!-- Issue number if applicable --> Adds requirements for documentation for different stability levels. I expect many of these will be done through automation over time :) #### Link to tracking issue Fixes #11852	2025-01-22 16:59:29 +00:00
Jade Guiton	f70a4b158d	[chore] Reword component stability doc (#12161 ) #### Description This PR slightly changes the wording of the "Stability levels and versioning" doc (`docs/component-stability.md`), which I found a bit confusing, in order to: - Emphasize the important fact that stability levels for a component are defined _per signal_. At the moment this is only alluded to at the beginning and assumed in the last section. Moreover, things like the "Unmaintained" level may give the impression that stability levels always apply to an entire component. - More cleanly separate the part about behavior changes from the part about API changes in the "Versioning" section. This should not change the content or interpretation of the document.	2025-01-22 16:22:04 +00:00
Dmitrii Anoshin	d3ba5a3ea2	[chore] Update release doc to mention the new update-otel job (#12155 ) The job to run make opdate-otel is finally working	2025-01-22 13:02:39 +00:00
Alex Boten	b09a65bb3a	[chore] update release schedule (#12149 ) Also fixed a typo at the same time. Signed-off-by: Alex Boten <223565+codeboten@users.noreply.github.com>	2025-01-21 23:01:20 +00:00
Tiffany Hrabusa	e7b52a28e5	[docs] Remove end-user content from security README (#12002 ) #### Description The end-user security documentation was moved to the OTel website in https://github.com/open-telemetry/opentelemetry.io/pull/5209 and https://github.com/open-telemetry/opentelemetry.io/pull/5729. This PR removes the end-user content from [security-best-practices.md](https://github.com/swiatekm/opentelemetry-collector/blob/main/docs/security-best-practices.md), leaving only the component developer documentation. <!-- Issue number if applicable --> #### Link to tracking issue Fixes https://github.com/open-telemetry/opentelemetry.io/issues/3479	2025-01-10 11:20:45 +00:00
Alex Boten	70fc33e408	[chore] update link checker to use lychee (#12041 ) This was done in the [specification repo](`6c626defb7`), and allows us to use a github action instead of install npm packages as part of the build process (which kept bringing security warnings back) Signed-off-by: Alex Boten <223565+codeboten@users.noreply.github.com>	2025-01-08 07:13:54 +00:00
Dmitrii Anoshin	3dddfc69ce	[docs] [chore] Update the Release doc (#12052 )	2025-01-07 23:35:51 +00:00
John L. Peterson (Jack)	afdd296edb	[chore] add release docs for artifact version bump script (#12005 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description updates documentation to include changes in https://github.com/open-telemetry/opentelemetry-collector-releases/pull/684 <!--Describe what testing was performed and which tests were added.--> #### Testing run locally and via workflows in jackgopack4 fork <!--Describe the documentation added.--> #### Documentation updates to release.md in docs folder <!--Please delete paragraphs that you did not use before submitting.-->	2025-01-03 23:15:30 +00:00
Pablo Baeyens	1e5ccbecd8	[chore] Document module split in more detail (#11860 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description - Documents when to split into separate modules, including general rules as well as specific conventions we are currently using - Rephrases the wording on #11836 to add it into a general list. - Documents how to split into separate modules. <!-- Issue number if applicable --> #### Link to tracking issue Follows #11836, Fixes #11436, Fixes #11623 --------- Co-authored-by: Jade Guiton <jade.guiton@datadoghq.com>	2024-12-23 16:37:55 +00:00
Yang Song	c1e860e56c	[chore][release.md] checkout release branch to the correct commit & update release schedule (#11938 ) #### Description If more PRs are merged after the release PR commit, make sure to checkout the release branch to the release PR commit rather than the mainline head.	2024-12-23 15:29:37 +00:00
Pablo Baeyens	50104db5f3	[chore][docs/component-stability.md] Add a 'Moving between stability levels' section (#11937 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Split off from #11864, describes how the graduation would work without any additional criteria. Rendered diagram: ```mermaid stateDiagram-v2 state Maintained { InDevelopment --> Alpha Alpha --> Beta Beta --> Stable } InDevelopment: In Development Maintained --> Unmaintained Unmaintained --> Maintained Maintained --> Deprecated Deprecated --> Maintained: (should be rare) ``` --------- Co-authored-by: Christos Markou <chrismarkou92@gmail.com>	2024-12-19 11:26:26 +00:00
Matthew Wear	58a5ffc888	[service/internal] Allow components to transition from PermanentError to Stopping (#10958 ) #### Description In #10058 I mentioned: > There is a tangentially related issue with PermanentErrors and the underlying finite state machine that governs transitions between statuses. Currently, a PermanentError is a final state. That is, once a component enters this state, no further transitions are allowed. In light of the work I did on the alternative health check extension, I believe we should allow a transition from PermanentError to Stopping to consistently prioritize lifecycle events for components. This transition also make sense from a practical perspective. A component in a PermanentError state is one that has been started and is running, although in a likely degraded state. The collector will call shutdown on the component (when the collector is shutting down) and we should allow the status to reflect that. This PR makes the suggested change and updates the documentation to reflect that. As this is an internal change, I have not included a changelog. Also note, we can close #10058 after this as we've already removed status aggregation from core during the recent component status refactor. <!-- Issue number if applicable --> #### Link to tracking issue Fixes #10058 <!--Describe what testing was performed and which tests were added.--> #### Testing units <!--Describe the documentation added.--> #### Documentation Updated docs/component-status.md and associated diagram. <!--Please delete paragraphs that you did not use before submitting.--> Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com> Co-authored-by: Antoine Toulme <atoulme@splunk.com>	2024-12-17 13:02:44 +00:00
Yang Song	6a47030ba2	[chore][release.md] mention release blockers (#11908 ) #### Description Mention release blockers should be addressed before the release.	2024-12-16 19:16:38 +00:00
Pablo Baeyens	26f0fcfe90	[chore] [docs/component-stability.md] Document relationship between versioning and stability (#11863 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Documents relationship between component stability and versioning. <!-- Issue number if applicable --> #### Link to tracking issue Fixes #11851	2024-12-16 16:06:01 +00:00
Jade Guiton	8ac40a01a5	Define observability requirements for stable components (#11772 ) ## Description This PR defines observability requirements for components at the "Stable" stability levels. The goal is to ensure that Collector pipelines are properly observable, to help in debugging configuration issues. #### Approach - The requirements are deliberately not too specific, in order to be adaptable to each specific component, and so as to not over-burden component authors. - After discussing it with @mx-psi, this list of requirements explicitly includes things that may end up being emitted automatically as part of the Pipeline Instrumentation RFC (#11406), with only a note at the beginning explaining that not everything may need to be implemented manually. Feel free to share if you don't think this is the right approach for these requirements. #### Link to tracking issue Resolves #11581 ## Important note regarding the Pipeline Instrumentation RFC I included this paragraph in the part about error count metrics: > The goal is to be able to easily pinpoint the source of data loss in the Collector pipeline, so this should either: > - only include errors internal to the component, or; > - allow distinguishing said errors from ones originating in an external service, or propagated from downstream Collector components. The [Pipeline Instrumentation RFC](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/rfcs/component-universal-telemetry.md) (hereafter abbreviated "PI"), once implemented, should allow monitoring component errors via the `outcome` attribute, which is either `success` or `failure`, depending on whether the `Consumer` API call returned an error. Note that this does not work for receivers, or allow differentiating between different types of errors; for that reason, I believe additional component-specific error metrics will often still be required, but it would be nice to cover as many cases as possible automatically. However, at the moment, errors are (usually) propagated upstream through the chain of `Consume` calls, so in case of error the `failure` state will end up applied to all components upstream of the actual source of the error. This means the PI metrics do not fit the first bullet point. Moreover, I would argue that even post-processing the PI metrics does not reliably allow distinguishing the ultimate source of errors (the second bullet point). One simple idea is to compute `consumed.items{outcome:failure} - produced.items{outcome:failure}` to get the number of errors originating in a component. But this only works if output items map one-to-one to input items: if a processor or connector outputs fewer items than it consumes (because it aggregates them, or translates to a different signal type), this formula will return false positives. If these false positives are mixed with real errors from the component and/or from downstream, the situation becomes impossible to analyze by just looking at the metrics. For these reasons, I believe we should do one of four things: 1. Change the way we use the `Consumer` API to no longer propagate errors, making the PI metric outcomes more precise. We could catch errors in whatever wrapper we already use to emit the PI metrics, log them for posterity, and simply not propagate them. Note that some components already more or less do this, such as the `batchprocessor`, but this option may in principle break components which rely on downstream errors (for retry purposes for example). 3. Keep propagating errors, but modify or extend the RFC to require distinguishing between internal and propagated errors (maybe add a third `outcome` value, or add another attribute). This could be implemented by somehow propagating additional state from one `Consume` call to another, allowing us to establish the first appearance of a given error value in the pipeline. 5. Loosen this requirement so that the PI metrics suffice in their current state. 6. Leave everything as-is and make component authors implement their own somewhat redundant error count metrics. --------- Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com> Co-authored-by: Pablo Baeyens <pablo.baeyens@datadoghq.com>	2024-12-16 09:16:23 +00:00
Guspan Tanadi	5da51fae56	[README] links caption (#11893 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Provide caption of a link in README <!-- Issue number if applicable --> #### Link to tracking issue Fixes # <!--Describe what testing was performed and which tests were added.--> #### Testing <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.-->	2024-12-15 00:10:49 +00:00
Pablo Baeyens	8bd803fd70	[chore] Add post-release steps (#11862 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description <!-- Issue number if applicable --> Adds post-release steps including release retro and schedule updating. #### Link to tracking issue Fixes #11858 --------- Co-authored-by: Yang Song <songy23@users.noreply.github.com>	2024-12-12 15:55:24 +00:00
Pablo Baeyens	85558a95d1	[chore] Mention filing a thread in #otel-collector-dev during releases (#11861 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Fixes #11859	2024-12-12 13:59:31 +00:00
Dmitrii Anoshin	4a987da6c5	[docs] Add guidelines for experimental modules naming (#11836 ) Adding guidelines for experimental module naming as a result of [the vote](https://github.com/open-telemetry/opentelemetry-collector/issues/11783). Resolves #11783	2024-12-11 16:26:23 +00:00
Antoine Toulme	574e434b05	[chore] update release instructions to update the -releases changelog (#11846 ) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description This change is paired with https://github.com/open-telemetry/opentelemetry-collector-releases/pull/769	2024-12-11 11:15:53 +00:00

1 2 3 4 5 ...

325 Commits