Compare commits

...

380 Commits

Author SHA1 Message Date
Giedrius Statkevičius 3727363b49
Merge pull request #8335 from pedro-stanaka/fix/flaky-unit-test-store-proxy
fix: make TestProxyStore_SeriesSlowStores less flaky by removing timing assertions
2025-06-26 12:20:19 +03:00
Giedrius Statkevičius 37254e5779
Merge pull request #8336 from thanos-io/lazyindexheader_fix
indexheader: fix race between lazy index header creation
2025-06-26 11:19:12 +03:00
Giedrius Statkevičius 4b31bbaa6b indexheader: create lazy header in singleflight
Creation of the index header shares the underlying storage so we should
use singleflight here to only create it once.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-26 10:18:07 +03:00
Giedrius Statkevičius d6ee898a06 indexheader: produce race in test
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-26 10:01:21 +03:00
Giedrius Statkevičius 5a95d13802
Merge pull request #8333 from thanos-io/repro_8224
e2e: add repro for 8224
2025-06-26 08:01:35 +03:00
Pedro Tanaka b54d293dbd
fix: make TestProxyStore_SeriesSlowStores less flaky by removing timing assertions
The TestProxyStore_SeriesSlowStores test was failing intermittently in CI due to
strict timing assertions that were sensitive to system load and scheduling variations.

The test now focuses on functional correctness rather than precise timing,
making it more reliable in CI environments while still validating the
proxy store's timeout and partial response behavior.

Signed-off-by: Pedro Tanaka <pedro.stanaka@gmail.com>
2025-06-25 23:09:47 +02:00
Giedrius Statkevičius dfcbfe7c40 e2e: add repro for 8224
Add repro for https://github.com/thanos-io/thanos/issues/8224. Fix in
follow up PRs.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-25 18:07:48 +03:00
Giedrius Statkevičius 8b738c55b1
Merge pull request #8331 from thanos-io/merge-release-0.39-to-main-v2
Merge release 0.39 to main
2025-06-25 15:25:36 +03:00
Giedrius Statkevičius 69624ecbf1 Merge branch 'main' into merge-release-0.39-to-main-v2
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-25 14:59:35 +03:00
Giedrius Statkevičius 0453c9b144
*: release 0.39.0 (#8330)
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-25 14:05:34 +03:00
Saswata Mukherjee 9c955d21df
e2e: Check rule group label works (#8322)
* e2e: Check rule group label works

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix fanout test

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2025-06-23 10:27:07 +01:00
Paul 7de9c13e5f
add rule tsdb.enable-native-histograms flag (#8321)
Signed-off-by: Paul Hsieh <supaulkawaii@gmail.com>
2025-06-23 10:06:00 +01:00
Giedrius Statkevičius a6c05e6df6
*: add CHANGELOG, update VERSION (#8320)
Prepare for 0.39.0-rc.0.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-20 07:12:19 +03:00
Giedrius Statkevičius 34a98c8efb
CHANGELOG: indicate release (#8319)
Indicate that 0.39.0 is in progress.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-19 17:59:12 +03:00
Giedrius Statkevičius 933f04f55e
query_frontend: only ready if downstream is ready (#8315)
We had an incident in prod where QFE was reporting that it is ready even
though the downstream didn't work due to a misconfigured load-balancer.
In this PR I am proposing sending periodic requests to downstream
to check whether it is working.

TestQueryFrontendTenantForward never worked so I deleted it.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-18 11:56:48 +03:00
dependabot[bot] f1c0f4b9b8
build(deps): bump github.com/KimMachineGun/automemlimit (#8312)
Bumps [github.com/KimMachineGun/automemlimit](https://github.com/KimMachineGun/automemlimit) from 0.7.2 to 0.7.3.
- [Release notes](https://github.com/KimMachineGun/automemlimit/releases)
- [Commits](https://github.com/KimMachineGun/automemlimit/compare/v0.7.2...v0.7.3)

---
updated-dependencies:
- dependency-name: github.com/KimMachineGun/automemlimit
  dependency-version: 0.7.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-18 12:35:30 +05:30
Hongcheng Zhu a6370c7cc6
Add Prometheus counters for pending write requests and series requests in Receive (#8308)
Signed-off-by: HC Zhu <hczhu.mtv@gmail.com>
Co-authored-by: HC Zhu (DB) <hc.zhu@databricks.com>
2025-06-17 10:46:12 +05:30
Hongcheng Zhu 8f715b0b6b
Query: limit LazyRetrieval memory buffer size (#8296)
* Limit lazyRespSet memory buffer size using a ring buffer

Signed-off-by: HC Zhu <hczhu.mtv@gmail.com>

* store: make heap a bit more consistent

Add len comparison to make it more consistent.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Fix linter complains

Signed-off-by: HC Zhu <hczhu.mtv@gmail.com>

---------

Signed-off-by: HC Zhu <hczhu.mtv@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Co-authored-by: HC Zhu (DB) <hc.zhu@databricks.com>
Co-authored-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Co-authored-by: HC Zhu <hczhu.mtv@gmail.com>
2025-06-14 10:52:46 -07:00
Filip Petkovski 6c27396458
Merge pull request #8306 from GregSharpe1/main
[docs] Updating documentation around --compact flags
2025-06-13 08:53:04 +02:00
Greg Sharpe d1afea6a69 Updating the documention to reflect the correct flags when using --compact.enable-vertical-compaction.
Signed-off-by: Greg Sharpe <git+me@gregsharpe.co.uk>
2025-06-13 08:28:42 +02:00
gabyf 03d5b6bc28
tools: fix tool bucket inspect output arg description (#8252)
* docs: fix tool bucket output arg description

Signed-off-by: gabyf <zweeking.tech@gmail.com>

* fix(tools_bucket): output description from cvs to csv

Signed-off-by: gabyf <zweeking.tech@gmail.com>

---------

Signed-off-by: gabyf <zweeking.tech@gmail.com>
2025-06-12 16:35:42 -07:00
Giedrius Statkevičius 8769b97c86
go.mod: update promql engine + Prom dep (#8305)
Update dependencies. Almost everything works except for
https://github.com/prometheus/prometheus/pull/16252.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-12 10:50:03 +03:00
Aaron Walker 26f6e64365
Revert capnp to v3.0.0-alpha (#8300)
cef0b02 caused a regression of !7944. This reverts the version upgrade to the previously working version

Signed-off-by: Aaron Walker <aaron@vcra.io>
2025-06-10 09:41:59 +05:30
dependabot[bot] 60533e4a22
build(deps): bump golang.org/x/time from 0.11.0 to 0.12.0 (#8302)
Bumps [golang.org/x/time](https://github.com/golang/time) from 0.11.0 to 0.12.0.
- [Commits](https://github.com/golang/time/compare/v0.11.0...v0.12.0)

---
updated-dependencies:
- dependency-name: golang.org/x/time
  dependency-version: 0.12.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-10 09:31:02 +05:30
dependabot[bot] 95a2b00f17
build(deps): bump github.com/alicebob/miniredis/v2 from 2.22.0 to 2.35.0 (#8303)
Bumps [github.com/alicebob/miniredis/v2](https://github.com/alicebob/miniredis) from 2.22.0 to 2.35.0.
- [Release notes](https://github.com/alicebob/miniredis/releases)
- [Changelog](https://github.com/alicebob/miniredis/blob/master/CHANGELOG.md)
- [Commits](https://github.com/alicebob/miniredis/compare/v2.22.0...v2.35.0)

---
updated-dependencies:
- dependency-name: github.com/alicebob/miniredis/v2
  dependency-version: 2.35.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-10 09:30:45 +05:30
dependabot[bot] 2ed24bdf5b
build(deps): bump github/codeql-action from 3.26.13 to 3.28.19 (#8304)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.13 to 3.28.19.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](f779452ac5...fca7ace96b)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.28.19
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-10 09:30:25 +05:30
Naman-Parlecha 23d60b8615
Fix: DataRace in TestEndpointSetUpdate_StrictEndpointMetadata test (#8288)
* fix: Fixing Unit Test TestEndpointSetUpdate_StrictEndpointMetadata

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* revert: CHANGELOG.md

Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>

---------

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>
2025-06-06 15:51:53 +03:00
Naman-Parlecha 290f16c0e9
Resolve GitHub Actions Failure (#8299)
* update: changing to new prometheus page

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* fix: disable-admin-op flag

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

---------

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
2025-06-05 13:52:12 +03:00
Aaron Walker 4ad45948cd
Receive: Remove migration of legacy storage to multi-tsdb (#8289)
This has been in since 0.13 (~5 years ago). This fixes issues caused when the default-tenant does not have any data and gets churned, resulting in the migration assuming that per-tenant directories are actually blocks, resulting in blocks not being queryable.

Signed-off-by: Aaron Walker <aaron@vcra.io>
2025-06-03 16:57:57 +03:00
Daniel Blando 15b1ef2ead
shipper: allow shipper sync to skip corrupted blocks (#8259)
* Allow shipper sync to skip corrupted blocks

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Move check to blockMetasFromOldest

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Split metrics. Return error

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* fix test

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Reorder shipper contructor variables

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Use opts in shipper constructor

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Fix typo

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

---------

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>
2025-06-02 23:30:16 -07:00
Naman-Parlecha 2029c9bee0
store: Add --disable-admin-operations Flag to Store Gateway (#8284)
* fix(sidebar): maintain expanded state based on current page

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* fixing changelog

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* store: --disable-admin-operation flag

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

docs: Adding Flag details

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

updated changelog

refactor: changelog

Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>

---------

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>
2025-06-01 15:26:58 -07:00
Saumya Shah 4e04420489
query: handle query.Analyze returning nil gracefully (#8199)
* fix: handle analyze returning nil gracefully

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* update CHANGELOG.md

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* fix format

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

---------

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-05-30 12:15:42 +03:00
Naman-Parlecha 36df30bbe8
fix: maintain expanded state based on current page (#8266)
* fix(sidebar): maintain expanded state based on current page

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* fixing changelog

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

---------

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>
2025-05-30 12:07:23 +03:00
Saumya Shah 390fd0a023
query, query-frontend, ruler: Add support for flags to use promQL experimental functions & bump promql-engine (#8245)
* feat: add support for experimental functions, if enabled

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* fix tests

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* allow setting enable-feature flag in ruler

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* add flag info in docs

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* add CHANGELOG

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* add hidden flag to throw err on query fallback, red in tests ^_^

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* bump promql-engine to latest version/commit

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* format docs

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

---------

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-05-30 10:04:28 +03:00
Anna Tran 12649d8be7
Force sync writes to meta.json in case of host crash (#8282)
* Force sync writes to meta.json in case of host crash

Signed-off-by: Anna Tran <trananna@amazon.com>

* Update CHANGELOG for fsync meta.json

Signed-off-by: Anna Tran <trananna@amazon.com>

---------

Signed-off-by: Anna Tran <trananna@amazon.com>
2025-05-29 12:23:49 +03:00
Giedrius Statkevičius cef0b0200e
go.mod: mass update modules (#8277)
Maintenance task: let's update all modules.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-27 18:32:28 +03:00
Saumya Shah efc6eee8c6
query: fix query analyze to return appropriate results (#8262)
* call query analysis once querys being exec

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* refract the analyze logic

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* send not analyzable warnings instead of returning err

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* add seperate warnings in query non analyzable state based on engine

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

---------

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-05-27 16:13:30 +03:00
Siavash Safi da421eaffe
Shipper: fix missing meta file errors (#8268)
- fix meta file read error check
- use proper logs for missing meta file vs. other read errors

Signed-off-by: Siavash Safi <siavash@cloudflare.com>
2025-05-23 11:46:09 +00:00
Giedrius Statkevičius d71a58cbd4
docs: fix receive page (#8267)
Fix the docs after the most recent merge.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-23 10:47:01 +00:00
Giedrius Statkevičius f847ff0262
receive: implement shuffle sharding (#8238)
See the documentation for details.

Closes #3821.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-22 11:08:23 +03:00
dronenb ec9601aa0e
feat(promu): add darwin/arm64 (#8263)
* feat(promu): add darwin/arm64

Signed-off-by: Ben Dronen <dronenb@users.noreply.github.com>

* fix(promu): just use darwin

Signed-off-by: Ben Dronen <dronenb@users.noreply.github.com>

---------

Signed-off-by: Ben Dronen <dronenb@users.noreply.github.com>
2025-05-22 10:04:57 +02:00
Michael Hoffmann 759773c4dc
shipper: delete unused functions (#8260)
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-05-21 08:18:52 +00:00
Giedrius Statkevičius 88092449cd
docs: volunteer as shepherd (#8249)
* docs: volunteer as shepherd

Release the next version in a few weeks.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Fix formatting

Signed-off-by: Matej Gera <38492574+matej-g@users.noreply.github.com>

---------

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Matej Gera <38492574+matej-g@users.noreply.github.com>
Co-authored-by: Matej Gera <38492574+matej-g@users.noreply.github.com>
2025-05-15 14:56:00 +03:00
Ayoub Mrini 34b3d64034
test(tools_test.go/Test_CheckRules_Glob): take into consideration RO current dirs while (#8014)
changing files permissions.

The process may not have the needed permissions on the file (not the owner, not root or doesn't have the CAP_FOWNER capability)
to chmod it.
i

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-05-14 13:20:14 +01:00
dongjiang 242b5f6307
add otlp clientType (#8243)
Signed-off-by: dongjiang <dongjiang1989@126.com>
2025-05-13 14:18:41 +03:00
Giedrius Statkevičius aa3e4199db
e2e: disable some more flaky tests (#8241)
These are flaky hence disable them.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-09 16:46:23 +03:00
Giedrius Statkevičius 81b4260f5f
reloader: disable some flaky tests (#8240)
Disabling some flaky tests.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-08 15:59:24 +03:00
Saumya Shah 2dfc749a85
UI: bump codemirror-promql dependency to latest version (#8230)
* bump codemirror-promql react dep to latest version

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* fix lint errors, build react-app

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* sync ui change of input expression

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* revert build files

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* build and update few warnings

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

---------

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-05-07 11:20:49 +01:00
Philip Gough 2a5a856e34
tools: Extend bucket ls options (#8225)
* tools: Extend bucket ls command with min and max time, selector config and timeout options

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>

* make: docs

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>

Update cmd/thanos/tools_bucket.go

Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Philip Gough <pgough@redhat.com>

Update cmd/thanos/tools_bucket.go

Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Philip Gough <pgough@redhat.com>

---------

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Signed-off-by: Philip Gough <pgough@redhat.com>
Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2025-04-25 10:33:34 +00:00
Giedrius Statkevičius cff147dbc0
receive: remove Get() method from hashring (#8226)
Get() is equivalent to GetN(1) so remove it. It's not used.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-04-25 09:59:37 +00:00
Saswata Mukherjee 7d7ea650b7
Receive: Ensure forward/replication metrics are incremented in err cases (#8212)
* Ensure forward/replication metrics are incremented in err cases

This commit ensures forward and replication metrics are incremented with
err labels.

This seemed to be missing, came across this whilst working on a
dashboard.

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add changelog

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2025-04-22 11:35:34 +00:00
Andrew Reilly 92db7aabb1
Update query.md documentation where example uses --query.tenant-default-id flag instead of --query.default-tenant-id (#8210)
Signed-off-by: Andrew Reilly <adr@maas.ca>
2025-04-22 11:27:49 +03:00
Filip Petkovski 66f54ac88d
Merge pull request #8216 from yuchen-db/fix-iter-race
Fix Pull iterator race between next() and stop()
2025-04-22 08:19:10 +02:00
Yuchen Wang a8220d7317 simplify unit test
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 17:54:47 -07:00
Yuchen Wang 909c08fa98 add comments
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 17:46:38 -07:00
Yuchen Wang 6663bb01ac update changelog
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 17:16:33 -07:00
Yuchen Wang d7876b4303 fix unit test
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 16:55:57 -07:00
Yuchen Wang 6f556d2bbb add unit test
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 15:46:54 -07:00
Yuchen Wang 0dcc9e9ccd add changelog
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 15:46:54 -07:00
Yuchen Wang f168dc0cbb fix Pull iter race between next() and stop()
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 15:46:54 -07:00
dependabot[bot] 8273ad013c
build(deps): bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 (#8164)
Bumps [github.com/golang-jwt/jwt/v5](https://github.com/golang-jwt/jwt) from 5.2.1 to 5.2.2.
- [Release notes](https://github.com/golang-jwt/jwt/releases)
- [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md)
- [Commits](https://github.com/golang-jwt/jwt/compare/v5.2.1...v5.2.2)

---
updated-dependencies:
- dependency-name: github.com/golang-jwt/jwt/v5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-17 15:08:12 +01:00
Michael Hoffmann 31c6115317
Query: fix partial response for distributed instant query (#8211)
This commit fixes a typo in partial response handling for distributed
instant queries.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-04-17 08:06:42 +00:00
Aaron Walker c0b5500cb5
Unhide tsdb.enable-native-histograms flag in receive (#8202)
Signed-off-by: Aaron Walker <aaron@vcra.io>
2025-04-11 13:33:24 +02:00
Michael Hoffmann ce2b51f93e
Sidecar: increase default prometheus timeout (#8192)
Adjust the default get-config timeout to match the default get-config
interval.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-04-07 15:29:22 +00:00
Naohiro Okada 1a559f9de8
fix changelog markdown. (#8190)
Signed-off-by: naohiroo <naohiro.dev@gmail.com>
2025-04-04 14:23:16 +02:00
Michael Hoffmann b2f5ee44a7
merge release 0.38.0 to main (#8186)
* Changelog: cut release 0.38-rc.0 (#8174)

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

* Changelog: cut release 0.38.0-rc.1 (#8180)

* Query: fix endpointset setup

This commit fixes an issue where we add non-strict, non-group endpoints
to the endpointset twice, once with resolved addresses from the dns
provider and once with its dns prefix.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

* deps: bump promql-engine (#8181)

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

* Changelog: cut release 0.38-rc.1

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

---------

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

* Changelog: cut release 0.38 (#8185)

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

---------

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-04-04 06:10:48 +00:00
Michael Hoffmann 08e5907cba
deps: bump promql-engine (#8181)
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-03-31 13:22:35 +00:00
Michael Hoffmann 2fccdfbf5a
Query: fix endpointset setup (#8175)
This commit fixes an issue where we add non-strict, non-group endpoints
to the endpointset twice, once with resolved addresses from the dns
provider and once with its dns prefix.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-03-27 07:02:21 +00:00
Michael Hoffmann da855a12dc
Changelog: mark 0.38 as in-progress (#8173)
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-03-25 12:57:29 +00:00
Michael Hoffmann 68844d46d7
e2e: use prom 3 (#8165)
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-03-24 18:10:54 +00:00
Saumya Shah d1345b999e
update: interactive tests to update non-supported store flags to endpoint in querier (#8157)
Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-03-17 11:07:46 +00:00
Ben Kochie 8da0a2b69b
rule: Add support for query offset (#8158)
Support Prometheus rule manager upstream "query offset" feature.
* Add support for a default rule query offset via command flag.
* Add per rule group query_offset support.

Fixes: https://github.com/thanos-io/thanos/issues/7596

Signed-off-by: SuperQ <superq@gmail.com>
2025-03-14 08:24:32 +00:00
Michał Mazur 1f5bff2a01
query: Support chain deduplication algorithm (#7808)
Signed-off-by: Michał Mazur <mmazur.box@gmail.com>
2025-03-13 08:24:18 -07:00
dependabot[bot] e3acaeb8d6
build(deps): bump go.opentelemetry.io/otel/sdk from 1.34.0 to 1.35.0 (#8148)
Bumps [go.opentelemetry.io/otel/sdk](https://github.com/open-telemetry/opentelemetry-go) from 1.34.0 to 1.35.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.34.0...v1.35.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-11 10:12:47 +02:00
Michael Hoffmann 097b2a4783
Query: bump promql-engine, fix fallout for distributed mode (#8135)
This PR bumps the thanos promql-engine repository, fixes the fallout and
makes distributed mode respect the user requested partial response
setting.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-03-10 08:39:45 +00:00
Saswata Mukherjee 0414eef64d
Bump common+client_golang to deal with utf-8 (#8134)
* Bump common+client_golang to deal with utf-8

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix+Add tests

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Bump to 1.21

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2025-03-04 15:31:05 +00:00
Giedrius Statkevičius 03c96d05a0
compact: implement native histogram downsampling (#8110)
* test/e2e: add native histogram downsampling test case

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* downsample: port other PR

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* compact: fix after review

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

---------

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-03-03 21:07:05 +02:00
dependabot[bot] 812688e573
build(deps): bump peter-evans/create-pull-request from 7.0.6 to 7.0.7 (#8123)
Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 7.0.6 to 7.0.7.
- [Release notes](https://github.com/peter-evans/create-pull-request/releases)
- [Commits](67ccf781d6...dd2324fc52)

---
updated-dependencies:
- dependency-name: peter-evans/create-pull-request
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-03 11:20:06 +00:00
dependabot[bot] 7457649c0f
build(deps): bump actions/cache from 4.0.2 to 4.2.1 (#8122)
Bumps [actions/cache](https://github.com/actions/cache) from 4.0.2 to 4.2.1.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](0c45773b62...0c907a75c2)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-03 11:19:49 +00:00
Michael Hoffmann 81dfb50c1e
Query: bump promql-engine (#8118)
Bumping PromQL engine, fixing fallback fallout.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
Co-authored-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-03-03 11:18:08 +00:00
Ben Ye c69f11214d
Optimize wildcard matchers for .* and .+ (#8131)
* optimize wildcard matchers for .* and .+

Signed-off-by: yeya24 <benye@amazon.com>

* add changelog

Signed-off-by: yeya24 <benye@amazon.com>

---------

Signed-off-by: yeya24 <benye@amazon.com>
2025-03-02 19:05:17 -08:00
Ben Ye 4ba7d596a8
Infer max query downsample resolution from promql query (#7012)
* Adjust max_source_resolution automatically based promql queries

Signed-off-by: Ben Ye <benye@amazon.com>

* fix data race

Signed-off-by: yeya24 <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
Signed-off-by: yeya24 <benye@amazon.com>
2025-02-25 15:28:43 -08:00
민선 (minnie) 4a83459892
query : add missing xincrease/xrate aggregation (#8120) 2025-02-25 09:14:48 -08:00
dependabot[bot] f230915c1c
build(deps): bump go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp (#8067)
Bumps [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp](https://github.com/open-telemetry/opentelemetry-go) from 1.29.0 to 1.34.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.29.0...v1.34.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-20 22:42:18 -08:00
Yi Jin 426b2f19f1
[issue-8106] fix tenant hashring glob with multiple match patterns (#8107)
Signed-off-by: Yi Jin <yi.jin@databricks.com>
2025-02-20 12:11:27 +02:00
Michael Hoffmann 151ae7490e
Query: dynamic endpointgroups are allowed (#8113)
This PR fixes a bug where dynamic endpoint groups are silently ignored.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
Co-authored-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-02-19 10:31:58 +00:00
Marta 71bbafb611
Remove quote from replica label. (#8075)
Signed-off-by: Marta <me@marta.nz>
2025-02-18 11:35:11 -08:00
Saumya Shah 6fa81f797c
*: bump Go to 1.24 (#8105)
* bump go version to newer 1.24

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

* update .go-version

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

* fix test failing, formatted string func now requires constant string literal

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

* test: point faillint to thanos-community/faillint

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

* fix commit hash

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

* use branch instead of hash

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

* test: update faillint.mod/sum

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

* bump golangci-lint

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

* update .bingo/faillint.mod based on new deps upgrade

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

* address required changes

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>

---------

Signed-off-by: Saumyacodes-40 <saumyabshah90@gmail.com>
2025-02-18 12:41:03 +00:00
dependabot[bot] fe651280a5
build(deps): bump github.com/tjhop/slog-gokit from 0.1.2 to 0.1.3 (#8109)
Bumps [github.com/tjhop/slog-gokit](https://github.com/tjhop/slog-gokit) from 0.1.2 to 0.1.3.
- [Release notes](https://github.com/tjhop/slog-gokit/releases)
- [Changelog](https://github.com/tjhop/slog-gokit/blob/main/.goreleaser.yaml)
- [Commits](https://github.com/tjhop/slog-gokit/compare/v0.1.2...v0.1.3)

---
updated-dependencies:
- dependency-name: github.com/tjhop/slog-gokit
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-18 10:44:51 +02:00
SungJin1212 346d18bb0f
Update prometheus verison to v3.1.0 (#8090)
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
2025-02-12 12:17:00 +02:00
Giedrius Statkevičius 38f4c3c6a2
store: lock around iterating over s.blocks (#8088)
Hold a lock around s.blocks when iterating over it. I have experienced a
case where a block had been added to a blockSet twice somehow and it
being somehow removed from s.blocks is the only way it could happen.
This is the only "bad" thing I've been able to spot.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-02-07 10:54:59 +02:00
Piotr Śliwka a13fc75c04
Fix deadlock in metadata fetcher (#8092)
This is a fix for a bug we encountered in our production deployment at
$WORK, where we experience Thanos Store to spontaneously stop refreshing
its metadata cache every time our (Ceph-based) object storage starts
rate-limiting Thanos's requests too much. This causes the Store to
permanently stop discovering new blocks in the bucket, and keep trying
to access old, long-gone blocks (which have already been compacted and
removed), breaking subsequent queries to the Store and mandating its
manual restart. More details about the bug and the fix below.

Initially, the `GetActiveAndPartialBlockIDs` method spawns fixed number
of 64 goroutines to issue concurrent requests to a remote storage. Each
time the storage causes the `f.bkt.Exists` call to return an error, one
of goroutines exits returning said error, effectively reducing
concurrency of processing remaining block IDs in `metaChan`. While it's
not a big problem in case of one or two errors, it is entirely possible
that, in case of prolonged storage problems, all 64 goroutines quit,
resulting in `metaChan` filling up and blocking the `f.bkt.Iter`
iterator below. This causes the whole method to be stuck indefinitely,
even if the storage becomes fully operational again.

This commit fixes the issue by allowing the iterator to return as soon
as a single processing goroutine errors out, so that the method can
reliably finish, returning the error as intended. Additionally, the
processing goroutines are adjusted as well, to make them all quit early
without consuming remaining items in `metaChan`. While the latter is not
strictly necessary to fix this bug, it doesn't make sense to let any
remaining goroutines keep issuing requests to the storage if the method
is already bound to return nil result along with the first encountered
error.

Signed-off-by: Piotr Śliwka <psliwka@opera.com>
2025-02-07 10:54:32 +02:00
Célian GARCIA c25a356214
fix: add POST into allowed CORS methods header (#8091)
Signed-off-by: Célian Garcia <celian.garcia@amadeus.com>
2025-02-06 16:49:44 +02:00
SungJin1212 57efc2aacd
Add a func to convert go-kit log to slog (#7969)
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
2025-02-02 23:07:37 -08:00
Ben Ye 8cd83bfd2b
Extend posting-group-max-key-series-ratio for add all posting group (#8083)
Signed-off-by: yeya24 <benye@amazon.com>
2025-02-02 10:32:59 -08:00
Ben Ye 45013e176f
skip match label values for certain matchers (#8084)
Signed-off-by: yeya24 <benye@amazon.com>
2025-02-02 10:30:31 -08:00
Michael Hoffmann 2367777322
query, rule: make endpoint discovery dynamically reloadable (#7890)
* Removed previously deprecated and hidden flags to configure endpoints ( --rule, --target, ...)
* Added new flags --endpoint.sd-config, --endpoint-sd-config-reload-interval to configure a dynamic SD file
* Moved endpoint set construction into cmd/thanos/endpointset.go for a little cleanup
* Renamed "thanos_(querier/ruler)_duplicated_store_addresses_total" to
  "thanos_(querier/ruler)_duplicated_endpoint_addresses_total"

The new config makes it possible to also set "strict" and "group" flags on the endpoint instead
of only their addresses, making it possible to have file based service discovery for endpoint groups too.

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-01-15 16:47:59 +02:00
Nicolas Takashi 300a9ed653
[FEATURE] adding otlp endpoint (#7996)
* [FEATURE] adding otlp endpoint

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FEATURE] adding otlp endpoint

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FEATURE] adding otlp endpoint

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FIX] e2e tests for otlp receiver

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [CHORE] adding otlp flags

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [DOC] updating docs

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [CHORE] copying otlptranslator

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [CHORE] copying otlptranslator tests

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [CHORE] copying otlptranslator tests

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [CHORE] copying otlptranslator tests

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FIX] lint issues

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FIX] lint issues

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FIX] lint issues

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [CHORE] using multi errors

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FIX] span naming convention

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [TEST] adding handler otlp unit test

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [TEST] upgrade collector version

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FIX] golang lint

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [CHORE] adding allow size bytes limit gate

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FIX] unit test otlp endpoint

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* Apply suggestions from code review

Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FIX] unit test otlp endpoint

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [DOC] updating docs

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [CHORE] applying pr comments

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* Update pkg/receive/handler_otlp.go

Co-authored-by: Matej Gera <38492574+matej-g@users.noreply.github.com>
Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* Update cmd/thanos/receive.go

Co-authored-by: Matej Gera <38492574+matej-g@users.noreply.github.com>
Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [DOCS] updating

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* [FIX] go mod lint issues

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>

* Fix TestFromMetrics error comparison

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Co-authored-by: Matej Gera <38492574+matej-g@users.noreply.github.com>
2025-01-15 11:31:04 +00:00
Pedro Tanaka a3b78c231c
QFE: fixing stats middleware when cache is enabled (#8046)
* QFE: fixing stats middleware when cache is enabled

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Clarify strange config parameter

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

---------

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2025-01-15 10:28:35 +00:00
Ben Ye caffc1181b
make lazy posting series match ratio configurable (#8049)
Signed-off-by: Ben Ye <benye@amazon.com>
2025-01-15 10:16:25 +00:00
Daniel Sabsay 4ba0ba4038
pkg/cacheutil: Async op fix (#8044)
* Add test for AsyncOperationProcessor stop() behavior

The existing implementation sometimes drops existing operations that are
still on the queue when .stop() is called.

If multiple communications in a select statement can proceed, one is
chosen pseudo-randomly: https://go.dev/ref/spec#Select_statements

This means that sometimes a processor worker will process a remaining
operation, and sometimes it won't.

Signed-off-by: Daniel Sabsay <sabsay@adobe.com>

* Fix async_op test regarding stop() behavior

Signed-off-by: Daniel Sabsay <sabsay@adobe.com>

* add header to test file

Signed-off-by: Daniel Sabsay <sabsay@adobe.com>

---------

Signed-off-by: Daniel Sabsay <sabsay@adobe.com>
Co-authored-by: Daniel Sabsay <sabsay@adobe.com>
2025-01-10 09:47:50 +02:00
Michael Hoffmann f250d681fd
query: fix panic when selecting non-default engine (#8050)
Fix duplicate metrics registration when selecting non-default engine

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
Co-authored-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-01-09 10:38:33 +00:00
Alan Protasio 0d42636167
Matcher cache/series (#8045)
* Add option to cache matcher on the get series call

Signed-off-by: alanprot <alanprot@gmail.com>

* Adding matcher cacheoption for the store gateway

Signed-off-by: alanprot <alanprot@gmail.com>

* lint/docs

Signed-off-by: alanprot <alanprot@gmail.com>

* change desc

Signed-off-by: alanprot <alanprot@gmail.com>

* change desc

Signed-off-by: alanprot <alanprot@gmail.com>

* change desc

Signed-off-by: alanprot <alanprot@gmail.com>

* fix test

Signed-off-by: alanprot <alanprot@gmail.com>

* Caching only regex matchers

Signed-off-by: alanprot <alanprot@gmail.com>

* changelog

Signed-off-by: alanprot <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
2025-01-07 22:05:37 +00:00
Ben Ye 0e95c464dd
Fix binary reader download duration histogram (#8017)
* fix binary reader download duration histogram

Signed-off-by: Ben Ye <benye@amazon.com>

* enable native histograms

Signed-off-by: Ben Ye <benye@amazon.com>

* changelog

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2025-01-06 09:34:20 -08:00
Ben Ye ca2e23ffb5
add block lifecycle callback (#8036)
Signed-off-by: Ben Ye <benye@amazon.com>
2025-01-06 11:41:20 +00:00
Ben Ye 2ff07b2ce7
Optimize sort keys by server in memcache client (#8026)
* optimize sort keys by server in memcache client

Signed-off-by: Ben Ye <benye@amazon.com>

* address comments

Signed-off-by: Ben Ye <benye@amazon.com>

* remove unused mockAddr

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2025-01-06 09:46:56 +00:00
Alan Protasio bed76cf4dd
Fix matcher cache (#8039)
* Fix matcher cache

Signed-off-by: alanprot <alanprot@gmail.com>

* Simplifying cache interface

Signed-off-by: alanprot <alanprot@gmail.com>

---------

Signed-off-by: alanprot <alanprot@gmail.com>
2025-01-05 16:02:39 -08:00
Ben Ye 6e29530000
optimize store gateway bytes limiter reserve with type request (#8025)
Signed-off-by: Ben Ye <benye@amazon.com>
2025-01-05 09:59:57 +00:00
dependabot[bot] 4a246cee50
build(deps): bump peter-evans/create-pull-request from 6.1.0 to 7.0.6 (#8028)
Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 6.1.0 to 7.0.6.
- [Release notes](https://github.com/peter-evans/create-pull-request/releases)
- [Commits](c5a7806660...67ccf781d6)

---
updated-dependencies:
- dependency-name: peter-evans/create-pull-request
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-05 09:58:37 +00:00
Milind Dethe 40d844de5b
receive: unhide tsdb.out-of-order.time-window and tsdb.out-of-order.cap-max (#8032)
* Unhide tsdb.out-of-order.time-window and tsdb.out-of-order.cap-max

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

* make docs

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

* make docs

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

---------

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
2025-01-05 09:58:08 +00:00
Ben Ye 07734b97fa
add pool for expanded posting slice (#8035)
* add pool for expanded posting slice

Signed-off-by: Ben Ye <benye@amazon.com>

* check nil postings

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2025-01-05 09:57:43 +00:00
Roberto O. Fernández Crisial 803556cb57
Updating x/net package (#8034)
Signed-off-by: Roberto O. Fernández Crisial <roberto.crisial@ip-192-168-0-7.ec2.internal>
Signed-off-by: Roberto O. Fernández Crisial <rofc@rofc.com.ar>
2025-01-03 11:44:37 -08:00
Pedro Tanaka 626d0e5bfb
Receiver: cache matchers for series calls (#7353)
* Receiver: cache matchers for series calls

We have tried caching matchers before with a time-based expiration cache, this time we are trying with LRU cache.

We saw some of our receivers busy with compiling regexes and with high CPU usage, similar to the profile of the benchmark I added here:

* Adding matcher cache for method `MatchersToPromMatchers` and a new version which uses the cache.
* The main change is in `matchesExternalLabels` function which now receives a cache instance.

adding matcher cache and refactor matchers

Co-authored-by: Andre Branchizio <andre.branchizio@shopify.com>

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Using the cache in proxy and tsdb stores (only receiver)

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

fixing problem with deep equality

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

adding some docs

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Adding benchmark

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

undo unecessary changes

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Adjusting metric names

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

adding changelog

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

wiring changes to the receiver

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Fixing linting

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

docs

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* using singleflight to get or set items

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* improve metrics

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Introduce interface for matchers cache

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* fixing unit test

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* adding changelog

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* fixing benchmark

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* moving matcher cache to storecache package

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Trying to make the cache more reusable introducing interface

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Fixing problem with wrong initialization

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

Moving interface to storecache package

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

remove empty file and fix calls to constructor passing nil;

Signed-off-by: Pedro Tanaka <pedro.stanaka@gmail.com>

* Fix false entry on change log

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Removing default value for registry and rename test file

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Using fmt.Errf()

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Remove method that is not on interface anymore

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

* Remove duplicate get call

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>

---------

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Pedro Tanaka <pedro.stanaka@gmail.com>
2025-01-03 09:59:24 -08:00
Pengyu Wang ca40906c83
Bump devcontainer Dockerfile base image from go1.22 to go1.23 (#8031)
Signed-off-by: Pengyu Wang <hncswpy@gmail.com>
2024-12-31 13:44:21 -08:00
Harry John 6f03fcb0ba
QFE: Fix @ modifier not being applied correctly on subqueries (#8016)
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-12-26 15:38:17 -08:00
Filip Petkovski 2d041dc774
Merge pull request #8018 from Juneezee/refactor/xxhash
*: replace `cespare/xxhash` with `cespare/xxhash/v2`
2024-12-24 15:37:35 +01:00
Eng Zer Jun d298d5afee
*: replace `cespare/xxhash` with `cespare/xxhash/v2`
`github.com/cespare/xxhash/v2` is the latest version with bug fixes and
improvements.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2024-12-24 20:35:00 +08:00
Filip Petkovski ab13e10b3d
Merge pull request #8021 from Juneezee/refactor/exp
*: replace `golang.org/x/exp` with standard library
2024-12-24 13:20:09 +01:00
Eng Zer Jun 90bfef6a91
Tidy `go.mod` properly
Two sections in total: one for direct dependencies, and one for indirect
dependencies.

Reference: https://github.com/golang/go/issues/56471
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2024-12-24 01:17:41 +08:00
Eng Zer Jun b6556852a7
*: replace `golang.org/x/exp` with standard library
These experimental packages are now available in the Go standard
library. Since we upgraded our minimum Go version to 1.23 in PR
https://github.com/thanos-io/thanos/pull/7796, we can replace them with
the standard library:

	1. golang.org/x/exp/slices -> slices [1]
	2. golang.org/x/exp/maps -> maps [2]
	3. golang.org/x/exp/rand -> math/rand/v2 [3]

[1]: https://go.dev/doc/go1.21#slices
[2]: https://go.dev/doc/go1.21#maps
[3]: https://go.dev/doc/go1.22#math_rand_v2

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2024-12-24 01:16:32 +08:00
Filip Petkovski c5025b79af
Merge pull request #8002 from thanos-io/dependabot/go_modules/golang.org/x/crypto-0.31.0
build(deps): bump golang.org/x/crypto from 0.28.0 to 0.31.0
2024-12-20 16:06:01 +01:00
dependabot[bot] 8311e3de70
build(deps): bump golang.org/x/crypto from 0.28.0 to 0.31.0
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.28.0 to 0.31.0.
- [Commits](https://github.com/golang/crypto/compare/v0.28.0...v0.31.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-12-20 14:51:51 +00:00
Coleen Iona Quadros 4a847851e4
Add labels to rules UI (#8009)
Signed-off-by: Coleen Iona Quadros <coleen.quadros27@gmail.com>
2024-12-19 16:28:06 +00:00
Filip Petkovski b881f6c41b
Merge pull request #8005 from pedro-stanaka/fix-e2e-maybe
Fixing E2E tests using headless Chrome
2024-12-18 12:03:19 +01:00
Abel Simon 80c89fdb19
query: distributed engine - allow querying overlapping intervals (commit signed) (#8003)
* chore: add possibility to run individual e2e tests

Signed-off-by: Abel Simon <abelsimon48@gmail.com>

* chore: add metric pointer for too old sample logs

Signed-off-by: Abel Simon <abelsimon48@gmail.com>

* feat: add flag for distributed queries with overlapping intervals

Signed-off-by: Abel Simon <abelsimon48@gmail.com>

* chore: add failing overlapping interval test

Signed-off-by: Abel Simon <abelsimon48@gmail.com>

* chore: fix base branch diff

Signed-off-by: Abel Simon <abelsimon48@gmail.com>

---------

Signed-off-by: Abel Simon <abelsimon48@gmail.com>
2024-12-18 10:51:21 +00:00
Pedro Tanaka e2cb509f5f
Forcing headless and making sure we can run it before creating the context for testing
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-12-18 11:47:03 +01:00
Giedrius Statkevičius 8e4eb42d4c .github: run e2e tests on newer ubuntu
I cannot reproduce chromedp panics locally so trying to see if a newer
Ubuntu version would help.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-12-18 12:32:09 +02:00
Rémi Vichery 1b58ed13c9
receive: fix maxBufferedResponses channel size to avoid deadlock (#7978)
* Fix maxBufferedResponses channel size to avoid deadlock

Fixes #7977

Signed-off-by: Remi Vichery <remi@alkira.com>

* Add changelog entry

Signed-off-by: Remi Vichery <remi@alkira.com>

* adjust line numbers in docs/components/receive.md to match updated code

Signed-off-by: Remi Vichery <remi@alkira.com>

---------

Signed-off-by: Remi Vichery <remi@alkira.com>
2024-12-18 10:23:26 +02:00
Michael Hoffmann 1ca8292729
api: bump promql engine and fix fallout (#8000)
* bump to new promql-engine version and fix fallout
* new promql-engine makes it possible to provide more options at runtime

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
Co-authored-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2024-12-17 16:28:23 +00:00
Saswata Mukherjee 683cf171e9
Merge pull request #7982 from saswatamcode/merge-release-0.37.2-to-main
Merge release 0.37.2 back to main
2024-12-11 10:10:45 +00:00
Saswata Mukherjee 3ac552d95a
Merge branch 'main' into merge-release-0.37.2-to-main
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-12-11 09:32:33 +00:00
Saswata Mukherjee 18291a78d4
Merge pull request #7980 from saswatamcode/cut-release-0.37.2
* Fix potential deadlock in hedging request (#7962)

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>

* sidecar: fix limit mintime (#7970)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Cut patch release v0.37.2

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix changelog

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Co-authored-by: SungJin1212 <tjdwls1201@gmail.com>
Co-authored-by: Michael Hoffmann <mhoffm@posteo.de>
2024-12-11 09:03:29 +00:00
Saswata Mukherjee c071d513f9
Fix changelog
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-12-11 08:51:31 +00:00
Saswata Mukherjee 49a0587b54
Cut patch release v0.37.2
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-12-11 08:51:31 +00:00
Michael Hoffmann 351f75b597
sidecar: fix limit mintime (#7970)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-12-11 08:51:31 +00:00
SungJin1212 6cdc1ae96d
Fix potential deadlock in hedging request (#7962)
Signed-off-by: SungJin1212 <tjdwls1201@gmail.com>
2024-12-11 08:50:48 +00:00
Ben Ye 0ea6bac096
store gateway: fix merge fetched postings with lazy postings (#7979)
Signed-off-by: Ben Ye <benye@amazon.com>
2024-12-10 15:43:02 -08:00
Michael Hoffmann b3645c8017
sidecar: fix limit mintime (#7970)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-12-10 10:22:07 +00:00
Ben Ye 51c7dcd8c2
Mark posting group lazy if it has a lot of keys (#7961)
* mark posting group lazy if it has a lot of add keys

Signed-off-by: Ben Ye <benye@amazon.com>

* update docs

Signed-off-by: Ben Ye <benye@amazon.com>

* rename labels

Signed-off-by: Ben Ye <benye@amazon.com>

* changelog

Signed-off-by: Ben Ye <benye@amazon.com>

* change to use max key series ratio

Signed-off-by: Ben Ye <benye@amazon.com>

* update docs

Signed-off-by: Ben Ye <benye@amazon.com>

* mention metrics

Signed-off-by: Ben Ye <benye@amazon.com>

* update docs

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-12-09 23:13:11 -08:00
SungJin1212 d0d93dbf3e
Fix potential deadlock in hedging request (#7962)
Signed-off-by: kade.lee <tjdwls1201@gmail.com>
2024-12-05 12:39:58 +00:00
Saswata Mukherjee 7037331e6e
Merge pull request #7959 from saswatamcode/merge-release-0.37.1-to-main
Merge release 0.37.1 to main
2024-12-04 09:46:16 +00:00
Saswata Mukherjee 5d2f3b687e
Merge branch 'main' into merge-release-0.37.1-to-main
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-12-04 09:27:43 +00:00
Saswata Mukherjee e0812e2f46
Cut patch release `v0.37.1` (#7952)
* Merge pull request #7674 from didukh86/query_frontend_tls_redis_fix

Query-frontend: Fix connection to Redis cluster with TLS.
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Capnp: Use segment from existing message (#7945)

* Capnp: Use segment from existing message

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Downgrade capnproto

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* [Receive] Fix race condition when adding multiple new tenants at once (#7941)

* [Receive] fix race condition

Signed-off-by: Yi Jin <yi.jin@databricks.com>

* add a change log

Signed-off-by: Yi Jin <yi.jin@databricks.com>

* memorize tsdb local clients without race condition

Signed-off-by: Yi Jin <yi.jin@databricks.com>

* fix data race in testing with some concurrent safe helper functions

Signed-off-by: Yi Jin <yi.jin@databricks.com>

* address comments

Signed-off-by: Yi Jin <yi.jin@databricks.com>

---------

Signed-off-by: Yi Jin <yi.jin@databricks.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Cut patch release v0.37.1

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Update promql-engine for subquery fix (#7953)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Sidecar: Ensure limit param is positive for compatibility with older Prometheus (#7954)

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Update changelog

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix changelog

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Yi Jin <yi.jin@databricks.com>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: Yi Jin <96499497+jnyi@users.noreply.github.com>
2024-12-04 08:22:07 +00:00
Saswata Mukherjee dec2686f99
Sidecar: Ensure limit param is positive for compatibility with older Prometheus (#7954)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-12-03 18:55:46 +00:00
Saswata Mukherjee 1a328c124b
Update promql-engine for subquery fix (#7953)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-12-03 13:33:10 +00:00
Yi Jin 1ea4e69908
[Receive] Fix race condition when adding multiple new tenants at once (#7941)
* [Receive] fix race condition

Signed-off-by: Yi Jin <yi.jin@databricks.com>

* add a change log

Signed-off-by: Yi Jin <yi.jin@databricks.com>

* memorize tsdb local clients without race condition

Signed-off-by: Yi Jin <yi.jin@databricks.com>

* fix data race in testing with some concurrent safe helper functions

Signed-off-by: Yi Jin <yi.jin@databricks.com>

* address comments

Signed-off-by: Yi Jin <yi.jin@databricks.com>

---------

Signed-off-by: Yi Jin <yi.jin@databricks.com>
2024-12-03 10:52:06 +02:00
Saswata Mukherjee 51fddeb28d
Merge pull request #7946 from saswatamcode/merge-release-0.37-to-main
Merge release v0.37.0 to main
2024-11-28 17:51:16 +00:00
Saswata Mukherjee 96cc4f17ff
Fix ver
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-11-28 17:22:20 +00:00
Saswata Mukherjee cd0ac33697
Merge branch 'main' into merge-release-0.37-to-main
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-11-28 17:20:28 +00:00
Filip Petkovski dd86ec8d0a
Capnp: Use segment from existing message (#7945)
* Capnp: Use segment from existing message

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Downgrade capnproto

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-11-28 17:15:20 +00:00
bluesky6529 e133849d38
remove redundant redis config (#7942)
Signed-off-by: Helen Tseng <bluesky6529@gmail.com>
2024-11-28 17:12:34 +00:00
Filip Petkovski e4d8234c86
Merge pull request #7674 from didukh86/query_frontend_tls_redis_fix
Query-frontend: Fix connection to Redis cluster with TLS.
2024-11-28 17:44:47 +01:00
Filip Petkovski 1a3dc07892
Merge branch 'main' into query_frontend_tls_redis_fix 2024-11-28 17:32:22 +01:00
Philip Gough 1d76335611
receive: Allow specifying a custom gRPC service config via flag (#7907)
Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
2024-11-25 13:00:26 +00:00
Giedrius Statkevičius a55844d52a
receive/expandedpostingscache: fix race (#7937)
Porting https://github.com/cortexproject/cortex/pull/6369 to our code
base. Add test that fails without the fix.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-11-25 12:14:08 +02:00
Saswata Mukherjee 889d527630
Cut release for v0.37.0 (#7936)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-11-25 09:57:16 +00:00
Giedrius Statkevičius b144ebac20
receive: port expanded postings cache from Cortex (#7914)
Port expanded postings cache from Cortex. Huge kudos to @alanprot for
the implementation. I added a TODO item to convert our whole internal
caching infra to be promise based.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-11-21 11:27:53 +02:00
Saswata Mukherjee 02568235c4
Cut first release candidate for v0.37.0 (#7921)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-11-19 11:21:32 +00:00
Saswata Mukherjee fd0643206a
docs: Fix formatting again (#7928)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-11-18 17:25:35 +00:00
Saswata Mukherjee 6a2be98876
docs: Add link to ignore (#7926)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-11-18 14:45:18 +00:00
Saswata Mukherjee df9cca7d31
Update objstore and promql-engine to latest (#7924)
* Update objstore and promql-engine to latest

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fixes after upgrade

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-11-18 14:44:42 +00:00
Ben Ye f998fc59c1
Close block series client at the end to not reuse chunk buf (#7915)
* always close block series client at the end

Signed-off-by: Ben Ye <benye@amazon.com>

* add back close for loser tree

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-18 11:36:52 +00:00
Saswata Mukherjee 8c49344fea
Changelog: Mark v0.37 release in progress (#7920)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-11-18 09:42:38 +00:00
Saswata Mukherjee 2a975d3366
Skip TestDistributedEngineWithDisjointTSDBs (#7911)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-11-15 10:16:59 +00:00
Michael Hoffmann caa972ffd1
store, query: remote engine bug (#7904)
* Fix a storage GW bug that loses TSDB infos when joining them
* E2E test demonstrating a bug in the MinT calculation in distributed
  Engine

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-11-15 09:50:10 +00:00
Giedrius Statkevičius 20af3eb7df
receive/capnp: remove close (#7909)
I always get this in logs:
```
err: receive capnp conn: close tcp ...: use of closed network connection
```

This is also visible in the e2e test.

After Done() returns, the connection is closed either way so no need to
close it again.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-11-15 09:02:29 +00:00
Milind Dethe dc4c49f249
store: support hedged requests (#7860)
* support hedged requests in store

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

* hedged roundtripper with tdigest for dynamic delay

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

* refactor struct and fix lint

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

* Improve hedging implementation

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

* Improved hedging implementation

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

* Update store doc

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

* fix white space

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

* add enabled field

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>

---------

Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
2024-11-14 16:06:56 +02:00
Ben Ye f9da21ec0b
Fix store debug matchers panic on regex matcher (#7903)
* fix store debug matchers panic on regex

Signed-off-by: Ben Ye <benye@amazon.com>

* add test

Signed-off-by: Ben Ye <benye@amazon.com>

* changelog

Signed-off-by: Ben Ye <benye@amazon.com>

---------

Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-13 15:49:48 +02:00
Simon Pasquier bfbabbb89a
Fix ExternalLabels() for Prometheus v3.0 (#7893)
Prometheus v3.0.0-rc.0 introduces a new scrape protocol
(`PrometheusText1.0.0`) which is present by default in the global
configuration. It breaks the Thanos sidecar when it wants to retrieve
the external labels.

This change replaces the use of the Prometheus `GlobalConfig` struct by
a minimal struct which unmarshals only the `external_labels` key.

See also https://github.com/prometheus-operator/prometheus-operator/issues/7078

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2024-11-08 09:57:04 +00:00
Giedrius Statkevičius 928bc7aafb
*: bump Go version (#7891)
Use 1.23.3 as it contains a critical fix: https://github.com/golang/go/issues/70001

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-11-07 16:16:23 +02:00
Filip Petkovski 79593cb834
Merge pull request #7885 from fpetkovski/close-loser-tree
Fix bug in Bucket Series
2024-11-07 12:00:47 +01:00
Filip Petkovski 065c3beff2
Merge branch 'main' into close-loser-tree
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-11-07 11:47:58 +01:00
Giedrius Statkevičius ab43b2b20c
compact: add SyncMetas() timeout (#7887)
Add wait_interval*3 timeout to SyncMetas(). We had an incident in
production where object storage had had some problems and the syncer got
stuck due to no timeout. The timeout value is arbitrary but just exists
so that it wouldn't get stuck for eternity.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-11-06 16:26:54 +02:00
Michael Hoffmann 761487ccf5
Sidecar: use prometheus metrics for min timestamp (#7820)
Read "minT" from prometheus metrics so that we also set it for sidecars
that are not uploading blocks.

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-11-06 15:00:35 +02:00
Giedrius Statkevičius df3df36986
discovery: preserve results from other resolve calls (#7886)
Properly preserve results from other resolve calls. There is an
assumption that resolve() is always called with the same addresses but
that is not true with gRPC and `--endpoint-group`. Without this fix,
multiple resolves could happen at the same time but some of the callers
will not be able to retrieve the results leading to random errors.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-11-06 09:46:21 +02:00
Michał Mazur ebfc03e5fc
query-frontend: Fix cache keys for dynamic split intervals (#7832) 2024-11-05 13:57:03 -08:00
Filip Petkovski 4550964eb5
Close loser tree outside of span
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-11-05 19:56:44 +01:00
Filip Petkovski 62eb843f94
Add CHANGELOG entry
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-11-05 13:38:02 +01:00
Filip Petkovski 77bd9c0cbb
Fix bug in Bucket Series
Applies the fix described in https://github.com/thanos-io/thanos/issues/7883.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-11-05 13:36:42 +01:00
Filip Petkovski 9bc3cc0d05
Merge pull request #7854 from pedro-stanaka/feat/qfe-force-stats-collection
QFE: new middleware to force query statistics collection
2024-11-05 09:31:46 +01:00
Ben Ye d6d19c568f
upgrade Prometheus to fix round function (#7877)
Signed-off-by: Ben Ye <benye@amazon.com>
2024-11-04 13:36:26 -08:00
Pedro Tanaka 3d47cdac9e
fixing one more unit test
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-11-04 16:21:23 +01:00
Pedro Tanaka 457b861228
Bind to existing stats tag
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-11-04 15:12:55 +01:00
Pedro Tanaka d16b0985da
CR comments
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-11-04 15:12:55 +01:00
Pedro Tanaka cb922bb2d8
adjust docs
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-11-04 15:12:55 +01:00
Pedro Tanaka e08d5bcad1
Adding CHANGELOG
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-31 17:56:37 +01:00
Pedro Tanaka 11a17086c0
Using context propagation to add sample information
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-31 17:48:21 +01:00
Pedro Tanaka e2f6ca34f9
Update stats protobuf
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-31 17:48:21 +01:00
Pedro Tanaka 9d2e5e0b69
QFE: Create new stats middleware to force query statistics collection
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-31 17:48:21 +01:00
pureiboi 62038110b1
allow user to specify tls version for backward compatibility (#7654)
* optional tls version logic

Signed-off-by: pureiboi <17396188+pureiboi@users.noreply.github.com>

* update cmd description and match doc

Signed-off-by: pureiboi <17396188+pureiboi@users.noreply.github.com>

* feat: update doc with make docs

Signed-off-by: pureiboi <17396188+pureiboi@users.noreply.github.com>

* fix indentation by linter

Signed-off-by: pureiboi <17396188+pureiboi@users.noreply.github.com>

---------

Signed-off-by: pureiboi <17396188+pureiboi@users.noreply.github.com>
Signed-off-by: pureiboi <17396188+pureiboi@users.noreply.github.com
2024-10-29 14:58:30 +02:00
Giedrius Statkevičius 19dc4b9478
Cut down test times (#7861)
Refactor so that leak detection is happening in TestMain;
Use t.Parallel() everywhere;
Reduce series/samples count in some tests that reuse the same functions
we use for benchmarking i.e. leave higher loads for benchmarks

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-10-29 10:08:01 +02:00
Ben Kochie 5749c8c332
Improve replica flag handling (#7855)
Add a string utility parsing function to improve the handling of replica
label flags. This allows for easier handling of flags when multiple
replica labels are need.
* Split flag parts that are comma separated.
* Remove any empty strings.
* Sort and deduplicate the slice.

For example in the case of multiple replica labels like:
`--query.replica-label=prometheus_replica,thanos_rule_replica`

Signed-off-by: SuperQ <superq@gmail.com>
2024-10-29 07:56:24 +00:00
Filip Petkovski a31af1da03
Merge pull request #7859 from logzio/logzio-logo
Add Logz.io to adopters
2024-10-24 15:13:43 +02:00
Michał Mazur 05724b9e93 Add Logz.io to adopters
Signed-off-by: Michał Mazur <mmazur.box@gmail.com>
2024-10-24 13:38:18 +02:00
Giedrius Statkevičius c10b695a5b
receive/multitsdb: defer unlock properly (#7857)
Do not forget to unlock.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-10-23 23:56:07 +03:00
Yu Long e5bb3a490e
UI: Select time range with mouse drag feature (#7853)
* UI: Select time range with mouse drag

Signed-off-by: Yu Long <yu.long@databricks.com>

* QueryFrontend: pass "stats" parameter forward (#7852)

If a querier sees a "stats" parameter in the query request, it will attach important information about the query execution to the response.
But currently, even if an user sets this value, the Query Frontend will lose this value in its middleware/roundtrippers.

This PR fixes this problem by properly encoding/decoding the requests in QFE.

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Yu Long <yu.long@databricks.com>

* build(deps): bump go.opentelemetry.io/otel/bridge/opentracing (#7851)

Bumps [go.opentelemetry.io/otel/bridge/opentracing](https://github.com/open-telemetry/opentelemetry-go) from 1.29.0 to 1.31.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.29.0...v1.31.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/bridge/opentracing
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Yu Long <yu.long@databricks.com>

* Update CHANGELOG

Signed-off-by: Yu Long <yu.long@databricks.com>

* Apply fix to linter error (from orig prom PR)

Signed-off-by: Yu Long <yu.long@databricks.com>

* Fix not-null assertion bug from orig PR

Signed-off-by: Yu Long <yu.long@databricks.com>

* Commit generated files

Signed-off-by: Yu Long <yu.long@databricks.com>

* Fix unit test

Signed-off-by: Yu Long <yu.long@databricks.com>

---------

Signed-off-by: Yu Long <yu.long@databricks.com>
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Yu Long <yu.long@databricks.com>
Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-23 11:44:35 +05:30
dependabot[bot] ea89306d0d
build(deps): bump go.opentelemetry.io/otel/bridge/opentracing (#7851)
Bumps [go.opentelemetry.io/otel/bridge/opentracing](https://github.com/open-telemetry/opentelemetry-go) from 1.29.0 to 1.31.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.29.0...v1.31.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/bridge/opentracing
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-22 09:09:57 +05:30
Pedro Tanaka 1bdcc655d8
QueryFrontend: pass "stats" parameter forward (#7852)
If a querier sees a "stats" parameter in the query request, it will attach important information about the query execution to the response.
But currently, even if an user sets this value, the Query Frontend will lose this value in its middleware/roundtrippers.

This PR fixes this problem by properly encoding/decoding the requests in QFE.

Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-22 09:09:43 +05:30
Filip Petkovski 7d95913c50
Merge pull request #7843 from pedro-stanaka/fix/qfe-only-log-slow-queries
QFE: only log slow query, if it is a query endpoint
2024-10-21 10:44:46 +02:00
Pedro Tanaka 6ab96c3702
QFE: only log slow query, if it is a query endpoint
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-10-21 09:28:46 +02:00
Filip Petkovski 731e4607d3
Merge pull request #7838 from fpetkovski/optimize-validate-labels
Optimize validateLabels
2024-10-17 14:00:53 +02:00
Filip Petkovski 16130149ec
Fix lint
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-10-17 13:19:21 +02:00
Filip Petkovski 37d8e07226
Optimize validateLabels
Validating labels in the capnproto writer seems to use a notable
amount of CPU, mostly because it needlessly allocates bytes for each
labels validation.

This commit optimizes that function to have zero allocs.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-10-17 13:13:08 +02:00
Taras Didukh da6344a0ba
Merge branch 'main' into query_frontend_tls_redis_fix
Signed-off-by: Taras Didukh <didukh86@gmail.com>
2024-10-17 14:03:41 +03:00
Filip Petkovski 65b664c500
Implement capnproto replication (#7659)
* Implement capnproto replication

Our profiles from production show that a lot of CPU and memory in receivers
is used for unmarshaling protobuf messages. Although it is not possible to change
the remote-write format, we have the freedom to change the protocol used
for replicating timeseries data.

This commit introduces a new feature in receivers where replication can be done
using Cap'n Proto instead of gRPC + Protobuf. The advantage of the former protocol
is that deserialization is far cheaper and fields can be accessed directly from
the received message (byte slice) without allocating intermediate objects.
There is an additional cost for serialization because we have to convert from
Protobuf to the Cap'n proto format, but in our setup this still results in a net
reduction in resource usage.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Pass logger

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Update capnp

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Modify flag

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Lint

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix spellcheck

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Use previous version

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Update docker base

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Bump go

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Update docs/components/receive.md

Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Validate labels

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* e2e: add receive test with capnp replication

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* receive: make copy only when necessary

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Fix failing test

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add CHANGELOG entry

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Add capnproto Make target

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Replace panics with errors

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix benchmark

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix CHANGELOG

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Co-authored-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-10-17 09:45:01 +00:00
Vasiliy Rumyantsev 274f95e74f
store: label_values: fetch less postings (#7814)
* label_values: fetch less postings

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* CHANGELOG.md

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* added acceptance test

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* removed redundant comment

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* check if matcher is EQ matcher

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

* Update CHANGELOG.md

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>

---------

Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
2024-10-17 07:49:26 +00:00
Giedrius Statkevičius fe51bd66c3
rule: add concurrent evals functionality (#7835)
Expose the new concurrent evaluation functionality from Ruler.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-10-16 15:38:46 +03:00
Filip Petkovski 6a3704c12b
Merge pull request #7834 from ntk148v/main
docs: add Thanos store memcached deployment note
2024-10-16 11:41:22 +02:00
Kien Nguyen Tuan a9061f3d84 docs: add Thanos store memcached deployment note
add Memcached deployment in Kubernetes, similar to Cortex [1].

[1] https://cortexmetrics.io/docs/blocks-storage/store-gateway/#memcached-index-cache

Signed-off-by: Kien Nguyen Tuan <kiennt98@fpt.com>
2024-10-16 15:50:01 +07:00
Filip Petkovski f33a44879c
Merge pull request #7833 from fpetkovski/update-go-1.23
Update go to 1.23 in the CI
2024-10-16 09:48:38 +02:00
Filip Petkovski 9045370ca9
Update go to 1.23 in the CI
This commit updates the go version to 1.23 in the CI, including
unit, e2e tests and promu crossbuild.

It also bumps bingo dependencies where needed.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-10-16 09:14:58 +02:00
Filip Petkovski 9c925bfae1
Fix coroutine leak (#7821)
* Fix coroutine leak

The in-process client uses a pull based iterator which needs
to be closed, otherwise it will leak the underlying coroutine.
When this happens, the tsdb reader will remain open which blocks head
compaction indefinitely.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix race condition

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix CHANGELOG

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Improve tests

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix blockSeriesClient

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix unit test

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix another unit test

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-10-15 08:39:02 +00:00
dependabot[bot] 175baf13de
build(deps): bump golang.org/x/time from 0.6.0 to 0.7.0 (#7826)
Bumps [golang.org/x/time](https://github.com/golang/time) from 0.6.0 to 0.7.0.
- [Commits](https://github.com/golang/time/compare/v0.6.0...v0.7.0)

---
updated-dependencies:
- dependency-name: golang.org/x/time
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 10:18:06 +03:00
dependabot[bot] 8b40ed075d
build(deps): bump github/codeql-action from 3.26.10 to 3.26.13 (#7822)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.10 to 3.26.13.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](e2b3eafc8d...f779452ac5)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 10:04:21 +03:00
dependabot[bot] a9daf63350
build(deps): bump go.opentelemetry.io/otel/trace from 1.29.0 to 1.31.0 (#7825)
Bumps [go.opentelemetry.io/otel/trace](https://github.com/open-telemetry/opentelemetry-go) from 1.29.0 to 1.31.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.29.0...v1.31.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/trace
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 10:03:30 +03:00
dependabot[bot] 1de5a87fb2
build(deps): bump github.com/prometheus/common from 0.59.1 to 0.60.0 (#7824)
Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.59.1 to 0.60.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Changelog](https://github.com/prometheus/common/blob/main/RELEASE.md)
- [Commits](https://github.com/prometheus/common/compare/v0.59.1...v0.60.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 10:03:02 +03:00
dependabot[bot] 8bea523c51
build(deps): bump google.golang.org/protobuf from 1.34.2 to 1.35.1 (#7827)
Bumps google.golang.org/protobuf from 1.34.2 to 1.35.1.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-15 10:02:27 +03:00
Filip Petkovski 2f39d248d8
Merge pull request #7815 from fpetkovski/disable-chunk-trimming
Disable chunk trimming in Receivers
2024-10-14 12:48:23 +02:00
Filip Petkovski a79c710fc5
Fix docs
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-10-14 12:11:41 +02:00
Filip Petkovski 328385ac14 Extend godoc
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-10-14 11:48:11 +02:00
Filip Petkovski bdbcd9d39a Disable chunk trimming in Receivers
When trimming is not disabled, receivers end up recoding all chunks
in order to drop samples that are outside of the range.
This ends up being very expensive and causes ingestion problems during high
query load.

This commit disables trimming which should reduce CPU usage in receivers.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-10-14 11:48:11 +02:00
Giedrius Statkevičius d215f5b349
api: use jsoniter (#7816) 2024-10-11 10:38:38 -07:00
Giedrius Statkevičius af0900bfd2
*: bump deps + enable compaction randomization (#7813)
* *: bump deps + enable compaction randomization

Bump go.mod dependencies of prometheus and thanos promql-engine. Enable
randomized compaction start to help with reducing latency with multiple
TSDBs.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* CHANGELOG: add item

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* *: fix CI

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

---------

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-10-11 14:15:32 +03:00
Filip Petkovski 6623a3c02f
Use iterators for in-process Series calls (#7796)
* Use iterators for in-process Series calls

The TSDBStore has two implementations of Series. One uses a goroutine
and the other one buffers series in memory. Both are used for different
use cases and trade-off CPU and memory according to the use.

In order to reconcile these two approaches, we can use an iterator
which relies on coroutines that have a much lower overhead than goroutines.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Update golangci-lint

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Fix lint

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

---------

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-10-10 10:48:32 +02:00
Walther Lee 55d0e09fa7
Query: Skip formatting strings if debug logging is disabled (#7678)
* skip formatting debug str if debug logging is disabled

Signed-off-by: Walther Lee <walthere.lee@gmail.com>

* make statis strings const

Signed-off-by: Walther Lee <walthere.lee@gmail.com>

---------

Signed-off-by: Walther Lee <walthere.lee@gmail.com>
2024-10-10 07:54:39 +05:30
Filip Petkovski f265c3b062
Merge pull request #7787 from niaurys/add_cuckoo_filter_on_metric_names
receive/multitsdb: add cuckoo filter on metric names
2024-10-02 08:07:23 +02:00
Filip Petkovski f8af674646
Disable dedup proxy in multi-tsdb (#7793)
The receiver manages independent TSDBs which do not have duplicated series.
For this reason it should be safe to disable deduplication of chunks and
reduce CPU usage for this path.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-10-01 21:00:58 +03:00
Mindaugas Niaura 701d312a1b rebase on main
Signed-off-by: Mindaugas Niaura <mindaugas.niaura@vinted.com>
2024-10-01 16:45:42 +03:00
Mindaugas Niaura a659d10d3c address PR comments, use options in tsbd initializations
Signed-off-by: Mindaugas Niaura <mindaugas.niaura@vinted.com>
2024-10-01 16:39:09 +03:00
Mindaugas Niaura d2b6089d58 fix TSDB pruning
Signed-off-by: Mindaugas Niaura <mindaugas.niaura@vinted.com>
2024-10-01 16:38:49 +03:00
Mindaugas Niaura f3493a1f92 use matchers in store filter
Signed-off-by: Mindaugas Niaura <mindaugas.niaura@vinted.com>
2024-10-01 16:38:47 +03:00
Mindaugas Niaura 068c92cf3b avoid copy in CuckooFilterMetricNameFilter
Signed-off-by: Mindaugas Niaura <mindaugas.niaura@vinted.com>
2024-10-01 16:38:17 +03:00
Mindaugas Niaura a4298bb850 add test cases for testFilter
Signed-off-by: Mindaugas Niaura <mindaugas.niaura@vinted.com>
2024-10-01 16:38:13 +03:00
Mindaugas Niaura bc3a1828fa add enable-feature flag to Receiver docs, fix newEndpointRef typo
Signed-off-by: Mindaugas Niaura <mindaugas.niaura@vinted.com>
2024-10-01 16:37:39 +03:00
Mindaugas Niaura 1cd8d90d4f receive/multitsdb: add cuckoo filter on metric names
Signed-off-by: Mindaugas Niaura <mindaugas.niaura@vinted.com>
2024-10-01 16:37:34 +03:00
dependabot[bot] b31a6376bd
build(deps): bump github/codeql-action from 3.26.6 to 3.26.10 (#7789)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.6 to 3.26.10.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](4dd16135b6...e2b3eafc8d)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 16:36:23 +03:00
dependabot[bot] cec6c6f643
build(deps): bump github.com/redis/rueidis from 1.0.14-go1.18 to 1.0.47 (#7792)
Bumps [github.com/redis/rueidis](https://github.com/redis/rueidis) from 1.0.14-go1.18 to 1.0.47.
- [Release notes](https://github.com/redis/rueidis/releases)
- [Commits](https://github.com/redis/rueidis/compare/v1.0.14-go1.18...v1.0.47)

---
updated-dependencies:
- dependency-name: github.com/redis/rueidis
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-10-01 16:36:00 +03:00
Giedrius Statkevičius bcd20a3b88
*: switch back to gogoproto, rm stringlabels (#7790)
* Revert "store: add chunk pooling (#7771)"

This reverts commit a2113fd81c.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Revert "query/store: memoize PromLabels() call (#7767)"

This reverts commit 735db72a4b.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Revert "store: compare labels directly (#7766)"

This reverts commit 30f453edd8.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Revert "store: don't create intermediate labels (#7762)"

This reverts commit 8cd3fae938.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Revert "*: build with stringlabels (#7745)"

This reverts commit 883fade9bd.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Revert "*: enable gRPC pooling (#7742)"

This reverts commit ca8ab90266.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Revert "*: switch to vtprotobuf (#7721)"

This reverts commit a8e7109d50.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Revert "*: removing gogoproto extensions (#7718)"

This reverts commit 97710f41b0.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Revert "*: rm ZLabels (#7675)"

This reverts commit 8c8a88e2f9.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

---------

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-10-01 16:08:34 +03:00
Giedrius Statkevičius 4b012f9a59
store: reuse chunks map (#7783)
Reuse chunks map instead of creating a new one each time. This is a hot
path and shows up in profiles.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-10-01 12:01:28 +03:00
Taras Didukh 81350e8a0e
Merge branch 'main' into query_frontend_tls_redis_fix 2024-09-27 15:53:30 +03:00
Filip Petkovski 6ff5e1b3a2
Merge pull request #7785 from xnet-mobile/main
Add XNET to Adopters
2024-09-27 08:10:12 +02:00
Filip Petkovski 80179ce4f1
Merge pull request #7763 from pedro-stanaka/feat/http-wrappers-native-histo
ruler: use native histograms for client metrics
2024-09-27 07:28:24 +02:00
xnet-mobile 8ac0b4de7e
Update adopters.yml
Signed-off-by: xnet-mobile <105046137+xnet-mobile@users.noreply.github.com>
2024-09-27 03:32:56 +01:00
xnet-mobile 972fd1f9f2
Add files via upload
Signed-off-by: xnet-mobile <105046137+xnet-mobile@users.noreply.github.com>
2024-09-27 03:31:59 +01:00
Giedrius Statkevičius 90215ad135
receive: memoize exemplar/TSDB clients (#7782)
We call this on each Series() so memoize the creation of this slice.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-25 14:23:54 +03:00
Pedro Tanaka 303b1f1ab5
Merge branch 'main' into feat/http-wrappers-native-histo 2024-09-24 08:15:54 +02:00
Filip Petkovski 585899439b
Merge pull request #7764 from dongjiang1989/support-gomemlimit
chore: add GOMEMLIMIT in runtimeinfo api
2024-09-23 12:38:18 +02:00
dongjiang c797ec5636
Merge branch 'main' into support-gomemlimit 2024-09-23 16:00:47 +08:00
Giedrius Statkevičius a2113fd81c
store: add chunk pooling (#7771)
Pool byte slices inside of aggrchunks.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-23 10:39:41 +03:00
dongjiang 5bc638b392
Merge branch 'main' into support-gomemlimit 2024-09-22 18:19:49 +08:00
Kemal Akkoyun e69bf72337
Update affiliation of kakkoyun (#7773)
Signed-off-by: Kemal Akkoyun <kakkoyun@users.noreply.github.com>
2024-09-22 11:37:03 +02:00
Giedrius Statkevičius 103ef36e4a
e2e/e2ethanos: fix avalanche version (#7772)
`main` has some breakage so use older version for now.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-21 20:43:20 +03:00
dongjiang b93ec5369d
Merge branch 'main' into support-gomemlimit 2024-09-20 21:47:41 +08:00
Filip Petkovski 832d17ae64
Merge pull request #7768 from pedro-stanaka/banner-remove-thanoscon
Website: remove ThanosCon banner
2024-09-20 13:07:32 +02:00
Giedrius Statkevičius 735db72a4b
query/store: memoize PromLabels() call (#7767)
We use the stringlabels call so some allocations are inevitable but we
can be much smarter about it:

```
func (s *storeSeriesSet) At() (labels.Labels, []*storepb.AggrChunk) {
	return s.series[s.i].PromLabels(), s.series[s.i].Chunks <--- not memoized, new alloc on every At() call; need to memoize because of stringlabel. One alloc is inevitable.
}
```

```
lset, chks := s.SeriesSet.At()
if s.peek == nil {
	s.peek = &Series{Labels: labelpb.PromLabelsToLabelpbLabels(lset), Chunks: chks} <-- converting back to labelpb ?
	continue
}
```

```
if labels.Compare(lset, s.peek.PromLabels()) != 0 { <--- PromLabels() called; we can avoid this call
	s.lset, s.chunks = s.peek.PromLabels(), s.peek.Chunks <- PromLabels() called; we can avoid this
	s.peek = &Series{Labels: labelpb.PromLabelsToLabelpbLabels(lset), Chunks: chks} <--- converting back to labelpb; we can avoid this
	return true
}
```

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-20 13:00:02 +03:00
Pedro Tanaka 439c12f791
Website: remove ThanosCon banner
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-09-20 11:43:45 +02:00
Giedrius Statkevičius 30f453edd8
store: compare labels directly (#7766)
Do not create intermediate prometheus labels and compare the labels
directly.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-20 11:56:07 +03:00
Pedro Tanaka 5d36a5af10
Adding Pedro Tanaka as Triager (#7765)
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-09-20 09:04:43 +01:00
dongjiang1989 87d71ffe05
chore add GOMEMLIMIT
Signed-off-by: dongjiang1989 <dongjiang1989@126.com>
2024-09-20 11:34:23 +08:00
Pedro Tanaka 91a20eb980
adding changelog
Signed-off-by: Pedro Tanaka <pedro.stanaka@gmail.com>
2024-09-20 01:37:28 +02:00
Pedro Tanaka 4a46856cba
ruler: use native histograms for client metrics
Signed-off-by: Pedro Tanaka <pedro.stanaka@gmail.com>
2024-09-20 01:33:51 +02:00
Giedrius Statkevičius 8cd3fae938
store: don't create intermediate labels (#7762)
Just compare labelpb.Label directly instead of creating promlabels from
them.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-19 17:33:12 +03:00
Taras Didukh b95510618c
Merge branch 'main' into query_frontend_tls_redis_fix 2024-09-19 16:18:40 +03:00
Thibault Mange 1625665caf
Fix blog article img rendering for Life of a sample Part I (#7761)
* add img style attribute

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* fix formatting

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* fix link

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* remove internal links

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

---------

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2024-09-19 12:28:27 +01:00
Taras Didukh e4e645a40a
Merge branch 'main' into query_frontend_tls_redis_fix 2024-09-19 11:07:26 +03:00
Filip Petkovski 2bdb909af9
Merge pull request #7756 from fpetkovski/generalize-pool
Generalize the bucketed bytes pool
2024-09-18 15:54:43 +02:00
Thibault Mange 9dd7905a88
Blog article submission: Life of a Sample in Thanos Part I (#7748)
* part_1

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* typo

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* Update docs/blog/2023-11-20-life-of-a-sample-part-1.md

Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* Update docs/blog/2023-11-20-life-of-a-sample-part-1.md

Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* Update docs/blog/2023-11-20-life-of-a-sample-part-1.md

Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* Update docs/blog/2023-11-20-life-of-a-sample-part-1.md

Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* Update docs/blog/2023-11-20-life-of-a-sample-part-1.md

Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

* add sidecar, remove invalid links

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>

---------

Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-09-18 13:41:00 +01:00
Filip Petkovski 95c9bcfffb
Fix test lint
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-09-17 14:48:27 +02:00
Filip Petkovski 381cad5d76
Generalize the bucketed bytes pool
Now that we have generics, we can generalize the bucketed bytes pool
to be used with slices of any type T.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-09-17 14:43:18 +02:00
Michael Hoffmann 883fade9bd
*: build with stringlabels (#7745)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-09-13 18:52:22 +02:00
Giedrius Statkevičius ca8ab90266
*: enable gRPC pooling (#7742)
Use the new CodecV2 interface to enable pooling gRPC
marshaling/unmarshaling buffers. Also, add missing includes to
scripts/genproto.sh so that we could enable the `pool` flag in the next
PR.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-13 12:34:41 +03:00
Michael Hoffmann 7bddb603e4
dep: bump objstore (#7741)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-09-13 11:06:11 +02:00
Tidhar Klein Orbach cbf4fb4b47
store: added a log error print when proxy limit are violated (#7683)
* store: added a log error print when proxy limit are violated

Signed-off-by: Tidhar Klein Orbach <tidhar.o@taboola.com>

* Update pkg/store/proxy.go

Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
Signed-off-by: Tidhar Klein Orbach <tizki@users.noreply.github.com>

---------

Signed-off-by: Tidhar Klein Orbach <tidhar.o@taboola.com>
Signed-off-by: Tidhar Klein Orbach <tizki@users.noreply.github.com>
Co-authored-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-09-13 08:14:54 +02:00
Harry John f4af8aa9dd
util: Pass limit to MergeSlice (#7706) 2024-09-12 18:30:19 -07:00
Michael Hoffmann d28918916e
query: add partition labels flag (#7722)
* query: add partition labels flag

The distributed engine decides when to push down certain operations by
checking if the external labels are still present, i.e. we can push down
a binary operation if its vector matching includes all external labels.
This is great but if you have multiple external labels that are
irrelevant for the partition this is problematic since query authors
must be aware of those irrelevant labels and must incorporate them into
their queries.
This PR attempts to solve that by giving an option to focus on the
labels that are relevant for the partition.

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Update cmd/thanos/query.go

Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-09-12 19:33:38 +02:00
Giedrius Statkevičius a8e7109d50
*: switch to vtprotobuf (#7721)
Finally, let's enable vtprotobuf!

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-12 16:39:18 +03:00
Giedrius Statkevičius 97710f41b0
*: removing gogoproto extensions (#7718)
Removed all gogoproto extensions and dealt with the changes. 2nd step in
removing gogoproto.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-10 15:17:06 +03:00
Michael Hoffmann 153607f4bf
receive: mark too-far-in-future flag as non-experimental (#7707)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-09-09 11:21:09 +02:00
Michael Hoffmann 27412d2868
*: get rid of store info api (#7704)
We support the Info gRPC api for 3 years now. We used to use Store API
Info as fallback if we encounter an endpoint that does not implement
Info gRPC but that should not happen now anymore.

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-09-06 18:19:41 +02:00
Giedrius Statkevičius 0966192a44
receive/multitsdb: remove double lock (#7700)
Do not double lock here as in some situations it could lead to a
dead-lock situation.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-05 11:17:32 +03:00
Giedrius Statkevičius 8c8a88e2f9
*: rm ZLabels (#7675)
* server/grpc: add pooling

Add pooling for grpc requests/responses.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* *: rm ZLabels and friends

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* *: fix tests

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* go.mod: revert changes

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

---------

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-09-05 07:46:34 +02:00
qiyang 09db52562d
Store: Fix panic too smaller buffer (#7658)
Co-authored-by: dominic.qi <dominic.qi@jaco.live>
Co-authored-by: Ben Ye <benye@amazon.com>
2024-09-04 10:39:45 -07:00
Taras Didukh 53250395e2
Remove empty lines
Signed-off-by: Taras Didukh <didukh86@gmail.com>
2024-09-04 20:02:01 +03:00
Taras Didukh 4c5baced3b
Merge branch 'main' into query_frontend_tls_redis_fix
Signed-off-by: Taras Didukh <didukh86@gmail.com>
2024-09-04 18:15:49 +03:00
dependabot[bot] 75f0328ebb Bump golang.org/x/time from 0.5.0 to 0.6.0 (#7601)
Bumps [golang.org/x/time](https://github.com/golang/time) from 0.5.0 to 0.6.0.
- [Commits](https://github.com/golang/time/compare/v0.5.0...v0.6.0)

---
updated-dependencies:
- dependency-name: golang.org/x/time
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Taras Didukh <taras.didukh@advancedmd.com>
2024-09-04 18:00:20 +03:00
dependabot[bot] 1c2d6ca60b build(deps): bump github/codeql-action from 3.26.2 to 3.26.5 (#7667)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.2 to 3.26.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](429e197704...2c779ab0d0)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Taras Didukh <taras.didukh@advancedmd.com>
2024-09-04 18:00:20 +03:00
Milind Dethe 93200ea437 website: max-height for version-picker dropdown (#7642)
Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
Signed-off-by: Taras Didukh <taras.didukh@advancedmd.com>
2024-09-04 18:00:20 +03:00
dependabot[bot] 65c3a0e2e7 build(deps): bump go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp (#7666)
Bumps [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp](https://github.com/open-telemetry/opentelemetry-go) from 1.27.0 to 1.29.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.27.0...v1.29.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Taras Didukh <taras.didukh@advancedmd.com>
2024-09-04 18:00:07 +03:00
Michael Hoffmann a230123399 query: queryable is not respecting limits (#7679)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-09-04 18:00:07 +03:00
Taras Didukh c89dee19e5 Add record to the changelog
Signed-off-by: Taras Didukh <taras.didukh@advancedmd.com>
2024-09-04 17:58:47 +03:00
didukh86 52d84eb7c1 Query-frontend: Fix connection to Redis with TLS.
Issue: https://github.com/thanos-io/thanos/issues/7672

Signed-off-by: didukh86 <didukh86@gmail.com>
Signed-off-by: didukh86 <78904472+didukh86@users.noreply.github.com>
Signed-off-by: didukh86 <didukh86@gmail.com>
Signed-off-by: Taras Didukh <taras.didukh@advancedmd.com>
2024-09-04 17:58:47 +03:00
Saswata Mukherjee 9f2af3f78f
Fix CodeQL checks on main (#7698)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-09-04 10:19:55 +01:00
dependabot[bot] d7f45e723f
build(deps): bump github/codeql-action from 3.26.5 to 3.26.6 (#7685)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.5 to 3.26.6.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](2c779ab0d0...4dd16135b6)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-04 09:30:57 +01:00
dependabot[bot] 26392d5515
build(deps): bump go.opentelemetry.io/contrib/propagators/autoprop (#7688)
Bumps [go.opentelemetry.io/contrib/propagators/autoprop](https://github.com/open-telemetry/opentelemetry-go-contrib) from 0.53.0 to 0.54.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go-contrib/compare/zpages/v0.53.0...zpages/v0.54.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/contrib/propagators/autoprop
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-04 09:29:49 +01:00
dependabot[bot] f752793c65
build(deps): bump go.opentelemetry.io/otel/bridge/opentracing (#7689)
Bumps [go.opentelemetry.io/otel/bridge/opentracing](https://github.com/open-telemetry/opentelemetry-go) from 1.28.0 to 1.29.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.28.0...v1.29.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/bridge/opentracing
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-04 09:29:20 +01:00
Filip Petkovski 956fe47611
Merge pull request #7697 from thanos-io/dependabot/go_modules/github.com/prometheus/common-0.58.0
build(deps): bump github.com/prometheus/common from 0.55.0 to 0.58.0
2024-09-04 08:22:18 +02:00
Harry John 2c488dcf1f
store: Implement metadata API limit in stores (#7652)
* Store: Implement metadata API limit in stores

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Apply seriesLimit in nextBatch

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

---------

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-09-03 14:19:14 -07:00
dependabot[bot] 295d8a924c
build(deps): bump go.opentelemetry.io/contrib/samplers/jaegerremote (#7686)
Bumps [go.opentelemetry.io/contrib/samplers/jaegerremote](https://github.com/open-telemetry/opentelemetry-go-contrib) from 0.22.0 to 0.23.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go-contrib/compare/v0.22.0...v0.23.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/contrib/samplers/jaegerremote
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-03 22:02:22 +01:00
dependabot[bot] 15b221baaf
build(deps): bump github.com/prometheus/common from 0.55.0 to 0.58.0
Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.55.0 to 0.58.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Changelog](https://github.com/prometheus/common/blob/main/RELEASE.md)
- [Commits](https://github.com/prometheus/common/compare/v0.55.0...v0.58.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-09-03 21:00:00 +00:00
dependabot[bot] a1fc99706a
build(deps): bump go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc (#7692)
Bumps [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc](https://github.com/open-telemetry/opentelemetry-go) from 1.28.0 to 1.29.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.28.0...v1.29.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-03 21:59:50 +01:00
dependabot[bot] 113c416eaf
build(deps): bump github.com/felixge/fgprof from 0.9.4 to 0.9.5 (#7691)
Bumps [github.com/felixge/fgprof](https://github.com/felixge/fgprof) from 0.9.4 to 0.9.5.
- [Release notes](https://github.com/felixge/fgprof/releases)
- [Commits](https://github.com/felixge/fgprof/compare/v0.9.4...v0.9.5)

---
updated-dependencies:
- dependency-name: github.com/felixge/fgprof
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-03 21:58:23 +01:00
dependabot[bot] 727c3c9be1
build(deps): bump github.com/prometheus/client_golang (#7693)
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.19.1 to 1.20.2.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.19.1...v1.20.2)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-03 21:58:04 +01:00
Mikhail Nozdrachev 74651777de
Receive: fix `thanos_receive_write_{timeseries,samples}` stats (#7643)
* Revert "Receive: fix stats (#7373)"

This reverts commit 66841fbb1e.

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>

* Receive: fix `thanos_receive_write_{timeseries,samples}` stats

There are two path data can be written to a receiver: through the HTTP
or the gRPC endpoint, and `thanos_receive_write_{timeseries,samples}` only
count the number of timeseries/samples received through the HTTP
endpoint.

So, there is no risk that a sample will be counted twice, once as a
remote write and once as a local write. On the other hand, we still need
to account for the replication factor, and only count local writes is
not enough as there might be no local writes at all (e.g. in RouterOnly
mode).

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>

---------

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
2024-09-03 14:17:08 +02:00
Mikhail Nozdrachev 1c5e7f1ab0
test: Fix flaky receive/multitsdb test (#7694)
There is race condition in `TestMultiTSDBPrune` due to a dangling goroutine
which can fail outside of the test function's lifetime if the database object
is closed before `Sync()` is finished.

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
2024-09-03 12:07:32 +02:00
dependabot[bot] dfeaf6e258
build(deps): bump github.com/onsi/gomega from 1.33.1 to 1.34.2 (#7681)
Bumps [github.com/onsi/gomega](https://github.com/onsi/gomega) from 1.33.1 to 1.34.2.
- [Release notes](https://github.com/onsi/gomega/releases)
- [Changelog](https://github.com/onsi/gomega/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/gomega/compare/v1.33.1...v1.34.2)

---
updated-dependencies:
- dependency-name: github.com/onsi/gomega
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 13:35:36 +03:00
dependabot[bot] acf423dac6
Bump golang.org/x/time from 0.5.0 to 0.6.0 (#7601)
Bumps [golang.org/x/time](https://github.com/golang/time) from 0.5.0 to 0.6.0.
- [Commits](https://github.com/golang/time/compare/v0.5.0...v0.6.0)

---
updated-dependencies:
- dependency-name: golang.org/x/time
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 09:23:41 +00:00
dependabot[bot] c200719861
build(deps): bump github/codeql-action from 3.26.2 to 3.26.5 (#7667)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.2 to 3.26.5.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](429e197704...2c779ab0d0)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 08:27:21 +00:00
Milind Dethe d1b8382eab
website: max-height for version-picker dropdown (#7642)
Signed-off-by: milinddethe15 <milinddethe15@gmail.com>
2024-09-02 09:11:54 +01:00
dependabot[bot] 3d03cb4885
build(deps): bump go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp (#7666)
Bumps [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp](https://github.com/open-telemetry/opentelemetry-go) from 1.27.0 to 1.29.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.27.0...v1.29.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-02 09:09:59 +01:00
Michael Hoffmann 3270568f6b
query: queryable is not respecting limits (#7679)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-09-02 09:55:16 +03:00
Giedrius Statkevičius 8d3d34be70
receive: change quorum calculation for RF=2 (#7669)
As discussed during ThanosCon, I am updating the handling for RF=2 to
require only one successful write because requiring all writes to
succeed all the time doesn't make sense and causes lots of confusion to
users. The only other alternative is to forbid RF=2 but I think we
shouldn't do that because people would be forced to add extra resources
when moving from a Sidecar based setup.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2024-08-28 12:12:45 +03:00
dependabot[bot] 5dc91d287c
Bump github.com/miekg/dns from 1.1.59 to 1.1.62 (#7651)
Bumps [github.com/miekg/dns](https://github.com/miekg/dns) from 1.1.59 to 1.1.62.
- [Changelog](https://github.com/miekg/dns/blob/master/Makefile.release)
- [Commits](https://github.com/miekg/dns/compare/v1.1.59...v1.1.62)

---
updated-dependencies:
- dependency-name: github.com/miekg/dns
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-28 11:23:30 +03:00
Mario Trangoni a82a121c1a
codespell: check `pkg` folder (#7655)
* pkg/store: Fix all spelling issues discovered by codespell.

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>

* pkg/block: Fix all spelling issues discovered by codespell.

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>

* pkg/query: Fix all spelling issues discovered by codespell.

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>

* pkg/rules: Fix all spelling issues discovered by codespell.

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>

* pkg: Fix all spelling issues discovered by codespell.

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>

* codespell: Adjust CI job to exclude some pkg exceptions

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>

---------

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-28 11:23:05 +03:00
Walther Lee 8af5139a73
implement memcachedClient.Set in internal/cortex (#7610)
Signed-off-by: Walther Lee <walthere.lee@gmail.com>
2024-08-27 11:44:30 +03:00
pureiboi ce52e9fda8
fix(ui): add null check to find overlapping blocks logicx (#7644)
Signed-off-by: pureiboi <17396188+pureiboi@users.noreply.github.com>
2024-08-27 11:39:13 +03:00
Walther Lee d96661353d
Store: Fix LabelNames and LabelValues when using non-equal matchers (#7661)
* fix non-equal matchers in bucket FilterExtLabelsMatchers

Signed-off-by: Walther Lee <walthere.lee@gmail.com>

* add acceptance tests

Signed-off-by: Walther Lee <walthere.lee@gmail.com>

---------

Signed-off-by: Walther Lee <walthere.lee@gmail.com>
2024-08-24 07:00:36 +02:00
Ritesh Sonawane 6737c8dd2f
Added Scaling Prometheus with Thanos Blog from CloudRaft (#7653)
* Added Scaling Prometheus with Thanos Blog from CloudRaft

Signed-off-by: riteshsonawane1372 <riteshsonawane1372@gmail.com>

* signed commit

Signed-off-by: riteshsonawane1372 <riteshsonawane1372@gmail.com>

---------

Signed-off-by: riteshsonawane1372 <riteshsonawane1372@gmail.com>
2024-08-20 10:31:21 +01:00
Harshita Sao 9301004d55
fix: fixed the token-permission and pinned dependencies issue (#7649) 2024-08-19 09:10:47 -07:00
Filip Petkovski e197368a0c
Merge pull request #7646 from mjtrangoni/fix-spelling-issues
Fix spelling issues discovered by codespell
2024-08-19 16:03:49 +02:00
Mario Trangoni 1690d5b49b
codespell: Add GitHub actions job to the CI
Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-19 12:18:52 +02:00
Filip Petkovski c3e83fc4b4
Merge pull request #7650 from harshitasao/vulnerability-fix
vulnerability fix
2024-08-19 11:38:24 +02:00
harshitasao 4d43e436d1 vulnerability fix
Signed-off-by: harshitasao <harshitasao@gmail.com>
2024-08-18 18:50:51 +05:30
Filip Petkovski e62dbebe09
Merge pull request #7645 from fpetkovski/stringlabels
Add support for stringlabels in Thanos Query
2024-08-18 11:03:05 +02:00
Filip Petkovski b55845d084
Add CI step
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-08-16 17:39:45 +02:00
Mario Trangoni 0523e6eae9
Fix all spelling issues discovered by codespell.
Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-16 16:01:11 +02:00
Mario Trangoni bfa8beec78
mixin: Fix all spelling issues discovered by codespell.
Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-16 15:49:29 +02:00
Mario Trangoni 3cef1b6bc8
docs: Fix all spelling issues discovered by codespell.
Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-16 15:43:46 +02:00
Filip Petkovski e86e200155
Remove compatibility label
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-08-16 15:41:27 +02:00
Mario Trangoni af663bc696
tutorials: Fix all spelling issues discovered by codespell.
Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-16 15:25:15 +02:00
Mario Trangoni b42286198a
examples: Fix all spelling issues discovered by codespell.
Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-16 15:19:39 +02:00
Filip Petkovski 0bc02dd536
Use EmptyLabels()
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-08-16 14:57:34 +02:00
Filip Petkovski f7befd2339
Add support for stringlabels in Thanos Query
This commit finalizes support for the stringlabels build tag
so that we can build the binary.

I would assume that we will still get panics if we run a store
since the Series call still relies on casting one pointer type
to another. This will be fixed in a follow up PR.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-08-16 14:09:34 +02:00
Filip Petkovski 825b7c66ac
Merge pull request #7641 from mjtrangoni/fix-errcheck
golangci-lint: Update deprecated linter configurations
2024-08-15 11:13:56 +02:00
Mario Trangoni cc138f1840
golangci: Replace deprecated `run.deadline` with `run.timeout`.
Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-15 10:32:21 +02:00
Mario Trangoni 6612d56338
golangci: Replace deprecated `run.skip-dirs` with `issues.exclude-dirs`.
See,
level=warning msg="[config_reader] The configuration option `run.skip-dirs` is deprecated, please use `issues.exclude-dirs`."

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-15 10:29:31 +02:00
Mario Trangoni e36f574f06
golangci: Fix output format configuration
See,
level=warning msg="[config_reader] The configuration option `output.format` is deprecated, please use `output.formats`"

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-15 10:23:03 +02:00
Mario Trangoni 57a3acbba4
golangci: Fix errcheck configuration
See,
level=warning msg="[config_reader] The configuration option `linters.errcheck.exclude` is deprecated, please use `linters.errcheck.exclude-functions`."

Signed-off-by: Mario Trangoni <mjtrangoni@gmail.com>
2024-08-15 10:17:27 +02:00
Ben Ye 692a4a478f
Check context cancellation every 128 iterations (#7622) 2024-08-15 06:36:14 +01:00
Saswata Mukherjee 4e08c206cb
Merge release 0.36.1 to main (#7639)
* CHANGELOG: Mark 0.36 as in progress

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut release candidate v0.36.0-rc.0 (#7490)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut release candidate 0.36.0 rc.1 (#7510)

* *: fix server grpc histograms (#7493)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Close endpoints after the gRPC server has terminated (#7509)

Endpoints are currently closed as soon as we receive a SIGTERM or SIGINT.
This causes in-flight queries to get cancelled since outgoing connections
get closed instantly.

This commit moves the endpoints.Close call after the grpc server shutdown
to make sure connections are available as long as the server is running.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut release candidate v0.36.0-rc.1

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut release v0.36.0 (#7578)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut patch release `v0.36.1` (#7636)

* Proxy: Query goroutine leak when `store.response-timeout` is set (#7618)

time.AfterFunc() returns a time.Timer object whose C field is nil,
accroding to the documentation. A goroutine blocks forever on reading
from a `nil` channel, leading to a goroutine leak on random slow
queries.

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>

* pkg/clientconfig: fix TLS configs with only CA (#7634)

065e3dd75a introduced a regression: TLS configurations for Thanos Ruler
query and alerting with only a CA file failed to load.

For instance, the following snippet is a valid query configuration:

```
- static_configs:
  - prometheus.example.com:9090
  scheme: https
  http_config:
    tls_config:
      ca_file: /etc/ssl/cert.pem
```

The test fixtures (CA, certificate and key files) are copied from
prometheus/common and are valid until 2072.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* Cut patch release v0.36.1

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix failing e2e test (#7620)

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Co-authored-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Harry John <johrry@amazon.com>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
Co-authored-by: Michael Hoffmann <mhoffm@posteo.de>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Harry John <johrry@amazon.com>
2024-08-14 09:45:11 +01:00
Saswata Mukherjee 08b0993244
Fix changelog on main after 0.36 release (#7635)
Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2024-08-13 10:52:13 +01:00
Simon Pasquier 4fd2d8a273
pkg/clientconfig: fix TLS configs with only CA (#7634)
065e3dd75a introduced a regression: TLS configurations for Thanos Ruler
query and alerting with only a CA file failed to load.

For instance, the following snippet is a valid query configuration:

```
- static_configs:
  - prometheus.example.com:9090
  scheme: https
  http_config:
    tls_config:
      ca_file: /etc/ssl/cert.pem
```

The test fixtures (CA, certificate and key files) are copied from
prometheus/common and are valid until 2072.

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2024-08-13 10:18:31 +01:00
Mikhail Nozdrachev 4050c73f8f
Proxy: Query goroutine leak when `store.response-timeout` is set (#7618)
time.AfterFunc() returns a time.Timer object whose C field is nil,
accroding to the documentation. A goroutine blocks forever on reading
from a `nil` channel, leading to a goroutine leak on random slow
queries.

Signed-off-by: Mikhail Nozdrachev <mikhail.nozdrachev@aiven.io>
2024-08-13 08:35:54 +01:00
Harry John 49617f4d16
Fix failing e2e test (#7620)
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-08-12 19:15:28 +01:00
Ben Ye 2375b59ee3
fix GetActiveAndPartialBlockIDs panic (#7621)
Signed-off-by: Ben Ye <benye@amazon.com>
2024-08-12 09:13:04 +01:00
Simon Pasquier c9500df77b
Update CHANGELOG.md after #7614 (#7619)
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2024-08-10 07:10:23 +01:00
Simon Pasquier dcadaae80f
*: fix debug log formatting (#7614)
Before the change:

```
... msg="maxprocs: No GOMAXPROCS change to reset%!(EXTRA []interface {}=[])
```

After this change:

```
... msg="maxprocs: No GOMAXPROCS change to reset"
```

Signed-off-by: Simon Pasquier <spasquie@redhat.com>
2024-08-09 09:29:59 +01:00
Harry John a3a7c3b75c
API: Add limit param in metadata APIs (#7609) 2024-08-08 12:59:59 -07:00
Ben Ye 73648360ff
Only increment ruler warning eval metric for non PromQL warnings (#7592) 2024-08-07 11:02:31 -07:00
Filip Petkovski 08af5d7b55
Merge pull request #7608 from ahurtaud/amadeuslogo
website: Update amadeus logo to latest
2024-08-07 18:50:39 +02:00
Alban Hurtaud e24b922593
Merge branch 'main' into amadeuslogo 2024-08-07 15:07:14 +02:00
Alban HURTAUD ca47024035 Update amadeus logo to latest
Signed-off-by: Alban HURTAUD <alban.hurtaud@amadeus.com>
2024-08-07 14:20:09 +02:00
Michael Hoffmann e155196618
Merge release 0.36 to main (#7588)
* CHANGELOG: Mark 0.36 as in progress

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut release candidate v0.36.0-rc.0 (#7490)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Cut release candidate 0.36.0 rc.1 (#7510)

* *: fix server grpc histograms (#7493)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

* Close endpoints after the gRPC server has terminated (#7509)

Endpoints are currently closed as soon as we receive a SIGTERM or SIGINT.
This causes in-flight queries to get cancelled since outgoing connections
get closed instantly.

This commit moves the endpoints.Close call after the grpc server shutdown
to make sure connections are available as long as the server is running.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut release candidate v0.36.0-rc.1

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>

* Cut release v0.36.0 (#7578)

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>

---------

Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-08-02 14:55:26 +02:00
Michael Hoffmann a3d0aad67e
docs: add saswatas youtube introduction to blog (#7589)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-08-02 13:58:59 +02:00
Yuan-Kui Li 7c360d1930
Add Synology to adopters (#7581)
Signed-off-by: Yuan-Kui Li <yuankuili@synology.com>
2024-08-02 08:27:57 +01:00
Harry John f19b8c6161
Add @harry671003 to triagers (#7576)
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-07-31 14:30:35 +02:00
Michael Hoffmann bc42129651
discovery: use thanos resolver for endpoint groups (#7565)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-07-31 11:35:08 +02:00
Filip Petkovski 70b129888c
Merge pull request #7567 from thanos-io/metalmatze-maintainer-removal
Remove metalmatze from Thanos maintainers
2024-07-30 14:10:24 +02:00
Matthias Loibl e37cebc54a
Merge branch 'main' into metalmatze-maintainer-removal 2024-07-30 11:52:38 +01:00
Filip Petkovski 6b05aa4cc4
Merge pull request #7568 from SuperQ/thanos_engine_doc
Update Thanos PromQL Engine docs
2024-07-30 09:44:54 +02:00
SuperQ 4d2e84c101
Update Thanos PromQL Engine docs
Move the section on the distributed engine mode into the "Thanos PromQL
Engine" section since the new engine is required for distributed mode.
This also fixes an alignment issue which makes the distributed mode look
like it's part of the Tenancy section.

Also rename the section header to give it clearer "Thanos PromQL Engine"
branding.

Signed-off-by: SuperQ <superq@gmail.com>
2024-07-29 15:08:34 +02:00
Matthias Loibl e21dd45c3d
Remove metalmatze from Thanos maintainers
Thank you all for the last 5 years!

Signed-off-by: Matthias Loibl <mail@matthiasloibl.com>
2024-07-29 10:27:42 +02:00
dependabot[bot] 639bf8f216
Bump go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc (#7525)
Bumps [go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc](https://github.com/open-telemetry/opentelemetry-go) from 1.27.0 to 1.28.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.27.0...v1.28.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 17:51:14 +05:30
dependabot[bot] 971785e4d5
Bump golang.org/x/net from 0.26.0 to 0.27.0 (#7544)
Bumps [golang.org/x/net](https://github.com/golang/net) from 0.26.0 to 0.27.0.
- [Commits](https://github.com/golang/net/compare/v0.26.0...v0.27.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-23 17:48:38 +05:30
Jacob Baungård Hansen fb20d8ced6
api/rules: Add filtering on rule name/group/file (#7560)
This commits adds the option of filtering rules by rule name, rule
group, or file. This brings the rule API closer in-line with the current
Prometheus api.

Signed-off-by: Jacob Baungard Hansen <jacobbaungard@redhat.com>
2024-07-23 09:18:30 +05:30
dependabot[bot] 990a60b726
Bump go.opentelemetry.io/otel/bridge/opentracing from 1.21.0 to 1.28.0 (#7528)
Bumps [go.opentelemetry.io/otel/bridge/opentracing](https://github.com/open-telemetry/opentelemetry-go) from 1.21.0 to 1.28.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.21.0...v1.28.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/bridge/opentracing
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-22 10:28:12 -07:00
dependabot[bot] 0da18ad763
Bump golang.org/x/crypto from 0.24.0 to 0.25.0 (#7545)
Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.24.0 to 0.25.0.
- [Commits](https://github.com/golang/crypto/compare/v0.24.0...v0.25.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-22 10:20:48 -07:00
Harry John 466b0beb97
Update prometheus and promql-engine dependencies (#7558)
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-07-22 10:14:22 -07:00
Nishant Bansal 5765d3c1c9
Fix issue #7550: Bug fix and complete test coverage for tools.go (#7552)
Signed-off-by: Nishant Bansal <nishant.bansal.mec21@iitbhu.ac.in>
2024-07-21 18:51:32 +05:30
Harry John f77eff80ab
Build with Go 1.22 (#7559)
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2024-07-19 19:57:45 -07:00
Thomas Hartland 35c0dbec85
compact: Update filtered blocks list before second downsample pass (#7492)
* compact: Update filtered blocks list before second downsample pass

If the second downsampling pass is given the same filteredMetas
list as the first pass, it will create duplicates of blocks
created in the first pass.

It will also not be able to do further downsampling e.g 5m->1h
using blocks created in the first pass, as it will not be aware
of them.

The metadata was already being synced before the second pass,
but not updated into the filteredMetas list.

Signed-off-by: Thomas Hartland <thomas.hartland@diamond.ac.uk>

* Update changelog

Signed-off-by: Thomas Hartland <thomas.hartland@diamond.ac.uk>

* e2e/compact: Fix number of blocks cleaned assertion

The value was increased in 2ed48f7 to fix the test,
with the reasoning that the hardcoded value must
have been taken from a run of the CI that didn't
reach the max value due to CI worker lag.

More likely the real reason is that commit 68bef3f
the day before had caused blocks to be duplicated
during downsampling.

The duplicate block is immediately marked for deletion,
causing an extra +1 in the number of blocks cleaned.

Subtracting one from the value again now that the
block duplication issue is fixed.

Signed-off-by: Thomas Hartland <thomas.hartland@diamond.ac.uk>

* e2e/compact: Revert change to downsample count assertion

Combined with the previous commit this effectively reverts
all of 2ed48f7, in which two assertions were changed to
(unknowingly) account for a bug which had just been
introduced in the downsampling code, causing duplicate blocks.

This assertion change I am less sure on the reasoning for,
but after running through the e2e tests several times locally,
it is consistent that the only downsampling happens in the
"compact-working" step, and so all other steps would report 0
for their total downsamples metric.

Signed-off-by: Thomas Hartland <thomas.hartland@diamond.ac.uk>

---------

Signed-off-by: Thomas Hartland <thomas.hartland@diamond.ac.uk>
2024-07-13 13:11:26 -07:00
dependabot[bot] 34e0729607
Bump go.opentelemetry.io/contrib/samplers/jaegerremote (#7529)
Bumps [go.opentelemetry.io/contrib/samplers/jaegerremote](https://github.com/open-telemetry/opentelemetry-go-contrib) from 0.7.0 to 0.22.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go-contrib/compare/v0.7.0...v0.22.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/contrib/samplers/jaegerremote
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-11 16:41:48 +02:00
dependabot[bot] 50c304dcd7
Bump go.opentelemetry.io/contrib/propagators/autoprop (#7530)
Bumps [go.opentelemetry.io/contrib/propagators/autoprop](https://github.com/open-telemetry/opentelemetry-go-contrib) from 0.38.0 to 0.53.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go-contrib/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go-contrib/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go-contrib/compare/zpages/v0.38.0...zpages/v0.53.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/contrib/propagators/autoprop
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-11 15:05:52 +01:00
Joel Verezhak fb76b226bd
Remove trailing period from SRV records (#7494)
Recently ran into an issue with Istio in particular, where leaving the
trailing dot on the SRV record returned by `dnssrvnoa` lookups led to an
inability to connect to the endpoint. Removing the trailing dot fixes
this behaviour.

Now, technically, this is a valid URL, and it shouldn't be a problem.
One could definitely argue that Istio should be responsible here for
ensuring that the traffic is delivered. The problem seems rooted in how
Istio attempts to do wildcard matching or URLs it receives - including
the dot leads it to lookup an empty DNS field, which is invalid.

The approach I take here is actually copied from how Prometheus does it.
Therefore I hope we can sneak this through with the argument that 'this
is how Prometheus does it', regardless of whether or not this is
philosophically correct...

Signed-off-by: verejoel <j.verezhak@gmail.com>
2024-07-09 06:07:52 +00:00
Pedro Tanaka 6f1245483e
QFE: disable double compression middleware (#7511)
Signed-off-by: Pedro Tanaka <pedro.tanaka@shopify.com>
2024-07-08 11:07:49 +01:00
Vasiliy Rumyantsev cb27548cc4
removed mention of unused pkg (#7515)
Signed-off-by: Vasiliy Rumyantsev <4119114+xBazilio@users.noreply.github.com>
2024-07-07 19:57:14 -07:00
Filip Petkovski a922b219ef
Close endpoints after the gRPC server has terminated (#7509)
Endpoints are currently closed as soon as we receive a SIGTERM or SIGINT.
This causes in-flight queries to get cancelled since outgoing connections
get closed instantly.

This commit moves the endpoints.Close call after the grpc server shutdown
to make sure connections are available as long as the server is running.

Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com>
2024-07-03 15:20:23 +02:00
Rishabh Soni 0ae5bfc22e
chore: Add nirmata to adopters (#7506)
* Update adopters.yml

Signed-off-by: Rishabh Soni <risrock02@gmail.com>

* Add files via upload

Signed-off-by: Rishabh Soni <risrock02@gmail.com>

---------

Signed-off-by: Rishabh Soni <risrock02@gmail.com>
2024-07-02 14:45:50 -07:00
Pranshu Srivastava fcc88c028a
reloader: allow suppressing envvar errors (#7429)
Allow suppressing environment variables expansion errors when unset, and
thus keep the reloader from crashing. Instead leave them as is.

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
2024-07-02 09:41:27 +01:00
Michael Hoffmann 417595c4e5
*: fix server grpc histograms (#7493)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-06-27 19:12:15 +02:00
Michael Hoffmann 57b42d1bf4
CHANGELOG: Mark 0.36 as in progress (#7486)
Signed-off-by: Michael Hoffmann <mhoffm@posteo.de>
2024-06-26 17:39:40 +02:00
431 changed files with 31792 additions and 9338 deletions

1
.bingo/.gitignore vendored
View File

@ -11,3 +11,4 @@
!variables.env
*tmp.mod
*tmp.sum

View File

@ -6,7 +6,7 @@ This is directory which stores Go modules with pinned buildable package that is
* Run `bingo get <tool>` to install <tool> that have own module file in this directory.
* For Makefile: Make sure to put `include .bingo/Variables.mk` in your Makefile, then use $(<upper case tool name>) variable where <tool> is the .bingo/<tool>.mod.
* For shell: Run `source .bingo/variables.env` to source all environment variable for each tool.
* For go: Import `.bingo/variables.go` to for variable names.
* For go: Import `.bingo/variables.go` for variable names.
* See https://github.com/bwplotka/bingo or -h on how to add, remove or change binaries dependencies.
## Requirements

View File

@ -1,4 +1,4 @@
# Auto generated binary variables helper managed by https://github.com/bwplotka/bingo v0.8. DO NOT EDIT.
# Auto generated binary variables helper managed by https://github.com/bwplotka/bingo v0.9. DO NOT EDIT.
# All tools are designed to be build inside $GOBIN.
BINGO_DIR := $(dir $(lastword $(MAKEFILE_LIST)))
GOPATH ?= $(shell go env GOPATH)
@ -17,29 +17,35 @@ GO ?= $(shell which go)
# @echo "Running alertmanager"
# @$(ALERTMANAGER) <flags/args..>
#
ALERTMANAGER := $(GOBIN)/alertmanager-v0.24.0
ALERTMANAGER := $(GOBIN)/alertmanager-v0.27.0
$(ALERTMANAGER): $(BINGO_DIR)/alertmanager.mod
@# Install binary/ries using Go 1.14+ build command. This is using bwplotka/bingo-controlled, separate go module with pinned dependencies.
@echo "(re)installing $(GOBIN)/alertmanager-v0.24.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=alertmanager.mod -o=$(GOBIN)/alertmanager-v0.24.0 "github.com/prometheus/alertmanager/cmd/alertmanager"
@echo "(re)installing $(GOBIN)/alertmanager-v0.27.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=alertmanager.mod -o=$(GOBIN)/alertmanager-v0.27.0 "github.com/prometheus/alertmanager/cmd/alertmanager"
BINGO := $(GOBIN)/bingo-v0.8.1-0.20230820182247-0568407746a2
BINGO := $(GOBIN)/bingo-v0.9.0
$(BINGO): $(BINGO_DIR)/bingo.mod
@# Install binary/ries using Go 1.14+ build command. This is using bwplotka/bingo-controlled, separate go module with pinned dependencies.
@echo "(re)installing $(GOBIN)/bingo-v0.8.1-0.20230820182247-0568407746a2"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=bingo.mod -o=$(GOBIN)/bingo-v0.8.1-0.20230820182247-0568407746a2 "github.com/bwplotka/bingo"
@echo "(re)installing $(GOBIN)/bingo-v0.9.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=bingo.mod -o=$(GOBIN)/bingo-v0.9.0 "github.com/bwplotka/bingo"
FAILLINT := $(GOBIN)/faillint-v1.11.0
CAPNPC_GO := $(GOBIN)/capnpc-go-v3.0.1-alpha.2.0.20240830165715-46ccd63a72af
$(CAPNPC_GO): $(BINGO_DIR)/capnpc-go.mod
@# Install binary/ries using Go 1.14+ build command. This is using bwplotka/bingo-controlled, separate go module with pinned dependencies.
@echo "(re)installing $(GOBIN)/capnpc-go-v3.0.1-alpha.2.0.20240830165715-46ccd63a72af"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=capnpc-go.mod -o=$(GOBIN)/capnpc-go-v3.0.1-alpha.2.0.20240830165715-46ccd63a72af "capnproto.org/go/capnp/v3/capnpc-go"
FAILLINT := $(GOBIN)/faillint-v1.13.0
$(FAILLINT): $(BINGO_DIR)/faillint.mod
@# Install binary/ries using Go 1.14+ build command. This is using bwplotka/bingo-controlled, separate go module with pinned dependencies.
@echo "(re)installing $(GOBIN)/faillint-v1.11.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=faillint.mod -o=$(GOBIN)/faillint-v1.11.0 "github.com/fatih/faillint"
@echo "(re)installing $(GOBIN)/faillint-v1.13.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=faillint.mod -o=$(GOBIN)/faillint-v1.13.0 "github.com/fatih/faillint"
GOIMPORTS := $(GOBIN)/goimports-v0.12.0
GOIMPORTS := $(GOBIN)/goimports-v0.23.0
$(GOIMPORTS): $(BINGO_DIR)/goimports.mod
@# Install binary/ries using Go 1.14+ build command. This is using bwplotka/bingo-controlled, separate go module with pinned dependencies.
@echo "(re)installing $(GOBIN)/goimports-v0.12.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=goimports.mod -o=$(GOBIN)/goimports-v0.12.0 "golang.org/x/tools/cmd/goimports"
@echo "(re)installing $(GOBIN)/goimports-v0.23.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=goimports.mod -o=$(GOBIN)/goimports-v0.23.0 "golang.org/x/tools/cmd/goimports"
GOJSONTOYAML := $(GOBIN)/gojsontoyaml-v0.1.0
$(GOJSONTOYAML): $(BINGO_DIR)/gojsontoyaml.mod
@ -47,11 +53,11 @@ $(GOJSONTOYAML): $(BINGO_DIR)/gojsontoyaml.mod
@echo "(re)installing $(GOBIN)/gojsontoyaml-v0.1.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=gojsontoyaml.mod -o=$(GOBIN)/gojsontoyaml-v0.1.0 "github.com/brancz/gojsontoyaml"
GOLANGCI_LINT := $(GOBIN)/golangci-lint-v1.54.1
GOLANGCI_LINT := $(GOBIN)/golangci-lint-v1.64.5
$(GOLANGCI_LINT): $(BINGO_DIR)/golangci-lint.mod
@# Install binary/ries using Go 1.14+ build command. This is using bwplotka/bingo-controlled, separate go module with pinned dependencies.
@echo "(re)installing $(GOBIN)/golangci-lint-v1.54.1"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=golangci-lint.mod -o=$(GOBIN)/golangci-lint-v1.54.1 "github.com/golangci/golangci-lint/cmd/golangci-lint"
@echo "(re)installing $(GOBIN)/golangci-lint-v1.64.5"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=golangci-lint.mod -o=$(GOBIN)/golangci-lint-v1.64.5 "github.com/golangci/golangci-lint/cmd/golangci-lint"
GOTESPLIT := $(GOBIN)/gotesplit-v0.2.1
$(GOTESPLIT): $(BINGO_DIR)/gotesplit.mod
@ -95,11 +101,11 @@ $(MDOX): $(BINGO_DIR)/mdox.mod
@echo "(re)installing $(GOBIN)/mdox-v0.9.1-0.20220713110358-25b9abcf90a0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=mdox.mod -o=$(GOBIN)/mdox-v0.9.1-0.20220713110358-25b9abcf90a0 "github.com/bwplotka/mdox"
MINIO := $(GOBIN)/minio-v0.0.0-20220720015624-ce8397f7d944
MINIO := $(GOBIN)/minio-v0.0.0-20241014163537-3da7c9cce3de
$(MINIO): $(BINGO_DIR)/minio.mod
@# Install binary/ries using Go 1.14+ build command. This is using bwplotka/bingo-controlled, separate go module with pinned dependencies.
@echo "(re)installing $(GOBIN)/minio-v0.0.0-20220720015624-ce8397f7d944"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=minio.mod -o=$(GOBIN)/minio-v0.0.0-20220720015624-ce8397f7d944 "github.com/minio/minio"
@echo "(re)installing $(GOBIN)/minio-v0.0.0-20241014163537-3da7c9cce3de"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=minio.mod -o=$(GOBIN)/minio-v0.0.0-20241014163537-3da7c9cce3de "github.com/minio/minio"
PROMDOC := $(GOBIN)/promdoc-v0.8.0
$(PROMDOC): $(BINGO_DIR)/promdoc.mod
@ -107,11 +113,11 @@ $(PROMDOC): $(BINGO_DIR)/promdoc.mod
@echo "(re)installing $(GOBIN)/promdoc-v0.8.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=promdoc.mod -o=$(GOBIN)/promdoc-v0.8.0 "github.com/plexsystems/promdoc"
PROMETHEUS := $(GOBIN)/prometheus-v0.37.0
PROMETHEUS := $(GOBIN)/prometheus-v0.54.1
$(PROMETHEUS): $(BINGO_DIR)/prometheus.mod
@# Install binary/ries using Go 1.14+ build command. This is using bwplotka/bingo-controlled, separate go module with pinned dependencies.
@echo "(re)installing $(GOBIN)/prometheus-v0.37.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=prometheus.mod -o=$(GOBIN)/prometheus-v0.37.0 "github.com/prometheus/prometheus/cmd/prometheus"
@echo "(re)installing $(GOBIN)/prometheus-v0.54.1"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=prometheus.mod -o=$(GOBIN)/prometheus-v0.54.1 "github.com/prometheus/prometheus/cmd/prometheus"
PROMTOOL := $(GOBIN)/promtool-v0.47.0
$(PROMTOOL): $(BINGO_DIR)/promtool.mod
@ -131,9 +137,9 @@ $(PROTOC_GEN_GOGOFAST): $(BINGO_DIR)/protoc-gen-gogofast.mod
@echo "(re)installing $(GOBIN)/protoc-gen-gogofast-v1.3.2"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=protoc-gen-gogofast.mod -o=$(GOBIN)/protoc-gen-gogofast-v1.3.2 "github.com/gogo/protobuf/protoc-gen-gogofast"
SHFMT := $(GOBIN)/shfmt-v3.7.0
SHFMT := $(GOBIN)/shfmt-v3.8.0
$(SHFMT): $(BINGO_DIR)/shfmt.mod
@# Install binary/ries using Go 1.14+ build command. This is using bwplotka/bingo-controlled, separate go module with pinned dependencies.
@echo "(re)installing $(GOBIN)/shfmt-v3.7.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=shfmt.mod -o=$(GOBIN)/shfmt-v3.7.0 "mvdan.cc/sh/v3/cmd/shfmt"
@echo "(re)installing $(GOBIN)/shfmt-v3.8.0"
@cd $(BINGO_DIR) && GOWORK=off $(GO) build -mod=mod -modfile=shfmt.mod -o=$(GOBIN)/shfmt-v3.8.0 "mvdan.cc/sh/v3/cmd/shfmt"

View File

@ -1,5 +1,7 @@
module _ // Auto generated by https://github.com/bwplotka/bingo. DO NOT EDIT
go 1.14
go 1.21
require github.com/prometheus/alertmanager v0.24.0 // cmd/alertmanager
toolchain go1.23.1
require github.com/prometheus/alertmanager v0.27.0 // cmd/alertmanager

File diff suppressed because it is too large Load Diff

View File

@ -2,4 +2,4 @@ module _ // Auto generated by https://github.com/bwplotka/bingo. DO NOT EDIT
go 1.14
require github.com/bwplotka/bingo v0.8.1-0.20230820182247-0568407746a2
require github.com/bwplotka/bingo v0.9.0

View File

@ -4,6 +4,8 @@ github.com/bwplotka/bingo v0.6.0 h1:AlRrI9J/GVjOUSZbsYQ5WS8X8FnLpTbEAhUVW5iOQ7M=
github.com/bwplotka/bingo v0.6.0/go.mod h1:/qx0tLceUEeAs1R8QnIF+n9+Q0xUe7hmdQTB2w0eDYk=
github.com/bwplotka/bingo v0.8.1-0.20230820182247-0568407746a2 h1:nvLMMDf/Lw2JdJe2KzXjnL7IhIU+j48CXFZEuR9uPHQ=
github.com/bwplotka/bingo v0.8.1-0.20230820182247-0568407746a2/go.mod h1:GxC/y/xbmOK5P29cn+B3HuOSw0s2gruddT3r+rDizDw=
github.com/bwplotka/bingo v0.9.0 h1:slnsdJYExR4iRalHR6/ZiYnr9vSazOuFGmc2LdX293g=
github.com/bwplotka/bingo v0.9.0/go.mod h1:GxC/y/xbmOK5P29cn+B3HuOSw0s2gruddT3r+rDizDw=
github.com/cpuguy83/go-md2man/v2 v2.0.2/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/creack/pty v1.1.15/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4=

5
.bingo/capnpc-go.mod Normal file
View File

@ -0,0 +1,5 @@
module _ // Auto generated by https://github.com/bwplotka/bingo. DO NOT EDIT
go 1.23.1
require capnproto.org/go/capnp/v3 v3.0.1-alpha.2.0.20240830165715-46ccd63a72af // capnpc-go

6
.bingo/capnpc-go.sum Normal file
View File

@ -0,0 +1,6 @@
capnproto.org/go/capnp/v3 v3.0.1-alpha.2.0.20240830165715-46ccd63a72af h1:A5wxH0ZidOtYYUGjhtBaRuB87M73bGfc06uWB8sHpg0=
capnproto.org/go/capnp/v3 v3.0.1-alpha.2.0.20240830165715-46ccd63a72af/go.mod h1:2vT5D2dtG8sJGEoEKU17e+j7shdaYp1Myl8X03B3hmc=
github.com/colega/zeropool v0.0.0-20230505084239-6fb4a4f75381 h1:d5EKgQfRQvO97jnISfR89AiCCCJMwMFoSxUiU0OGCRU=
github.com/colega/zeropool v0.0.0-20230505084239-6fb4a4f75381/go.mod h1:OU76gHeRo8xrzGJU3F3I1CqX1ekM8dfJw0+wPeMwnp0=
golang.org/x/sync v0.7.0 h1:YsImfSBoP9QPYL0xyKJPq0gcaJdG3rInoqxTWbfQu9M=
golang.org/x/sync v0.7.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=

View File

@ -1,5 +1,11 @@
module _ // Auto generated by https://github.com/bwplotka/bingo. DO NOT EDIT
go 1.14
go 1.22.0
require github.com/fatih/faillint v1.11.0
toolchain go1.24.0
replace github.com/fatih/faillint => github.com/thanos-community/faillint v0.0.0-20250217160734-830c2205d383
require github.com/fatih/faillint v1.13.0
require golang.org/x/sync v0.11.0 // indirect

View File

@ -6,25 +6,52 @@ github.com/fatih/faillint v1.10.0 h1:NQ2zhSNuYp0g23/6gyCSi2IfdVIfOk/JkSzpWSDEnYQ
github.com/fatih/faillint v1.10.0/go.mod h1:upblMxCjN4sL78nBbOHFEH9UGHTSw61M3Kj9BMS0UL0=
github.com/fatih/faillint v1.11.0 h1:EhmAKe8k0Cx2gnf+/JiX/IAeeKjwsQao5dY8oG6cQB4=
github.com/fatih/faillint v1.11.0/go.mod h1:d9kdQwFcr+wD4cLXOdjTw1ENUUvv5+z0ctJ5Wm0dTvA=
github.com/fatih/faillint v1.13.0 h1:9Dn9ZvK7bPTFmAkQ0FvhBRF4qD+LZg0ZgelyeBc7kKE=
github.com/fatih/faillint v1.13.0/go.mod h1:YiTDDtwQSL6MNRPtYG0n/rGE9orYt92aohq/P2QYBLA=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/thanos-community/faillint v0.0.0-20250217160734-830c2205d383 h1:cuHWR5WwIVpmvccpJ2iYgEWIo1SQQCvPWtYSPOiVvoU=
github.com/thanos-community/faillint v0.0.0-20250217160734-830c2205d383/go.mod h1:KM6cUIJEIVjYDUACgnDrky9bsAP4/+d37G7sGbEn6I0=
github.com/yuin/goldmark v1.4.1/go.mod h1:mwnBkeHKe2W/ZEtQ+71ViKU8L12m81fl3OWwC1Zlc8k=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
golang.org/x/crypto v0.13.0/go.mod h1:y6Z2r+Rw4iayiXXAIxJIDAJ1zMW4yaTpebo8fPOliYc=
golang.org/x/crypto v0.19.0/go.mod h1:Iy9bg/ha4yyC70EfRS8jz+B6ybOBKMaSxLj6P6oBDfU=
golang.org/x/crypto v0.21.0/go.mod h1:0BP7YvVV9gBbVKyeTG0Gyn+gZm94bibOW5BjDEYAOMs=
golang.org/x/crypto v0.33.0/go.mod h1:bVdXmD7IV/4GdElGPozy6U7lWdRXA4qyRVGJV57uQ5M=
golang.org/x/mod v0.5.1 h1:OJxoQ/rynoF0dcCdI7cLPktw/hR2cueqYfjm43oqK38=
golang.org/x/mod v0.5.1/go.mod h1:5OXOZSfqPIIbmVBIIKWRFfZjPR0E5r58TLhUjH0a2Ro=
golang.org/x/mod v0.6.0-dev.0.20220106191415-9b9b3d81d5e3 h1:kQgndtyPBW/JIYERgdxfwMYh3AVStj88WQTlNDi2a+o=
golang.org/x/mod v0.6.0-dev.0.20220106191415-9b9b3d81d5e3/go.mod h1:3p9vT2HGsQu2K1YbXdKPJLVgG5VJdoTa1poYQBtP1AY=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4 h1:6zppjxzCulZykYSLyVDYbneBfbaBIQPYMevg0bEwv2s=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=
golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.12.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.15.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/mod v0.16.0 h1:QX4fJ0Rr5cPQCF7O9lh9Se4pmwfwskqZfq5moyldzic=
golang.org/x/mod v0.16.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/mod v0.23.0 h1:Zb7khfcRGKk+kqfxFaP5tZqCnDZMjC5VtUBs87Hr6QM=
golang.org/x/mod v0.23.0/go.mod h1:6SkKJ3Xj0I0BrPOZoBy3bdMptDDU9oJrpohJ3eWZ1fY=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20211015210444-4f30a5c0130f/go.mod h1:9nx3DQGgdP8bBQD5qxJ1jj9UTztislL4KSBs9R2vV5Y=
golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=
golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg=
golang.org/x/net v0.15.0/go.mod h1:idbUs1IY1+zTqbi8yxTbhexhEEk5ur9LInksu6HrEpk=
golang.org/x/net v0.21.0/go.mod h1:bIjVDfnllIU7BJ2DNgfnXvpSvtn8VRwhlsaeUTyUS44=
golang.org/x/net v0.22.0/go.mod h1:JKghWKKOSdJwpW2GEx0Ja7fmaKnMsbu+MWVZTokSYmg=
golang.org/x/net v0.35.0/go.mod h1:EglIi67kWsHKlRzzVMUD93VMSWGFOMSZgxFjparz1Qk=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.3.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y=
golang.org/x/sync v0.6.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.11.0 h1:GGz8+XQP4FvTTrjZPzNKTMFtSXH80RAzG+5ghFPgK9w=
golang.org/x/sync v0.11.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
@ -35,12 +62,31 @@ golang.org/x/sys v0.0.0-20211019181941-9d821ace8654/go.mod h1:oPkhp1MJrh7nUepCBc
golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f h1:v4INt8xihDGvnrfjMDVXGxw9wrfxYyCjk0KbXjhR55s=
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.17.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.18.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.30.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/telemetry v0.0.0-20240228155512-f48c80bd79b2/go.mod h1:TeRTkGYfJXctD9OcfyVLyj2J3IxLnKwHJR8f4D8a3YE=
golang.org/x/telemetry v0.0.0-20240521205824-bda55230c457/go.mod h1:pRgIJT+bRLFKnoM1ldnzKoxTIn14Yxz928LQRYYgIN0=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
golang.org/x/term v0.8.0/go.mod h1:xPskH00ivmX89bAKVGSKKtLOWNx2+17Eiy94tnKShWo=
golang.org/x/term v0.12.0/go.mod h1:owVbMEjm3cBLCHdkQu9b1opXd4ETQWc3BhuQGKgXgvU=
golang.org/x/term v0.17.0/go.mod h1:lLRBjIVuehSbZlaOtGMbcMncT+aqLLLmKrsjNrUguwk=
golang.org/x/term v0.18.0/go.mod h1:ILwASektA3OnRv7amZ1xhE/KTR+u50pbXfZ03+6Nx58=
golang.org/x/term v0.29.0/go.mod h1:6bl4lRlvVuDgSf3179VpIxBF0o10JUpXWOnI7nErv7s=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8=
golang.org/x/text v0.13.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE=
golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/text v0.22.0/go.mod h1:YRoo4H8PVmsu+E3Ou7cqLVH8oXWIHVoX0jqUWALQhfY=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.1.8 h1:P1HhGGuLW4aAclzjtmJdf0mJOjVUZUzOTqkAkWL+l6w=
@ -49,6 +95,12 @@ golang.org/x/tools v0.1.10 h1:QjFRCZxdOhBJ/UNgnBZLbNV13DlbnK0quyivTnXJM20=
golang.org/x/tools v0.1.10/go.mod h1:Uh6Zz+xoGYZom868N8YTex3t7RhtHDBrE8Gzo9bV56E=
golang.org/x/tools v0.1.12 h1:VveCTK38A2rkS8ZqFY25HIDFscX5X9OoEhJd3quQmXU=
golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc=
golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU=
golang.org/x/tools v0.13.0/go.mod h1:HvlwmtVNQAhOuCjW7xxvovg8wbNq7LwfXh/k7wXUl58=
golang.org/x/tools v0.19.0 h1:tfGCXNR1OsFG+sVdLAitlpjAvD/I6dHDKnYrpEZUHkw=
golang.org/x/tools v0.19.0/go.mod h1:qoJWxmGSIBmAeriMx19ogtrEPrGtDbPK634QFIcLAhc=
golang.org/x/tools v0.30.0 h1:BgcpHewrV5AUp2G9MebG4XPFI1E2W41zU1SaqVA9vJY=
golang.org/x/tools v0.30.0/go.mod h1:c347cR/OJfw5TI+GfX7RUPNMdDRRbjvYTS0jPyvsVtY=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1 h1:go1bK/D/BFZV2I8cIQd1NKEZ+0owSTG1fDTci4IqFcE=

View File

@ -2,4 +2,4 @@ module _ // Auto generated by https://github.com/bwplotka/bingo. DO NOT EDIT
go 1.14
require golang.org/x/tools v0.12.0 // cmd/goimports
require golang.org/x/tools v0.23.0 // cmd/goimports

View File

@ -1,3 +1,4 @@
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.4.1/go.mod h1:mwnBkeHKe2W/ZEtQ+71ViKU8L12m81fl3OWwC1Zlc8k=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
@ -5,6 +6,10 @@ golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACk
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
golang.org/x/crypto v0.12.0/go.mod h1:NF0Gs7EO5K4qLn+Ylc+fih8BSTeIjAP05siRnAh98yw=
golang.org/x/crypto v0.13.0/go.mod h1:y6Z2r+Rw4iayiXXAIxJIDAJ1zMW4yaTpebo8fPOliYc=
golang.org/x/crypto v0.19.0/go.mod h1:Iy9bg/ha4yyC70EfRS8jz+B6ybOBKMaSxLj6P6oBDfU=
golang.org/x/crypto v0.23.0/go.mod h1:CKFgDieR+mRhux2Lsu27y0fO304Db0wZe70UKqHu0v8=
golang.org/x/crypto v0.25.0/go.mod h1:T+wALwcMOSE0kXgUAnPAHqTLW+XHgcELELW8VaDgm/M=
golang.org/x/mod v0.2.0 h1:KU7oHjnv3XNWfa5COkzUifxZmxp1TyI7ImMXqFxLwvQ=
golang.org/x/mod v0.2.0/go.mod h1:s0Qsj1ACt9ePp/hMypM3fl4fZqREWJwdYDEqhRiZZUA=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4 h1:6zppjxzCulZykYSLyVDYbneBfbaBIQPYMevg0bEwv2s=
@ -12,6 +17,10 @@ golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91
golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.12.0 h1:rmsUpXtvNzj340zd98LZ4KntptpfRHwpFOHG188oHXc=
golang.org/x/mod v0.12.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.15.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/mod v0.17.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/mod v0.19.0 h1:fEdghXQSo20giMthA7cd28ZC+jts4amQ3YMXiP5oMQ8=
golang.org/x/mod v0.19.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn7q6eTqICYqUVnKs3thJo3Qplg=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
@ -21,12 +30,19 @@ golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug
golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg=
golang.org/x/net v0.14.0/go.mod h1:PpSgVXXLK0OxS0F31C1/tv6XNguvCrnXIDrFMspZIUI=
golang.org/x/net v0.15.0/go.mod h1:idbUs1IY1+zTqbi8yxTbhexhEEk5ur9LInksu6HrEpk=
golang.org/x/net v0.21.0/go.mod h1:bIjVDfnllIU7BJ2DNgfnXvpSvtn8VRwhlsaeUTyUS44=
golang.org/x/net v0.25.0/go.mod h1:JkAGAh7GEvH74S6FOH42FLoXpXbE/aqXSrIQjXgsiwM=
golang.org/x/net v0.27.0/go.mod h1:dDi0PyhWNoiUOrAS8uXv/vnScO4wnHQO4mj9fn/RytE=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.3.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y=
golang.org/x/sync v0.6.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.7.0 h1:YsImfSBoP9QPYL0xyKJPq0gcaJdG3rInoqxTWbfQu9M=
golang.org/x/sync v0.7.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190412213103-97732733099d/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
@ -40,11 +56,21 @@ golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.11.0 h1:eG7RXZHdqOJ1i+0lgLgCpSXAp6M3LYlAo6osgSi0xOM=
golang.org/x/sys v0.11.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.17.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.20.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.22.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/telemetry v0.0.0-20240228155512-f48c80bd79b2/go.mod h1:TeRTkGYfJXctD9OcfyVLyj2J3IxLnKwHJR8f4D8a3YE=
golang.org/x/telemetry v0.0.0-20240521205824-bda55230c457/go.mod h1:pRgIJT+bRLFKnoM1ldnzKoxTIn14Yxz928LQRYYgIN0=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
golang.org/x/term v0.8.0/go.mod h1:xPskH00ivmX89bAKVGSKKtLOWNx2+17Eiy94tnKShWo=
golang.org/x/term v0.11.0/go.mod h1:zC9APTIj3jG3FdV/Ons+XE1riIZXG4aZ4GTHiPZJPIU=
golang.org/x/term v0.12.0/go.mod h1:owVbMEjm3cBLCHdkQu9b1opXd4ETQWc3BhuQGKgXgvU=
golang.org/x/term v0.17.0/go.mod h1:lLRBjIVuehSbZlaOtGMbcMncT+aqLLLmKrsjNrUguwk=
golang.org/x/term v0.20.0/go.mod h1:8UkIAJTvZgivsXaD6/pH6U9ecQzZ45awqEOzuCvwpFY=
golang.org/x/term v0.22.0/go.mod h1:F3qCibpT5AMpCRfhfT53vVJwhLtIVHhB9XDjfFvnMI4=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
@ -52,6 +78,10 @@ golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8=
golang.org/x/text v0.12.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE=
golang.org/x/text v0.13.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE=
golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/text v0.15.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/text v0.16.0/go.mod h1:GhwF1Be+LQoKShO3cGOHzqOgRrGaYc9AvblQOmPVHnI=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.0.0-20200526224456-8b020aee10d2 h1:21BqcH/onxtGHn1A2GDOJjZnbt4Nlez629S3eaR+eYs=
@ -62,6 +92,10 @@ golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc
golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU=
golang.org/x/tools v0.12.0 h1:YW6HUoUmYBpwSgyaGaZq1fHjrBjX1rlpZ54T6mu2kss=
golang.org/x/tools v0.12.0/go.mod h1:Sc0INKfu04TlqNoRA1hgpFZbhYXHPr4V5DzpSBTPqQM=
golang.org/x/tools v0.13.0/go.mod h1:HvlwmtVNQAhOuCjW7xxvovg8wbNq7LwfXh/k7wXUl58=
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d/go.mod h1:aiJjzUbINMkxbQROHiO6hDPo2LHcIPhhQsa9DLh0yGk=
golang.org/x/tools v0.23.0 h1:SGsXPZ+2l4JsgaCKkx+FQ9YZ5XEtA1GZYuoDjenLjvg=
golang.org/x/tools v0.23.0/go.mod h1:pnu6ufv6vQkll6szChhK3C3L/ruaIv5eBeztNG8wtsI=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543 h1:E7g+9GITq07hpfrRu66IVDexMakfv52eLZ2CXBWiKr4=

View File

@ -1,5 +1,7 @@
module _ // Auto generated by https://github.com/bwplotka/bingo. DO NOT EDIT
go 1.14
go 1.23.0
require github.com/golangci/golangci-lint v1.54.1 // cmd/golangci-lint
toolchain go1.24.0
require github.com/golangci/golangci-lint v1.64.5 // cmd/golangci-lint

File diff suppressed because it is too large Load Diff

View File

@ -1,4 +1,6 @@
github.com/Songmu/gotesplit v0.2.1 h1:qJFvR75nJpeKyMQFwyDtFrcc6zDWhrHAkks7DvM8oLo=
github.com/Songmu/gotesplit v0.2.1/go.mod h1:sVBfmLT26b1H5VhUpq8cRhCVK75GAmW9c8r2NiK0gzk=
github.com/jstemmer/go-junit-report v1.0.0 h1:8X1gzZpR+nVQLAht+L/foqOeX2l9DTZoaIPbEQHxsds=
github.com/jstemmer/go-junit-report v1.0.0/go.mod h1:Brl9GWCQeLvo8nXZwPNNblvFj/XSXhF0NWZEnDohbsk=
golang.org/x/sync v0.0.0-20220513210516-0976fa681c29 h1:w8s32wxx3sY+OjLlv9qltkLU5yvJzxjjgiHWLjdIcw4=
golang.org/x/sync v0.0.0-20220513210516-0976fa681c29/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=

View File

@ -1,5 +1,7 @@
module _ // Auto generated by https://github.com/bwplotka/bingo. DO NOT EDIT
go 1.14
go 1.22
require github.com/minio/minio v0.0.0-20220720015624-ce8397f7d944
toolchain go1.23.1
require github.com/minio/minio v0.0.0-20241014163537-3da7c9cce3de

File diff suppressed because it is too large Load Diff

View File

@ -1,10 +1,12 @@
module _ // Auto generated by https://github.com/bwplotka/bingo. DO NOT EDIT
go 1.14
go 1.21.0
toolchain go1.23.1
replace k8s.io/klog => github.com/simonpasquier/klog-gokit v0.3.0
replace k8s.io/klog/v2 => github.com/simonpasquier/klog-gokit/v3 v3.0.0
replace k8s.io/klog/v2 => github.com/simonpasquier/klog-gokit/v3 v3.3.0
exclude github.com/linode/linodego v1.0.0
@ -12,4 +14,4 @@ exclude github.com/grpc-ecosystem/grpc-gateway v1.14.7
exclude google.golang.org/api v0.30.0
require github.com/prometheus/prometheus v0.37.0 // cmd/prometheus
require github.com/prometheus/prometheus v0.54.1 // cmd/prometheus

File diff suppressed because it is too large Load Diff

View File

@ -1,5 +1,7 @@
module _ // Auto generated by https://github.com/bwplotka/bingo. DO NOT EDIT
go 1.14
go 1.21
require mvdan.cc/sh/v3 v3.7.0 // cmd/shfmt
toolchain go1.22.5
require mvdan.cc/sh/v3 v3.8.0 // cmd/shfmt

View File

@ -1,11 +1,14 @@
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/creack/pty v1.1.17/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4=
github.com/creack/pty v1.1.18/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4=
github.com/creack/pty v1.1.21/go.mod h1:MOBLtS5ELjhRRrroQr9kyvTxUAFNvYEK993ew/Vr4O4=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/frankban/quicktest v1.14.0/go.mod h1:NeW+ay9A/U67EYXNFA1nPE8e/tnQv/09mUdL/ijj8og=
github.com/frankban/quicktest v1.14.5/go.mod h1:4ptaffx2x8+WTWXmUCuVU6aPUX1/Mz7zb5vbUoiM6w0=
github.com/frankban/quicktest v1.14.6/go.mod h1:4ptaffx2x8+WTWXmUCuVU6aPUX1/Mz7zb5vbUoiM6w0=
github.com/google/go-cmp v0.5.6/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/renameio v0.1.0 h1:GOZbcHa3HfsPKPlmyPyN2KEohoMXOhdMbHrvbpl2QaA=
github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI=
github.com/google/renameio v1.0.1 h1:Lh/jXZmvZxb0BBeSY5VKEfidcbcbenKjZFzM/q0fSeU=
@ -29,22 +32,27 @@ github.com/rogpeppe/go-internal v1.6.1/go.mod h1:xXDCJY+GAPziupqXw64V24skbSoqbTE
github.com/rogpeppe/go-internal v1.8.1/go.mod h1:JeRgkft04UBgHMgCIwADu4Pn6Mtm5d4nPKWu0nJ5d+o=
github.com/rogpeppe/go-internal v1.9.0/go.mod h1:WtVeX8xhTBvf0smdhujwtBcq4Qrzq/fJaraNFVN+nFs=
github.com/rogpeppe/go-internal v1.10.1-0.20230524175051-ec119421bb97/go.mod h1:ddIwULY96R17DhadqLgMfk9H9tvdUzkipdSkR5nkCZA=
github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
github.com/sergi/go-diff v1.0.0/go.mod h1:0CfEIISq7TuYL3j771MWULgwwjU+GofnZX9QAmXWZgo=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/testify v1.4.0/go.mod h1:j7eGeouHqKxXV5pUuKE4zz7dFj8WfuZ+81PSLYec5m4=
github.com/yuin/goldmark v1.4.13/go.mod h1:6yULJ656Px+3vBD8DxQVa3kxgyrAnzto9xy5taEt/CY=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20210921155107-089bfa567519/go.mod h1:GvvjBRRGRdwPK5ydBHafDWAxML/pGHZbMvKqRZ5+Abc=
golang.org/x/crypto v0.18.0/go.mod h1:R0j02AL6hcrfOiy9T4ZYp/rcWeMxM3L6QYxlOuEG1mg=
golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91VN4djpZkiMVwK6gcyfeH4XE8wZrZaV4=
golang.org/x/mod v0.9.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.14.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
golang.org/x/net v0.0.0-20220722155237-a158d28d115b/go.mod h1:XRhObCWvk6IyKnWLug+ECip1KBveYUHfp+8e9klMJ9c=
golang.org/x/net v0.20.0/go.mod h1:z8BVo6PvndSri0LbOE3hAn0apkU+1YvI6E70E9jsnvY=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.2.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.6.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20200217220822-9197077df867 h1:JoRuNIf+rpHl+VhScRQQvzbHed86tKkqwPMV34T8myw=
@ -57,6 +65,8 @@ golang.org/x/sys v0.0.0-20220520151302-bc2c85ada10a/go.mod h1:oPkhp1MJrh7nUepCBc
golang.org/x/sys v0.0.0-20220722155257-8c9f86f7a55f/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.8.0 h1:EBmGv8NaZBZTWvrbjNoL6HVt+IVy3QDQpJs7VRIw3tU=
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.17.0 h1:25cE3gD+tdBA7lp7QfhuV+rJiE9YXTcS3VG1SqssI/Y=
golang.org/x/sys v0.17.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/term v0.0.0-20191110171634-ad39bd3f0407 h1:5zh5atpUEdIc478E/ebrIaHLKcfVvG6dL/fGv7BcMoM=
golang.org/x/term v0.0.0-20191110171634-ad39bd3f0407/go.mod h1:Nr5EML6q2oocZ2LXRh80K7BxOlk5/8JxuGnuhpl+muw=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
@ -64,12 +74,16 @@ golang.org/x/term v0.0.0-20210927222741-03fcf44c2211 h1:JGgROgKl9N8DuW20oFS5gxc+
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.8.0 h1:n5xxQn2i3PC0yLAbjTpNT85q/Kgzcr2gIoX9OrJUols=
golang.org/x/term v0.8.0/go.mod h1:xPskH00ivmX89bAKVGSKKtLOWNx2+17Eiy94tnKShWo=
golang.org/x/term v0.17.0 h1:mkTF7LCd6WGJNL3K1Ad7kwxNfYAW6a8a8QqtMblp/4U=
golang.org/x/term v0.17.0/go.mod h1:lLRBjIVuehSbZlaOtGMbcMncT+aqLLLmKrsjNrUguwk=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ=
golang.org/x/text v0.14.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo=
golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc=
golang.org/x/tools v0.17.0/go.mod h1:xsh6VxdV005rRVaS6SSAf9oiAqljS7UZUacMZ8Bnsps=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
@ -82,9 +96,13 @@ mvdan.cc/editorconfig v0.1.1-0.20200121172147-e40951bde157 h1:VBYz8greWWP8BDpRX0
mvdan.cc/editorconfig v0.1.1-0.20200121172147-e40951bde157/go.mod h1:Ge4atmRUYqueGppvJ7JNrtqpqokoJEFxYbP0Z+WeKS8=
mvdan.cc/editorconfig v0.2.0 h1:XL+7ys6ls/RKrkUNFQvEwIvNHh+JKx8Mj1pUV5wQxQE=
mvdan.cc/editorconfig v0.2.0/go.mod h1:lvnnD3BNdBYkhq+B4uBuFFKatfp02eB6HixDvEz91C0=
mvdan.cc/editorconfig v0.2.1-0.20231228180347-1925077f8eb2 h1:8nmqQGVnHUtHuT+yvuA49lQK0y5il5IOr2PtCBkDI2M=
mvdan.cc/editorconfig v0.2.1-0.20231228180347-1925077f8eb2/go.mod h1:r8RiQJRtzrPrZdcdEs5VCMqvRxAzYDUu9a4S9z7fKh8=
mvdan.cc/sh/v3 v3.1.2 h1:PG5BYlwtrkZTbJXUy25r0/q9shB5ObttCaknkOIB1XQ=
mvdan.cc/sh/v3 v3.1.2/go.mod h1:F+Vm4ZxPJxDKExMLhvjuI50oPnedVXpfjNSrusiTOno=
mvdan.cc/sh/v3 v3.5.1 h1:hmP3UOw4f+EYexsJjFxvU38+kn+V/s2CclXHanIBkmQ=
mvdan.cc/sh/v3 v3.5.1/go.mod h1:1JcoyAKm1lZw/2bZje/iYKWicU/KMd0rsyJeKHnsK4E=
mvdan.cc/sh/v3 v3.7.0 h1:lSTjdP/1xsddtaKfGg7Myu7DnlHItd3/M2tomOcNNBg=
mvdan.cc/sh/v3 v3.7.0/go.mod h1:K2gwkaesF/D7av7Kxl0HbF5kGOd2ArupNTX3X44+8l8=
mvdan.cc/sh/v3 v3.8.0 h1:ZxuJipLZwr/HLbASonmXtcvvC9HXY9d2lXZHnKGjFc8=
mvdan.cc/sh/v3 v3.8.0/go.mod h1:w04623xkgBVo7/IUK89E0g8hBykgEpN0vgOj3RJr6MY=

View File

@ -1,4 +1,4 @@
# Auto generated binary variables helper managed by https://github.com/bwplotka/bingo v0.8. DO NOT EDIT.
# Auto generated binary variables helper managed by https://github.com/bwplotka/bingo v0.9. DO NOT EDIT.
# All tools are designed to be build inside $GOBIN.
# Those variables will work only until 'bingo get' was invoked, or if tools were installed via Makefile's Variables.mk.
GOBIN=${GOBIN:=$(go env GOBIN)}
@ -8,17 +8,19 @@ if [ -z "$GOBIN" ]; then
fi
ALERTMANAGER="${GOBIN}/alertmanager-v0.24.0"
ALERTMANAGER="${GOBIN}/alertmanager-v0.27.0"
BINGO="${GOBIN}/bingo-v0.8.1-0.20230820182247-0568407746a2"
BINGO="${GOBIN}/bingo-v0.9.0"
FAILLINT="${GOBIN}/faillint-v1.11.0"
CAPNPC_GO="${GOBIN}/capnpc-go-v3.0.1-alpha.2.0.20240830165715-46ccd63a72af"
GOIMPORTS="${GOBIN}/goimports-v0.12.0"
FAILLINT="${GOBIN}/faillint-v1.13.0"
GOIMPORTS="${GOBIN}/goimports-v0.23.0"
GOJSONTOYAML="${GOBIN}/gojsontoyaml-v0.1.0"
GOLANGCI_LINT="${GOBIN}/golangci-lint-v1.54.1"
GOLANGCI_LINT="${GOBIN}/golangci-lint-v1.64.5"
GOTESPLIT="${GOBIN}/gotesplit-v0.2.1"
@ -34,11 +36,11 @@ JSONNETFMT="${GOBIN}/jsonnetfmt-v0.18.0"
MDOX="${GOBIN}/mdox-v0.9.1-0.20220713110358-25b9abcf90a0"
MINIO="${GOBIN}/minio-v0.0.0-20220720015624-ce8397f7d944"
MINIO="${GOBIN}/minio-v0.0.0-20241014163537-3da7c9cce3de"
PROMDOC="${GOBIN}/promdoc-v0.8.0"
PROMETHEUS="${GOBIN}/prometheus-v0.37.0"
PROMETHEUS="${GOBIN}/prometheus-v0.54.1"
PROMTOOL="${GOBIN}/promtool-v0.47.0"
@ -46,5 +48,5 @@ PROMU="${GOBIN}/promu-v0.5.0"
PROTOC_GEN_GOGOFAST="${GOBIN}/protoc-gen-gogofast-v1.3.2"
SHFMT="${GOBIN}/shfmt-v3.7.0"
SHFMT="${GOBIN}/shfmt-v3.8.0"

View File

@ -8,56 +8,13 @@ orbs:
executors:
golang:
docker:
- image: cimg/go:1.21-node
- image: cimg/go:1.24.0-node
golang-test:
docker:
- image: cimg/go:1.21-node
- image: cimg/go:1.24.0-node
- image: quay.io/thanos/docker-swift-onlyone-authv2-keystone:v0.1
jobs:
test:
executor: golang-test
environment:
GO111MODULE: "on"
steps:
- git-shallow-clone/checkout
- go/load-cache
- go/mod-download
- run:
name: Download bingo modules
command: |
make install-tool-deps
- go/save-cache
- setup_remote_docker:
version: docker24
- run:
name: Create Secret if PR is not forked
# GCS integration tests are run only for author's PR that have write access, because these tests
# require credentials. Env variables that sets up these tests will work only for these kind of PRs.
command: |
if ! [ -z ${GCP_PROJECT} ]; then
echo $GOOGLE_APPLICATION_CREDENTIALS_CONTENT > $GOOGLE_APPLICATION_CREDENTIALS
echo "Awesome! GCS and S3 AWS integration tests are enabled."
fi
- run:
name: "Run unit tests."
no_output_timeout: "30m"
environment:
THANOS_TEST_OBJSTORE_SKIP: GCS,S3,AZURE,COS,ALIYUNOSS,BOS,OCI,OBS
# Variables for Swift testing.
OS_AUTH_URL: http://127.0.0.1:5000/v2.0
OS_PASSWORD: s3cr3t
OS_PROJECT_NAME: admin
OS_REGION_NAME: RegionOne
OS_USERNAME: admin
# taskset sets CPU affinity to 2 (current CPU limit).
command: |
if [ -z ${GCP_PROJECT} ]; then
export THANOS_TEST_OBJSTORE_SKIP=${THANOS_TEST_OBJSTORE_SKIP}
fi
echo "Skipping tests for object storages: ${THANOS_TEST_OBJSTORE_SKIP}"
taskset 2 make test
# Cross build is needed for publish_release but needs to be done outside of docker.
cross_build:
machine: true
@ -127,19 +84,11 @@ workflows:
version: 2
thanos:
jobs:
- test:
filters:
tags:
only: /.*/
- publish_main:
requires:
- test
filters:
branches:
only: main
- cross_build:
requires:
- test
filters:
tags:
only: /^v[0-9]+(\.[0-9]+){2}(-.+|[^-.]*)$/
@ -147,7 +96,6 @@ workflows:
ignore: /.*/
- publish_release:
requires:
- test
- cross_build
filters:
tags:

View File

@ -1,5 +1,5 @@
# For details, see https://github.com/devcontainers/images/tree/main/src/go
FROM mcr.microsoft.com/devcontainers/go:1.21
FROM mcr.microsoft.com/devcontainers/go:1.23
RUN echo "Downloading prometheus..." \
&& curl -sSL -H "Accept: application/vnd.github.v3+json" "https://api.github.com/repos/prometheus/prometheus/releases" -o /tmp/releases.json \

View File

@ -1,3 +0,0 @@
(github.com/go-kit/log.Logger).Log
fmt.Fprintln
fmt.Fprint

View File

@ -20,6 +20,10 @@ on:
schedule:
- cron: '30 12 * * 1'
permissions:
contents: read
security-events: write
jobs:
analyze:
name: Analyze
@ -35,16 +39,16 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Set up Go
uses: actions/setup-go@v5
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: 1.21.x
go-version: 1.22.x
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@fca7ace96b7d713c7035871441bd52efbe39e27e # v3.28.19
with:
languages: ${{ matrix.language }}
config-file: ./.github/codeql/codeql-config.yml
@ -56,7 +60,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v3
uses: github/codeql-action/autobuild@fca7ace96b7d713c7035871441bd52efbe39e27e # v3.28.19
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
@ -70,4 +74,4 @@ jobs:
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@fca7ace96b7d713c7035871441bd52efbe39e27e # v3.28.19

View File

@ -3,12 +3,18 @@ on:
schedule:
- cron: '0 * * * *'
name: busybox-update workflow
permissions:
contents: read
jobs:
checkVersionAndCreatePR:
permissions:
contents: write # for peter-evans/create-pull-request to create branch
pull-requests: write # for peter-evans/create-pull-request to create a PR
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Run busybox updater
run: |
@ -17,7 +23,7 @@ jobs:
shell: bash
- name: Create Pull Request
uses: peter-evans/create-pull-request@v6
uses: peter-evans/create-pull-request@dd2324fc52d5d43c699a5636bcf19fceaa70c284 # v7.0.7
with:
signoff: true
token: ${{ secrets.GITHUB_TOKEN }}

View File

@ -7,6 +7,9 @@ on:
tags:
pull_request:
permissions:
contents: read
jobs:
check:
runs-on: ubuntu-latest
@ -15,21 +18,21 @@ jobs:
GOBIN: /tmp/.bin
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install Go
uses: actions/setup-go@v5
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: 1.21.x
go-version: 1.24.x
- uses: actions/cache@v4
- uses: actions/cache@0c907a75c2c80ebcb7f088228285e798b750cf8f # v4.2.1
with:
path: ~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
restore-keys: |
${{ runner.os }}-go-
- uses: actions/cache@v4
- uses: actions/cache@0c907a75c2c80ebcb7f088228285e798b750cf8f # v4.2.1
with:
path: .mdoxcache
key: ${{ runner.os }}-mdox-${{ hashFiles('docs/**/*.md', 'examples/**/*.md', 'mixin/**/*.md', '*.md') }}

View File

@ -7,8 +7,42 @@ on:
tags:
pull_request:
# TODO(bwplotka): Add tests here.
permissions:
contents: read
jobs:
unit:
runs-on: ubuntu-latest
name: Thanos unit tests
env:
THANOS_TEST_OBJSTORE_SKIP: GCS,S3,AZURE,COS,ALIYUNOSS,BOS,OCI,OBS,SWIFT
OS_AUTH_URL: http://127.0.0.1:5000/v2.0
OS_PASSWORD: s3cr3t
OS_PROJECT_NAME: admin
OS_REGION_NAME: RegionOne
OS_USERNAME: admin
GOBIN: /tmp/.bin
services:
swift:
image: 'quay.io/thanos/docker-swift-onlyone-authv2-keystone:v0.1'
ports:
- 5000:5000
steps:
- name: Checkout code
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install Go.
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: 1.24.x
- name: Install bingo modules
run: make install-tool-deps
- name: Add GOBIN to path
run: echo "/tmp/.bin" >> $GITHUB_PATH
- name: Run unit tests
run: make test
cross-build-check:
runs-on: ubuntu-latest
name: Go build for different platforms
@ -16,14 +50,14 @@ jobs:
GOBIN: /tmp/.bin
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install Go
uses: actions/setup-go@v5
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: 1.21.x
go-version: 1.24.x
- uses: actions/cache@v4
- uses: actions/cache@0c907a75c2c80ebcb7f088228285e798b750cf8f # v4.2.1
with:
path: |
~/.cache/go-build
@ -36,6 +70,33 @@ jobs:
- name: Cross build check
run: make crossbuild
build-stringlabels:
runs-on: ubuntu-latest
name: Go build with -tags=stringlabels
env:
GOBIN: /tmp/.bin
steps:
- name: Checkout code
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install Go
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: 1.24.x
- uses: actions/cache@0c907a75c2c80ebcb7f088228285e798b750cf8f # v4.2.1
with:
path: |
~/.cache/go-build
~/.cache/golangci-lint
~/go/pkg/mod
key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
restore-keys: |
${{ runner.os }}-go-
- name: Cross build check
run: go build -tags=stringlabels ./cmd/thanos
lint:
runs-on: ubuntu-latest
name: Linters (Static Analysis) for Go
@ -43,14 +104,14 @@ jobs:
GOBIN: /tmp/.bin
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install Go
uses: actions/setup-go@v5
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: 1.21.x
go-version: 1.24.x
- uses: actions/cache@v4
- uses: actions/cache@0c907a75c2c80ebcb7f088228285e798b750cf8f # v4.2.1
with:
path: |
~/.cache/go-build
@ -66,26 +127,40 @@ jobs:
- name: Linting & vetting
run: make go-lint
codespell:
runs-on: ubuntu-latest
name: Check misspelled words
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Run codespell
uses: codespell-project/actions-codespell@v2
with:
check_filenames: false
check_hidden: true
skip: ./pkg/ui/*,./pkg/store/6545postingsrepro,./internal/*,./mixin/vendor/*,./.bingo/*,go.mod,go.sum
ignore_words_list: intrumentation,mmaped,nd,ot,re-use,ser,serie,sme,sudu,tast,te,ans
e2e:
strategy:
fail-fast: false
matrix:
parallelism: [8]
index: [0, 1, 2, 3, 4, 5, 6, 7]
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
name: Thanos end-to-end tests
env:
GOBIN: /tmp/.bin
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install Go.
uses: actions/setup-go@v5
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: 1.21.x
go-version: 1.24.x
- uses: actions/cache@v4
- uses: actions/cache@0c907a75c2c80ebcb7f088228285e798b750cf8f # v4.2.1
with:
path: |
~/.cache/go-build

View File

@ -6,17 +6,20 @@ on:
pull_request:
branches: [main]
permissions:
contents: read
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Set up Go
uses: actions/setup-go@v5
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: 1.21.x
go-version: 1.22.x
- name: Generate
run: make examples
@ -29,12 +32,12 @@ jobs:
name: Linters (Static Analysis) for Jsonnet (mixin)
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install Go
uses: actions/setup-go@v5
uses: actions/setup-go@0a12ed9d6a96ab950c8f026ed9f722fe0da7ef32 # v5.0.2
with:
go-version: 1.21.x
go-version: 1.22.x
- name: Format
run: |

View File

@ -6,6 +6,9 @@ on:
- main
pull_request:
permissions:
contents: read
jobs:
build:
runs-on: ubuntu-latest
@ -15,14 +18,14 @@ jobs:
name: React UI test on Node ${{ matrix.node }}
steps:
- name: Checkout code
uses: actions/checkout@v4
uses: actions/checkout@692973e3d937129bcbf40652eb9f2f61becf3332 # v4.1.7
- name: Install nodejs
uses: actions/setup-node@v4
uses: actions/setup-node@1e60f620b9541d16bece96c5465dc8ee9832be0b # v4.0.3
with:
node-version: ${{ matrix.node }}
- uses: actions/cache@v4
- uses: actions/cache@0c907a75c2c80ebcb7f088228285e798b750cf8f # v4.2.1
with:
path: ~/.npm
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}

View File

@ -1 +1 @@
1.21
1.24

View File

@ -4,25 +4,17 @@
# options for analysis running
run:
# timeout for analysis, e.g. 30s, 5m, default is 1m
deadline: 5m
timeout: 5m
# exit code when at least one issue was found, default is 1
issues-exit-code: 1
# which dirs to skip: they won't be analyzed;
# can use regexp here: generated.*, regexp is applied on full path;
# default value is empty list, but next dirs are always skipped independently
# from this option's value:
# vendor$, third_party$, testdata$, examples$, Godeps$, builtin$
skip-dirs:
- vendor
- internal/cortex
# output configuration options
output:
# colored-line-number|line-number|json|tab|checkstyle, default is "colored-line-number"
format: colored-line-number
# The formats used to render issues.
formats:
- format: colored-line-number
path: stdout
# print lines of code with issue, default is true
print-issued-lines: true
@ -46,12 +38,15 @@ linters:
- typecheck
- unparam
- unused
- exportloopref
- promlinter
linters-settings:
errcheck:
exclude: ./.errcheck_excludes.txt
# List of functions to exclude from checking, where each entry is a single function to exclude.
exclude-functions:
- (github.com/go-kit/log.Logger).Log
- fmt.Fprintln
- fmt.Fprint
misspell:
locale: US
goconst:
@ -80,3 +75,7 @@ issues:
- linters:
- unused
text: "ruleAndAssert"
# Which dirs to exclude: issues from them won't be reported.
exclude-dirs:
- vendor
- internal/cortex

View File

@ -42,6 +42,11 @@ validators:
type: 'ignore'
- regex: 'twitter\.com'
type: 'ignore'
# 500 when requested my mdox in GH actions.
- regex: 'outshift\.cisco\.com'
- regex: 'outshift\.cisco\.com\/blog\/multi-cluster-monitoring'
type: 'ignore'
# Expired certificate
- regex: 'bestpractices\.coreinfrastructure\.org\/projects\/3048'
type: 'ignore'
# Frequent DNS issues.
- regex: 'build\.thebeat\.co'
type: 'ignore'

View File

@ -1,5 +1,5 @@
go:
version: 1.21
version: 1.23
repository:
path: github.com/thanos-io/thanos
build:
@ -16,7 +16,7 @@ build:
crossbuild:
platforms:
- linux/amd64
- darwin/amd64
- darwin
- linux/arm64
- windows/amd64
- freebsd/amd64

View File

@ -10,11 +10,184 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
## Unreleased
### Added
### Changed
### Removed
### Fixed
## [v0.39.0](https://github.com/thanos-io/thanos/tree/release-0.39) - 2025 06 25
In short: there are a bunch of fixes and small improvements. The shining items in this release are memory usage improvements in Thanos Query and shuffle sharding support in Thanos Receiver. Information about shuffle sharding support is available in the documentation. Thank you to all contributors!
### Added
- [#8308](https://github.com/thanos-io/thanos/pull/8308) Receive: Prometheus counters for pending write requests and series requests
- [#8225](https://github.com/thanos-io/thanos/pull/8225) tools: Extend bucket ls options.
- [#8238](https://github.com/thanos-io/thanos/pull/8238) Receive: add shuffle sharding support
- [#8284](https://github.com/thanos-io/thanos/pull/8284) Store: Add `--disable-admin-operations` Flag to Store Gateway
- [#8245](https://github.com/thanos-io/thanos/pull/8245) Querier/Query-Frontend/Ruler: Add `--enable-feature=promql-experimental-functions` flag option to enable using promQL experimental functions in respective Thanos components
- [#8259](https://github.com/thanos-io/thanos/pull/8259) Shipper: Add `--shipper.skip-corrupted-blocks` flag to allow `Sync()` to continue upload when finding a corrupted block
### Changed
- [#8282](https://github.com/thanos-io/thanos/pull/8282) Force sync writes to meta.json in case of host crash
- [#8192](https://github.com/thanos-io/thanos/pull/8192) Sidecar: fix default get config timeout
- [#8202](https://github.com/thanos-io/thanos/pull/8202) Receive: Unhide `--tsdb.enable-native-histograms` flag
- [#8315](https://github.com/thanos-io/thanos/pull/8315) Query-Frontend: only ready if downstream is ready
### Removed
- [#8289](https://github.com/thanos-io/thanos/pull/8289) Receive: *breaking :warning:* Removed migration of legacy-TSDB to multi-TSDB. Ensure you are running version >0.13
### Fixed
- [#8199](https://github.com/thanos-io/thanos/pull/8199) Query: handle panics or nil pointer dereference in querier gracefully when query analyze returns nil
- [#8211](https://github.com/thanos-io/thanos/pull/8211) Query: fix panic on nested partial response in distributed instant query
- [#8216](https://github.com/thanos-io/thanos/pull/8216) Query/Receive: fix iter race between `next()` and `stop()` introduced in https://github.com/thanos-io/thanos/pull/7821.
- [#8212](https://github.com/thanos-io/thanos/pull/8212) Receive: Ensure forward/replication metrics are incremented in err cases
- [#8296](https://github.com/thanos-io/thanos/pull/8296) Query: limit LazyRetrieval memory buffer size
## [v0.38.0](https://github.com/thanos-io/thanos/tree/release-0.38) - 03.04.2025
### Fixed
- [#8091](https://github.com/thanos-io/thanos/pull/8091) *: Add POST into allowed CORS methods header
- [#8046](https://github.com/thanos-io/thanos/pull/8046) Query-Frontend: Fix query statistic reporting for range queries when caching is enabled.
- [#7978](https://github.com/thanos-io/thanos/pull/7978) Receive: Fix deadlock during local writes when `split-tenant-label-name` is used
- [#8016](https://github.com/thanos-io/thanos/pull/8016) Query Frontend: Fix @ modifier not being applied correctly on sub queries.
### Added
- [#7907](https://github.com/thanos-io/thanos/pull/7907) Receive: Add `--receive.grpc-service-config` flag to configure gRPC service config for the receivers.
- [#7961](https://github.com/thanos-io/thanos/pull/7961) Store Gateway: Add `--store.posting-group-max-keys` flag to mark posting group as lazy if it exceeds number of keys limit. Added `thanos_bucket_store_lazy_expanded_posting_groups_total` for total number of lazy posting groups and corresponding reasons.
- [#8000](https://github.com/thanos-io/thanos/pull/8000) Query: Bump promql-engine, pass partial response through options
- [#7353](https://github.com/thanos-io/thanos/pull/7353) [#8045](https://github.com/thanos-io/thanos/pull/8045) Receiver/StoreGateway: Add `--matcher-cache-size` option to enable caching for regex matchers in series calls.
- [#8017](https://github.com/thanos-io/thanos/pull/8017) Store Gateway: Use native histogram for binary reader load and download duration and fixed download duration metric. #8017
- [#8131](https://github.com/thanos-io/thanos/pull/8131) Store Gateway: Optimize regex matchers for .* and .+. #8131
- [#7808](https://github.com/thanos-io/thanos/pull/7808) Query: Support chain deduplication algorithm.
- [#8158](https://github.com/thanos-io/thanos/pull/8158) Rule: Add support for query offset.
- [#8110](https://github.com/thanos-io/thanos/pull/8110) Compact: implement native histogram downsampling.
- [#7996](https://github.com/thanos-io/thanos/pull/7996) Receive: Add OTLP endpoint.
### Changed
- [#7890](https://github.com/thanos-io/thanos/pull/7890) Query,Ruler: *breaking :warning:* deprecated `--store.sd-file` and `--store.sd-interval` to be replaced with `--endpoint.sd-config` and `--endpoint-sd-config-reload-interval`; removed legacy flags to pass endpoints `--store`, `--metadata`, `--rule`, `--exemplar`.
- [#7012](https://github.com/thanos-io/thanos/pull/7012) Query: Automatically adjust `max_source_resolution` based on promql query to avoid querying data from higher resolution resulting empty results.
- [#8118](https://github.com/thanos-io/thanos/pull/8118) Query: Bumped promql-engine
- [#8135](https://github.com/thanos-io/thanos/pull/8135) Query: respect partial response in distributed engine
- [#8181](https://github.com/thanos-io/thanos/pull/8181) Deps: bump promql engine
### Removed
## [v0.37.2](https://github.com/thanos-io/thanos/tree/release-0.37) - 11.12.2024
### Fixed
- [#7970](https://github.com/thanos-io/thanos/pull/7970) Sidecar: Respect min-time setting.
- [#7962](https://github.com/thanos-io/thanos/pull/7962) Store: Fix potential deadlock in hedging request.
- [#8175](https://github.com/thanos-io/thanos/pull/8175) Query: fix endpointset setup
### Added
### Changed
### Removed
## [v0.37.1](https://github.com/thanos-io/thanos/tree/release-0.37) - 04.12.2024
### Fixed
- [#7674](https://github.com/thanos-io/thanos/pull/7674) Query-frontend: Fix connection to Redis cluster with TLS.
- [#7945](https://github.com/thanos-io/thanos/pull/7945) Receive: Capnproto - use segment from existing message.
- [#7941](https://github.com/thanos-io/thanos/pull/7941) Receive: Fix race condition when adding multiple new tenants, see [issue-7892](https://github.com/thanos-io/thanos/issues/7892).
- [#7954](https://github.com/thanos-io/thanos/pull/7954) Sidecar: Ensure limit param is positive for compatibility with older Prometheus.
- [#7953](https://github.com/thanos-io/thanos/pull/7953) Query: Update promql-engine for subquery avg fix.
### Added
### Changed
### Removed
## [v0.37.0](https://github.com/thanos-io/thanos/tree/release-0.37) - 25.11.2024
### Fixed
- [#7511](https://github.com/thanos-io/thanos/pull/7511) Query Frontend: fix doubled gzip compression for response body.
- [#7592](https://github.com/thanos-io/thanos/pull/7592) Ruler: Only increment `thanos_rule_evaluation_with_warnings_total` metric for non PromQL warnings.
- [#7614](https://github.com/thanos-io/thanos/pull/7614) *: fix debug log formatting.
- [#7492](https://github.com/thanos-io/thanos/pull/7492) Compactor: update filtered blocks list before second downsample pass.
- [#7658](https://github.com/thanos-io/thanos/pull/7658) Store: Fix panic because too small buffer in pool.
- [#7643](https://github.com/thanos-io/thanos/pull/7643) Receive: fix thanos_receive_write_{timeseries,samples} stats
- [#7644](https://github.com/thanos-io/thanos/pull/7644) fix(ui): add null check to find overlapping blocks logic
- [#7674](https://github.com/thanos-io/thanos/pull/7674) Query-frontend: Fix connection to Redis cluster with TLS.
- [#7814](https://github.com/thanos-io/thanos/pull/7814) Store: label_values: if matchers contain **name**=="something", do not add <labelname> != "" to fetch less postings.
- [#7679](https://github.com/thanos-io/thanos/pull/7679) Query: respect store.limit.* flags when evaluating queries
- [#7821](https://github.com/thanos-io/thanos/pull/7821) Query/Receive: Fix coroutine leak introduced in https://github.com/thanos-io/thanos/pull/7796.
- [#7843](https://github.com/thanos-io/thanos/pull/7843) Query Frontend: fix slow query logging for non-query endpoints.
- [#7852](https://github.com/thanos-io/thanos/pull/7852) Query Frontend: pass "stats" parameter forward to queriers and fix Prometheus stats merging.
- [#7832](https://github.com/thanos-io/thanos/pull/7832) Query Frontend: Fix cache keys for dynamic split intervals.
- [#7885](https://github.com/thanos-io/thanos/pull/7885) Store: Return chunks to the pool after completing a Series call.
- [#7893](https://github.com/thanos-io/thanos/pull/7893) Sidecar: Fix retrieval of external labels for Prometheus v3.0.0.
- [#7903](https://github.com/thanos-io/thanos/pull/7903) Query: Fix panic on regex store matchers.
- [#7915](https://github.com/thanos-io/thanos/pull/7915) Store: Close block series client at the end to not reuse chunk buffer
- [#7941](https://github.com/thanos-io/thanos/pull/7941) Receive: Fix race condition when adding multiple new tenants, see [issue-7892](https://github.com/thanos-io/thanos/issues/7892).
### Added
- [#7763](https://github.com/thanos-io/thanos/pull/7763) Ruler: use native histograms for client latency metrics.
- [#7609](https://github.com/thanos-io/thanos/pull/7609) API: Add limit param to metadata APIs (series, label names, label values).
- [#7429](https://github.com/thanos-io/thanos/pull/7429): Reloader: introduce `TolerateEnvVarExpansionErrors` to allow suppressing errors when expanding environment variables in the configuration file. When set, this will ensure that the reloader won't consider the operation to fail when an unset environment variable is encountered. Note that all unset environment variables are left as is, whereas all set environment variables are expanded as usual.
- [#7560](https://github.com/thanos-io/thanos/pull/7560) Query: Added the possibility of filtering rules by rule_name, rule_group or file to HTTP api.
- [#7652](https://github.com/thanos-io/thanos/pull/7652) Store: Implement metadata API limit in stores.
- [#7659](https://github.com/thanos-io/thanos/pull/7659) Receive: Add support for replication using [Cap'n Proto](https://capnproto.org/). This protocol has a lower CPU and memory footprint, which leads to a reduction in resource usage in Receivers. Before enabling it, make sure that all receivers are updated to a version which supports this replication method.
- [#7853](https://github.com/thanos-io/thanos/pull/7853) UI: Add support for selecting graph time range with mouse drag.
- [#7855](https://github.com/thanos-io/thanos/pull/7855) Compcat/Query: Add support for comma separated replica labels.
- [#7654](https://github.com/thanos-io/thanos/pull/7654) *: Add '--grpc-server-tls-min-version' flag to allow user to specify TLS version, otherwise default to TLS 1.3
- [#7854](https://github.com/thanos-io/thanos/pull/7854) Query Frontend: Add `--query-frontend.force-query-stats` flag to force collection of query statistics from upstream queriers.
- [#7860](https://github.com/thanos-io/thanos/pull/7860) Store: Support hedged requests
- [#7924](https://github.com/thanos-io/thanos/pull/7924) *: Upgrade promql-engine to `v0.0.0-20241106100125-097e6e9f425a` and objstore to `v0.0.0-20241111205755-d1dd89d41f97`
- [#7835](https://github.com/thanos-io/thanos/pull/7835) Ruler: Add ability to do concurrent rule evaluations
- [#7722](https://github.com/thanos-io/thanos/pull/7722) Query: Add partition labels flag to partition leaf querier in distributed mode
### Changed
- [#7494](https://github.com/thanos-io/thanos/pull/7494) Ruler: remove trailing period from SRV records returned by discovery `dnsnosrva` lookups
- [#7567](https://github.com/thanos-io/thanos/pull/7565) Query: Use thanos resolver for endpoint groups.
- [#7741](https://github.com/thanos-io/thanos/pull/7741) Deps: Bump Objstore to `v0.0.0-20240913074259-63feed0da069`
- [#7813](https://github.com/thanos-io/thanos/pull/7813) Receive: enable initial TSDB compaction time randomization
- [#7820](https://github.com/thanos-io/thanos/pull/7820) Sidecar: Use prometheus metrics for min timestamp
- [#7886](https://github.com/thanos-io/thanos/pull/7886) Discovery: Preserve results from other resolve calls
- [#7745](https://github.com/thanos-io/thanos/pull/7745) *: Build with Prometheus stringlabels tags
- [#7669](https://github.com/thanos-io/thanos/pull/7669) Receive: Change quorum calculation for rf=2
### Removed
- [#7704](https://github.com/thanos-io/thanos/pull/7704) *: *breaking :warning:* remove Store gRPC Info function. This has been deprecated for 3 years, its time to remove it.
- [#7793](https://github.com/thanos-io/thanos/pull/7793) Receive: Disable dedup proxy in multi-tsdb
- [#7678](https://github.com/thanos-io/thanos/pull/7678) Query: Skip formatting strings if debug logging is disabled
## [v0.36.1](https://github.com/thanos-io/thanos/tree/release-0.36)
### Fixed
- [#7634](https://github.com/thanos-io/thanos/pull/7634) Rule: fix Query and Alertmanager TLS configurations with CA only.
- [#7618](https://github.com/thanos-io/thanos/pull/7618) Proxy: Query goroutine leak when store.response-timeout is set
### Added
### Changed
### Removed
## [v0.36.0](https://github.com/thanos-io/thanos/tree/release-0.36)
### Fixed
- [#7326](https://github.com/thanos-io/thanos/pull/7326) Query: fixing exemplars proxy when querying stores with multiple tenants.
- [#7403](https://github.com/thanos-io/thanos/pull/7403) Sidecar: fix startup sequence
- [#7484](https://github.com/thanos-io/thanos/pull/7484) Proxy: fix panic in lazy response set
- [#7493](https://github.com/thanos-io/thanos/pull/7493) *: fix server grpc histograms
### Added
@ -299,7 +472,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
- [#6342](https://github.com/thanos-io/thanos/pull/6342) Cache/Redis: Upgrade `rueidis` to v1.0.2 to to improve error handling while shrinking a redis cluster.
- [#6325](https://github.com/thanos-io/thanos/pull/6325) Store: return gRPC resource exhausted error for byte limiter.
- [#6399](https://github.com/thanos-io/thanos/pull/6399) *: Fix double-counting bug in http_request_duration metric
- [#6428](https://github.com/thanos-io/thanos/pull/6428) Report gRPC connnection errors in the logs.
- [#6428](https://github.com/thanos-io/thanos/pull/6428) Report gRPC connection errors in the logs.
- [#6519](https://github.com/thanos-io/thanos/pull/6519) Reloader: Use timeout for initial apply.
- [#6509](https://github.com/thanos-io/thanos/pull/6509) Store Gateway: Remove `memWriter` from `fileWriter` to reduce memory usage when sync index headers.
- [#6556](https://github.com/thanos-io/thanos/pull/6556) Thanos compact: respect block-files-concurrency setting when downsampling
@ -421,7 +594,7 @@ NOTE: Querier's `query.promql-engine` flag enabling new PromQL engine is now unh
- [#5889](https://github.com/thanos-io/thanos/pull/5889) Query Frontend: Added support for vertical sharding `label_replace` and `label_join` functions.
- [#5865](https://github.com/thanos-io/thanos/pull/5865) Compact: Retry on sync metas error.
- [#5819](https://github.com/thanos-io/thanos/pull/5819) Store: Added a few objectives for Store's data summaries (touched/fetched amount and sizes). They are: 50, 95, and 99 quantiles.
- [#5837](https://github.com/thanos-io/thanos/pull/5837) Store: Added streaming retrival of series from object storage.
- [#5837](https://github.com/thanos-io/thanos/pull/5837) Store: Added streaming retrieval of series from object storage.
- [#5940](https://github.com/thanos-io/thanos/pull/5940) Objstore: Support for authenticating to Swift using application credentials.
- [#5945](https://github.com/thanos-io/thanos/pull/5945) Tools: Added new `no-downsample` marker to skip blocks when downsampling via `thanos tools bucket mark --marker=no-downsample-mark.json`. This will skip downsampling for blocks with the new marker.
- [#5977](https://github.com/thanos-io/thanos/pull/5977) Tools: Added remove flag on bucket mark command to remove deletion, no-downsample or no-compact markers on the block
@ -577,7 +750,7 @@ NOTE: Querier's `query.promql-engine` flag enabling new PromQL engine is now unh
- [#5170](https://github.com/thanos-io/thanos/pull/5170) All: Upgraded the TLS version from TLS1.2 to TLS1.3.
- [#5205](https://github.com/thanos-io/thanos/pull/5205) Rule: Add ruler labels as external labels in stateless ruler mode.
- [#5206](https://github.com/thanos-io/thanos/pull/5206) Cache: Add timeout for groupcache's fetch operation.
- [#5218](https://github.com/thanos-io/thanos/pull/5218) Tools: Thanos tools bucket downsample is now running continously.
- [#5218](https://github.com/thanos-io/thanos/pull/5218) Tools: Thanos tools bucket downsample is now running continuously.
- [#5231](https://github.com/thanos-io/thanos/pull/5231) Tools: Bucket verify tool ignores blocks with deletion markers.
- [#5244](https://github.com/thanos-io/thanos/pull/5244) Query: Promote negative offset and `@` modifier to stable features as per Prometheus [#10121](https://github.com/prometheus/prometheus/pull/10121).
- [#5255](https://github.com/thanos-io/thanos/pull/5255) InfoAPI: Set store API unavailable when stores are not ready.
@ -1337,7 +1510,7 @@ sse_config:
- [#1666](https://github.com/thanos-io/thanos/pull/1666) Compact: `thanos_compact_group_compactions_total` now counts block compactions, so operations that resulted in a compacted block. The old behaviour is now exposed by new metric: `thanos_compact_group_compaction_runs_started_total` and `thanos_compact_group_compaction_runs_completed_total` which counts compaction runs overall.
- [#1748](https://github.com/thanos-io/thanos/pull/1748) Updated all dependencies.
- [#1694](https://github.com/thanos-io/thanos/pull/1694) `prober_ready` and `prober_healthy` metrics are removed, for sake of `status`. Now `status` exposes same metric with a label, `check`. `check` can have "healty" or "ready" depending on status of the probe.
- [#1694](https://github.com/thanos-io/thanos/pull/1694) `prober_ready` and `prober_healthy` metrics are removed, for sake of `status`. Now `status` exposes same metric with a label, `check`. `check` can have "healthy" or "ready" depending on status of the probe.
- [#1790](https://github.com/thanos-io/thanos/pull/1790) Ruler: Fixes subqueries support for ruler.
- [#1769](https://github.com/thanos-io/thanos/pull/1769) & [#1545](https://github.com/thanos-io/thanos/pull/1545) Adjusted most of the metrics histogram buckets.
@ -1569,7 +1742,7 @@ This version moved tarballs to Golang 1.12.5 from 1.11 as well, so same warning
- query:
- [BUGFIX] Make sure subquery range is taken into account for selection #5467
- [ENHANCEMENT] Check for cancellation on every step of a range evaluation. #5131
- [BUGFIX] Exponentation operator to drop metric name in result of operation. #5329
- [BUGFIX] Exponentiation operator to drop metric name in result of operation. #5329
- [BUGFIX] Fix output sample values for scalar-to-vector comparison operations. #5454
- rule:
- [BUGFIX] Reload rules: copy state on both name and labels. #5368

View File

@ -68,7 +68,7 @@ The following section explains various suggestions and procedures to note during
* It is strongly recommended that you use Linux distributions systems or macOS for development.
* Running [WSL 2 (on Windows)](https://learn.microsoft.com/en-us/windows/wsl/) is also possible. Note that if during development you run a local Kubernetes cluster and have a Service with `service.spec.sessionAffinity: ClientIP`, it will break things until it's removed[^windows_xt_recent].
* Go 1.21.x or higher.
* Go 1.22.x or higher.
* Docker (to run e2e tests)
* For React UI, you will need a working NodeJS environment and the npm package manager to compile the Web UI assets.
@ -164,7 +164,7 @@ $ git push origin <your_branch_for_new_pr>
**Formatting**
First of all, fall back to `make help` to see all availible commands. There are a few checks that happen when making a PR and these need to pass. We can make sure locally before making the PR by using commands that are related to your changes:
First of all, fall back to `make help` to see all available commands. There are a few checks that happen when making a PR and these need to pass. We can make sure locally before making the PR by using commands that are related to your changes:
- `make docs` generates, formats and cleans up white noise.
- `make changed-docs` does same as above, but just for changed docs by checking `git diff` on which files are changed.
- `make check-docs` generates, formats, cleans up white noise and checks links. Since it can be annoying to wait on link check results - it takes forever - to skip the check, you can use `make docs`).

View File

@ -1,5 +1,5 @@
# Taking a non-alpine image for e2e tests so that cgo can be enabled for the race detector.
FROM golang:1.21 as builder
FROM golang:1.24.0 as builder
WORKDIR $GOPATH/src/github.com/thanos-io/thanos
@ -8,7 +8,7 @@ COPY . $GOPATH/src/github.com/thanos-io/thanos
RUN CGO_ENABLED=1 go build -o $GOBIN/thanos -race ./cmd/thanos
# -----------------------------------------------------------------------------
FROM golang:1.21
FROM golang:1.24.0
LABEL maintainer="The Thanos Authors"
COPY --from=builder $GOBIN/thanos /bin/thanos

View File

@ -1,6 +1,6 @@
# By default we pin to amd64 sha. Use make docker to automatically adjust for arm64 versions.
ARG BASE_DOCKER_SHA="14d68ca3d69fceaa6224250c83d81d935c053fb13594c811038c461194599973"
FROM golang:1.21-alpine3.18 as builder
FROM golang:1.24.0-alpine3.20 as builder
WORKDIR $GOPATH/src/github.com/thanos-io/thanos
# Change in the docker context invalidates the cache so to leverage docker

View File

@ -5,10 +5,9 @@
| Bartłomiej Płotka | bwplotka@gmail.com | `@bwplotka` | [@bwplotka](https://github.com/bwplotka) | Google |
| Frederic Branczyk | fbranczyk@gmail.com | `@brancz` | [@brancz](https://github.com/brancz) | Polar Signals |
| Giedrius Statkevičius | giedriuswork@gmail.com | `@Giedrius Statkevičius` | [@GiedriusS](https://github.com/GiedriusS) | Vinted |
| Kemal Akkoyun | kakkoyun@gmail.com | `@kakkoyun` | [@kakkoyun](https://github.com/kakkoyun) | Fal |
| Kemal Akkoyun | kakkoyun@gmail.com | `@kakkoyun` | [@kakkoyun](https://github.com/kakkoyun) | Independent |
| Lucas Servén Marín | lserven@gmail.com | `@squat` | [@squat](https://github.com/squat) | Red Hat |
| Prem Saraswat | prmsrswt@gmail.com | `@Prem Saraswat` | [@onprem](https://github.com/onprem) | Red Hat |
| Matthias Loibl | mail@matthiasloibl.com | `@metalmatze` | [@metalmatze](https://github.com/metalmatze) | Polar Signals |
| Ben Ye | yb532204897@gmail.com | `@yeya24` | [@yeya24](https://github.com/yeya24) | Amazon Web Services |
| Matej Gera | matejgera@gmail.com | `@Matej Gera` | [@matej-g](https://github.com/matej-g) | Coralogix |
| Filip Petkovski | filip.petkovsky@gmail.com | `@Filip Petkovski` | [@fpetkovski](https://github.com/fpetkovski) | Shopify |
@ -31,15 +30,17 @@ We also have some nice souls that help triaging issues and PRs. See [here](https
Full list of triage persons is displayed below:
| Name | Slack | GitHub | Company |
|----------------|------------------|----------------------------------------------------|---------|
| Adrien Fillon | `@Adrien F` | [@adrien-f](https://github.com/adrien-f) | |
| Ian Billett | `@billett` | [@bill3tt](https://github.com/bill3tt) | Red Hat |
| Martin Chodur | `@FUSAKLA` | [@fusakla](https://github.com/fusakla) | |
| Michael Dai | `@jojohappy` | [@jojohappy](https://github.com/jojohappy) | |
| Xiang Dai | `@daixiang0` | [@daixiang0](https://github.com/daixiang0) | |
| Jimmie Han | `@hanjm` | [@hanjm](https://github.com/hanjm) | Tencent |
| Douglas Camata | `@douglascamata` | [@douglascamata](https://github.com/douglascamata) | Red Hat |
| Name | Slack | GitHub | Company |
|----------------|------------------|----------------------------------------------------|---------------------|
| Adrien Fillon | `@Adrien F` | [@adrien-f](https://github.com/adrien-f) | |
| Ian Billett | `@billett` | [@bill3tt](https://github.com/bill3tt) | Red Hat |
| Martin Chodur | `@FUSAKLA` | [@fusakla](https://github.com/fusakla) | |
| Michael Dai | `@jojohappy` | [@jojohappy](https://github.com/jojohappy) | |
| Xiang Dai | `@daixiang0` | [@daixiang0](https://github.com/daixiang0) | |
| Jimmie Han | `@hanjm` | [@hanjm](https://github.com/hanjm) | Tencent |
| Douglas Camata | `@douglascamata` | [@douglascamata](https://github.com/douglascamata) | Red Hat |
| Harry John | `@harry671003` | [@harry671003](https://github.com/harry671003) | Amazon Web Services |
| Pedro Tanaka | `@pedro.tanaka` | [@pedro-stanaka](https://github.com/pedro-stanaka) | Shopify |
Please reach any of the maintainer on slack or email if you want to help as well.
@ -107,4 +108,4 @@ Fabian Reinartz @fabxc and Bartłomiej Płotka @bwplotka
## Previous Maintainers
Dominic Green, Povilas Versockas, Marco Pracucci
Dominic Green, Povilas Versockas, Marco Pracucci, Matthias Loibl

View File

@ -295,6 +295,13 @@ proto: ## Generates Go files from Thanos proto files.
proto: check-git $(GOIMPORTS) $(PROTOC) $(PROTOC_GEN_GOGOFAST)
@GOIMPORTS_BIN="$(GOIMPORTS)" PROTOC_BIN="$(PROTOC)" PROTOC_GEN_GOGOFAST_BIN="$(PROTOC_GEN_GOGOFAST)" PROTOC_VERSION="$(PROTOC_VERSION)" scripts/genproto.sh
.PHONY: capnp
capnp: ## Generates Go files from Thanos capnproto files.
capnp: check-git
capnp compile -I $(shell go list -m -f '{{.Dir}}' capnproto.org/go/capnp/v3)/std -ogo pkg/receive/writecapnp/write_request.capnp
@$(GOIMPORTS) -w pkg/receive/writecapnp/write_request.capnp.go
go run ./scripts/copyright
.PHONY: tarballs-release
tarballs-release: ## Build tarballs.
tarballs-release: $(PROMU)
@ -333,7 +340,11 @@ test-e2e: docker-e2e $(GOTESPLIT)
# NOTE(GiedriusS):
# * If you want to limit CPU time available in e2e tests then pass E2E_DOCKER_CPUS environment variable. For example, E2E_DOCKER_CPUS=0.05 limits CPU time available
# to spawned Docker containers to 0.05 cores.
@$(GOTESPLIT) -total ${GH_PARALLEL} -index ${GH_INDEX} ./test/e2e/... -- ${GOTEST_OPTS}
@if [ -n "$(SINGLE_E2E_TEST)" ]; then \
$(GOTESPLIT) -total ${GH_PARALLEL} -index ${GH_INDEX} ./test/e2e -- -run $(SINGLE_E2E_TEST) ${GOTEST_OPTS}; \
else \
$(GOTESPLIT) -total ${GH_PARALLEL} -index ${GH_INDEX} ./test/e2e/... -- ${GOTEST_OPTS}; \
fi
.PHONY: test-e2e-local
test-e2e-local: ## Runs all thanos e2e tests locally.
@ -395,8 +406,7 @@ go-lint: check-git deps $(GOLANGCI_LINT) $(FAILLINT)
$(call require_clean_work_tree,'detected not clean work tree before running lint, previous job changed something?')
@echo ">> verifying modules being imported"
@# TODO(bwplotka): Add, Printf, DefaultRegisterer, NewGaugeFunc and MustRegister once exception are accepted. Add fmt.{Errorf}=github.com/pkg/errors.{Errorf} once https://github.com/fatih/faillint/issues/10 is addressed.
@$(FAILLINT) -paths "errors=github.com/pkg/errors,\
github.com/prometheus/tsdb=github.com/prometheus/prometheus/tsdb,\
@$(FAILLINT) -paths "github.com/prometheus/tsdb=github.com/prometheus/prometheus/tsdb,\
github.com/prometheus/prometheus/pkg/testutils=github.com/thanos-io/thanos/pkg/testutil,\
github.com/prometheus/client_golang/prometheus.{DefaultGatherer,DefBuckets,NewUntypedFunc,UntypedFunc},\
github.com/prometheus/client_golang/prometheus.{NewCounter,NewCounterVec,NewCounterVec,NewGauge,NewGaugeVec,NewGaugeFunc,\

View File

@ -1 +1 @@
0.36.0-dev
0.40.0-dev

View File

@ -42,10 +42,12 @@ import (
"github.com/thanos-io/thanos/pkg/extprom"
extpromhttp "github.com/thanos-io/thanos/pkg/extprom/http"
"github.com/thanos-io/thanos/pkg/logging"
"github.com/thanos-io/thanos/pkg/logutil"
"github.com/thanos-io/thanos/pkg/prober"
"github.com/thanos-io/thanos/pkg/runutil"
httpserver "github.com/thanos-io/thanos/pkg/server/http"
"github.com/thanos-io/thanos/pkg/store"
"github.com/thanos-io/thanos/pkg/strutil"
"github.com/thanos-io/thanos/pkg/tracing"
"github.com/thanos-io/thanos/pkg/ui"
)
@ -205,7 +207,7 @@ func runCompact(
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.String(), nil)
if err != nil {
return err
}
@ -254,10 +256,11 @@ func runCompact(
}
enableVerticalCompaction := conf.enableVerticalCompaction
if len(conf.dedupReplicaLabels) > 0 {
dedupReplicaLabels := strutil.ParseFlagLabels(conf.dedupReplicaLabels)
if len(dedupReplicaLabels) > 0 {
enableVerticalCompaction = true
level.Info(logger).Log(
"msg", "deduplication.replica-label specified, enabling vertical compaction", "dedupReplicaLabels", strings.Join(conf.dedupReplicaLabels, ","),
"msg", "deduplication.replica-label specified, enabling vertical compaction", "dedupReplicaLabels", strings.Join(dedupReplicaLabels, ","),
)
}
if enableVerticalCompaction {
@ -275,7 +278,7 @@ func runCompact(
labelShardedMetaFilter,
consistencyDelayMetaFilter,
ignoreDeletionMarkFilter,
block.NewReplicaLabelRemover(logger, conf.dedupReplicaLabels),
block.NewReplicaLabelRemover(logger, dedupReplicaLabels),
duplicateBlocksFilter,
noCompactMarkerFilter,
}
@ -288,6 +291,11 @@ func runCompact(
cf.UpdateOnChange(func(blocks []metadata.Meta, err error) {
api.SetLoaded(blocks, err)
})
var syncMetasTimeout = conf.waitInterval
if !conf.wait {
syncMetasTimeout = 0
}
sy, err = compact.NewMetaSyncer(
logger,
reg,
@ -297,6 +305,7 @@ func runCompact(
ignoreDeletionMarkFilter,
compactMetrics.blocksMarked.WithLabelValues(metadata.DeletionMarkFilename, ""),
compactMetrics.garbageCollectedBlocks,
syncMetasTimeout,
)
if err != nil {
return errors.Wrap(err, "create syncer")
@ -326,7 +335,7 @@ func runCompact(
case compact.DedupAlgorithmPenalty:
mergeFunc = dedup.NewChunkSeriesMerger()
if len(conf.dedupReplicaLabels) == 0 {
if len(dedupReplicaLabels) == 0 {
return errors.New("penalty based deduplication needs at least one replica label specified")
}
case "":
@ -338,7 +347,7 @@ func runCompact(
// Instantiate the compactor with different time slices. Timestamps in TSDB
// are in milliseconds.
comp, err := tsdb.NewLeveledCompactor(ctx, reg, logger, levels, downsample.NewPool(), mergeFunc)
comp, err := tsdb.NewLeveledCompactor(ctx, reg, logutil.GoKitLogToSlog(logger), levels, downsample.NewPool(), mergeFunc)
if err != nil {
return errors.Wrap(err, "create compactor")
}
@ -488,6 +497,14 @@ func runCompact(
return errors.Wrap(err, "sync before second pass of downsampling")
}
// Regenerate the filtered list of blocks after the sync,
// to include the blocks created by the first pass.
filteredMetas = sy.Metas()
noDownsampleBlocks = noDownsampleMarkerFilter.NoDownsampleMarkedBlocks()
for ul := range noDownsampleBlocks {
delete(filteredMetas, ul)
}
if err := downsampleBucket(
ctx,
logger,
@ -810,8 +827,9 @@ func (cc *compactConfig) registerFlag(cmd extkingpin.FlagClause) {
"When set to penalty, penalty based deduplication algorithm will be used. At least one replica label has to be set via --deduplication.replica-label flag.").
Default("").EnumVar(&cc.dedupFunc, compact.DedupAlgorithmPenalty, "")
cmd.Flag("deduplication.replica-label", "Label to treat as a replica indicator of blocks that can be deduplicated (repeated flag). This will merge multiple replica blocks into one. This process is irreversible."+
"Experimental. When one or more labels are set, compactor will ignore the given labels so that vertical compaction can merge the blocks."+
cmd.Flag("deduplication.replica-label", "Experimental. Label to treat as a replica indicator of blocks that can be deduplicated (repeated flag). This will merge multiple replica blocks into one. This process is irreversible. "+
"Flag may be specified multiple times as well as a comma separated list of labels. "+
"When one or more labels are set, compactor will ignore the given labels so that vertical compaction can merge the blocks."+
"Please note that by default this uses a NAIVE algorithm for merging which works well for deduplication of blocks with **precisely the same samples** like produced by Receiver replication."+
"If you need a different deduplication algorithm (e.g one that works well with Prometheus replicas), please set it via --deduplication.func.").
StringsVar(&cc.dedupReplicaLabels)

View File

@ -8,18 +8,23 @@ package main
import (
"net/url"
"sort"
"strconv"
"strings"
"time"
"github.com/KimMachineGun/automemlimit/memlimit"
extflag "github.com/efficientgo/tools/extkingpin"
"github.com/go-kit/log"
"github.com/opentracing/opentracing-go"
"github.com/pkg/errors"
"google.golang.org/grpc"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/common/model"
"github.com/prometheus/prometheus/model/labels"
"github.com/thanos-io/thanos/pkg/extgrpc"
"github.com/thanos-io/thanos/pkg/extgrpc/snappy"
"github.com/thanos-io/thanos/pkg/extkingpin"
"github.com/thanos-io/thanos/pkg/shipper"
)
@ -29,6 +34,7 @@ type grpcConfig struct {
tlsSrvCert string
tlsSrvKey string
tlsSrvClientCA string
tlsMinVersion string
gracePeriod time.Duration
maxConnectionAge time.Duration
}
@ -46,6 +52,9 @@ func (gc *grpcConfig) registerFlag(cmd extkingpin.FlagClause) *grpcConfig {
cmd.Flag("grpc-server-tls-client-ca",
"TLS CA to verify clients against. If no client CA is specified, there is no client verification on server side. (tls.NoClientCert)").
Default("").StringVar(&gc.tlsSrvClientCA)
cmd.Flag("grpc-server-tls-min-version",
"TLS supported minimum version for gRPC server. If no version is specified, it'll default to 1.3. Allowed values: [\"1.0\", \"1.1\", \"1.2\", \"1.3\"]").
Default("1.3").StringVar(&gc.tlsMinVersion)
cmd.Flag("grpc-server-max-connection-age", "The grpc server max connection age. This controls how often to re-establish connections and redo TLS handshakes.").
Default("60m").DurationVar(&gc.maxConnectionAge)
cmd.Flag("grpc-grace-period",
@ -55,6 +64,38 @@ func (gc *grpcConfig) registerFlag(cmd extkingpin.FlagClause) *grpcConfig {
return gc
}
type grpcClientConfig struct {
secure bool
skipVerify bool
cert, key, caCert string
serverName string
compression string
}
func (gc *grpcClientConfig) registerFlag(cmd extkingpin.FlagClause) *grpcClientConfig {
cmd.Flag("grpc-client-tls-secure", "Use TLS when talking to the gRPC server").Default("false").BoolVar(&gc.secure)
cmd.Flag("grpc-client-tls-skip-verify", "Disable TLS certificate verification i.e self signed, signed by fake CA").Default("false").BoolVar(&gc.skipVerify)
cmd.Flag("grpc-client-tls-cert", "TLS Certificates to use to identify this client to the server").Default("").StringVar(&gc.cert)
cmd.Flag("grpc-client-tls-key", "TLS Key for the client's certificate").Default("").StringVar(&gc.key)
cmd.Flag("grpc-client-tls-ca", "TLS CA Certificates to use to verify gRPC servers").Default("").StringVar(&gc.caCert)
cmd.Flag("grpc-client-server-name", "Server name to verify the hostname on the returned gRPC certificates. See https://tools.ietf.org/html/rfc4366#section-3.1").Default("").StringVar(&gc.serverName)
compressionOptions := strings.Join([]string{snappy.Name, compressionNone}, ", ")
cmd.Flag("grpc-compression", "Compression algorithm to use for gRPC requests to other clients. Must be one of: "+compressionOptions).Default(compressionNone).EnumVar(&gc.compression, snappy.Name, compressionNone)
return gc
}
func (gc *grpcClientConfig) dialOptions(logger log.Logger, reg prometheus.Registerer, tracer opentracing.Tracer) ([]grpc.DialOption, error) {
dialOpts, err := extgrpc.StoreClientGRPCOpts(logger, reg, tracer, gc.secure, gc.skipVerify, gc.cert, gc.key, gc.caCert, gc.serverName)
if err != nil {
return nil, errors.Wrapf(err, "building gRPC client")
}
if gc.compression != compressionNone {
dialOpts = append(dialOpts, grpc.WithDefaultCallOptions(grpc.UseCompressor(gc.compression)))
}
return dialOpts, nil
}
type httpConfig struct {
bindAddress string
tlsConfig string
@ -95,7 +136,7 @@ func (pc *prometheusConfig) registerFlag(cmd extkingpin.FlagClause) *prometheusC
Default("30s").DurationVar(&pc.getConfigInterval)
cmd.Flag("prometheus.get_config_timeout",
"Timeout for getting Prometheus config").
Default("5s").DurationVar(&pc.getConfigTimeout)
Default("30s").DurationVar(&pc.getConfigTimeout)
pc.httpClient = extflag.RegisterPathOrContent(
cmd,
"prometheus.http-client",
@ -162,6 +203,7 @@ type shipperConfig struct {
uploadCompacted bool
ignoreBlockSize bool
allowOutOfOrderUpload bool
skipCorruptedBlocks bool
hashFunc string
metaFileName string
}
@ -178,6 +220,11 @@ func (sc *shipperConfig) registerFlag(cmd extkingpin.FlagClause) *shipperConfig
"This can trigger compaction without those blocks and as a result will create an overlap situation. Set it to true if you have vertical compaction enabled and wish to upload blocks as soon as possible without caring"+
"about order.").
Default("false").Hidden().BoolVar(&sc.allowOutOfOrderUpload)
cmd.Flag("shipper.skip-corrupted-blocks",
"If true, shipper will skip corrupted blocks in the given iteration and retry later. This means that some newer blocks might be uploaded sooner than older blocks."+
"This can trigger compaction without those blocks and as a result will create an overlap situation. Set it to true if you have vertical compaction enabled and wish to upload blocks as soon as possible without caring"+
"about order.").
Default("false").Hidden().BoolVar(&sc.skipCorruptedBlocks)
cmd.Flag("hash-func", "Specify which hash function to use when calculating the hashes of produced files. If no function has been specified, it does not happen. This permits avoiding downloading some files twice albeit at some performance cost. Possible values are: \"\", \"SHA256\".").
Default("").EnumVar(&sc.hashFunc, "SHA256", "")
cmd.Flag("shipper.meta-file-name", "the file to store shipper metadata in").Default(shipper.DefaultMetaFilename).StringVar(&sc.metaFileName)
@ -266,23 +313,23 @@ func (ac *alertMgrConfig) registerFlag(cmd extflag.FlagClause) *alertMgrConfig {
}
func parseFlagLabels(s []string) (labels.Labels, error) {
var lset labels.Labels
var lset labels.ScratchBuilder
for _, l := range s {
parts := strings.SplitN(l, "=", 2)
if len(parts) != 2 {
return nil, errors.Errorf("unrecognized label %q", l)
return labels.EmptyLabels(), errors.Errorf("unrecognized label %q", l)
}
if !model.LabelName.IsValid(model.LabelName(parts[0])) {
return nil, errors.Errorf("unsupported format for label %s", l)
return labels.EmptyLabels(), errors.Errorf("unsupported format for label %s", l)
}
val, err := strconv.Unquote(parts[1])
if err != nil {
return nil, errors.Wrap(err, "unquote label value")
return labels.EmptyLabels(), errors.Wrap(err, "unquote label value")
}
lset = append(lset, labels.Label{Name: parts[0], Value: val})
lset.Add(parts[0], val)
}
sort.Sort(lset)
return lset, nil
lset.Sort()
return lset.Labels(), nil
}
type goMemLimitConfig struct {

View File

@ -15,13 +15,13 @@ import (
"github.com/go-kit/log"
"github.com/go-kit/log/level"
"github.com/oklog/run"
"github.com/oklog/ulid"
"github.com/oklog/ulid/v2"
"github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"github.com/prometheus/prometheus/tsdb"
"github.com/prometheus/prometheus/tsdb/chunkenc"
"github.com/thanos-io/thanos/pkg/compact"
"github.com/thanos-io/objstore"
"github.com/thanos-io/objstore/client"
@ -29,10 +29,12 @@ import (
"github.com/thanos-io/thanos/pkg/block"
"github.com/thanos-io/thanos/pkg/block/metadata"
"github.com/thanos-io/thanos/pkg/compact"
"github.com/thanos-io/thanos/pkg/compact/downsample"
"github.com/thanos-io/thanos/pkg/component"
"github.com/thanos-io/thanos/pkg/errutil"
"github.com/thanos-io/thanos/pkg/extprom"
"github.com/thanos-io/thanos/pkg/logutil"
"github.com/thanos-io/thanos/pkg/prober"
"github.com/thanos-io/thanos/pkg/runutil"
httpserver "github.com/thanos-io/thanos/pkg/server/http"
@ -84,7 +86,7 @@ func RunDownsample(
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Downsample.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Downsample.String(), nil)
if err != nil {
return err
}
@ -376,7 +378,7 @@ func processDownsampling(
pool = downsample.NewPool()
}
b, err := tsdb.OpenBlock(logger, bdir, pool)
b, err := tsdb.OpenBlock(logutil.GoKitLogToSlog(logger), bdir, pool, nil)
if err != nil {
return errors.Wrapf(err, "open block %s", m.ULID)
}

349
cmd/thanos/endpointset.go Normal file
View File

@ -0,0 +1,349 @@
// Copyright (c) The Thanos Authors.
// Licensed under the Apache License 2.0.
package main
import (
"context"
"fmt"
"sync"
"time"
"github.com/go-kit/log"
"github.com/go-kit/log/level"
"github.com/oklog/run"
"google.golang.org/grpc"
"gopkg.in/yaml.v3"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"github.com/prometheus/common/model"
"github.com/prometheus/prometheus/discovery"
"github.com/prometheus/prometheus/discovery/file"
"github.com/prometheus/prometheus/discovery/targetgroup"
"github.com/thanos-io/thanos/pkg/component"
"github.com/thanos-io/thanos/pkg/discovery/cache"
"github.com/thanos-io/thanos/pkg/discovery/dns"
"github.com/thanos-io/thanos/pkg/errors"
"github.com/thanos-io/thanos/pkg/extgrpc"
"github.com/thanos-io/thanos/pkg/extkingpin"
"github.com/thanos-io/thanos/pkg/extprom"
"github.com/thanos-io/thanos/pkg/logutil"
"github.com/thanos-io/thanos/pkg/query"
"github.com/thanos-io/thanos/pkg/runutil"
)
// fileContent is the interface of methods that we need from extkingpin.PathOrContent.
// We need to abstract it for now so we can implement a default if the user does not provide one.
type fileContent interface {
Content() ([]byte, error)
Path() string
}
type endpointSettings struct {
Strict bool `yaml:"strict"`
Group bool `yaml:"group"`
Address string `yaml:"address"`
}
type EndpointConfig struct {
Endpoints []endpointSettings `yaml:"endpoints"`
}
type endpointConfigProvider struct {
mu sync.Mutex
cfg EndpointConfig
// statically defined endpoints from flags for backwards compatibility
endpoints []string
endpointGroups []string
strictEndpoints []string
strictEndpointGroups []string
}
func (er *endpointConfigProvider) config() EndpointConfig {
er.mu.Lock()
defer er.mu.Unlock()
res := EndpointConfig{Endpoints: make([]endpointSettings, len(er.cfg.Endpoints))}
copy(res.Endpoints, er.cfg.Endpoints)
return res
}
func (er *endpointConfigProvider) parse(configFile fileContent) (EndpointConfig, error) {
content, err := configFile.Content()
if err != nil {
return EndpointConfig{}, errors.Wrapf(err, "unable to load config content: %s", configFile.Path())
}
var cfg EndpointConfig
if err := yaml.Unmarshal(content, &cfg); err != nil {
return EndpointConfig{}, errors.Wrapf(err, "unable to unmarshal config content: %s", configFile.Path())
}
return cfg, nil
}
func (er *endpointConfigProvider) addStaticEndpoints(cfg *EndpointConfig) {
for _, e := range er.endpoints {
cfg.Endpoints = append(cfg.Endpoints, endpointSettings{
Address: e,
})
}
for _, e := range er.endpointGroups {
cfg.Endpoints = append(cfg.Endpoints, endpointSettings{
Address: e,
Group: true,
})
}
for _, e := range er.strictEndpoints {
cfg.Endpoints = append(cfg.Endpoints, endpointSettings{
Address: e,
Strict: true,
})
}
for _, e := range er.strictEndpointGroups {
cfg.Endpoints = append(cfg.Endpoints, endpointSettings{
Address: e,
Group: true,
Strict: true,
})
}
}
func validateEndpointConfig(cfg EndpointConfig) error {
for _, ecfg := range cfg.Endpoints {
if dns.IsDynamicNode(ecfg.Address) && ecfg.Strict {
return errors.Newf("%s is a dynamically specified endpoint i.e. it uses SD and that is not permitted under strict mode.", ecfg.Address)
}
}
return nil
}
func newEndpointConfigProvider(
logger log.Logger,
configFile fileContent,
configReloadInterval time.Duration,
staticEndpoints []string,
staticEndpointGroups []string,
staticStrictEndpoints []string,
staticStrictEndpointGroups []string,
) (*endpointConfigProvider, error) {
res := &endpointConfigProvider{
endpoints: staticEndpoints,
endpointGroups: staticEndpointGroups,
strictEndpoints: staticStrictEndpoints,
strictEndpointGroups: staticStrictEndpointGroups,
}
if configFile == nil {
configFile = extkingpin.NewNopConfig()
}
cfg, err := res.parse(configFile)
if err != nil {
return nil, errors.Wrapf(err, "unable to load config file")
}
res.addStaticEndpoints(&cfg)
res.cfg = cfg
if err := validateEndpointConfig(cfg); err != nil {
return nil, errors.Wrapf(err, "unable to validate endpoints")
}
// only static endpoints
if len(configFile.Path()) == 0 {
return res, nil
}
if err := extkingpin.PathContentReloader(context.Background(), configFile, logger, func() {
res.mu.Lock()
defer res.mu.Unlock()
level.Info(logger).Log("msg", "reloading endpoint config")
cfg, err := res.parse(configFile)
if err != nil {
level.Error(logger).Log("msg", "failed to reload endpoint config", "err", err)
return
}
res.addStaticEndpoints(&cfg)
if err := validateEndpointConfig(cfg); err != nil {
level.Error(logger).Log("msg", "failed to validate endpoint config", "err", err)
return
}
res.cfg = cfg
}, configReloadInterval); err != nil {
return nil, errors.Wrapf(err, "unable to create config reloader")
}
return res, nil
}
func setupEndpointSet(
g *run.Group,
comp component.Component,
reg prometheus.Registerer,
logger log.Logger,
configFile fileContent,
configReloadInterval time.Duration,
legacyFileSDFiles []string,
legacyFileSDInterval time.Duration,
legacyEndpoints []string,
legacyEndpointGroups []string,
legacyStrictEndpoints []string,
legacyStrictEndpointGroups []string,
dnsSDResolver string,
dnsSDInterval time.Duration,
unhealthyTimeout time.Duration,
endpointTimeout time.Duration,
dialOpts []grpc.DialOption,
queryConnMetricLabels ...string,
) (*query.EndpointSet, error) {
configProvider, err := newEndpointConfigProvider(
logger,
configFile,
configReloadInterval,
legacyEndpoints,
legacyEndpointGroups,
legacyStrictEndpoints,
legacyStrictEndpointGroups,
)
if err != nil {
return nil, errors.Wrapf(err, "unable to load config initially")
}
// Register resolver for the "thanos:///" scheme for endpoint-groups
dns.RegisterGRPCResolver(
logger,
dns.NewProvider(
logger,
extprom.WrapRegistererWithPrefix(fmt.Sprintf("thanos_%s_endpoint_groups_", comp), reg),
dns.ResolverType(dnsSDResolver),
),
dnsSDInterval,
)
dnsEndpointProvider := dns.NewProvider(
logger,
extprom.WrapRegistererWithPrefix(fmt.Sprintf("thanos_%s_endpoints_", comp), reg),
dns.ResolverType(dnsSDResolver),
)
duplicatedEndpoints := promauto.With(reg).NewCounter(prometheus.CounterOpts{
Name: fmt.Sprintf("thanos_%s_duplicated_endpoint_addresses_total", comp),
Help: "The number of times a duplicated endpoint addresses is detected from the different configs",
})
removeDuplicateEndpointSpecs := func(specs []*query.GRPCEndpointSpec) []*query.GRPCEndpointSpec {
set := make(map[string]*query.GRPCEndpointSpec)
for _, spec := range specs {
addr := spec.Addr()
if _, ok := set[addr]; ok {
level.Warn(logger).Log("msg", "Duplicate endpoint address is provided", "addr", addr)
duplicatedEndpoints.Inc()
}
set[addr] = spec
}
deduplicated := make([]*query.GRPCEndpointSpec, 0, len(set))
for _, value := range set {
deduplicated = append(deduplicated, value)
}
return deduplicated
}
var fileSD *file.Discovery
if len(legacyFileSDFiles) > 0 {
conf := &file.SDConfig{
Files: legacyFileSDFiles,
RefreshInterval: model.Duration(legacyFileSDInterval),
}
var err error
if fileSD, err = file.NewDiscovery(conf, logutil.GoKitLogToSlog(logger), conf.NewDiscovererMetrics(reg, discovery.NewRefreshMetrics(reg))); err != nil {
return nil, fmt.Errorf("unable to create new legacy file sd config: %w", err)
}
}
legacyFileSDCache := cache.New()
ctx, cancel := context.WithCancel(context.Background())
if fileSD != nil {
fileSDUpdates := make(chan []*targetgroup.Group)
g.Add(func() error {
fileSD.Run(ctx, fileSDUpdates)
return nil
}, func(err error) {
cancel()
})
g.Add(func() error {
for {
select {
case update := <-fileSDUpdates:
// Discoverers sometimes send nil updates so need to check for it to avoid panics.
if update == nil {
continue
}
legacyFileSDCache.Update(update)
case <-ctx.Done():
return nil
}
}
}, func(err error) {
cancel()
})
}
{
g.Add(func() error {
return runutil.Repeat(dnsSDInterval, ctx.Done(), func() error {
ctxUpdateIter, cancelUpdateIter := context.WithTimeout(ctx, dnsSDInterval)
defer cancelUpdateIter()
endpointConfig := configProvider.config()
addresses := make([]string, 0, len(endpointConfig.Endpoints))
for _, ecfg := range endpointConfig.Endpoints {
if addr := ecfg.Address; dns.IsDynamicNode(addr) && !ecfg.Group {
addresses = append(addresses, addr)
}
}
addresses = append(addresses, legacyFileSDCache.Addresses()...)
if err := dnsEndpointProvider.Resolve(ctxUpdateIter, addresses, true); err != nil {
level.Error(logger).Log("msg", "failed to resolve addresses for endpoints", "err", err)
}
return nil
})
}, func(error) {
cancel()
})
}
endpointset := query.NewEndpointSet(time.Now, logger, reg, func() []*query.GRPCEndpointSpec {
endpointConfig := configProvider.config()
specs := make([]*query.GRPCEndpointSpec, 0)
// groups and non dynamic endpoints
for _, ecfg := range endpointConfig.Endpoints {
strict, group, addr := ecfg.Strict, ecfg.Group, ecfg.Address
if group {
specs = append(specs, query.NewGRPCEndpointSpec(fmt.Sprintf("thanos:///%s", addr), strict, append(dialOpts, extgrpc.EndpointGroupGRPCOpts()...)...))
} else if !dns.IsDynamicNode(addr) {
specs = append(specs, query.NewGRPCEndpointSpec(addr, strict, dialOpts...))
}
}
// dynamic endpoints
for _, addr := range dnsEndpointProvider.Addresses() {
specs = append(specs, query.NewGRPCEndpointSpec(addr, false, dialOpts...))
}
return removeDuplicateEndpointSpecs(specs)
}, unhealthyTimeout, endpointTimeout, queryConnMetricLabels...)
g.Add(func() error {
return runutil.Repeat(endpointTimeout, ctx.Done(), func() error {
ctxIter, cancelIter := context.WithTimeout(ctx, endpointTimeout)
defer cancelIter()
endpointset.Update(ctxIter)
return nil
})
}, func(error) {
cancel()
})
return endpointset, nil
}

View File

@ -15,6 +15,7 @@ import (
"runtime/debug"
"syscall"
"github.com/alecthomas/kingpin/v2"
"github.com/go-kit/log"
"github.com/go-kit/log/level"
"github.com/oklog/run"
@ -25,7 +26,6 @@ import (
versioncollector "github.com/prometheus/client_golang/prometheus/collectors/version"
"github.com/prometheus/common/version"
"go.uber.org/automaxprocs/maxprocs"
"gopkg.in/alecthomas/kingpin.v2"
"github.com/thanos-io/thanos/pkg/extkingpin"
"github.com/thanos-io/thanos/pkg/logging"
@ -74,7 +74,7 @@ func main() {
// Running in container with limits but with empty/wrong value of GOMAXPROCS env var could lead to throttling by cpu
// maxprocs will automate adjustment by using cgroups info about cpu limit if it set as value for runtime.GOMAXPROCS.
undo, err := maxprocs.Set(maxprocs.Logger(func(template string, args ...interface{}) {
level.Debug(logger).Log("msg", fmt.Sprintf(template, args))
level.Debug(logger).Log("msg", fmt.Sprintf(template, args...))
}))
defer undo()
if err != nil {

View File

@ -14,7 +14,8 @@ import (
"time"
"github.com/go-kit/log"
"github.com/oklog/ulid"
"github.com/oklog/ulid/v2"
"github.com/prometheus/client_golang/prometheus"
promtest "github.com/prometheus/client_golang/prometheus/testutil"
"github.com/prometheus/prometheus/model/labels"
@ -105,6 +106,16 @@ func (b *erroringBucket) Name() string {
return b.bkt.Name()
}
// IterWithAttributes allows to iterate over objects in the bucket with their attributes.
func (b *erroringBucket) IterWithAttributes(ctx context.Context, dir string, f func(objstore.IterObjectAttributes) error, options ...objstore.IterOption) error {
return b.bkt.IterWithAttributes(ctx, dir, f, options...)
}
// SupportedIterOptions returns the supported iteration options.
func (b *erroringBucket) SupportedIterOptions() []objstore.IterOptionType {
return b.bkt.SupportedIterOptions()
}
// Ensures that downsampleBucket() stops its work properly
// after an error occurs with some blocks in the backlog.
// Testing for https://github.com/thanos-io/thanos/issues/4960.
@ -126,7 +137,7 @@ func TestRegression4960_Deadlock(t *testing.T) {
[]labels.Labels{{{Name: "a", Value: "1"}}},
1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check.
labels.Labels{{Name: "e1", Value: "1"}},
downsample.ResLevel0, metadata.NoneFunc)
downsample.ResLevel0, metadata.NoneFunc, nil)
testutil.Ok(t, err)
testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id.String()), metadata.NoneFunc))
}
@ -137,7 +148,7 @@ func TestRegression4960_Deadlock(t *testing.T) {
[]labels.Labels{{{Name: "a", Value: "2"}}},
1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check.
labels.Labels{{Name: "e1", Value: "2"}},
downsample.ResLevel0, metadata.NoneFunc)
downsample.ResLevel0, metadata.NoneFunc, nil)
testutil.Ok(t, err)
testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id2.String()), metadata.NoneFunc))
}
@ -148,7 +159,7 @@ func TestRegression4960_Deadlock(t *testing.T) {
[]labels.Labels{{{Name: "a", Value: "2"}}},
1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check.
labels.Labels{{Name: "e1", Value: "2"}},
downsample.ResLevel0, metadata.NoneFunc)
downsample.ResLevel0, metadata.NoneFunc, nil)
testutil.Ok(t, err)
testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id3.String()), metadata.NoneFunc))
}
@ -188,7 +199,7 @@ func TestCleanupDownsampleCacheFolder(t *testing.T) {
[]labels.Labels{{{Name: "a", Value: "1"}}},
1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check.
labels.Labels{{Name: "e1", Value: "1"}},
downsample.ResLevel0, metadata.NoneFunc)
downsample.ResLevel0, metadata.NoneFunc, nil)
testutil.Ok(t, err)
testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id.String()), metadata.NoneFunc))
}

View File

@ -4,16 +4,12 @@
package main
import (
"context"
"fmt"
"math"
"net/http"
"strings"
"time"
extflag "github.com/efficientgo/tools/extkingpin"
"google.golang.org/grpc"
"github.com/go-kit/log"
"github.com/go-kit/log/level"
grpc_logging "github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/logging"
@ -21,25 +17,19 @@ import (
"github.com/opentracing/opentracing-go"
"github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
"github.com/prometheus/common/route"
"github.com/prometheus/prometheus/discovery"
"github.com/prometheus/prometheus/discovery/file"
"github.com/prometheus/prometheus/discovery/targetgroup"
"github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/promql"
"github.com/thanos-io/promql-engine/api"
"github.com/prometheus/prometheus/promql/parser"
apiv1 "github.com/thanos-io/thanos/pkg/api/query"
"github.com/thanos-io/thanos/pkg/api/query/querypb"
"github.com/thanos-io/thanos/pkg/block"
"github.com/thanos-io/thanos/pkg/compact/downsample"
"github.com/thanos-io/thanos/pkg/component"
"github.com/thanos-io/thanos/pkg/discovery/cache"
"github.com/thanos-io/thanos/pkg/dedup"
"github.com/thanos-io/thanos/pkg/discovery/dns"
"github.com/thanos-io/thanos/pkg/exemplars"
"github.com/thanos-io/thanos/pkg/extgrpc"
"github.com/thanos-io/thanos/pkg/extgrpc/snappy"
"github.com/thanos-io/thanos/pkg/extkingpin"
"github.com/thanos-io/thanos/pkg/extprom"
extpromhttp "github.com/thanos-io/thanos/pkg/extprom/http"
@ -47,15 +37,16 @@ import (
"github.com/thanos-io/thanos/pkg/info"
"github.com/thanos-io/thanos/pkg/info/infopb"
"github.com/thanos-io/thanos/pkg/logging"
"github.com/thanos-io/thanos/pkg/logutil"
"github.com/thanos-io/thanos/pkg/metadata"
"github.com/thanos-io/thanos/pkg/prober"
"github.com/thanos-io/thanos/pkg/query"
"github.com/thanos-io/thanos/pkg/rules"
"github.com/thanos-io/thanos/pkg/runutil"
grpcserver "github.com/thanos-io/thanos/pkg/server/grpc"
httpserver "github.com/thanos-io/thanos/pkg/server/http"
"github.com/thanos-io/thanos/pkg/store"
"github.com/thanos-io/thanos/pkg/store/labelpb"
"github.com/thanos-io/thanos/pkg/strutil"
"github.com/thanos-io/thanos/pkg/targets"
"github.com/thanos-io/thanos/pkg/tenancy"
"github.com/thanos-io/thanos/pkg/tls"
@ -63,16 +54,10 @@ import (
)
const (
promqlNegativeOffset = "promql-negative-offset"
promqlAtModifier = "promql-at-modifier"
queryPushdown = "query-pushdown"
)
type queryMode string
const (
queryModeLocal queryMode = "local"
queryModeDistributed queryMode = "distributed"
promqlNegativeOffset = "promql-negative-offset"
promqlAtModifier = "promql-at-modifier"
queryPushdown = "query-pushdown"
promqlExperimentalFunctions = "promql-experimental-functions"
)
// registerQuery registers a query command.
@ -85,14 +70,8 @@ func registerQuery(app *extkingpin.App) {
var grpcServerConfig grpcConfig
grpcServerConfig.registerFlag(cmd)
secure := cmd.Flag("grpc-client-tls-secure", "Use TLS when talking to the gRPC server").Default("false").Bool()
skipVerify := cmd.Flag("grpc-client-tls-skip-verify", "Disable TLS certificate verification i.e self signed, signed by fake CA").Default("false").Bool()
cert := cmd.Flag("grpc-client-tls-cert", "TLS Certificates to use to identify this client to the server").Default("").String()
key := cmd.Flag("grpc-client-tls-key", "TLS Key for the client's certificate").Default("").String()
caCert := cmd.Flag("grpc-client-tls-ca", "TLS CA Certificates to use to verify gRPC servers").Default("").String()
serverName := cmd.Flag("grpc-client-server-name", "Server name to verify the hostname on the returned gRPC certificates. See https://tools.ietf.org/html/rfc4366#section-3.1").Default("").String()
compressionOptions := strings.Join([]string{snappy.Name, compressionNone}, ", ")
grpcCompression := cmd.Flag("grpc-compression", "Compression algorithm to use for gRPC requests to other clients. Must be one of: "+compressionOptions).Default(compressionNone).Enum(snappy.Name, compressionNone)
var grpcClientConfig grpcClientConfig
grpcClientConfig.registerFlag(cmd)
webRoutePrefix := cmd.Flag("web.route-prefix", "Prefix for API and UI endpoints. This allows thanos UI to be served on a sub-path. Defaults to the value of --web.external-prefix. This option is analogous to --web.route-prefix of Prometheus.").Default("").String()
webExternalPrefix := cmd.Flag("web.external-prefix", "Static prefix for all HTML links and redirect URLs in the UI query web interface. Actual endpoints are still served on / or the web.route-prefix. This allows thanos UI to be served behind a reverse proxy that strips a URL sub-path.").Default("").String()
@ -104,10 +83,12 @@ func registerQuery(app *extkingpin.App) {
defaultEngine := cmd.Flag("query.promql-engine", "Default PromQL engine to use.").Default(string(apiv1.PromqlEnginePrometheus)).
Enum(string(apiv1.PromqlEnginePrometheus), string(apiv1.PromqlEngineThanos))
disableQueryFallback := cmd.Flag("query.disable-fallback", "If set then thanos engine will throw an error if query falls back to prometheus engine").Hidden().Default("false").Bool()
extendedFunctionsEnabled := cmd.Flag("query.enable-x-functions", "Whether to enable extended rate functions (xrate, xincrease and xdelta). Only has effect when used with Thanos engine.").Default("false").Bool()
promqlQueryMode := cmd.Flag("query.mode", "PromQL query mode. One of: local, distributed.").
Default(string(queryModeLocal)).
Enum(string(queryModeLocal), string(queryModeDistributed))
Default(string(apiv1.PromqlQueryModeLocal)).
Enum(string(apiv1.PromqlQueryModeLocal), string(apiv1.PromqlQueryModeDistributed))
maxConcurrentQueries := cmd.Flag("query.max-concurrent", "Maximum number of queries processed concurrently by query node.").
Default("20").Int()
@ -122,8 +103,17 @@ func registerQuery(app *extkingpin.App) {
Default(string(query.ExternalLabels), string(query.StoreType)).
Enums(string(query.ExternalLabels), string(query.StoreType))
queryReplicaLabels := cmd.Flag("query.replica-label", "Labels to treat as a replica indicator along which data is deduplicated. Still you will be able to query without deduplication using 'dedup=false' parameter. Data includes time series, recording rules, and alerting rules.").
deduplicationFunc := cmd.Flag("deduplication.func", "Experimental. Deduplication algorithm for merging overlapping series. "+
"Possible values are: \"penalty\", \"chain\". If no value is specified, penalty based deduplication algorithm will be used. "+
"When set to chain, the default compact deduplication merger is used, which performs 1:1 deduplication for samples. At least one replica label has to be set via --query.replica-label flag.").
Default(dedup.AlgorithmPenalty).Enum(dedup.AlgorithmPenalty, dedup.AlgorithmChain)
queryReplicaLabels := cmd.Flag("query.replica-label", "Labels to treat as a replica indicator along which data is deduplicated. Still you will be able to query without deduplication using 'dedup=false' parameter. Data includes time series, recording rules, and alerting rules. Flag may be specified multiple times as well as a comma separated list of labels.").
Strings()
queryPartitionLabels := cmd.Flag("query.partition-label", "Labels that partition the leaf queriers. This is used to scope down the labelsets of leaf queriers when using the distributed query mode. If set, these labels must form a partition of the leaf queriers. Partition labels must not intersect with replica labels. Every TSDB of a leaf querier must have these labels. This is useful when there are multiple external labels that are irrelevant for the partition as it allows the distributed engine to ignore them for some optimizations. If this is empty then all labels are used as partition labels.").Strings()
// currently, we choose the highest MinT of an engine when querying multiple engines. This flag allows to change this behavior to choose the lowest MinT.
queryDistributedWithOverlappingInterval := cmd.Flag("query.distributed-with-overlapping-interval", "Allow for distributed queries using an engines lowest MinT.").Hidden().Default("false").Bool()
instantDefaultMaxSourceResolution := extkingpin.ModelDuration(cmd.Flag("query.instant.default.max_source_resolution", "default value for max_source_resolution for instant queries. If not set, defaults to 0s only taking raw resolution into account. 1h can be a good value if you use instant queries over time ranges that incorporate times outside of your raw-retention.").Default("0s").Hidden())
@ -132,55 +122,6 @@ func registerQuery(app *extkingpin.App) {
selectorLabels := cmd.Flag("selector-label", "Query selector labels that will be exposed in info endpoint (repeated).").
PlaceHolder("<name>=\"<value>\"").Strings()
endpoints := extkingpin.Addrs(cmd.Flag("endpoint", "Addresses of statically configured Thanos API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect Thanos API servers through respective DNS lookups.").
PlaceHolder("<endpoint>"))
endpointGroups := extkingpin.Addrs(cmd.Flag("endpoint-group", "Experimental: DNS name of statically configured Thanos API server groups (repeatable). Targets resolved from the DNS name will be queried in a round-robin, instead of a fanout manner. This flag should be used when connecting a Thanos Query to HA groups of Thanos components.").
PlaceHolder("<endpoint-group>"))
stores := extkingpin.Addrs(cmd.Flag("store", "Deprecation Warning - This flag is deprecated and replaced with `endpoint`. Addresses of statically configured store API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect store API servers through respective DNS lookups.").
PlaceHolder("<store>"))
// TODO(bwplotka): Hidden because we plan to extract discovery to separate API: https://github.com/thanos-io/thanos/issues/2600.
ruleEndpoints := extkingpin.Addrs(cmd.Flag("rule", "Deprecation Warning - This flag is deprecated and replaced with `endpoint`. Experimental: Addresses of statically configured rules API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect rule API servers through respective DNS lookups.").
Hidden().PlaceHolder("<rule>"))
metadataEndpoints := extkingpin.Addrs(cmd.Flag("metadata", "Deprecation Warning - This flag is deprecated and replaced with `endpoint`. Experimental: Addresses of statically configured metadata API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect metadata API servers through respective DNS lookups.").
Hidden().PlaceHolder("<metadata>"))
exemplarEndpoints := extkingpin.Addrs(cmd.Flag("exemplar", "Deprecation Warning - This flag is deprecated and replaced with `endpoint`. Experimental: Addresses of statically configured exemplars API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect exemplars API servers through respective DNS lookups.").
Hidden().PlaceHolder("<exemplar>"))
// TODO(atunik): Hidden because we plan to extract discovery to separate API: https://github.com/thanos-io/thanos/issues/2600.
targetEndpoints := extkingpin.Addrs(cmd.Flag("target", "Deprecation Warning - This flag is deprecated and replaced with `endpoint`. Experimental: Addresses of statically configured target API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect target API servers through respective DNS lookups.").
Hidden().PlaceHolder("<target>"))
strictStores := cmd.Flag("store-strict", "Deprecation Warning - This flag is deprecated and replaced with `endpoint-strict`. Addresses of only statically configured store API servers that are always used, even if the health check fails. Useful if you have a caching layer on top.").
PlaceHolder("<staticstore>").Strings()
strictEndpoints := cmd.Flag("endpoint-strict", "Addresses of only statically configured Thanos API servers that are always used, even if the health check fails. Useful if you have a caching layer on top.").
PlaceHolder("<staticendpoint>").Strings()
strictEndpointGroups := extkingpin.Addrs(cmd.Flag("endpoint-group-strict", "Experimental: DNS name of statically configured Thanos API server groups (repeatable) that are always used, even if the health check fails.").
PlaceHolder("<endpoint-group-strict>"))
fileSDFiles := cmd.Flag("store.sd-files", "Path to files that contain addresses of store API servers. The path can be a glob pattern (repeatable).").
PlaceHolder("<path>").Strings()
fileSDInterval := extkingpin.ModelDuration(cmd.Flag("store.sd-interval", "Refresh interval to re-read file SD files. It is used as a resync fallback.").
Default("5m"))
// TODO(bwplotka): Grab this from TTL at some point.
dnsSDInterval := extkingpin.ModelDuration(cmd.Flag("store.sd-dns-interval", "Interval between DNS resolutions.").
Default("30s"))
dnsSDResolver := cmd.Flag("store.sd-dns-resolver", fmt.Sprintf("Resolver to use. Possible options: [%s, %s]", dns.GolangResolverType, dns.MiekgdnsResolverType)).
Default(string(dns.MiekgdnsResolverType)).Hidden().String()
unhealthyStoreTimeout := extkingpin.ModelDuration(cmd.Flag("store.unhealthy-timeout", "Timeout before an unhealthy store is cleaned from the store UI page.").Default("5m"))
endpointInfoTimeout := extkingpin.ModelDuration(cmd.Flag("endpoint.info-timeout", "Timeout of gRPC Info requests.").Default("5s").Hidden())
enableAutodownsampling := cmd.Flag("query.auto-downsampling", "Enable automatic adjustment (step / 5) to what source of data should be used in store gateways if no max_source_resolution param is specified.").
Default("false").Bool()
@ -198,7 +139,7 @@ func registerQuery(app *extkingpin.App) {
activeQueryDir := cmd.Flag("query.active-query-path", "Directory to log currently active queries in the queries.active file.").Default("").String()
featureList := cmd.Flag("enable-feature", "Comma separated experimental feature names to enable.The current list of features is empty.").Hidden().Default("").Strings()
featureList := cmd.Flag("enable-feature", "Comma separated feature names to enable. Valid options for now: promql-experimental-functions (enables promql experimental functions in query)").Default("").Strings()
enableExemplarPartialResponse := cmd.Flag("exemplar.partial-response", "Enable partial response for exemplar endpoint. --no-exemplar.partial-response for disabling.").
Hidden().Default("true").Bool()
@ -231,6 +172,38 @@ func registerQuery(app *extkingpin.App) {
tenantCertField := cmd.Flag("query.tenant-certificate-field", "Use TLS client's certificate field to determine tenant for write requests. Must be one of "+tenancy.CertificateFieldOrganization+", "+tenancy.CertificateFieldOrganizationalUnit+" or "+tenancy.CertificateFieldCommonName+". This setting will cause the query.tenant-header flag value to be ignored.").Default("").Enum("", tenancy.CertificateFieldOrganization, tenancy.CertificateFieldOrganizationalUnit, tenancy.CertificateFieldCommonName)
enforceTenancy := cmd.Flag("query.enforce-tenancy", "Enforce tenancy on Query APIs. Responses are returned only if the label value of the configured tenant-label-name and the value of the tenant header matches.").Default("false").Bool()
tenantLabel := cmd.Flag("query.tenant-label-name", "Label name to use when enforcing tenancy (if --query.enforce-tenancy is enabled).").Default(tenancy.DefaultTenantLabel).String()
// TODO(bwplotka): Grab this from TTL at some point.
dnsSDInterval := extkingpin.ModelDuration(cmd.Flag("store.sd-dns-interval", "Interval between DNS resolutions.").
Default("30s"))
dnsSDResolver := cmd.Flag("store.sd-dns-resolver", fmt.Sprintf("Resolver to use. Possible options: [%s, %s]", dns.GolangResolverType, dns.MiekgdnsResolverType)).
Default(string(dns.MiekgdnsResolverType)).Hidden().String()
unhealthyStoreTimeout := extkingpin.ModelDuration(cmd.Flag("store.unhealthy-timeout", "Timeout before an unhealthy store is cleaned from the store UI page.").Default("5m"))
endpointInfoTimeout := extkingpin.ModelDuration(cmd.Flag("endpoint.info-timeout", "Timeout of gRPC Info requests.").Default("5s").Hidden())
endpointSetConfig := extflag.RegisterPathOrContent(cmd, "endpoint.sd-config", "Config File with endpoint definitions")
endpointSetConfigReloadInterval := extkingpin.ModelDuration(cmd.Flag("endpoint.sd-config-reload-interval", "Interval between endpoint config refreshes").Default("5m"))
legacyFileSDFiles := cmd.Flag("store.sd-files", "(Deprecated) Path to files that contain addresses of store API servers. The path can be a glob pattern (repeatable).").
PlaceHolder("<path>").Strings()
legacyFileSDInterval := extkingpin.ModelDuration(cmd.Flag("store.sd-interval", "(Deprecated) Refresh interval to re-read file SD files. It is used as a resync fallback.").
Default("5m"))
endpoints := extkingpin.Addrs(cmd.Flag("endpoint", "(Deprecated): Addresses of statically configured Thanos API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect Thanos API servers through respective DNS lookups.").PlaceHolder("<endpoint>"))
endpointGroups := extkingpin.Addrs(cmd.Flag("endpoint-group", "(Deprecated, Experimental): DNS name of statically configured Thanos API server groups (repeatable). Targets resolved from the DNS name will be queried in a round-robin, instead of a fanout manner. This flag should be used when connecting a Thanos Query to HA groups of Thanos components.").PlaceHolder("<endpoint-group>"))
strictEndpoints := extkingpin.Addrs(cmd.Flag("endpoint-strict", "(Deprecated): Addresses of only statically configured Thanos API servers that are always used, even if the health check fails. Useful if you have a caching layer on top.").
PlaceHolder("<endpoint-strict>"))
strictEndpointGroups := extkingpin.Addrs(cmd.Flag("endpoint-group-strict", "(Deprecated, Experimental): DNS name of statically configured Thanos API server groups (repeatable) that are always used, even if the health check fails.").PlaceHolder("<endpoint-group-strict>"))
lazyRetrievalMaxBufferedResponses := cmd.Flag("query.lazy-retrieval-max-buffered-responses", "The lazy retrieval strategy can buffer up to this number of responses. This is to limit the memory usage. This flag takes effect only when the lazy retrieval strategy is enabled.").
Default("20").Hidden().Int()
var storeRateLimits store.SeriesSelectLimits
storeRateLimits.RegisterFlags(cmd)
@ -242,6 +215,10 @@ func registerQuery(app *extkingpin.App) {
}
for _, feature := range *featureList {
if feature == promqlExperimentalFunctions {
parser.EnableExperimentalFunctions = true
level.Info(logger).Log("msg", "Experimental PromQL functions enabled.", "option", promqlExperimentalFunctions)
}
if feature == promqlAtModifier {
level.Warn(logger).Log("msg", "This option for --enable-feature is now permanently enabled and therefore a no-op.", "option", promqlAtModifier)
}
@ -259,23 +236,10 @@ func registerQuery(app *extkingpin.App) {
}
grpcLogOpts, logFilterMethods, err := logging.ParsegRPCOptions(reqLogConfig)
if err != nil {
return errors.Wrap(err, "error while parsing config for request logging")
}
var fileSD *file.Discovery
if len(*fileSDFiles) > 0 {
conf := &file.SDConfig{
Files: *fileSDFiles,
RefreshInterval: *fileSDInterval,
}
var err error
if fileSD, err = file.NewDiscovery(conf, logger, conf.NewDiscovererMetrics(reg, discovery.NewRefreshMetrics(reg))); err != nil {
return err
}
}
if *webRoutePrefix == "" {
*webRoutePrefix = *webExternalPrefix
}
@ -293,23 +257,51 @@ func registerQuery(app *extkingpin.App) {
return err
}
dialOpts, err := grpcClientConfig.dialOptions(logger, reg, tracer)
if err != nil {
return err
}
if *promqlQueryMode != string(apiv1.PromqlQueryModeLocal) {
level.Info(logger).Log("msg", "Distributed query mode enabled, using Thanos as the default query engine.")
*defaultEngine = string(apiv1.PromqlEngineThanos)
}
endpointSet, err := setupEndpointSet(
g,
comp,
reg,
logger,
endpointSetConfig,
time.Duration(*endpointSetConfigReloadInterval),
*legacyFileSDFiles,
time.Duration(*legacyFileSDInterval),
*endpoints,
*endpointGroups,
*strictEndpoints,
*strictEndpointGroups,
*dnsSDResolver,
time.Duration(*dnsSDInterval),
time.Duration(*unhealthyStoreTimeout),
time.Duration(*endpointInfoTimeout),
dialOpts,
*queryConnMetricLabels...,
)
if err != nil {
return err
}
return runQuery(
g,
logger,
debugLogging,
endpointSet,
reg,
tracer,
httpLogOpts,
grpcLogOpts,
logFilterMethods,
grpcServerConfig,
*grpcCompression,
*secure,
*skipVerify,
*cert,
*key,
*caCert,
*serverName,
*httpBindAddr,
*httpTLSConfig,
time.Duration(*httpGracePeriod),
@ -324,17 +316,11 @@ func registerQuery(app *extkingpin.App) {
*dynamicLookbackDelta,
time.Duration(*defaultEvaluationInterval),
time.Duration(*storeResponseTimeout),
*queryConnMetricLabels,
*deduplicationFunc,
*queryReplicaLabels,
*queryPartitionLabels,
selectorLset,
getFlagsMap(cmd.Flags()),
*endpoints,
*endpointGroups,
*stores,
*ruleEndpoints,
*targetEndpoints,
*metadataEndpoints,
*exemplarEndpoints,
*enableAutodownsampling,
*enableQueryPartialResponse,
*enableRulePartialResponse,
@ -342,56 +328,44 @@ func registerQuery(app *extkingpin.App) {
*enableMetricMetadataPartialResponse,
*enableExemplarPartialResponse,
*activeQueryDir,
fileSD,
time.Duration(*dnsSDInterval),
*dnsSDResolver,
time.Duration(*unhealthyStoreTimeout),
time.Duration(*endpointInfoTimeout),
time.Duration(*instantDefaultMaxSourceResolution),
*defaultMetadataTimeRange,
*strictStores,
*strictEndpoints,
*strictEndpointGroups,
*webDisableCORS,
*alertQueryURL,
*grpcProxyStrategy,
component.Query,
*queryTelemetryDurationQuantiles,
*queryTelemetrySamplesQuantiles,
*queryTelemetrySeriesQuantiles,
*defaultEngine,
storeRateLimits,
*extendedFunctionsEnabled,
store.NewTSDBSelector(tsdbSelector),
queryMode(*promqlQueryMode),
apiv1.PromqlEngineType(*defaultEngine),
apiv1.PromqlQueryMode(*promqlQueryMode),
*disableQueryFallback,
*tenantHeader,
*defaultTenant,
*tenantCertField,
*enforceTenancy,
*tenantLabel,
*queryDistributedWithOverlappingInterval,
*lazyRetrievalMaxBufferedResponses,
)
})
}
// runQuery starts a server that exposes PromQL Query API. It is responsible for querying configured
// store nodes, merging and duplicating the data to satisfy user query.
// store nodes, merging and deduplicating the data to satisfy user query.
func runQuery(
g *run.Group,
logger log.Logger,
debugLogging bool,
endpointSet *query.EndpointSet,
reg *prometheus.Registry,
tracer opentracing.Tracer,
httpLogOpts []logging.Option,
grpcLogOpts []grpc_logging.Option,
logFilterMethods []string,
grpcServerConfig grpcConfig,
grpcCompression string,
secure bool,
skipVerify bool,
cert string,
key string,
caCert string,
serverName string,
httpBindAddr string,
httpTLSConfig string,
httpGracePeriod time.Duration,
@ -406,17 +380,11 @@ func runQuery(
dynamicLookbackDelta bool,
defaultEvaluationInterval time.Duration,
storeResponseTimeout time.Duration,
queryConnMetricLabels []string,
deduplicationFunc string,
queryReplicaLabels []string,
queryPartitionLabels []string,
selectorLset labels.Labels,
flagsMap map[string]string,
endpointAddrs []string,
endpointGroupAddrs []string,
storeAddrs []string,
ruleAddrs []string,
targetAddrs []string,
metadataAddrs []string,
exemplarAddrs []string,
enableAutodownsampling bool,
enableQueryPartialResponse bool,
enableRulePartialResponse bool,
@ -424,34 +392,29 @@ func runQuery(
enableMetricMetadataPartialResponse bool,
enableExemplarPartialResponse bool,
activeQueryDir string,
fileSD *file.Discovery,
dnsSDInterval time.Duration,
dnsSDResolver string,
unhealthyStoreTimeout time.Duration,
endpointInfoTimeout time.Duration,
instantDefaultMaxSourceResolution time.Duration,
defaultMetadataTimeRange time.Duration,
strictStores []string,
strictEndpoints []string,
strictEndpointGroups []string,
disableCORS bool,
alertQueryURL string,
grpcProxyStrategy string,
comp component.Component,
queryTelemetryDurationQuantiles []float64,
queryTelemetrySamplesQuantiles []float64,
queryTelemetrySeriesQuantiles []float64,
defaultEngine string,
storeRateLimits store.SeriesSelectLimits,
extendedFunctionsEnabled bool,
tsdbSelector *store.TSDBSelector,
queryMode queryMode,
defaultEngine apiv1.PromqlEngineType,
queryMode apiv1.PromqlQueryMode,
disableQueryFallback bool,
tenantHeader string,
defaultTenant string,
tenantCertField string,
enforceTenancy bool,
tenantLabel string,
queryDistributedWithOverlappingInterval bool,
lazyRetrievalMaxBufferedResponses int,
) error {
comp := component.Query
if alertQueryURL == "" {
lastColon := strings.LastIndex(httpBindAddr, ":")
if lastColon != -1 {
@ -459,184 +422,41 @@ func runQuery(
}
// NOTE(GiedriusS): default is set in config.ts.
}
// TODO(bplotka in PR #513 review): Move arguments into struct.
duplicatedStores := promauto.With(reg).NewCounter(prometheus.CounterOpts{
Name: "thanos_query_duplicated_store_addresses_total",
Help: "The number of times a duplicated store addresses is detected from the different configs in query",
})
dialOpts, err := extgrpc.StoreClientGRPCOpts(logger, reg, tracer, secure, skipVerify, cert, key, caCert, serverName)
if err != nil {
return errors.Wrap(err, "building gRPC client")
}
if grpcCompression != compressionNone {
dialOpts = append(dialOpts, grpc.WithDefaultCallOptions(grpc.UseCompressor(grpcCompression)))
}
fileSDCache := cache.New()
dnsStoreProvider := dns.NewProvider(
logger,
extprom.WrapRegistererWithPrefix("thanos_query_store_apis_", reg),
dns.ResolverType(dnsSDResolver),
)
for _, store := range strictStores {
if dns.IsDynamicNode(store) {
return errors.Errorf("%s is a dynamically specified store i.e. it uses SD and that is not permitted under strict mode. Use --store for this", store)
}
}
for _, endpoint := range strictEndpoints {
if dns.IsDynamicNode(endpoint) {
return errors.Errorf("%s is a dynamically specified endpoint i.e. it uses SD and that is not permitted under strict mode. Use --endpoint for this", endpoint)
}
}
dnsEndpointProvider := dns.NewProvider(
logger,
extprom.WrapRegistererWithPrefix("thanos_query_endpoints_", reg),
dns.ResolverType(dnsSDResolver),
)
dnsRuleProvider := dns.NewProvider(
logger,
extprom.WrapRegistererWithPrefix("thanos_query_rule_apis_", reg),
dns.ResolverType(dnsSDResolver),
)
dnsTargetProvider := dns.NewProvider(
logger,
extprom.WrapRegistererWithPrefix("thanos_query_target_apis_", reg),
dns.ResolverType(dnsSDResolver),
)
dnsMetadataProvider := dns.NewProvider(
logger,
extprom.WrapRegistererWithPrefix("thanos_query_metadata_apis_", reg),
dns.ResolverType(dnsSDResolver),
)
dnsExemplarProvider := dns.NewProvider(
logger,
extprom.WrapRegistererWithPrefix("thanos_query_exemplar_apis_", reg),
dns.ResolverType(dnsSDResolver),
)
options := []store.ProxyStoreOption{
store.WithTSDBSelector(tsdbSelector),
store.WithProxyStoreDebugLogging(debugLogging),
store.WithLazyRetrievalMaxBufferedResponsesForProxy(lazyRetrievalMaxBufferedResponses),
}
var (
endpoints = prepareEndpointSet(
g,
logger,
reg,
[]*dns.Provider{
dnsStoreProvider,
dnsRuleProvider,
dnsExemplarProvider,
dnsMetadataProvider,
dnsTargetProvider,
dnsEndpointProvider,
},
duplicatedStores,
strictStores,
strictEndpoints,
endpointGroupAddrs,
strictEndpointGroups,
dialOpts,
unhealthyStoreTimeout,
endpointInfoTimeout,
queryConnMetricLabels...,
)
// Parse and sanitize the provided replica labels flags.
queryReplicaLabels = strutil.ParseFlagLabels(queryReplicaLabels)
proxy = store.NewProxyStore(logger, reg, endpoints.GetStoreClients, component.Query, selectorLset, storeResponseTimeout, store.RetrievalStrategy(grpcProxyStrategy), options...)
rulesProxy = rules.NewProxy(logger, endpoints.GetRulesClients)
targetsProxy = targets.NewProxy(logger, endpoints.GetTargetsClients)
metadataProxy = metadata.NewProxy(logger, endpoints.GetMetricMetadataClients)
exemplarsProxy = exemplars.NewProxy(logger, endpoints.GetExemplarsStores, selectorLset)
var (
proxyStore = store.NewProxyStore(logger, reg, endpointSet.GetStoreClients, component.Query, selectorLset, storeResponseTimeout, store.RetrievalStrategy(grpcProxyStrategy), options...)
seriesProxy = store.NewLimitedStoreServer(store.NewInstrumentedStoreServer(reg, proxyStore), reg, storeRateLimits)
rulesProxy = rules.NewProxy(logger, endpointSet.GetRulesClients)
targetsProxy = targets.NewProxy(logger, endpointSet.GetTargetsClients)
metadataProxy = metadata.NewProxy(logger, endpointSet.GetMetricMetadataClients)
exemplarsProxy = exemplars.NewProxy(logger, endpointSet.GetExemplarsStores, selectorLset)
queryableCreator = query.NewQueryableCreator(
logger,
extprom.WrapRegistererWithPrefix("thanos_query_", reg),
proxy,
seriesProxy,
maxConcurrentSelects,
queryTimeout,
deduplicationFunc,
)
remoteEndpointsCreator = query.NewRemoteEndpointsCreator(
logger,
endpointSet.GetQueryAPIClients,
queryPartitionLabels,
queryTimeout,
queryDistributedWithOverlappingInterval,
enableAutodownsampling,
)
)
// Run File Service Discovery and update the store set when the files are modified.
if fileSD != nil {
var fileSDUpdates chan []*targetgroup.Group
ctxRun, cancelRun := context.WithCancel(context.Background())
fileSDUpdates = make(chan []*targetgroup.Group)
g.Add(func() error {
fileSD.Run(ctxRun, fileSDUpdates)
return nil
}, func(error) {
cancelRun()
})
ctxUpdate, cancelUpdate := context.WithCancel(context.Background())
g.Add(func() error {
for {
select {
case update := <-fileSDUpdates:
// Discoverers sometimes send nil updates so need to check for it to avoid panics.
if update == nil {
continue
}
fileSDCache.Update(update)
endpoints.Update(ctxUpdate)
if err := dnsStoreProvider.Resolve(ctxUpdate, append(fileSDCache.Addresses(), storeAddrs...)); err != nil {
level.Error(logger).Log("msg", "failed to resolve addresses for storeAPIs", "err", err)
}
// Rules apis do not support file service discovery as of now.
case <-ctxUpdate.Done():
return nil
}
}
}, func(error) {
cancelUpdate()
})
}
// Periodically update the addresses from static flags and file SD by resolving them using DNS SD if necessary.
{
ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {
return runutil.Repeat(dnsSDInterval, ctx.Done(), func() error {
resolveCtx, resolveCancel := context.WithTimeout(ctx, dnsSDInterval)
defer resolveCancel()
if err := dnsStoreProvider.Resolve(resolveCtx, append(fileSDCache.Addresses(), storeAddrs...)); err != nil {
level.Error(logger).Log("msg", "failed to resolve addresses for storeAPIs", "err", err)
}
if err := dnsRuleProvider.Resolve(resolveCtx, ruleAddrs); err != nil {
level.Error(logger).Log("msg", "failed to resolve addresses for rulesAPIs", "err", err)
}
if err := dnsTargetProvider.Resolve(ctx, targetAddrs); err != nil {
level.Error(logger).Log("msg", "failed to resolve addresses for targetsAPIs", "err", err)
}
if err := dnsMetadataProvider.Resolve(resolveCtx, metadataAddrs); err != nil {
level.Error(logger).Log("msg", "failed to resolve addresses for metadataAPIs", "err", err)
}
if err := dnsExemplarProvider.Resolve(resolveCtx, exemplarAddrs); err != nil {
level.Error(logger).Log("msg", "failed to resolve addresses for exemplarsAPI", "err", err)
}
if err := dnsEndpointProvider.Resolve(resolveCtx, endpointAddrs); err != nil {
level.Error(logger).Log("msg", "failed to resolve addresses passed using endpoint flag", "err", err)
}
return nil
})
}, func(error) {
cancel()
})
}
grpcProbe := prober.NewGRPC()
httpProbe := prober.NewHTTP()
statusProber := prober.Combine(
@ -645,45 +465,26 @@ func runQuery(
prober.NewInstrumentation(comp, logger, extprom.WrapRegistererWithPrefix("thanos_", reg)),
)
engineOpts := promql.EngineOpts{
Logger: logger,
Reg: reg,
// TODO(bwplotka): Expose this as a flag: https://github.com/thanos-io/thanos/issues/703.
MaxSamples: math.MaxInt32,
Timeout: queryTimeout,
LookbackDelta: lookbackDelta,
NoStepSubqueryIntervalFn: func(int64) int64 {
return defaultEvaluationInterval.Milliseconds()
},
EnableNegativeOffset: true,
EnableAtModifier: true,
}
// An active query tracker will be added only if the user specifies a non-default path.
// Otherwise, the nil active query tracker from existing engine options will be used.
var activeQueryTracker *promql.ActiveQueryTracker
if activeQueryDir != "" {
engineOpts.ActiveQueryTracker = promql.NewActiveQueryTracker(activeQueryDir, maxConcurrentQueries, logger)
activeQueryTracker = promql.NewActiveQueryTracker(activeQueryDir, maxConcurrentQueries, logutil.GoKitLogToSlog(logger))
}
var remoteEngineEndpoints api.RemoteEndpoints
if queryMode != queryModeLocal {
level.Info(logger).Log("msg", "Distributed query mode enabled, using Thanos as the default query engine.")
defaultEngine = string(apiv1.PromqlEngineThanos)
remoteEngineEndpoints = query.NewRemoteEndpoints(logger, endpoints.GetQueryAPIClients, query.Opts{
AutoDownsample: enableAutodownsampling,
ReplicaLabels: queryReplicaLabels,
Timeout: queryTimeout,
EnablePartialResponse: enableQueryPartialResponse,
})
}
engineFactory := apiv1.NewQueryEngineFactory(
engineOpts,
remoteEngineEndpoints,
queryCreator := apiv1.NewQueryFactory(
reg,
logger,
queryTimeout,
lookbackDelta,
defaultEvaluationInterval,
extendedFunctionsEnabled,
activeQueryTracker,
queryMode,
disableQueryFallback,
)
lookbackDeltaCreator := LookbackDeltaFactory(engineOpts, dynamicLookbackDelta)
lookbackDeltaCreator := LookbackDeltaFactory(lookbackDelta, dynamicLookbackDelta)
// Start query API + UI HTTP server.
{
@ -708,15 +509,16 @@ func runQuery(
ins := extpromhttp.NewTenantInstrumentationMiddleware(tenantHeader, defaultTenant, reg, nil)
// TODO(bplotka in PR #513 review): pass all flags, not only the flags needed by prefix rewriting.
ui.NewQueryUI(logger, endpoints, webExternalPrefix, webPrefixHeaderName, alertQueryURL, tenantHeader, defaultTenant, enforceTenancy).Register(router, ins)
ui.NewQueryUI(logger, endpointSet, webExternalPrefix, webPrefixHeaderName, alertQueryURL, tenantHeader, defaultTenant, enforceTenancy).Register(router, ins)
api := apiv1.NewQueryAPI(
logger,
endpoints.GetEndpointStatus,
engineFactory,
endpointSet.GetEndpointStatus,
queryCreator,
apiv1.PromqlEngineType(defaultEngine),
lookbackDeltaCreator,
queryableCreator,
remoteEndpointsCreator,
// NOTE: Will share the same replica label as the query for now.
rules.NewGRPCClientWithDedup(rulesProxy, queryReplicaLabels),
targets.NewGRPCClientWithDedup(targetsProxy, queryReplicaLabels),
@ -775,23 +577,23 @@ func runQuery(
}
// Start query (proxy) gRPC StoreAPI.
{
tlsCfg, err := tls.NewServerConfig(log.With(logger, "protocol", "gRPC"), grpcServerConfig.tlsSrvCert, grpcServerConfig.tlsSrvKey, grpcServerConfig.tlsSrvClientCA)
tlsCfg, err := tls.NewServerConfig(log.With(logger, "protocol", "gRPC"), grpcServerConfig.tlsSrvCert, grpcServerConfig.tlsSrvKey, grpcServerConfig.tlsSrvClientCA, grpcServerConfig.tlsMinVersion)
if err != nil {
return errors.Wrap(err, "setup gRPC server")
}
infoSrv := info.NewInfoServer(
component.Query.String(),
info.WithLabelSetFunc(func() []labelpb.ZLabelSet { return proxy.LabelSet() }),
info.WithLabelSetFunc(func() []labelpb.ZLabelSet { return proxyStore.LabelSet() }),
info.WithStoreInfoFunc(func() (*infopb.StoreInfo, error) {
if httpProbe.IsReady() {
mint, maxt := proxy.TimeRange()
mint, maxt := proxyStore.TimeRange()
return &infopb.StoreInfo{
MinTime: mint,
MaxTime: maxt,
SupportsSharding: true,
SupportsWithoutReplicaLabels: true,
TsdbInfos: proxy.TSDBInfos(),
TsdbInfos: proxyStore.TSDBInfos(),
}, nil
}
return nil, errors.New("Not ready")
@ -803,12 +605,11 @@ func runQuery(
info.WithQueryAPIInfoFunc(),
)
defaultEngineType := querypb.EngineType(querypb.EngineType_value[defaultEngine])
grpcAPI := apiv1.NewGRPCAPI(time.Now, queryReplicaLabels, queryableCreator, engineFactory, defaultEngineType, lookbackDeltaCreator, instantDefaultMaxSourceResolution)
storeServer := store.NewLimitedStoreServer(store.NewInstrumentedStoreServer(reg, proxy), reg, storeRateLimits)
defaultEngineType := querypb.EngineType(querypb.EngineType_value[string(defaultEngine)])
grpcAPI := apiv1.NewGRPCAPI(time.Now, queryReplicaLabels, queryableCreator, remoteEndpointsCreator, queryCreator, defaultEngineType, lookbackDeltaCreator, instantDefaultMaxSourceResolution)
s := grpcserver.New(logger, reg, tracer, grpcLogOpts, logFilterMethods, comp, grpcProbe,
grpcserver.WithServer(apiv1.RegisterQueryServer(grpcAPI)),
grpcserver.WithServer(store.RegisterStoreServer(storeServer, logger)),
grpcserver.WithServer(store.RegisterStoreServer(seriesProxy, logger)),
grpcserver.WithServer(rules.RegisterRulesServer(rulesProxy)),
grpcserver.WithServer(targets.RegisterTargetsServer(targetsProxy)),
grpcserver.WithServer(metadata.RegisterMetadataServer(metadataProxy)),
@ -826,6 +627,7 @@ func runQuery(
}, func(error) {
statusProber.NotReady(err)
s.Shutdown(err)
endpointSet.Close()
})
}
@ -833,104 +635,11 @@ func runQuery(
return nil
}
func removeDuplicateEndpointSpecs(logger log.Logger, duplicatedStores prometheus.Counter, specs []*query.GRPCEndpointSpec) []*query.GRPCEndpointSpec {
set := make(map[string]*query.GRPCEndpointSpec)
for _, spec := range specs {
addr := spec.Addr()
if _, ok := set[addr]; ok {
level.Warn(logger).Log("msg", "Duplicate store address is provided", "addr", addr)
duplicatedStores.Inc()
}
set[addr] = spec
}
deduplicated := make([]*query.GRPCEndpointSpec, 0, len(set))
for _, value := range set {
deduplicated = append(deduplicated, value)
}
return deduplicated
}
func prepareEndpointSet(
g *run.Group,
logger log.Logger,
reg *prometheus.Registry,
dnsProviders []*dns.Provider,
duplicatedStores prometheus.Counter,
strictStores []string,
strictEndpoints []string,
endpointGroupAddrs []string,
strictEndpointGroups []string,
dialOpts []grpc.DialOption,
unhealthyStoreTimeout time.Duration,
endpointInfoTimeout time.Duration,
queryConnMetricLabels ...string,
) *query.EndpointSet {
endpointSet := query.NewEndpointSet(
time.Now,
logger,
reg,
func() (specs []*query.GRPCEndpointSpec) {
// Add strict & static nodes.
for _, addr := range strictStores {
specs = append(specs, query.NewGRPCEndpointSpec(addr, true))
}
for _, addr := range strictEndpoints {
specs = append(specs, query.NewGRPCEndpointSpec(addr, true))
}
for _, dnsProvider := range dnsProviders {
var tmpSpecs []*query.GRPCEndpointSpec
for _, addr := range dnsProvider.Addresses() {
tmpSpecs = append(tmpSpecs, query.NewGRPCEndpointSpec(addr, false))
}
tmpSpecs = removeDuplicateEndpointSpecs(logger, duplicatedStores, tmpSpecs)
specs = append(specs, tmpSpecs...)
}
for _, eg := range endpointGroupAddrs {
addr := fmt.Sprintf("dns:///%s", eg)
spec := query.NewGRPCEndpointSpec(addr, false, extgrpc.EndpointGroupGRPCOpts()...)
specs = append(specs, spec)
}
for _, eg := range strictEndpointGroups {
addr := fmt.Sprintf("dns:///%s", eg)
spec := query.NewGRPCEndpointSpec(addr, true, extgrpc.EndpointGroupGRPCOpts()...)
specs = append(specs, spec)
}
return specs
},
dialOpts,
unhealthyStoreTimeout,
endpointInfoTimeout,
queryConnMetricLabels...,
)
// Periodically update the store set with the addresses we see in our cluster.
{
ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {
return runutil.Repeat(5*time.Second, ctx.Done(), func() error {
endpointSet.Update(ctx)
return nil
})
}, func(error) {
cancel()
endpointSet.Close()
})
}
return endpointSet
}
// LookbackDeltaFactory creates from 1 to 3 lookback deltas depending on
// dynamicLookbackDelta and eo.LookbackDelta and returns a function
// that returns appropriate lookback delta for given maxSourceResolutionMillis.
func LookbackDeltaFactory(
eo promql.EngineOpts,
lookbackDelta time.Duration,
dynamicLookbackDelta bool,
) func(int64) time.Duration {
resolutions := []int64{downsample.ResLevel0}
@ -939,10 +648,9 @@ func LookbackDeltaFactory(
}
var (
lds = make([]time.Duration, len(resolutions))
ld = eo.LookbackDelta.Milliseconds()
ld = lookbackDelta.Milliseconds()
)
lookbackDelta := eo.LookbackDelta
for i, r := range resolutions {
if ld < r {
lookbackDelta = time.Duration(r) * time.Millisecond

View File

@ -4,6 +4,7 @@
package main
import (
"context"
"net"
"net/http"
"time"
@ -34,6 +35,7 @@ import (
"github.com/thanos-io/thanos/pkg/logging"
"github.com/thanos-io/thanos/pkg/prober"
"github.com/thanos-io/thanos/pkg/queryfrontend"
"github.com/thanos-io/thanos/pkg/runutil"
httpserver "github.com/thanos-io/thanos/pkg/server/http"
"github.com/thanos-io/thanos/pkg/server/http/middleware"
"github.com/thanos-io/thanos/pkg/tenancy"
@ -97,6 +99,8 @@ func registerQueryFrontend(app *extkingpin.App) {
cmd.Flag("query-frontend.enable-x-functions", "Enable experimental x- functions in query-frontend. --no-query-frontend.enable-x-functions for disabling.").
Default("false").BoolVar(&cfg.EnableXFunctions)
cmd.Flag("enable-feature", "Comma separated feature names to enable. Valid options for now: promql-experimental-functions (enables promql experimental functions in query-frontend)").Default("").StringsVar(&cfg.EnableFeatures)
cmd.Flag("query-range.max-query-length", "Limit the query time range (end - start time) in the query-frontend, 0 disables it.").
Default("0").DurationVar((*time.Duration)(&cfg.QueryRangeConfig.Limits.MaxQueryLength))
@ -146,6 +150,8 @@ func registerQueryFrontend(app *extkingpin.App) {
cmd.Flag("query-frontend.log-queries-longer-than", "Log queries that are slower than the specified duration. "+
"Set to 0 to disable. Set to < 0 to enable on all queries.").Default("0").DurationVar(&cfg.CortexHandlerConfig.LogQueriesLongerThan)
cmd.Flag("query-frontend.force-query-stats", "Enables query statistics for all queries and will export statistics as logs and service headers.").Default("false").BoolVar(&cfg.CortexHandlerConfig.QueryStatsEnabled)
cmd.Flag("query-frontend.org-id-header", "Deprecation Warning - This flag will be soon deprecated in favor of query-frontend.tenant-header"+
" and both flags cannot be used at the same time. "+
"Request header names used to identify the source of slow queries (repeated flag). "+
@ -268,8 +274,9 @@ func runQueryFrontend(
return errors.Wrap(err, "initializing the query range cache config")
}
cfg.QueryRangeConfig.ResultsCacheConfig = &queryrange.ResultsCacheConfig{
Compression: cfg.CacheCompression,
CacheConfig: *cacheConfig,
Compression: cfg.CacheCompression,
CacheConfig: *cacheConfig,
CacheQueryableSamplesStats: cfg.CortexHandlerConfig.QueryStatsEnabled,
}
}
@ -298,6 +305,15 @@ func runQueryFrontend(
}
}
if len(cfg.EnableFeatures) > 0 {
for _, feature := range cfg.EnableFeatures {
if feature == promqlExperimentalFunctions {
parser.EnableExperimentalFunctions = true
level.Info(logger).Log("msg", "Experimental PromQL functions enabled.", "option", promqlExperimentalFunctions)
}
}
}
tripperWare, err := queryfrontend.NewTripperware(cfg.Config, reg, logger)
if err != nil {
return errors.Wrap(err, "setup tripperwares")
@ -359,9 +375,7 @@ func runQueryFrontend(
logger,
ins.NewHandler(
name,
gzhttp.GzipHandler(
logMiddleware.HTTPMiddleware(name, f),
),
logMiddleware.HTTPMiddleware(name, f),
),
// Cortex frontend middlewares require orgID.
),
@ -383,8 +397,55 @@ func runQueryFrontend(
})
}
// Periodically check downstream URL to ensure it is reachable.
{
ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {
var firstRun = true
for {
if !firstRun {
select {
case <-ctx.Done():
return nil
case <-time.After(10 * time.Second):
}
}
timeoutCtx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
readinessUrl := cfg.DownstreamURL + "/-/ready"
req, err := http.NewRequestWithContext(timeoutCtx, http.MethodGet, readinessUrl, nil)
if err != nil {
return errors.Wrap(err, "creating request to downstream URL")
}
resp, err := roundTripper.RoundTrip(req)
if err != nil {
level.Warn(logger).Log("msg", "failed to reach downstream URL", "err", err, "readiness_url", readinessUrl)
statusProber.NotReady(err)
firstRun = false
continue
}
runutil.ExhaustCloseWithLogOnErr(logger, resp.Body, "downstream health check response body")
if resp.StatusCode/100 == 4 || resp.StatusCode/100 == 5 {
level.Warn(logger).Log("msg", "downstream URL returned an error", "status_code", resp.StatusCode, "readiness_url", readinessUrl)
statusProber.NotReady(errors.Errorf("downstream URL %s returned an error: %d", readinessUrl, resp.StatusCode))
firstRun = false
continue
}
statusProber.Ready()
}
}, func(err error) {
cancel()
})
}
level.Info(logger).Log("msg", "starting query frontend")
statusProber.Ready()
return nil
}

View File

@ -7,8 +7,6 @@ import (
"testing"
"time"
"github.com/prometheus/prometheus/promql"
"github.com/efficientgo/core/testutil"
)
@ -87,7 +85,7 @@ func TestLookbackDeltaFactory(t *testing.T) {
}
)
for _, td := range tData {
lookbackCreate := LookbackDeltaFactory(promql.EngineOpts{LookbackDelta: td.lookbackDelta}, td.dynamicLookbackDelta)
lookbackCreate := LookbackDeltaFactory(td.lookbackDelta, td.dynamicLookbackDelta)
for _, tc := range td.tcs {
got := lookbackCreate(tc.stepMillis)
testutil.Equals(t, tc.expect, got)

View File

@ -5,6 +5,8 @@ package main
import (
"context"
"fmt"
"net"
"os"
"path"
"strings"
@ -25,12 +27,11 @@ import (
"github.com/prometheus/prometheus/model/relabel"
"github.com/prometheus/prometheus/tsdb"
"github.com/prometheus/prometheus/tsdb/wlog"
"google.golang.org/grpc"
"gopkg.in/yaml.v2"
"github.com/thanos-io/objstore"
"github.com/thanos-io/objstore/client"
objstoretracing "github.com/thanos-io/objstore/tracing/opentracing"
"google.golang.org/grpc"
"gopkg.in/yaml.v2"
"github.com/thanos-io/thanos/pkg/block/metadata"
"github.com/thanos-io/thanos/pkg/component"
@ -48,12 +49,16 @@ import (
grpcserver "github.com/thanos-io/thanos/pkg/server/grpc"
httpserver "github.com/thanos-io/thanos/pkg/server/http"
"github.com/thanos-io/thanos/pkg/store"
storecache "github.com/thanos-io/thanos/pkg/store/cache"
"github.com/thanos-io/thanos/pkg/store/labelpb"
"github.com/thanos-io/thanos/pkg/tenancy"
"github.com/thanos-io/thanos/pkg/tls"
)
const compressionNone = "none"
const (
compressionNone = "none"
metricNamesFilter = "metric-names-filter"
)
func registerReceive(app *extkingpin.App) {
cmd := app.Command(component.Receive.String(), "Accept Prometheus remote write API requests and write to local tsdb.")
@ -70,7 +75,7 @@ func registerReceive(app *extkingpin.App) {
if !model.LabelName.IsValid(model.LabelName(conf.tenantLabelName)) {
return errors.Errorf("unsupported format for tenant label name, got %s", conf.tenantLabelName)
}
if len(lset) == 0 {
if lset.Len() == 0 {
return errors.New("no external labels configured for receive, uniquely identifying external labels must be configured (ideally with `receive_` prefix); see https://thanos.io/tip/thanos/storage.md#external-labels for details.")
}
@ -136,7 +141,18 @@ func runReceive(
level.Info(logger).Log("mode", receiveMode, "msg", "running receive")
rwTLSConfig, err := tls.NewServerConfig(log.With(logger, "protocol", "HTTP"), conf.rwServerCert, conf.rwServerKey, conf.rwServerClientCA)
multiTSDBOptions := []receive.MultiTSDBOption{
receive.WithHeadExpandedPostingsCacheSize(conf.headExpandedPostingsCacheSize),
receive.WithBlockExpandedPostingsCacheSize(conf.compactedBlocksExpandedPostingsCacheSize),
}
for _, feature := range *conf.featureList {
if feature == metricNamesFilter {
multiTSDBOptions = append(multiTSDBOptions, receive.WithMetricNameFilterEnabled())
level.Info(logger).Log("msg", "metric name filter feature enabled")
}
}
rwTLSConfig, err := tls.NewServerConfig(log.With(logger, "protocol", "HTTP"), conf.rwServerCert, conf.rwServerKey, conf.rwServerClientCA, conf.rwServerTlsMinVersion)
if err != nil {
return err
}
@ -159,6 +175,10 @@ func runReceive(
dialOpts = append(dialOpts, grpc.WithDefaultCallOptions(grpc.UseCompressor(conf.compression)))
}
if conf.grpcServiceConfig != "" {
dialOpts = append(dialOpts, grpc.WithDefaultServiceConfig(conf.grpcServiceConfig))
}
var bkt objstore.Bucket
confContentYaml, err := conf.objStoreConfig.Content()
if err != nil {
@ -180,7 +200,7 @@ func runReceive(
}
// The background shipper continuously scans the data directory and uploads
// new blocks to object storage service.
bkt, err = client.NewBucket(logger, confContentYaml, comp.String())
bkt, err = client.NewBucket(logger, confContentYaml, comp.String(), nil)
if err != nil {
return err
}
@ -190,10 +210,9 @@ func runReceive(
}
}
// TODO(brancz): remove after a couple of versions
// Migrate non-multi-tsdb capable storage to multi-tsdb disk layout.
if err := migrateLegacyStorage(logger, conf.dataDir, conf.defaultTenantID); err != nil {
return errors.Wrapf(err, "migrate legacy storage in %v to default tenant %v", conf.dataDir, conf.defaultTenantID)
// Create TSDB for the default tenant.
if err := createDefautTenantTSDB(logger, conf.dataDir, conf.defaultTenantID); err != nil {
return errors.Wrapf(err, "create default tenant tsdb in %v", conf.dataDir)
}
relabelContentYaml, err := conf.relabelConfigPath.Content()
@ -205,6 +224,15 @@ func runReceive(
return errors.Wrap(err, "parse relabel configuration")
}
var cache = storecache.NoopMatchersCache
if conf.matcherCacheSize > 0 {
cache, err = storecache.NewMatchersCache(storecache.WithSize(conf.matcherCacheSize), storecache.WithPromRegistry(reg))
if err != nil {
return errors.Wrap(err, "failed to create matchers cache")
}
multiTSDBOptions = append(multiTSDBOptions, receive.WithMatchersCache(cache))
}
dbs := receive.NewMultiTSDB(
conf.dataDir,
logger,
@ -214,7 +242,9 @@ func runReceive(
conf.tenantLabelName,
bkt,
conf.allowOutOfOrderUpload,
conf.skipCorruptedBlocks,
hashFunc,
multiTSDBOptions...,
)
writer := receive.NewWriter(log.With(logger, "component", "receive-writer"), dbs, &receive.WriterOptions{
Intern: conf.writerInterning,
@ -259,6 +289,9 @@ func runReceive(
Limiter: limiter,
AsyncForwardWorkerCount: conf.asyncForwardWorkerCount,
ReplicationProtocol: receive.ReplicationProtocol(conf.replicationProtocol),
OtlpEnableTargetInfo: conf.otlpEnableTargetInfo,
OtlpResourceAttributes: conf.otlpResourceAttributes,
})
grpcProbe := prober.NewGRPC()
@ -316,13 +349,19 @@ func runReceive(
level.Debug(logger).Log("msg", "setting up gRPC server")
{
tlsCfg, err := tls.NewServerConfig(log.With(logger, "protocol", "gRPC"), conf.grpcConfig.tlsSrvCert, conf.grpcConfig.tlsSrvKey, conf.grpcConfig.tlsSrvClientCA)
tlsCfg, err := tls.NewServerConfig(log.With(logger, "protocol", "gRPC"), conf.grpcConfig.tlsSrvCert, conf.grpcConfig.tlsSrvKey, conf.grpcConfig.tlsSrvClientCA, conf.grpcConfig.tlsMinVersion)
if err != nil {
return errors.Wrap(err, "setup gRPC server")
}
if conf.lazyRetrievalMaxBufferedResponses <= 0 {
return errors.New("--receive.lazy-retrieval-max-buffered-responses must be > 0")
}
options := []store.ProxyStoreOption{
store.WithProxyStoreDebugLogging(debugLogging),
store.WithMatcherCache(cache),
store.WithoutDedup(),
store.WithLazyRetrievalMaxBufferedResponsesForProxy(conf.lazyRetrievalMaxBufferedResponses),
}
proxy := store.NewProxyStore(
@ -452,6 +491,26 @@ func runReceive(
}
}
{
capNProtoWriter := receive.NewCapNProtoWriter(logger, dbs, &receive.CapNProtoWriterOptions{
TooFarInFutureTimeWindow: int64(time.Duration(*conf.tsdbTooFarInFutureTimeWindow)),
})
handler := receive.NewCapNProtoHandler(logger, capNProtoWriter)
listener, err := net.Listen("tcp", conf.replicationAddr)
if err != nil {
return err
}
server := receive.NewCapNProtoServer(listener, handler, logger)
g.Add(func() error {
return server.ListenAndServe()
}, func(err error) {
server.Shutdown()
if err := listener.Close(); err != nil {
level.Warn(logger).Log("msg", "Cap'n Proto server did not shut down gracefully", "err", err.Error())
}
})
}
level.Info(logger).Log("msg", "starting receiver")
return nil
}
@ -538,7 +597,7 @@ func setupHashring(g *run.Group,
webHandler.Hashring(receive.SingleNodeHashring(conf.endpoint))
level.Info(logger).Log("msg", "Empty hashring config. Set up single node hashring.")
} else {
h, err := receive.NewMultiHashring(algorithm, conf.replicationFactor, c)
h, err := receive.NewMultiHashring(algorithm, conf.replicationFactor, c, reg)
if err != nil {
return errors.Wrap(err, "unable to create new hashring from config")
}
@ -740,38 +799,25 @@ func startTSDBAndUpload(g *run.Group,
return nil
}
func migrateLegacyStorage(logger log.Logger, dataDir, defaultTenantID string) error {
func createDefautTenantTSDB(logger log.Logger, dataDir, defaultTenantID string) error {
defaultTenantDataDir := path.Join(dataDir, defaultTenantID)
if _, err := os.Stat(defaultTenantDataDir); !os.IsNotExist(err) {
level.Info(logger).Log("msg", "default tenant data dir already present, not attempting to migrate storage")
level.Info(logger).Log("msg", "default tenant data dir already present, will not create")
return nil
}
if _, err := os.Stat(dataDir); os.IsNotExist(err) {
level.Info(logger).Log("msg", "no existing storage found, no data migration attempted")
level.Info(logger).Log("msg", "no existing storage found, not creating default tenant data dir")
return nil
}
level.Info(logger).Log("msg", "found legacy storage, migrating to multi-tsdb layout with default tenant", "defaultTenantID", defaultTenantID)
files, err := os.ReadDir(dataDir)
if err != nil {
return errors.Wrapf(err, "read legacy data dir: %v", dataDir)
}
level.Info(logger).Log("msg", "default tenant data dir not found, creating", "defaultTenantID", defaultTenantID)
if err := os.MkdirAll(defaultTenantDataDir, 0750); err != nil {
return errors.Wrapf(err, "create default tenant data dir: %v", defaultTenantDataDir)
}
for _, f := range files {
from := path.Join(dataDir, f.Name())
to := path.Join(defaultTenantDataDir, f.Name())
if err := os.Rename(from, to); err != nil {
return errors.Wrapf(err, "migrate file from %v to %v", from, to)
}
}
return nil
}
@ -782,16 +828,18 @@ type receiveConfig struct {
grpcConfig grpcConfig
rwAddress string
rwServerCert string
rwServerKey string
rwServerClientCA string
rwClientCert string
rwClientKey string
rwClientSecure bool
rwClientServerCA string
rwClientServerName string
rwClientSkipVerify bool
replicationAddr string
rwAddress string
rwServerCert string
rwServerKey string
rwServerClientCA string
rwClientCert string
rwClientKey string
rwClientSecure bool
rwClientServerCA string
rwClientServerName string
rwClientSkipVerify bool
rwServerTlsMinVersion string
dataDir string
labelStrs []string
@ -803,17 +851,19 @@ type receiveConfig struct {
hashringsFileContent string
hashringsAlgorithm string
refreshInterval *model.Duration
endpoint string
tenantHeader string
tenantField string
tenantLabelName string
defaultTenantID string
replicaHeader string
replicationFactor uint64
forwardTimeout *model.Duration
maxBackoff *model.Duration
compression string
refreshInterval *model.Duration
endpoint string
tenantHeader string
tenantField string
tenantLabelName string
defaultTenantID string
replicaHeader string
replicationFactor uint64
forwardTimeout *model.Duration
maxBackoff *model.Duration
compression string
replicationProtocol string
grpcServiceConfig string
tsdbMinBlockDuration *model.Duration
tsdbMaxBlockDuration *model.Duration
@ -836,6 +886,7 @@ type receiveConfig struct {
ignoreBlockSize bool
allowOutOfOrderUpload bool
skipCorruptedBlocks bool
reqLogConfig *extflag.PathOrContent
relabelConfigPath *extflag.PathOrContent
@ -845,6 +896,17 @@ type receiveConfig struct {
limitsConfigReloadTimer time.Duration
asyncForwardWorkerCount uint
matcherCacheSize int
lazyRetrievalMaxBufferedResponses int
featureList *[]string
headExpandedPostingsCacheSize uint64
compactedBlocksExpandedPostingsCacheSize uint64
otlpEnableTargetInfo bool
otlpResourceAttributes []string
}
func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
@ -861,6 +923,8 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
cmd.Flag("remote-write.server-tls-client-ca", "TLS CA to verify clients against. If no client CA is specified, there is no client verification on server side. (tls.NoClientCert)").Default("").StringVar(&rc.rwServerClientCA)
cmd.Flag("remote-write.server-tls-min-version", "TLS version for the gRPC server, leave blank to default to TLS 1.3, allow values: [\"1.0\", \"1.1\", \"1.2\", \"1.3\"]").Default("1.3").StringVar(&rc.rwServerTlsMinVersion)
cmd.Flag("remote-write.client-tls-cert", "TLS Certificates to use to identify this client to the server.").Default("").StringVar(&rc.rwClientCert)
cmd.Flag("remote-write.client-tls-key", "TLS Key for the client's certificate.").Default("").StringVar(&rc.rwClientKey)
@ -914,6 +978,15 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
cmd.Flag("receive.replication-factor", "How many times to replicate incoming write requests.").Default("1").Uint64Var(&rc.replicationFactor)
replicationProtocols := []string{string(receive.ProtobufReplication), string(receive.CapNProtoReplication)}
cmd.Flag("receive.replication-protocol", "The protocol to use for replicating remote-write requests. One of "+strings.Join(replicationProtocols, ", ")).
Default(string(receive.ProtobufReplication)).
EnumVar(&rc.replicationProtocol, replicationProtocols...)
cmd.Flag("receive.capnproto-address", "Address for the Cap'n Proto server.").Default(fmt.Sprintf("0.0.0.0:%s", receive.DefaultCapNProtoPort)).StringVar(&rc.replicationAddr)
cmd.Flag("receive.grpc-service-config", "gRPC service configuration file or content in JSON format. See https://github.com/grpc/grpc/blob/master/doc/service_config.md").PlaceHolder("<content>").Default("").StringVar(&rc.grpcServiceConfig)
rc.forwardTimeout = extkingpin.ModelDuration(cmd.Flag("receive-forward-timeout", "Timeout for each forward request.").Default("5s").Hidden())
rc.maxBackoff = extkingpin.ModelDuration(cmd.Flag("receive-forward-max-backoff", "Maximum backoff for each forward fan-out request").Default("5s").Hidden())
@ -925,18 +998,18 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
rc.tsdbMaxBlockDuration = extkingpin.ModelDuration(cmd.Flag("tsdb.max-block-duration", "Max duration for local TSDB blocks").Default("2h").Hidden())
rc.tsdbTooFarInFutureTimeWindow = extkingpin.ModelDuration(cmd.Flag("tsdb.too-far-in-future.time-window",
"[EXPERIMENTAL] Configures the allowed time window for ingesting samples too far in the future. Disabled (0s) by default"+
"Configures the allowed time window for ingesting samples too far in the future. Disabled (0s) by default. "+
"Please note enable this flag will reject samples in the future of receive local NTP time + configured duration due to clock skew in remote write clients.",
).Default("0s"))
rc.tsdbOutOfOrderTimeWindow = extkingpin.ModelDuration(cmd.Flag("tsdb.out-of-order.time-window",
"[EXPERIMENTAL] Configures the allowed time window for ingestion of out-of-order samples. Disabled (0s) by default"+
"Please note if you enable this option and you use compactor, make sure you have the --enable-vertical-compaction flag enabled, otherwise you might risk compactor halt.",
).Default("0s").Hidden())
"Please note if you enable this option and you use compactor, make sure you have the --compact.enable-vertical-compaction flag enabled, otherwise you might risk compactor halt.",
).Default("0s"))
cmd.Flag("tsdb.out-of-order.cap-max",
"[EXPERIMENTAL] Configures the maximum capacity for out-of-order chunks (in samples). If set to <=0, default value 32 is assumed.",
).Default("0").Hidden().Int64Var(&rc.tsdbOutOfOrderCapMax)
).Default("0").Int64Var(&rc.tsdbOutOfOrderCapMax)
cmd.Flag("tsdb.allow-overlapping-blocks", "Allow overlapping blocks, which in turn enables vertical compaction and vertical query merge. Does not do anything, enabled all the time.").Default("false").BoolVar(&rc.tsdbAllowOverlappingBlocks)
@ -946,6 +1019,9 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
cmd.Flag("tsdb.no-lockfile", "Do not create lockfile in TSDB data directory. In any case, the lockfiles will be deleted on next startup.").Default("false").BoolVar(&rc.noLockFile)
cmd.Flag("tsdb.head.expanded-postings-cache-size", "[EXPERIMENTAL] If non-zero, enables expanded postings cache for the head block.").Default("0").Uint64Var(&rc.headExpandedPostingsCacheSize)
cmd.Flag("tsdb.block.expanded-postings-cache-size", "[EXPERIMENTAL] If non-zero, enables expanded postings cache for compacted blocks.").Default("0").Uint64Var(&rc.compactedBlocksExpandedPostingsCacheSize)
cmd.Flag("tsdb.max-exemplars",
"Enables support for ingesting exemplars and sets the maximum number of exemplars that will be stored per tenant."+
" In case the exemplar storage becomes full (number of stored exemplars becomes equal to max-exemplars),"+
@ -963,7 +1039,7 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
cmd.Flag("tsdb.enable-native-histograms",
"[EXPERIMENTAL] Enables the ingestion of native histograms.").
Default("false").Hidden().BoolVar(&rc.tsdbEnableNativeHistograms)
Default("false").BoolVar(&rc.tsdbEnableNativeHistograms)
cmd.Flag("writer.intern",
"[EXPERIMENTAL] Enables string interning in receive writer, for more optimized memory usage.").
@ -980,11 +1056,27 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
"about order.").
Default("false").Hidden().BoolVar(&rc.allowOutOfOrderUpload)
cmd.Flag("shipper.skip-corrupted-blocks",
"If true, shipper will skip corrupted blocks in the given iteration and retry later. This means that some newer blocks might be uploaded sooner than older blocks."+
"This can trigger compaction without those blocks and as a result will create an overlap situation. Set it to true if you have vertical compaction enabled and wish to upload blocks as soon as possible without caring"+
"about order.").
Default("false").Hidden().BoolVar(&rc.skipCorruptedBlocks)
cmd.Flag("matcher-cache-size", "Max number of cached matchers items. Using 0 disables caching.").Default("0").IntVar(&rc.matcherCacheSize)
rc.reqLogConfig = extkingpin.RegisterRequestLoggingFlags(cmd)
rc.writeLimitsConfig = extflag.RegisterPathOrContent(cmd, "receive.limits-config", "YAML file that contains limit configuration.", extflag.WithEnvSubstitution(), extflag.WithHidden())
cmd.Flag("receive.limits-config-reload-timer", "Minimum amount of time to pass for the limit configuration to be reloaded. Helps to avoid excessive reloads.").
Default("1s").Hidden().DurationVar(&rc.limitsConfigReloadTimer)
cmd.Flag("receive.otlp-enable-target-info", "Enables target information in OTLP metrics ingested by Receive. If enabled, it converts the resource to the target info metric").Default("true").BoolVar(&rc.otlpEnableTargetInfo)
cmd.Flag("receive.otlp-promote-resource-attributes", "(Repeatable) Resource attributes to include in OTLP metrics ingested by Receive.").Default("").StringsVar(&rc.otlpResourceAttributes)
rc.featureList = cmd.Flag("enable-feature", "Comma separated experimental feature names to enable. The current list of features is "+metricNamesFilter+".").Default("").Strings()
cmd.Flag("receive.lazy-retrieval-max-buffered-responses", "The lazy retrieval strategy can buffer up to this number of responses. This is to limit the memory usage. This flag takes effect only when the lazy retrieval strategy is enabled.").
Default("20").IntVar(&rc.lazyRetrievalMaxBufferedResponses)
}
// determineMode returns the ReceiverMode that this receiver is configured to run in.

View File

@ -14,6 +14,7 @@ import (
"os"
"path/filepath"
"strings"
"sync"
texttemplate "text/template"
"time"
@ -35,17 +36,18 @@ import (
"github.com/prometheus/prometheus/promql"
"github.com/prometheus/prometheus/promql/parser"
"github.com/prometheus/prometheus/rules"
"github.com/prometheus/prometheus/scrape"
"github.com/prometheus/prometheus/storage"
"github.com/prometheus/prometheus/storage/remote"
"github.com/prometheus/prometheus/tsdb"
"github.com/prometheus/prometheus/tsdb/agent"
"github.com/prometheus/prometheus/tsdb/wlog"
"gopkg.in/yaml.v2"
"github.com/thanos-io/objstore"
"github.com/thanos-io/objstore/client"
objstoretracing "github.com/thanos-io/objstore/tracing/opentracing"
"github.com/thanos-io/promql-engine/execution/parse"
"gopkg.in/yaml.v2"
"github.com/thanos-io/thanos/pkg/alert"
v1 "github.com/thanos-io/thanos/pkg/api/rule"
@ -54,6 +56,7 @@ import (
"github.com/thanos-io/thanos/pkg/component"
"github.com/thanos-io/thanos/pkg/discovery/dns"
"github.com/thanos-io/thanos/pkg/errutil"
"github.com/thanos-io/thanos/pkg/extannotations"
"github.com/thanos-io/thanos/pkg/extgrpc"
"github.com/thanos-io/thanos/pkg/extkingpin"
"github.com/thanos-io/thanos/pkg/extprom"
@ -62,6 +65,7 @@ import (
"github.com/thanos-io/thanos/pkg/info"
"github.com/thanos-io/thanos/pkg/info/infopb"
"github.com/thanos-io/thanos/pkg/logging"
"github.com/thanos-io/thanos/pkg/logutil"
"github.com/thanos-io/thanos/pkg/prober"
"github.com/thanos-io/thanos/pkg/promclient"
"github.com/thanos-io/thanos/pkg/query"
@ -78,8 +82,6 @@ import (
"github.com/thanos-io/thanos/pkg/ui"
)
const dnsSDResolver = "miekgdns"
type ruleConfig struct {
http httpConfig
grpc grpcConfig
@ -97,18 +99,22 @@ type ruleConfig struct {
rwConfig *extflag.PathOrContent
resendDelay time.Duration
evalInterval time.Duration
outageTolerance time.Duration
forGracePeriod time.Duration
ruleFiles []string
objStoreConfig *extflag.PathOrContent
dataDir string
lset labels.Labels
ignoredLabelNames []string
storeRateLimits store.SeriesSelectLimits
resendDelay time.Duration
evalInterval time.Duration
queryOffset time.Duration
outageTolerance time.Duration
forGracePeriod time.Duration
ruleFiles []string
objStoreConfig *extflag.PathOrContent
dataDir string
lset labels.Labels
ignoredLabelNames []string
storeRateLimits store.SeriesSelectLimits
ruleConcurrentEval int64
extendedFunctionsEnabled bool
extendedFunctionsEnabled bool
EnableFeatures []string
tsdbEnableNativeHistograms bool
}
type Expression struct {
@ -149,17 +155,25 @@ func registerRule(app *extkingpin.App) {
Default("1m").DurationVar(&conf.resendDelay)
cmd.Flag("eval-interval", "The default evaluation interval to use.").
Default("1m").DurationVar(&conf.evalInterval)
cmd.Flag("rule-query-offset", "The default rule group query_offset duration to use.").
Default("0s").DurationVar(&conf.queryOffset)
cmd.Flag("for-outage-tolerance", "Max time to tolerate prometheus outage for restoring \"for\" state of alert.").
Default("1h").DurationVar(&conf.outageTolerance)
cmd.Flag("for-grace-period", "Minimum duration between alert and restored \"for\" state. This is maintained only for alerts with configured \"for\" time greater than grace period.").
Default("10m").DurationVar(&conf.forGracePeriod)
cmd.Flag("restore-ignored-label", "Label names to be ignored when restoring alerts from the remote storage. This is only used in stateless mode.").
StringsVar(&conf.ignoredLabelNames)
cmd.Flag("rule-concurrent-evaluation", "How many rules can be evaluated concurrently. Default is 1.").Default("1").Int64Var(&conf.ruleConcurrentEval)
cmd.Flag("grpc-query-endpoint", "Addresses of Thanos gRPC query API servers (repeatable). The scheme may be prefixed with 'dns+' or 'dnssrv+' to detect Thanos API servers through respective DNS lookups.").
PlaceHolder("<endpoint>").StringsVar(&conf.grpcQueryEndpoints)
cmd.Flag("query.enable-x-functions", "Whether to enable extended rate functions (xrate, xincrease and xdelta). Only has effect when used with Thanos engine.").Default("false").BoolVar(&conf.extendedFunctionsEnabled)
cmd.Flag("enable-feature", "Comma separated feature names to enable. Valid options for now: promql-experimental-functions (enables promql experimental functions for ruler)").Default("").StringsVar(&conf.EnableFeatures)
cmd.Flag("tsdb.enable-native-histograms",
"[EXPERIMENTAL] Enables the ingestion of native histograms.").
Default("false").BoolVar(&conf.tsdbEnableNativeHistograms)
conf.rwConfig = extflag.RegisterPathOrContent(cmd, "remote-write.config", "YAML config for the remote-write configurations, that specify servers where samples should be sent to (see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). This automatically enables stateless mode for ruler and no series will be stored in the ruler's TSDB. If an empty config (or file) is provided, the flag is ignored and ruler is run with its own TSDB.", extflag.WithEnvSubstitution())
@ -180,11 +194,12 @@ func registerRule(app *extkingpin.App) {
}
tsdbOpts := &tsdb.Options{
MinBlockDuration: int64(time.Duration(*tsdbBlockDuration) / time.Millisecond),
MaxBlockDuration: int64(time.Duration(*tsdbBlockDuration) / time.Millisecond),
RetentionDuration: int64(time.Duration(*tsdbRetention) / time.Millisecond),
NoLockfile: *noLockFile,
WALCompression: wlog.ParseCompressionType(*walCompression, string(wlog.CompressionSnappy)),
MinBlockDuration: int64(time.Duration(*tsdbBlockDuration) / time.Millisecond),
MaxBlockDuration: int64(time.Duration(*tsdbBlockDuration) / time.Millisecond),
RetentionDuration: int64(time.Duration(*tsdbRetention) / time.Millisecond),
NoLockfile: *noLockFile,
WALCompression: wlog.ParseCompressionType(*walCompression, string(wlog.CompressionSnappy)),
EnableNativeHistograms: conf.tsdbEnableNativeHistograms,
}
agentOpts := &agent.Options{
@ -292,7 +307,7 @@ func newRuleMetrics(reg *prometheus.Registry) *RuleMetrics {
m.ruleEvalWarnings = factory.NewCounterVec(
prometheus.CounterOpts{
Name: "thanos_rule_evaluation_with_warnings_total",
Help: "The total number of rule evaluation that were successful but had warnings which can indicate partial error.",
Help: "The total number of rule evaluation that were successful but had non PromQL warnings which can indicate partial error.",
}, []string{"strategy"},
)
m.ruleEvalWarnings.WithLabelValues(strings.ToLower(storepb.PartialResponseStrategy_ABORT.String()))
@ -325,7 +340,7 @@ func runRule(
if len(conf.queryConfigYAML) > 0 {
queryCfg, err = clientconfig.LoadConfigs(conf.queryConfigYAML)
if err != nil {
return err
return errors.Wrap(err, "query configuration")
}
} else {
queryCfg, err = clientconfig.BuildConfigFromHTTPAddresses(conf.query.addrs)
@ -382,12 +397,12 @@ func runRule(
cfg.HTTPConfig.HTTPClientConfig.ClientMetrics = queryClientMetrics
c, err := clientconfig.NewHTTPClient(cfg.HTTPConfig.HTTPClientConfig, "query")
if err != nil {
return err
return fmt.Errorf("failed to create HTTP query client: %w", err)
}
c.Transport = tracing.HTTPTripperware(logger, c.Transport)
queryClient, err := clientconfig.NewClient(logger, cfg.HTTPConfig.EndpointsConfig, c, queryProvider.Clone())
if err != nil {
return err
return fmt.Errorf("failed to create query client: %w", err)
}
queryClients = append(queryClients, queryClient)
promClients = append(promClients, promclient.NewClient(queryClient, logger, "thanos-rule"))
@ -401,17 +416,6 @@ func runRule(
}
if len(grpcEndpoints) > 0 {
duplicatedGRPCEndpoints := promauto.With(reg).NewCounter(prometheus.CounterOpts{
Name: "thanos_rule_grpc_endpoints_duplicated_total",
Help: "The number of times a duplicated grpc endpoint is detected from the different configs in rule",
})
dnsEndpointProvider := dns.NewProvider(
logger,
extprom.WrapRegistererWithPrefix("thanos_rule_grpc_endpoints_", reg),
dnsSDResolver,
)
dialOpts, err := extgrpc.StoreClientGRPCOpts(
logger,
reg,
@ -427,36 +431,27 @@ func runRule(
return err
}
grpcEndpointSet = prepareEndpointSet(
grpcEndpointSet, err = setupEndpointSet(
g,
logger,
comp,
reg,
[]*dns.Provider{dnsEndpointProvider},
duplicatedGRPCEndpoints,
logger,
nil,
1*time.Minute,
nil,
1*time.Minute,
grpcEndpoints,
nil,
nil,
nil,
nil,
dialOpts,
conf.query.dnsSDResolver,
conf.query.dnsSDInterval,
5*time.Minute,
5*time.Second,
dialOpts,
)
// Periodically update the GRPC addresses from query config by resolving them using DNS SD if necessary.
{
ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {
return runutil.Repeat(5*time.Second, ctx.Done(), func() error {
resolveCtx, resolveCancel := context.WithTimeout(ctx, 5*time.Second)
defer resolveCancel()
if err := dnsEndpointProvider.Resolve(resolveCtx, grpcEndpoints); err != nil {
level.Error(logger).Log("msg", "failed to resolve addresses passed using grpc query config", "err", err)
}
return nil
})
}, func(error) {
cancel()
})
if err != nil {
return err
}
}
@ -480,10 +475,11 @@ func runRule(
return errors.Wrapf(err, "failed to parse remote write config %v", string(rwCfgYAML))
}
slogger := logutil.GoKitLogToSlog(logger)
// flushDeadline is set to 1m, but it is for metadata watcher only so not used here.
remoteStore := remote.NewStorage(logger, reg, func() (int64, error) {
remoteStore := remote.NewStorage(slogger, reg, func() (int64, error) {
return 0, nil
}, conf.dataDir, 1*time.Minute, nil)
}, conf.dataDir, 1*time.Minute, &readyScrapeManager{})
if err := remoteStore.ApplyConfig(&config.Config{
GlobalConfig: config.GlobalConfig{
ExternalLabels: labelsTSDBToProm(conf.lset),
@ -493,18 +489,18 @@ func runRule(
return errors.Wrap(err, "applying config to remote storage")
}
agentDB, err = agent.Open(logger, reg, remoteStore, conf.dataDir, agentOpts)
agentDB, err = agent.Open(slogger, reg, remoteStore, conf.dataDir, agentOpts)
if err != nil {
return errors.Wrap(err, "start remote write agent db")
}
fanoutStore := storage.NewFanout(logger, agentDB, remoteStore)
fanoutStore := storage.NewFanout(slogger, agentDB, remoteStore)
appendable = fanoutStore
// Use a separate queryable to restore the ALERTS firing states.
// We cannot use remoteStore directly because it uses remote read for
// query. However, remote read is not implemented in Thanos Receiver.
queryable = thanosrules.NewPromClientsQueryable(logger, queryClients, promClients, conf.query.httpMethod, conf.query.step, conf.ignoredLabelNames)
} else {
tsdbDB, err = tsdb.Open(conf.dataDir, log.With(logger, "component", "tsdb"), reg, tsdbOpts, nil)
tsdbDB, err = tsdb.Open(conf.dataDir, logutil.GoKitLogToSlog(log.With(logger, "component", "tsdb")), reg, tsdbOpts, nil)
if err != nil {
return errors.Wrap(err, "open TSDB")
}
@ -595,6 +591,15 @@ func runRule(
}
}
if len(conf.EnableFeatures) > 0 {
for _, feature := range conf.EnableFeatures {
if feature == promqlExperimentalFunctions {
parser.EnableExperimentalFunctions = true
level.Info(logger).Log("msg", "Experimental PromQL functions enabled.", "option", promqlExperimentalFunctions)
}
}
}
// Run rule evaluation and alert notifications.
notifyFunc := func(ctx context.Context, expr string, alerts ...*rules.Alert) {
res := make([]*notifier.Alert, 0, len(alerts))
@ -623,22 +628,29 @@ func runRule(
alertQ.Push(res)
}
managerOpts := rules.ManagerOptions{
NotifyFunc: notifyFunc,
Logger: logutil.GoKitLogToSlog(logger),
Appendable: appendable,
ExternalURL: nil,
Queryable: queryable,
ResendDelay: conf.resendDelay,
OutageTolerance: conf.outageTolerance,
ForGracePeriod: conf.forGracePeriod,
DefaultRuleQueryOffset: func() time.Duration { return conf.queryOffset },
}
if conf.ruleConcurrentEval > 1 {
managerOpts.MaxConcurrentEvals = conf.ruleConcurrentEval
managerOpts.ConcurrentEvalsEnabled = true
}
ctx, cancel := context.WithCancel(context.Background())
logger = log.With(logger, "component", "rules")
ruleMgr = thanosrules.NewManager(
tracing.ContextWithTracer(ctx, tracer),
reg,
conf.dataDir,
rules.ManagerOptions{
NotifyFunc: notifyFunc,
Logger: logger,
Appendable: appendable,
ExternalURL: nil,
Queryable: queryable,
ResendDelay: conf.resendDelay,
OutageTolerance: conf.outageTolerance,
ForGracePeriod: conf.forGracePeriod,
},
managerOpts,
queryFuncCreator(logger, queryClients, promClients, grpcEndpointSet, metrics.duplicatedQuery, metrics.ruleEvalWarnings, conf.query.httpMethod, conf.query.doNotAddThanosParams),
conf.lset,
// In our case the querying URL is the external URL because in Prometheus
@ -721,7 +733,7 @@ func runRule(
)
// Start gRPC server.
tlsCfg, err := tls.NewServerConfig(log.With(logger, "protocol", "gRPC"), conf.grpc.tlsSrvCert, conf.grpc.tlsSrvKey, conf.grpc.tlsSrvClientCA)
tlsCfg, err := tls.NewServerConfig(log.With(logger, "protocol", "gRPC"), conf.grpc.tlsSrvCert, conf.grpc.tlsSrvKey, conf.grpc.tlsSrvClientCA, conf.grpc.tlsMinVersion)
if err != nil {
return errors.Wrap(err, "setup gRPC server")
}
@ -833,7 +845,7 @@ func runRule(
if len(confContentYaml) > 0 {
// The background shipper continuously scans the data directory and uploads
// new blocks to Google Cloud Storage or an S3-compatible storage service.
bkt, err := client.NewBucket(logger, confContentYaml, component.Rule.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Rule.String(), nil)
if err != nil {
return err
}
@ -846,7 +858,18 @@ func runRule(
}
}()
s := shipper.New(logger, reg, conf.dataDir, bkt, func() labels.Labels { return conf.lset }, metadata.RulerSource, nil, conf.shipper.allowOutOfOrderUpload, metadata.HashFunc(conf.shipper.hashFunc), conf.shipper.metaFileName)
s := shipper.New(
bkt,
conf.dataDir,
shipper.WithLogger(logger),
shipper.WithRegisterer(reg),
shipper.WithSource(metadata.RulerSource),
shipper.WithHashFunc(metadata.HashFunc(conf.shipper.hashFunc)),
shipper.WithMetaFileName(conf.shipper.metaFileName),
shipper.WithLabels(func() labels.Labels { return conf.lset }),
shipper.WithAllowOutOfOrderUploads(conf.shipper.allowOutOfOrderUpload),
shipper.WithSkipCorruptedBlocks(conf.shipper.skipCorruptedBlocks),
)
ctx, cancel := context.WithCancel(context.Background())
@ -886,13 +909,7 @@ func removeLockfileIfAny(logger log.Logger, dataDir string) error {
}
func labelsTSDBToProm(lset labels.Labels) (res labels.Labels) {
for _, l := range lset {
res = append(res, labels.Label{
Name: l.Name,
Value: l.Value,
})
}
return res
return lset.Copy()
}
func queryFuncCreator(
@ -939,6 +956,8 @@ func queryFuncCreator(
level.Error(logger).Log("err", err, "query", qs)
continue
}
warns = filterOutPromQLWarnings(warns, logger, qs)
if len(warns) > 0 {
ruleEvalWarnings.WithLabelValues(strings.ToLower(partialResponseStrategy.String())).Inc()
// TODO(bwplotka): Propagate those to UI, probably requires changing rule manager code ):
@ -970,12 +989,13 @@ func queryFuncCreator(
continue
}
if len(result.Warnings) > 0 {
warnings := make([]string, 0, len(result.Warnings))
for _, warn := range result.Warnings {
warnings = append(warnings, warn.Error())
}
warnings = filterOutPromQLWarnings(warnings, logger, qs)
if len(warnings) > 0 {
ruleEvalWarnings.WithLabelValues(strings.ToLower(partialResponseStrategy.String())).Inc()
warnings := make([]string, 0, len(result.Warnings))
for _, w := range result.Warnings {
warnings = append(warnings, w.Error())
}
level.Warn(logger).Log("warnings", strings.Join(warnings, ", "), "query", qs)
}
@ -1081,3 +1101,45 @@ func validateTemplate(tmplStr string) error {
}
return nil
}
// Filter out PromQL related warnings from warning response and keep store related warnings only.
func filterOutPromQLWarnings(warns []string, logger log.Logger, query string) []string {
storeWarnings := make([]string, 0, len(warns))
for _, warn := range warns {
if extannotations.IsPromQLAnnotation(warn) {
level.Warn(logger).Log("warning", warn, "query", query)
continue
}
storeWarnings = append(storeWarnings, warn)
}
return storeWarnings
}
// ReadyScrapeManager allows a scrape manager to be retrieved. Even if it's set at a later point in time.
type readyScrapeManager struct {
mtx sync.RWMutex
m *scrape.Manager
}
// Set the scrape manager.
func (rm *readyScrapeManager) Set(m *scrape.Manager) {
rm.mtx.Lock()
defer rm.mtx.Unlock()
rm.m = m
}
// Get the scrape manager. If is not ready, return an error.
func (rm *readyScrapeManager) Get() (*scrape.Manager, error) {
rm.mtx.RLock()
defer rm.mtx.RUnlock()
if rm.m != nil {
return rm.m, nil
}
return nil, ErrNotReady
}
// ErrNotReady is returned if the underlying scrape manager is not ready yet.
var ErrNotReady = errors.New("scrape manager not ready")

View File

@ -7,6 +7,10 @@ import (
"testing"
"github.com/efficientgo/core/testutil"
"github.com/go-kit/log"
"github.com/prometheus/prometheus/util/annotations"
"github.com/thanos-io/thanos/pkg/extpromql"
)
func Test_parseFlagLabels(t *testing.T) {
@ -19,19 +23,7 @@ func Test_parseFlagLabels(t *testing.T) {
expectErr: false,
},
{
s: []string{`label-Name="LabelVal"`}, // Unsupported labelname.
expectErr: true,
},
{
s: []string{`label:Name="LabelVal"`}, // Unsupported labelname.
expectErr: true,
},
{
s: []string{`1abelName="LabelVal"`}, // Unsupported labelname.
expectErr: true,
},
{
s: []string{`label_Name"LabelVal"`}, // Missing "=" seprator.
s: []string{`label_Name"LabelVal"`}, // Missing "=" separator.
expectErr: true,
},
{
@ -110,3 +102,59 @@ func Test_tableLinkForExpression(t *testing.T) {
testutil.Equals(t, resStr, td.expectStr)
}
}
func TestFilterOutPromQLWarnings(t *testing.T) {
logger := log.NewNopLogger()
query := "foo"
expr, err := extpromql.ParseExpr(`rate(prometheus_build_info[5m])`)
testutil.Ok(t, err)
possibleCounterInfo := annotations.NewPossibleNonCounterInfo("foo", expr.PositionRange())
badBucketLabelWarning := annotations.NewBadBucketLabelWarning("foo", "0.99", expr.PositionRange())
for _, tc := range []struct {
name string
warnings []string
expected []string
}{
{
name: "nil warning",
expected: make([]string, 0),
},
{
name: "empty warning",
warnings: make([]string, 0),
expected: make([]string, 0),
},
{
name: "no PromQL warning",
warnings: []string{
"some_warning_message",
},
expected: []string{
"some_warning_message",
},
},
{
name: "PromQL warning",
warnings: []string{
possibleCounterInfo.Error(),
},
expected: make([]string, 0),
},
{
name: "filter out all PromQL warnings",
warnings: []string{
possibleCounterInfo.Error(),
badBucketLabelWarning.Error(),
"some_warning_message",
},
expected: []string{
"some_warning_message",
},
},
} {
t.Run(tc.name, func(t *testing.T) {
output := filterOutPromQLWarnings(tc.warnings, logger, query)
testutil.Equals(t, tc.expected, output)
})
}
}

View File

@ -235,6 +235,14 @@ func runSidecar(
iterCtx, iterCancel := context.WithTimeout(context.Background(), conf.prometheus.getConfigTimeout)
defer iterCancel()
if err := m.UpdateTimestamps(iterCtx); err != nil {
level.Warn(logger).Log(
"msg", "failed to fetch timestamps. Is Prometheus running? Retrying",
"err", err,
)
return err
}
if err := m.UpdateLabels(iterCtx); err != nil {
level.Warn(logger).Log(
"msg", "failed to fetch initial external labels. Is Prometheus running? Retrying",
@ -253,7 +261,7 @@ func runSidecar(
return errors.Wrap(err, "initial external labels query")
}
if len(m.Labels()) == 0 {
if m.Labels().Len() == 0 {
return errors.New("no external labels configured on Prometheus server, uniquely identifying external labels must be configured; see https://thanos.io/tip/thanos/storage.md#external-labels for details.")
}
promUp.Set(1)
@ -266,16 +274,21 @@ func runSidecar(
return runutil.Repeat(conf.prometheus.getConfigInterval, ctx.Done(), func() error {
iterCtx, iterCancel := context.WithTimeout(context.Background(), conf.prometheus.getConfigTimeout)
defer iterCancel()
if err := m.UpdateLabels(iterCtx); err != nil {
level.Warn(logger).Log("msg", "heartbeat failed", "err", err)
if err := m.UpdateTimestamps(iterCtx); err != nil {
level.Warn(logger).Log("msg", "updating timestamps failed", "err", err)
promUp.Set(0)
statusProber.NotReady(err)
} else {
promUp.Set(1)
statusProber.Ready()
return nil
}
if err := m.UpdateLabels(iterCtx); err != nil {
level.Warn(logger).Log("msg", "updating labels failed", "err", err)
promUp.Set(0)
statusProber.NotReady(err)
return nil
}
promUp.Set(1)
statusProber.Ready()
return nil
})
}, func(error) {
@ -303,7 +316,7 @@ func runSidecar(
}
tlsCfg, err := tls.NewServerConfig(log.With(logger, "protocol", "gRPC"),
conf.grpc.tlsSrvCert, conf.grpc.tlsSrvKey, conf.grpc.tlsSrvClientCA)
conf.grpc.tlsSrvCert, conf.grpc.tlsSrvKey, conf.grpc.tlsSrvClientCA, conf.grpc.tlsMinVersion)
if err != nil {
return errors.Wrap(err, "setup gRPC server")
}
@ -317,7 +330,7 @@ func runSidecar(
}),
info.WithStoreInfoFunc(func() (*infopb.StoreInfo, error) {
if httpProbe.IsReady() {
mint, maxt := promStore.Timestamps()
mint, maxt := m.Timestamps()
return &infopb.StoreInfo{
MinTime: mint,
MaxTime: maxt,
@ -367,7 +380,7 @@ func runSidecar(
if uploads {
// The background shipper continuously scans the data directory and uploads
// new blocks to Google Cloud Storage or an S3-compatible storage service.
bkt, err := client.NewBucket(logger, confContentYaml, component.Sidecar.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Sidecar.String(), nil)
if err != nil {
return err
}
@ -393,7 +406,7 @@ func runSidecar(
defer cancel()
if err := runutil.Retry(2*time.Second, extLabelsCtx.Done(), func() error {
if len(m.Labels()) == 0 {
if m.Labels().Len() == 0 {
return errors.New("not uploading as no external labels are configured yet - is Prometheus healthy/reachable?")
}
return nil
@ -401,21 +414,24 @@ func runSidecar(
return errors.Wrapf(err, "aborting as no external labels found after waiting %s", promReadyTimeout)
}
uploadCompactedFunc := func() bool { return conf.shipper.uploadCompacted }
s := shipper.New(logger, reg, conf.tsdb.path, bkt, m.Labels, metadata.SidecarSource,
uploadCompactedFunc, conf.shipper.allowOutOfOrderUpload, metadata.HashFunc(conf.shipper.hashFunc), conf.shipper.metaFileName)
s := shipper.New(
bkt,
conf.tsdb.path,
shipper.WithLogger(logger),
shipper.WithRegisterer(reg),
shipper.WithSource(metadata.SidecarSource),
shipper.WithHashFunc(metadata.HashFunc(conf.shipper.hashFunc)),
shipper.WithMetaFileName(conf.shipper.metaFileName),
shipper.WithLabels(m.Labels),
shipper.WithUploadCompacted(conf.shipper.uploadCompacted),
shipper.WithAllowOutOfOrderUploads(conf.shipper.allowOutOfOrderUpload),
shipper.WithSkipCorruptedBlocks(conf.shipper.skipCorruptedBlocks),
)
return runutil.Repeat(30*time.Second, ctx.Done(), func() error {
if uploaded, err := s.Sync(ctx); err != nil {
level.Warn(logger).Log("err", err, "uploaded", uploaded)
}
minTime, _, err := s.Timestamps()
if err != nil {
level.Warn(logger).Log("msg", "reading timestamps failed", "err", err)
return nil
}
m.UpdateTimestamps(minTime, math.MaxInt64)
return nil
})
}, func(error) {
@ -490,16 +506,19 @@ func (s *promMetadata) UpdateLabels(ctx context.Context) error {
return nil
}
func (s *promMetadata) UpdateTimestamps(mint, maxt int64) {
func (s *promMetadata) UpdateTimestamps(ctx context.Context) error {
s.mtx.Lock()
defer s.mtx.Unlock()
if mint < s.limitMinTime.PrometheusTimestamp() {
mint = s.limitMinTime.PrometheusTimestamp()
mint, err := s.client.LowestTimestamp(ctx, s.promURL)
if err != nil {
return err
}
s.mint = mint
s.maxt = maxt
s.mint = max(s.limitMinTime.PrometheusTimestamp(), mint)
s.maxt = math.MaxInt64
return nil
}
func (s *promMetadata) Labels() labels.Labels {

View File

@ -21,6 +21,7 @@ import (
"github.com/prometheus/client_golang/prometheus"
commonmodel "github.com/prometheus/common/model"
"github.com/prometheus/common/route"
"gopkg.in/yaml.v2"
"github.com/thanos-io/objstore"
"github.com/thanos-io/objstore/client"
@ -32,6 +33,7 @@ import (
"github.com/thanos-io/thanos/pkg/block/metadata"
"github.com/thanos-io/thanos/pkg/component"
hidden "github.com/thanos-io/thanos/pkg/extflag"
"github.com/thanos-io/thanos/pkg/exthttp"
"github.com/thanos-io/thanos/pkg/extkingpin"
"github.com/thanos-io/thanos/pkg/extprom"
extpromhttp "github.com/thanos-io/thanos/pkg/extprom/http"
@ -65,42 +67,46 @@ const (
)
type storeConfig struct {
indexCacheConfigs extflag.PathOrContent
objStoreConfig extflag.PathOrContent
dataDir string
cacheIndexHeader bool
grpcConfig grpcConfig
httpConfig httpConfig
indexCacheSizeBytes units.Base2Bytes
chunkPoolSize units.Base2Bytes
estimatedMaxSeriesSize uint64
estimatedMaxChunkSize uint64
seriesBatchSize int
storeRateLimits store.SeriesSelectLimits
maxDownloadedBytes units.Base2Bytes
maxConcurrency int
component component.StoreAPI
debugLogging bool
syncInterval time.Duration
blockListStrategy string
blockSyncConcurrency int
blockMetaFetchConcurrency int
filterConf *store.FilterConfig
selectorRelabelConf extflag.PathOrContent
advertiseCompatibilityLabel bool
consistencyDelay commonmodel.Duration
ignoreDeletionMarksDelay commonmodel.Duration
disableWeb bool
webConfig webConfig
label string
postingOffsetsInMemSampling int
cachingBucketConfig extflag.PathOrContent
reqLogConfig *extflag.PathOrContent
lazyIndexReaderEnabled bool
lazyIndexReaderIdleTimeout time.Duration
lazyExpandedPostingsEnabled bool
indexCacheConfigs extflag.PathOrContent
objStoreConfig extflag.PathOrContent
dataDir string
cacheIndexHeader bool
grpcConfig grpcConfig
httpConfig httpConfig
indexCacheSizeBytes units.Base2Bytes
chunkPoolSize units.Base2Bytes
estimatedMaxSeriesSize uint64
estimatedMaxChunkSize uint64
seriesBatchSize int
storeRateLimits store.SeriesSelectLimits
maxDownloadedBytes units.Base2Bytes
maxConcurrency int
component component.StoreAPI
debugLogging bool
syncInterval time.Duration
blockListStrategy string
blockSyncConcurrency int
blockMetaFetchConcurrency int
filterConf *store.FilterConfig
selectorRelabelConf extflag.PathOrContent
advertiseCompatibilityLabel bool
consistencyDelay commonmodel.Duration
ignoreDeletionMarksDelay commonmodel.Duration
disableWeb bool
webConfig webConfig
label string
postingOffsetsInMemSampling int
cachingBucketConfig extflag.PathOrContent
reqLogConfig *extflag.PathOrContent
lazyIndexReaderEnabled bool
lazyIndexReaderIdleTimeout time.Duration
lazyExpandedPostingsEnabled bool
postingGroupMaxKeySeriesRatio float64
indexHeaderLazyDownloadStrategy string
matcherCacheSize int
disableAdminOperations bool
}
func (sc *storeConfig) registerFlag(cmd extkingpin.FlagClause) {
@ -202,6 +208,9 @@ func (sc *storeConfig) registerFlag(cmd extkingpin.FlagClause) {
cmd.Flag("store.enable-lazy-expanded-postings", "If true, Store Gateway will estimate postings size and try to lazily expand postings if it downloads less data than expanding all postings.").
Default("false").BoolVar(&sc.lazyExpandedPostingsEnabled)
cmd.Flag("store.posting-group-max-key-series-ratio", "Mark posting group as lazy if it fetches more keys than R * max series the query should fetch. With R set to 100, a posting group which fetches 100K keys will be marked as lazy if the current query only fetches 1000 series. thanos_bucket_store_lazy_expanded_posting_groups_total shows lazy expanded postings groups with reasons and you can tune this config accordingly. This config is only valid if lazy expanded posting is enabled. 0 disables the limit.").
Default("100").Float64Var(&sc.postingGroupMaxKeySeriesRatio)
cmd.Flag("store.index-header-lazy-download-strategy", "Strategy of how to download index headers lazily. Supported values: eager, lazy. If eager, always download index header during initial load. If lazy, download index header during query time.").
Default(string(indexheader.EagerDownloadStrategy)).
EnumVar(&sc.indexHeaderLazyDownloadStrategy, string(indexheader.EagerDownloadStrategy), string(indexheader.LazyDownloadStrategy))
@ -219,6 +228,10 @@ func (sc *storeConfig) registerFlag(cmd extkingpin.FlagClause) {
cmd.Flag("bucket-web-label", "External block label to use as group title in the bucket web UI").StringVar(&sc.label)
cmd.Flag("matcher-cache-size", "Max number of cached matchers items. Using 0 disables caching.").Default("0").IntVar(&sc.matcherCacheSize)
cmd.Flag("disable-admin-operations", "Disable UI/API admin operations like marking blocks for deletion and no compaction.").Default("false").BoolVar(&sc.disableAdminOperations)
sc.reqLogConfig = extkingpin.RegisterRequestLoggingFlags(cmd)
}
@ -308,8 +321,11 @@ func runStore(
if err != nil {
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, conf.component.String())
customBktConfig := exthttp.DefaultCustomBucketConfig()
if err := yaml.Unmarshal(confContentYaml, &customBktConfig); err != nil {
return errors.Wrap(err, "parsing config YAML file")
}
bkt, err := client.NewBucket(logger, confContentYaml, conf.component.String(), exthttp.CreateHedgedTransportWithConfig(customBktConfig))
if err != nil {
return err
}
@ -359,6 +375,14 @@ func runStore(
return errors.Wrap(err, "create index cache")
}
var matchersCache = storecache.NoopMatchersCache
if conf.matcherCacheSize > 0 {
matchersCache, err = storecache.NewMatchersCache(storecache.WithSize(conf.matcherCacheSize), storecache.WithPromRegistry(reg))
if err != nil {
return errors.Wrap(err, "failed to create matchers cache")
}
}
var blockLister block.Lister
switch syncStrategy(conf.blockListStrategy) {
case concurrentDiscovery:
@ -404,6 +428,7 @@ func runStore(
}),
store.WithRegistry(reg),
store.WithIndexCache(indexCache),
store.WithMatchersCache(matchersCache),
store.WithQueryGate(queriesGate),
store.WithChunkPool(chunkPool),
store.WithFilterConfig(conf.filterConf),
@ -424,6 +449,8 @@ func runStore(
return conf.estimatedMaxChunkSize
}),
store.WithLazyExpandedPostings(conf.lazyExpandedPostingsEnabled),
store.WithPostingGroupMaxKeySeriesRatio(conf.postingGroupMaxKeySeriesRatio),
store.WithSeriesMatchRatio(0.5), // TODO: expose series match ratio as config.
store.WithIndexHeaderLazyDownloadStrategy(
indexheader.IndexHeaderLazyDownloadStrategy(conf.indexHeaderLazyDownloadStrategy).StrategyToDownloadFunc(),
),
@ -516,7 +543,7 @@ func runStore(
// Start query (proxy) gRPC StoreAPI.
{
tlsCfg, err := tls.NewServerConfig(log.With(logger, "protocol", "gRPC"), conf.grpcConfig.tlsSrvCert, conf.grpcConfig.tlsSrvKey, conf.grpcConfig.tlsSrvClientCA)
tlsCfg, err := tls.NewServerConfig(log.With(logger, "protocol", "gRPC"), conf.grpcConfig.tlsSrvCert, conf.grpcConfig.tlsSrvKey, conf.grpcConfig.tlsSrvClientCA, conf.grpcConfig.tlsMinVersion)
if err != nil {
return errors.Wrap(err, "setup gRPC server")
}

View File

@ -0,0 +1,18 @@
groups:
- name: test-alert-group
partial_response_strategy: "warn"
interval: 2m
rules:
- alert: TestAlert
expr: 1
labels:
key: value
annotations:
key: value
- name: test-rule-group
partial_response_strategy: "warn"
interval: 2m
rules:
- record: test_metric
expr: 1

View File

@ -55,7 +55,6 @@ func checkRulesFiles(logger log.Logger, patterns *[]string) error {
if err != nil || matches == nil {
err = errors.New("matching file not found")
level.Error(logger).Log("result", "FAILED", "error", err)
level.Info(logger).Log()
failed.Add(err)
continue
}
@ -64,8 +63,7 @@ func checkRulesFiles(logger log.Logger, patterns *[]string) error {
f, er := os.Open(fn)
if er != nil {
level.Error(logger).Log("result", "FAILED", "error", er)
level.Info(logger).Log()
failed.Add(err)
failed.Add(er)
continue
}
defer func() { _ = f.Close() }()
@ -77,7 +75,6 @@ func checkRulesFiles(logger log.Logger, patterns *[]string) error {
level.Error(logger).Log("error", e.Error())
failed.Add(e)
}
level.Info(logger).Log()
continue
}
level.Info(logger).Log("result", "SUCCESS", "rules found", n)

View File

@ -23,7 +23,8 @@ import (
"github.com/go-kit/log"
"github.com/go-kit/log/level"
"github.com/oklog/run"
"github.com/oklog/ulid"
"github.com/oklog/ulid/v2"
"github.com/olekukonko/tablewriter"
"github.com/opentracing/opentracing-go"
"github.com/pkg/errors"
@ -54,6 +55,7 @@ import (
"github.com/thanos-io/thanos/pkg/extprom"
extpromhttp "github.com/thanos-io/thanos/pkg/extprom/http"
"github.com/thanos-io/thanos/pkg/logging"
"github.com/thanos-io/thanos/pkg/logutil"
"github.com/thanos-io/thanos/pkg/model"
"github.com/thanos-io/thanos/pkg/prober"
"github.com/thanos-io/thanos/pkg/promclient"
@ -109,8 +111,11 @@ type bucketVerifyConfig struct {
}
type bucketLsConfig struct {
output string
excludeDelete bool
output string
excludeDelete bool
selectorRelabelConf extflag.PathOrContent
filterConf *store.FilterConfig
timeout time.Duration
}
type bucketWebConfig struct {
@ -180,10 +185,18 @@ func (tbc *bucketVerifyConfig) registerBucketVerifyFlag(cmd extkingpin.FlagClaus
}
func (tbc *bucketLsConfig) registerBucketLsFlag(cmd extkingpin.FlagClause) *bucketLsConfig {
tbc.selectorRelabelConf = *extkingpin.RegisterSelectorRelabelFlags(cmd)
tbc.filterConf = &store.FilterConfig{}
cmd.Flag("output", "Optional format in which to print each block's information. Options are 'json', 'wide' or a custom template.").
Short('o').Default("").StringVar(&tbc.output)
cmd.Flag("exclude-delete", "Exclude blocks marked for deletion.").
Default("false").BoolVar(&tbc.excludeDelete)
cmd.Flag("min-time", "Start of time range limit to list blocks. Thanos Tools will list blocks, which were created later than this value. Option can be a constant time in RFC3339 format or time duration relative to current time, such as -1d or 2h45m. Valid duration units are ms, s, m, h, d, w, y.").
Default("0000-01-01T00:00:00Z").SetValue(&tbc.filterConf.MinTime)
cmd.Flag("max-time", "End of time range limit to list. Thanos Tools will list only blocks, which were created earlier than this value. Option can be a constant time in RFC3339 format or time duration relative to current time, such as -1d or 2h45m. Valid duration units are ms, s, m, h, d, w, y.").
Default("9999-12-31T23:59:59Z").SetValue(&tbc.filterConf.MaxTime)
cmd.Flag("timeout", "Timeout to download metadata from remote storage").Default("5m").DurationVar(&tbc.timeout)
return tbc
}
@ -327,7 +340,7 @@ func registerBucketVerify(app extkingpin.AppClause, objStoreConfig *extflag.Path
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Bucket.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Bucket.String(), nil)
if err != nil {
return err
}
@ -346,7 +359,7 @@ func registerBucketVerify(app extkingpin.AppClause, objStoreConfig *extflag.Path
}
} else {
// nil Prometheus registerer: don't create conflicting metrics.
backupBkt, err = client.NewBucket(logger, backupconfContentYaml, component.Bucket.String())
backupBkt, err = client.NewBucket(logger, backupconfContentYaml, component.Bucket.String(), nil)
if err != nil {
return err
}
@ -411,18 +424,36 @@ func registerBucketLs(app extkingpin.AppClause, objStoreConfig *extflag.PathOrCo
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Bucket.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Bucket.String(), nil)
if err != nil {
return err
}
insBkt := objstoretracing.WrapWithTraces(objstore.WrapWithMetrics(bkt, extprom.WrapRegistererWithPrefix("thanos_", reg), bkt.Name()))
var filters []block.MetadataFilter
if tbc.timeout < time.Minute {
level.Warn(logger).Log("msg", "Timeout less than 1m could lead to frequent failures")
}
relabelContentYaml, err := tbc.selectorRelabelConf.Content()
if err != nil {
return errors.Wrap(err, "get content of relabel configuration")
}
relabelConfig, err := block.ParseRelabelConfig(relabelContentYaml, block.SelectorSupportedRelabelActions)
if err != nil {
return err
}
filters := []block.MetadataFilter{
block.NewLabelShardedMetaFilter(relabelConfig),
block.NewTimePartitionMetaFilter(tbc.filterConf.MinTime, tbc.filterConf.MaxTime),
}
if tbc.excludeDelete {
ignoreDeletionMarkFilter := block.NewIgnoreDeletionMarkFilter(logger, insBkt, 0, block.FetcherConcurrency)
filters = append(filters, ignoreDeletionMarkFilter)
}
baseBlockIDsFetcher := block.NewConcurrentLister(logger, insBkt)
fetcher, err := block.NewMetaFetcher(logger, block.FetcherConcurrency, insBkt, baseBlockIDsFetcher, "", extprom.WrapRegistererWithPrefix(extpromPrefix, reg), filters)
if err != nil {
@ -434,7 +465,7 @@ func registerBucketLs(app extkingpin.AppClause, objStoreConfig *extflag.PathOrCo
defer runutil.CloseWithLogOnErr(logger, insBkt, "bucket client")
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute)
ctx, cancel := context.WithTimeout(context.Background(), tbc.timeout)
defer cancel()
var (
@ -504,7 +535,7 @@ func registerBucketInspect(app extkingpin.AppClause, objStoreConfig *extflag.Pat
tbc := &bucketInspectConfig{}
tbc.registerBucketInspectFlag(cmd)
output := cmd.Flag("output", "Output format for result. Currently supports table, cvs, tsv.").Default("table").Enum(outputTypes...)
output := cmd.Flag("output", "Output format for result. Currently supports table, csv, tsv.").Default("table").Enum(outputTypes...)
cmd.Setup(func(g *run.Group, logger log.Logger, reg *prometheus.Registry, _ opentracing.Tracer, _ <-chan struct{}, _ bool) error {
@ -519,7 +550,7 @@ func registerBucketInspect(app extkingpin.AppClause, objStoreConfig *extflag.Pat
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Bucket.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Bucket.String(), nil)
if err != nil {
return err
}
@ -629,7 +660,7 @@ func registerBucketWeb(app extkingpin.AppClause, objStoreConfig *extflag.PathOrC
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Bucket.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Bucket.String(), nil)
if err != nil {
return errors.Wrap(err, "bucket client")
}
@ -826,7 +857,7 @@ func registerBucketCleanup(app extkingpin.AppClause, objStoreConfig *extflag.Pat
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Cleanup.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Cleanup.String(), nil)
if err != nil {
return err
}
@ -870,6 +901,7 @@ func registerBucketCleanup(app extkingpin.AppClause, objStoreConfig *extflag.Pat
ignoreDeletionMarkFilter,
stubCounter,
stubCounter,
0,
)
if err != nil {
return errors.Wrap(err, "create syncer")
@ -1013,12 +1045,12 @@ func getKeysAlphabetically(labels map[string]string) []string {
// matchesSelector checks if blockMeta contains every label from
// the selector with the correct value.
func matchesSelector(blockMeta *metadata.Meta, selectorLabels labels.Labels) bool {
for _, l := range selectorLabels {
if v, ok := blockMeta.Thanos.Labels[l.Name]; !ok || v != l.Value {
return false
}
}
return true
matches := true
selectorLabels.Range(func(l labels.Label) {
val, ok := blockMeta.Thanos.Labels[l.Name]
matches = matches && ok && val == l.Value
})
return matches
}
// getIndex calculates the index of s in strs.
@ -1083,7 +1115,7 @@ func registerBucketMarkBlock(app extkingpin.AppClause, objStoreConfig *extflag.P
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Mark.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Mark.String(), nil)
if err != nil {
return err
}
@ -1163,7 +1195,7 @@ func registerBucketRewrite(app extkingpin.AppClause, objStoreConfig *extflag.Pat
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Rewrite.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Rewrite.String(), nil)
if err != nil {
return err
}
@ -1232,7 +1264,7 @@ func registerBucketRewrite(app extkingpin.AppClause, objStoreConfig *extflag.Pat
if err != nil {
return errors.Wrapf(err, "read meta of %v", id)
}
b, err := tsdb.OpenBlock(logger, filepath.Join(tbc.tmpDir, id.String()), chunkPool)
b, err := tsdb.OpenBlock(logutil.GoKitLogToSlog(logger), filepath.Join(tbc.tmpDir, id.String()), chunkPool, nil)
if err != nil {
return errors.Wrapf(err, "open block %v", id)
}
@ -1371,7 +1403,7 @@ func registerBucketRetention(app extkingpin.AppClause, objStoreConfig *extflag.P
return err
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Retention.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Retention.String(), nil)
if err != nil {
return err
}
@ -1413,6 +1445,7 @@ func registerBucketRetention(app extkingpin.AppClause, objStoreConfig *extflag.P
ignoreDeletionMarkFilter,
stubCounter,
stubCounter,
0,
)
if err != nil {
return errors.Wrap(err, "create syncer")
@ -1460,7 +1493,7 @@ func registerBucketUploadBlocks(app extkingpin.AppClause, objStoreConfig *extfla
return errors.Wrap(err, "unable to parse objstore config")
}
bkt, err := client.NewBucket(logger, confContentYaml, component.Upload.String())
bkt, err := client.NewBucket(logger, confContentYaml, component.Upload.String(), nil)
if err != nil {
return errors.Wrap(err, "unable to create bucket")
}
@ -1468,8 +1501,15 @@ func registerBucketUploadBlocks(app extkingpin.AppClause, objStoreConfig *extfla
bkt = objstoretracing.WrapWithTraces(objstore.WrapWithMetrics(bkt, extprom.WrapRegistererWithPrefix("thanos_", reg), bkt.Name()))
s := shipper.New(logger, reg, tbc.path, bkt, func() labels.Labels { return lset }, metadata.BucketUploadSource,
nil, false, metadata.HashFunc(""), shipper.DefaultMetaFilename)
s := shipper.New(
bkt,
tbc.path,
shipper.WithLogger(logger),
shipper.WithRegisterer(reg),
shipper.WithSource(metadata.BucketUploadSource),
shipper.WithMetaFileName(shipper.DefaultMetaFilename),
shipper.WithLabels(func() labels.Labels { return lset }),
)
ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {

View File

@ -4,6 +4,8 @@
package main
import (
"os"
"path"
"testing"
"github.com/go-kit/log"
@ -44,4 +46,14 @@ func Test_CheckRules_Glob(t *testing.T) {
// invalid path
files = &[]string{"./testdata/rules-files/*.yamlaaa"}
testutil.NotOk(t, checkRulesFiles(logger, files), "expected err for file %s", files)
// Unreadble path
// Move the initial file to a temp dir and make it unreadble there, in case the process cannot chmod the file in the current dir.
filename := "./testdata/rules-files/unreadable_valid.yaml"
bytesRead, err := os.ReadFile(filename)
testutil.Ok(t, err)
filename = path.Join(t.TempDir(), "file.yaml")
testutil.Ok(t, os.WriteFile(filename, bytesRead, 0000))
files = &[]string{filename}
testutil.NotOk(t, checkRulesFiles(logger, files), "expected err for file %s", files)
}

View File

@ -0,0 +1,210 @@
---
title: Life of a Sample in Thanos, and How to Configure it Ingestion Part I
date: "2024-09-16"
author: Thibault Mangé (https://github.com/thibaultmg)
---
## Life of a Sample in Thanos, and How to Configure it Ingestion Part I
### Introduction
Thanos is a sophisticated distributed system with a broad range of capabilities, and with that comes a certain level of configuration complexity. In this series of articles, we will take a deep dive into the lifecycle of a sample within Thanos, tracking its journey from initial ingestion to final retrieval. Our focus will be to explain Thanos's critical internal mechanisms and highlight the essential configurations for each component, guiding you toward achieving your desired operational results. We will be covering the following Thanos components:
* **Receive**: Ingests samples from remote Prometheus instances and uploads blocks to object storage.
* **Sidecar**: Attaches to Prometheus pods as a sidecar container, ingests its data and uploads blocks to object storage.
* **Compactor**: Merges and deduplicates blocks in object storage.
* **Store**: Exposes blocks in object storage for querying.
* **Query**: Retrieves data from stores and processes queries.
* **Query Frontend**: Distributes incoming queries to Querier instances.
The objective of this series of articles is to make Thanos more accessible to new users, helping alleviate any initial apprehensions. We will also assume that the working environment is Kubernetes. Given the extensive ground to cover, our goal is to remain concise throughout this exploration.
Before diving deeper, please check the [annexes](#annexes) to clarify some essential terminology. If you are already familiar with these concepts, feel free to skip ahead.
### The Sample Origin: Do You Have Close Integration Capabilities?
The sample usually originates from a Prometheus instance that is scraping targets in a cluster. There are two possible scenarios:
* The **Prometheus instances are under your control and you can access it from your Thanos deployment**. In this case, you can use the Thanos sidecar, which you will attach to the pod running the Prometheus server. The Thanos sidecar will directly read the raw samples from the Prometheus server using the [remote read API](https://prometheus.io/docs/prometheus/latest/querying/remote_read_api/). Then, the sidecar will behave similarly to the other scenario. It will expose its local data via the Store API as a **Receiver**, without the routing and ingestion parts. Thus, we will not delve further into this use case.
* The **Prometheus servers are running in clusters that you do not control**. In this case, you cannot attach a sidecar to the Prometheus server and you cannot fetch its data. The samples will travel to your Thanos system using the remote write protocol. This is the scenario we will focus on.
Also, bear in mind that if adding Thanos for collecting your clusters metrics removes the need for a full fledged local Prometheus (with querying and alerting), you can save some resources by using the [Prometheus Agent mode](https://prometheus.io/docs/prometheus/latest/feature_flags/#prometheus-agent). In this configuration, it will only scrape the targets and forward the data to the Thanos system.
The following diagram illustrates the two scenarios:
<img src="img/life-of-a-sample/close-integration.png" alt="Close integration vs external client" style="max-width: 600px; display: block;margin: 0 auto;"/>
Comparing the two deployment modes, the Sidecar Mode is generally preferable due to its simpler configuration and fewer moving parts. However, if this isn't possible, opt for the **Receive Mode**. Bear in mind, this mode requires careful configuration to ensure high availability, scalability, and durability. It adds another layer of indirection and comes with the overhead of operating the additional component.
### Sending Samples to Thanos
#### The Remote Write Protocol
Let's start with our first Thanos component, the **Receive** or **Receiver**, the entry point to the system. It was introduced with this [proposal](https://thanos.io/tip/proposals-done/201812-thanos-remote-receive.md/). This component facilitates the ingestion of metrics from multiple clients, eliminating the need for close integration with the clients' Prometheus deployments.
Thanos Receive exposes a remote-write endpoint (see [Prometheus remote-write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write)) that Prometheus servers can use to transmit metrics. The only prerequisite on the client side is to configure the remote write endpoint on each Prometheus server, a feature natively supported by Prometheus.
On the Receive component, the remote write endpoint is configured with the `--remote-write.address` flag. You can also configure TLS options using other `--remote-write.*` flags. You can see the full list of the Receiver flags [here](https://thanos.io/tip/components/receive.md/#flags).
The remote-write protocol is based on HTTP POST requests. The payload consists of a protobuf message containing a list of time-series samples and labels. Generally, a payload contains at most one sample per time series and spans numerous time series. Metrics are typically scraped every 15 seconds, with a maximum remote write delay of 5 seconds to minimize latency, from scraping to query availability on the receiver.
#### Tuning the Remote Write Protocol
The Prometheus remote write configuration offers various parameters to tailor the connection specifications, parallelism, and payload properties (compression, batch size, etc.). While these may seem like implementation details for Prometheus, understanding them is essential for optimizing ingestion, as they form a sensitive part of the system.
From an implementation standpoint, the key idea is to read directly from the TSDB WAL (Write Ahead Log), a simple mechanism commonly used by databases to ensure data durability. If you wish to delve deeper into the TSDB WAL, check out this [great article](https://ganeshvernekar.com/blog/prometheus-tsdb-wal-and-checkpoint). Once samples are extracted from the WAL, they are aggregated into parallel queues (shards) as remote-write payloads. When a queue reaches its limit or a maximum timeout is reached, the remote-write client stops reading the WAL and dispatches the data. The cycle continues. The parallelism is defined by the number of shards, their number is dynamically optimized. More insights on Prometheus's remote write can be found in the [documentation](https://prometheus.io/docs/practices/remote_write/). You can also find troubleshooting tips on [Grafana's blog](https://grafana.com/blog/2021/04/12/how-to-troubleshoot-remote-write-issues-in-prometheus/#troubleshooting-and-metrics).
The following diagram illustrates the impacts of each parameter on the remote write protocol:
<img src="img/life-of-a-sample/remote-write.png" alt="Remote write" width="700"/>
Key Points to Consider:
* **The send deadline setting**: `batch_send_deadline` should be set to around 5s to minimize latency. This timeframe strikes a balance between minimizing latency and avoiding excessive request frequency that could burden the Receiver. While a 5-second delay might seem substantial in critical alert scenarios, it is generally acceptable considering the typical resolution time for most issues.
* **The backoff settings**: The `min_backoff` should ideally be no less than 250 milliseconds, and the `max_backoff` should be at least 10 seconds. These settings help prevent Receiver overload, particularly in situations like system restarts, by controlling the rate of data sending.
#### Protecting the Receiver from Overuse
In scenarios where you have limited control over client configurations, it becomes essential to shield the Receive component from potential misuse or overload. The Receive component includes several configuration options designed for this purpose, comprehensively detailed in the [official documentation](https://thanos.io/tip/components/receive.md/#limits--gates-experimental). Below is a diagram illustrating the impact of these configuration settings:
<img src="img/life-of-a-sample/receive-limits.png" alt="Receive limits" width="900"/>
When implementing a topology with separate router and ingestor roles (as we will see later), these limits should be enforced at the router level.
Key points to consider:
* **Series and samples limits**: Typically, with a standard target scrape interval of 15 seconds and a maximum remote write delay of 5 seconds, the `series_limit` and `samples_limit` tend to be functionally equivalent. However, in scenarios where the remote writer is recovering from downtime, the `samples_limit` may become more restrictive, as the payload might include multiple samples for the same series.
* **Handling request limits**: If a request exceeds these limits, the system responds with a 413 (Entity Too Large) HTTP error. Currently, Prometheus does not support splitting requests in response to this error, leading to data loss.
* **Active series limiting**: The limitation on active series persists as long as the count remains above the set threshold in the Receivers' TSDBs. Active series represent the number of time series currently stored in the TSDB's (Time Series Database) head block. The head block is the in-memory portion of the TSDB where incoming samples are temporarily stored before being compacted into persistent on-disk blocks. The head block is typically compacted every two hours. This is when stale series are removed, and the active series count decreases. Requests reaching this limit are rejected with a 429 (Too Many Requests) HTTP code, triggering retries.
Considering these aspects, it is important to carefully monitor and adjust these limits. While they serve as necessary safeguards, overly restrictive settings can inadvertently lead to data loss.
### Receiving Samples with High Availability and Durability
#### The Need for Multiple Receive Instances
Relying on a single instance of Thanos Receive is not sufficient for two main reasons:
* Scalability: As your metrics grow, so does the need to scale your infrastructure.
* Reliability: If a single Receive instance falls, it disrupts metric collection, affecting rule evaluation and alerting. Furthermore, during downtime, Prometheus servers will buffer data in their Write-Ahead Log (WAL). If the outage exceeds the WAL's retention duration (default is 2 hours), this can lead to data loss.
#### The Hashring Mechanism
To achieve high availability, it is necessary to deploy multiple Receive replicas. However, it is not just about having more instances; it is crucial to maintain consistency in sample ingestion. In other words, samples from a given time series should always be ingested by the same Receive instance. This is necessary for optimized operations. When this is not achieved, it imposes a higher load on other operations such as compacting the data or querying the data.
To that effect, you guessed it, the Receive component uses a hashring! With the hashring, every Receive participant knows and agrees on who must ingest which sample. When clients send data, they connect to any Receive instance, which then routes the data to the correct instances based on the hashring. This is why the Receive component is also known as the **IngestorRouter**.
<img src="img/life-of-a-sample/ingestor-router.png" alt="IngestorRouter" style="max-width: 600px; display: block;margin: 0 auto;"/>
Receive instances use a gossip protocol to maintain a consistent view of the hashring, requiring inter-instance communication via a configured HTTP server (`--http-address` flag).
There are two possible hashrings:
* **hashmod**: This algorithm distributes time series by hashing labels modulo the number of instances. It is effective in evenly distributing the load. The downside is that scaling operations on the hashring cause a high churn of time series on the nodes, requiring each node to flush its TSDB head and upload its recent blocks on the object storage. During this operation that can last a few minutes, the receivers cannot ingest data, causing a downtime. This is especially critical if you are running big Receive nodes. The more data they have, the longer the downtime.
* **ketama**: A more recent addition, an implementation of a consistent hashing algorithm. It means that during scaling operations, most of the time series will remain attached to the same nodes. No TSDB operation or data upload is needed before operating into the new configuration. As a result, the downtime is minimal, just the time for the nodes to agree on the new hashring. As a downside, it can be less efficient in evenly distributing the load compared to hashmod.
The hashring algorithm is configured with the `--receive.hashrings-algorithm` flag. You can use the [Thanos Receive Controller](https://github.com/observatorium/thanos-receive-controller) to automate the management of the hashring.
Key points to consider:
* The case for hashmod: If your load is stable for the foreseeable future, the `hashmod` algorithm is a good choice. It is more efficient in evenly distributing the load. Otherwise, `ketama` is recommended for its operational benefits.
* The case for small Receive nodes: If you have smaller Receive nodes, the downtime during scaling operation with the `hashmod` algorithm will be shorter as the amount of data to upload to the object storage is smaller. Also, when using the `ketama` algorithm, if a node falls, its requests are directly redistributed to the remaining nodes. This could cause them to be overwhelmed if there are too few of them and result in a downtime. With more nodes, the added load represents a smaller fraction of the total load.
* Protecting the nodes after recovery: During a downtime, the Receive replies with 503 to the clients, which is interpreted as a temporary failure and remote-writes are retried. At that moment, your Receive will have to catch up and ingest a lot of data. This is why we recommend using the `--receive.limits-config` flag to limit the amount of data that can be received. This will prevent the Receive from being overwhelmed by the catch up.
#### Ensuring Samples Durability
For clients requiring high data durability, the `--receive.replication-factor` flag ensures data duplication across multiple receivers. When set to n, it will only reply with a successful processing response to the client once it has duplicated the data to `n-1` other receivers. Additionally, an external replicas label can be added to each Receive instance (`--label` flag) to mark replicated data. This setup increases data resilience but also expands the data footprint.
For even greater durability, replication can take into account the [availability zones](https://thanos.io/tip/components/receive.md/#az-aware-ketama-hashring-experimental) of the Receive instances. It will ensure that data is replicated to instances in different availability zones, reducing the risk of data loss in case of a zone failure. This is however only supported with the `ketama` algorithm.
Beyond the increased storage cost of replication, another downside is the increased load on the Receive instances that must now forward a given request to multiple nodes, according to the time series labels. Nodes receiving the first replica then must forward the series to the next Receive node until the replication factor is satisfied. This multiplies the internodes communication, especially with big hashrings.
#### Improving Scalability and Reliability
A new deployment topology was [proposed](https://thanos.io/tip/proposals-accepted/202012-receive-split.md/), separating the **router** and **ingestor** roles. The hashring configuration is read by the routers, which will direct each time series to the appropriate ingestor and its replicas. This role separation provides some important benefits:
* **Scalability**: The routers and ingestors have different constraints and can be scaled independently. The router requires a performant network and CPU to route the samples, while the ingestor needs significant memory and storage. The router is stateless, while the ingestor is stateful. This separation of concerns also enables the setup of more complex topologies, such as chaining routers and having multiple hashrings. For example, you can have different hashrings attached to the routers, grouping distinct tenants with different service levels supported by isolated groups of ingestors.
* **Reliability**: During hashring reconfigurations, especially with the hashmod algorithm, some nodes may become ready before others, leading to a partially operational hashring that results in many request failures because replicas cannot be forwarded. This triggers retries, increasing the load and causing instabilities. Relieving the ingestors from the routing responsibilities makes them more stable and less prone to overload. This is especially important as they are stateful components. Routers, on the other hand, are stateless and can be easily scaled up and down.
<img src="img/life-of-a-sample/router-and-ingestor.png" alt="IngestorRouter" style="max-width: 600px; display: block;margin: 0 auto;"/>
The Receive instance behaves in the following way:
* When both a hashring and `receive.local-endpoint` are set, it acts as a **RouterIngestor**. This last flag enables the router to identify itself in the hashring as an ingestor and ingest the data when appropriate.
* When no hashring is set, it simply ingests the data and acts as an **Ingestor**.
* When only the hashring is set, it acts as a **Router** and forwards the data to the correct ingestor.
#### Handling Out-of-Order Timestamps
To enhance reliability in data ingestion, Thanos Receive supports out-of-order samples.
Samples are ingested into the Receiver's TSDB, which has strict requirements for the order of timestamps:
* Samples are expected to have increasing timestamps for a given time series.
* A new sample cannot be more than 1 hour older than the most recent sample of any time series in the TSDB.
When these requirements are not met, the samples are dropped, and an out-of-order warning is logged. However, there are scenarios where out-of-order samples may occur, often because of [clients' misconfigurations](https://thanos.io/tip/operating/troubleshooting.md/#possible-cause-1) or delayed remote write requests, which can cause samples to arrive out of order depending on the remote write implementation. Additional examples at the Prometheus level can be found in [this article](https://promlabs.com/blog/2022/12/15/understanding-duplicate-samples-and-out-of-order-timestamp-errors-in-prometheus/).
As you are not necessarily in control of your clients' setups, you may want to increase resilience against these issues. Support for out-of-order samples has been implemented for the TSDB. This feature can be enabled with the `tsdb.out-of-order.time-window` flag on the Receiver. The downsides are:
* An increase in the TSDB's memory usage, proportional to the number of out-of-order samples.
* The TSDB will produce blocks with overlapping time periods, which the compactor must handle. Ensure the `--compact.enable-vertical-compaction` [flag](https://thanos.io/tip/components/compact.md/#enabling-vertical-compaction) is enabled on the compactor to manage these overlapping blocks. We will cover this in more detail in the next article.
Additionally, consider setting the `tsdb.too-far-in-future.time-window` flag to a value higher than the default 0s to account for possible clock drifts between clients and the Receiver.
### Conclusion
In this first part, we have covered the initial steps of the sample lifecycle in Thanos, focusing on the ingestion process. We have explored the remote write protocol, the Receive component, and the critical configurations needed to ensure high availability and durability. Now, our sample is safely ingested and stored in the system. In the next part, we will continue following our sample's journey, delving into the data management and querying processes.
See the full list of articles in this series:
* Life of a sample in thanos, and how to configure it Ingestion Part I
* Life of a sample in thanos, and how to configure it Data Management Part II
* Life of a sample in thanos, and how to configure it Querying Part III
### Annexes
#### Metrics Terminology: Samples, Labels and Series
* **Sample**: A sample in Prometheus represents a single data point, capturing a measurement of a specific system aspect or property at a given moment. It is the fundamental unit of data in Prometheus, reflecting real-time system states.
* **Labels**: very sample in Prometheus is tagged with labels, which are key-value pairs that add context and metadata. These labels typically include:
* The nature of the metric being measured.
* The source or origin of the metric.
* Other relevant contextual details.
* **External labels**: External labels are appended by the scraping or receiving component (like a Prometheus server or Thanos Receive). They enable:
* **Sharding**: Included in the `meta.json` file of the block created by Thanos, these labels are used by the compactor and the store to shard blocks processing effectively.
* **Deduplication**: In high-availability setups where Prometheus servers scrape the same targets, external labels help identify and deduplicate similar samples.
* **Tenancy isolation**: In multi-tenant systems, external labels are used to segregate data per tenant, ensuring logical data isolation.
* **Series** or **Time Series**: In the context of monitoring, a Series, which is a more generic term is necessarily a time series. A series is defined by a unique set of label-value combinations. For instance:
```
http_requests_total{method="GET", handler="/users", status="200"}
^ ^
Series name (label `__name__`) Labels (key=value format)
```
In this example, http_requests_total is a specific label (`__name__`). The unique combination of labels creates a distinct series. Prometheus scrapes these series, attaching timestamps to each sample, thereby forming a dynamic time series.
For our discussion, samples can be of various types, but we will treat them as simple integers for simplicity.
The following diagram illustrates the relationship between samples, labels and series:
<img src="img/life-of-a-sample/series-terminology.png" alt="Series terminology" width="500"/>
#### TSDB Terminology: Chunks, Chunk Files and Blocks
Thanos adopts its [storage architecture](https://thanos.io/tip/thanos/storage.md/#data-in-object-storage) from [Prometheus](https://prometheus.io/docs/prometheus/latest/storage/), utilizing the TSDB (Time Series Database) [file format](https://github.com/prometheus/prometheus/blob/release-2.48/tsdb/docs/format/README.md) as its foundation. Let's review some key terminology that is needed to understand some of the configuration options.
**Samples** from a given time series are first aggregated into small **chunks**. The storage format of a chunk is highly compressed ([see documentation](https://github.com/prometheus/prometheus/blob/release-2.48/tsdb/docs/format/chunks.md#xor-chunk-data)). Accessing a given sample of the chunk requires decoding all preceding values stored in this chunk. This is why chunks hold up to 120 samples, a number chosen to strike a balance between compression benefits and the performance of reading data.
Chunks are created over time for each time series. As time progresses, these chunks are assembled into **chunk files**. Each chunk file, encapsulating chunks from various time series, is limited to 512MiB to manage memory usage effectively during read operations. Initially, these files cover a span of two hours and are subsequently organized into a larger entity known as a **block**.
A **block** is a directory containing the chunk files in a specific time range, an index and some metadata. The two-hour duration for initial blocks is chosen for optimizing factors like storage efficiency and read performance. Over time, these two-hour blocks undergo horizontal compaction by the compactor, merging them into larger blocks. This process is designed to optimize long-term storage by extending the time period each block covers.
The following diagram illustrates the relationship between chunks, chunk files and blocks:
<img src="img/life-of-a-sample/storage-terminology.png" alt="TSDB terminology" width="900"/>

Binary file not shown.

After

Width:  |  Height:  |  Size: 145 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 199 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 303 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 134 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 159 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 248 KiB

View File

@ -8,4 +8,4 @@ Welcome 👋🏼
This space was created for the Thanos community to share learnings, insights, best practices and cool things to the world. If you are interested in contributing relevant content to Thanos blog, feel free to add Pull Request (PR) to [Thanos repo's blog directory](http://github.com/thanos-io/thanos). See ya there!
PS: For Prometheus specific content, consider contributing to [Prometheus blog space](https://prometheus.io/blog/) by creating PR to [Prometheus docs repo](https://github.com/prometheus/docs/tree/main/content/blog).
PS: For Prometheus specific content, consider contributing to [Prometheus blog space](https://prometheus.io/blog/) by creating PR to [Prometheus docs repo](https://github.com/prometheus/docs/tree/main/blog-posts).

View File

@ -106,7 +106,7 @@ external_labels: {cluster="us1", replica="1", receive="true", environment="produ
external_labels: {cluster="us1", replica="1", receive="true", environment="staging"}
```
and set `--deduplication.replica-label="replica"`, Compactor will assume those as:
and set `--deduplication.replica-label=replica`, Compactor will assume those as:
```
external_labels: {cluster="eu1", receive="true", environment="production"} (2 streams, resulted in one)
@ -152,6 +152,8 @@ message AggrChunk {
This means that for each series we collect various aggregations with a given interval: 5m or 1h (depending on resolution). This allows us to keep precision on large duration queries, without fetching too many samples.
Native histogram downsampling leverages the fact that one can aggregate & reduce schema i.e. downsample native histograms. Native histograms only store 3 aggregations - counter, count, and sum. Sum and count are used to produce "an average" native histogram. Counter is a counter that is used with functions irate, rate, increase, and resets.
### ⚠ Downsampling: Note About Resolution and Retention ⚠️
Resolution is a distance between data points on your graphs. E.g.
@ -278,10 +280,75 @@ usage: thanos compact [<flags>]
Continuously compacts blocks in an object store bucket.
Flags:
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory.
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
HTTP Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--data-dir="./data" Data directory in which to cache blocks and
process compactions.
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--consistency-delay=30m Minimum age of fresh (non-compacted)
blocks before they are being processed.
Malformed blocks older than the maximum of
consistency-delay and 48h0m0s will be removed.
--retention.resolution-raw=0d
How long to retain raw samples in bucket.
Setting this to 0d will retain samples of this
resolution forever
--retention.resolution-5m=0d
How long to retain samples of resolution 1 (5
minutes) in bucket. Setting this to 0d will
retain samples of this resolution forever
--retention.resolution-1h=0d
How long to retain samples of resolution 2 (1
hour) in bucket. Setting this to 0d will retain
samples of this resolution forever
-w, --[no-]wait Do not exit after all compactions have been
processed and wait for new work.
--wait-interval=5m Wait interval between consecutive compaction
runs and bucket refreshes. Only works when
--wait flag specified.
--[no-]downsampling.disable
Disables downsampling. This is not recommended
as querying long time ranges without
non-downsampled data is not efficient and useful
e.g it is not possible to render all samples for
a human eye anyway
--block-discovery-strategy="concurrent"
One of concurrent, recursive. When set to
concurrent, stores will concurrently issue
@ -291,13 +358,13 @@ Flags:
recursively traversing into each directory.
This avoids N+1 calls at the expense of having
slower bucket iterations.
--block-meta-fetch-concurrency=32
Number of goroutines to use when fetching block
metadata from object storage.
--block-files-concurrency=1
Number of goroutines to use when
fetching/uploading block files from object
storage.
--block-meta-fetch-concurrency=32
Number of goroutines to use when fetching block
metadata from object storage.
--block-viewer.global.sync-block-interval=1m
Repeat interval for syncing the blocks between
local and remote view for /global Block Viewer
@ -306,56 +373,26 @@ Flags:
Maximum time for syncing the blocks between
local and remote view for /global Block Viewer
UI.
--bucket-web-label=BUCKET-WEB-LABEL
External block label to use as group title in
the bucket web UI
--compact.blocks-fetch-concurrency=1
Number of goroutines to use when download block
during compaction.
--compact.cleanup-interval=5m
How often we should clean up partially uploaded
blocks and blocks with deletion mark in the
background when --wait has been enabled. Setting
it to "0s" disables it - the cleaning will only
happen at the end of an iteration.
--compact.concurrency=1 Number of goroutines to use when compacting
groups.
--compact.progress-interval=5m
Frequency of calculating the compaction progress
in the background when --wait has been enabled.
Setting it to "0s" disables it. Now compaction,
downsampling and retention progress are
supported.
--consistency-delay=30m Minimum age of fresh (non-compacted)
blocks before they are being processed.
Malformed blocks older than the maximum of
consistency-delay and 48h0m0s will be removed.
--data-dir="./data" Data directory in which to cache blocks and
process compactions.
--deduplication.func= Experimental. Deduplication algorithm for
merging overlapping blocks. Possible values are:
"", "penalty". If no value is specified,
the default compact deduplication merger
is used, which performs 1:1 deduplication
for samples. When set to penalty, penalty
based deduplication algorithm will be used.
At least one replica label has to be set via
--deduplication.replica-label flag.
--deduplication.replica-label=DEDUPLICATION.REPLICA-LABEL ...
Label to treat as a replica indicator of blocks
that can be deduplicated (repeated flag). This
will merge multiple replica blocks into one.
This process is irreversible.Experimental.
When one or more labels are set, compactor
will ignore the given labels so that vertical
compaction can merge the blocks.Please note
that by default this uses a NAIVE algorithm
for merging which works well for deduplication
of blocks with **precisely the same samples**
like produced by Receiver replication.If you
need a different deduplication algorithm (e.g
one that works well with Prometheus replicas),
please set it via --deduplication.func.
--compact.concurrency=1 Number of goroutines to use when compacting
groups.
--compact.blocks-fetch-concurrency=1
Number of goroutines to use when download block
during compaction.
--downsample.concurrency=1
Number of goroutines to use when downsampling
blocks.
--delete-delay=48h Time before a block marked for deletion is
deleted from bucket. If delete-delay is non
zero, blocks will be marked for deletion and
@ -367,37 +404,45 @@ Flags:
block loaded, or compactor is ignoring the
deletion because it's compacting the block at
the same time.
--disable-admin-operations
Disable UI/API admin operations like marking
blocks for deletion and no compaction.
--downsample.concurrency=1
Number of goroutines to use when downsampling
blocks.
--downsampling.disable Disables downsampling. This is not recommended
as querying long time ranges without
non-downsampled data is not efficient and useful
e.g it is not possible to render all samples for
a human eye anyway
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--deduplication.func= Experimental. Deduplication algorithm for
merging overlapping blocks. Possible values are:
"", "penalty". If no value is specified,
the default compact deduplication merger
is used, which performs 1:1 deduplication
for samples. When set to penalty, penalty
based deduplication algorithm will be used.
At least one replica label has to be set via
--deduplication.replica-label flag.
--deduplication.replica-label=DEDUPLICATION.REPLICA-LABEL ...
Experimental. Label to treat as a replica
indicator of blocks that can be deduplicated
(repeated flag). This will merge multiple
replica blocks into one. This process is
irreversible. Flag may be specified multiple
times as well as a comma separated list of
labels. When one or more labels are set,
compactor will ignore the given labels so that
vertical compaction can merge the blocks.Please
note that by default this uses a NAIVE algorithm
for merging which works well for deduplication
of blocks with **precisely the same samples**
like produced by Receiver replication.If you
need a different deduplication algorithm (e.g
one that works well with Prometheus replicas),
please set it via --deduplication.func.
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
HTTP Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--min-time=0000-01-01T00:00:00Z
Start of time range limit to compact.
Thanos Compactor will compact only blocks, which
happened later than this value. Option can be a
constant time in RFC3339 format or time duration
relative to current time, such as -1d or 2h45m.
Valid duration units are ms, s, m, h, d, w, y.
--max-time=9999-12-31T23:59:59Z
End of time range limit to compact.
Thanos Compactor will compact only blocks,
@ -406,35 +451,14 @@ Flags:
duration relative to current time, such as -1d
or 2h45m. Valid duration units are ms, s, m, h,
d, w, y.
--min-time=0000-01-01T00:00:00Z
Start of time range limit to compact.
Thanos Compactor will compact only blocks, which
happened later than this value. Option can be a
constant time in RFC3339 format or time duration
relative to current time, such as -1d or 2h45m.
Valid duration units are ms, s, m, h, d, w, y.
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--retention.resolution-1h=0d
How long to retain samples of resolution 2 (1
hour) in bucket. Setting this to 0d will retain
samples of this resolution forever
--retention.resolution-5m=0d
How long to retain samples of resolution 1 (5
minutes) in bucket. Setting this to 0d will
retain samples of this resolution forever
--retention.resolution-raw=0d
How long to retain raw samples in bucket.
Setting this to 0d will retain samples of this
resolution forever
--[no-]web.disable Disable Block Viewer UI.
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to act on based on their external labels.
It follows thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML
@ -443,32 +467,10 @@ Flags:
external labels. It follows thanos sharding
relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to act on based on their external labels.
It follows thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--version Show application version.
-w, --wait Do not exit after all compactions have been
processed and wait for new work.
--wait-interval=5m Wait interval between consecutive compaction
runs and bucket refreshes. Only works when
--wait flag specified.
--web.disable Disable Block Viewer UI.
--web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--web.route-prefix="" Prefix for API and UI endpoints. This allows
thanos UI to be served on a sub-path. This
option is analogous to --web.route-prefix of
Prometheus.
--web.external-prefix="" Static prefix for all HTML links and redirect
URLs in the bucket web UI interface.
Actual endpoints are still served on / or the
@ -488,9 +490,14 @@ Flags:
stripped prefix value in X-Forwarded-Prefix
header. This allows thanos UI to be served on a
sub-path.
--web.route-prefix="" Prefix for API and UI endpoints. This allows
thanos UI to be served on a sub-path. This
option is analogous to --web.route-prefix of
Prometheus.
--[no-]web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--bucket-web-label=BUCKET-WEB-LABEL
External block label to use as group title in
the bucket web UI
--[no-]disable-admin-operations
Disable UI/API admin operations like marking
blocks for deletion and no compaction.
```

View File

@ -200,194 +200,196 @@ usage: thanos query-frontend [<flags>]
Query frontend command implements a service deployed in front of queriers to
improve query parallelization and caching.
Flags:
--auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory.
--cache-compression-type=""
Use compression in results cache.
Supported values are: 'snappy' and ” (disable
compression).
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
HTTP Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--labels.default-time-range=24h
The default metadata time range duration for
retrieving labels through Labels and Series API
when the range parameters are not specified.
--labels.max-query-parallelism=14
Maximum number of labels requests will be
scheduled in parallel by the Frontend.
--labels.max-retries-per-request=5
Maximum number of retries for a single
label/series API request; beyond this,
the downstream error is returned.
--labels.partial-response Enable partial response for labels requests
if no partial_response param is specified.
--no-labels.partial-response for disabling.
--labels.response-cache-config=<content>
Alternative to
'labels.response-cache-config-file' flag
(mutually exclusive). Content of YAML file that
contains response cache configuration.
--labels.response-cache-config-file=<file-path>
Path to YAML file that contains response cache
configuration.
--labels.response-cache-max-freshness=1m
Most recent allowed cacheable result for
labels requests, to prevent caching very recent
results that might still be in flux.
--labels.split-interval=24h
Split labels requests by an interval and
execute in parallel, it should be greater
than 0 when labels.response-cache-config is
configured.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--query-frontend.compress-responses
Compress HTTP responses.
--query-frontend.downstream-tripper-config=<content>
Alternative to
'query-frontend.downstream-tripper-config-file'
flag (mutually exclusive). Content of YAML file
that contains downstream tripper configuration.
If your downstream URL is localhost or
127.0.0.1 then it is highly recommended to
increase max_idle_conns_per_host to at least
100.
--query-frontend.downstream-tripper-config-file=<file-path>
Path to YAML file that contains downstream
tripper configuration. If your downstream URL
is localhost or 127.0.0.1 then it is highly
recommended to increase max_idle_conns_per_host
to at least 100.
--query-frontend.downstream-url="http://localhost:9090"
URL of downstream Prometheus Query compatible
API.
--query-frontend.enable-x-functions
Enable experimental x-
functions in query-frontend.
--no-query-frontend.enable-x-functions for
disabling.
--query-frontend.forward-header=<http-header-name> ...
List of headers forwarded by the query-frontend
to downstream queriers, default is empty
--query-frontend.log-queries-longer-than=0
Log queries that are slower than the specified
duration. Set to 0 to disable. Set to < 0 to
enable on all queries.
--query-frontend.org-id-header=<http-header-name> ...
Deprecation Warning - This flag
will be soon deprecated in favor of
query-frontend.tenant-header and both flags
cannot be used at the same time. Request header
names used to identify the source of slow
queries (repeated flag). The values of the
header will be added to the org id field in
the slow query log. If multiple headers match
the request, the first matching arg specified
will take precedence. If no headers match
'anonymous' will be used.
--query-frontend.slow-query-logs-user-header=<http-header-name>
Set the value of the field remote_user in the
slow query logs to the value of the given HTTP
header. Falls back to reading the user from the
basic auth header.
--query-frontend.vertical-shards=QUERY-FRONTEND.VERTICAL-SHARDS
Number of shards to use when
distributing shardable PromQL queries.
For more details, you can refer to
the Vertical query sharding proposal:
https://thanos.io/tip/proposals-accepted/202205-vertical-query-sharding.md
--query-range.align-range-with-step
Mutate incoming queries to align their
start and end with their step for better
cache-ability. Note: Grafana dashboards do that
by default.
--query-range.horizontal-shards=0
Split queries in this many requests
when query duration is below
query-range.max-split-interval.
--query-range.max-query-length=0
Limit the query time range (end - start time)
in the query-frontend, 0 disables it.
--query-range.max-query-parallelism=14
Maximum number of query range requests will be
scheduled in parallel by the Frontend.
--query-range.max-retries-per-request=5
Maximum number of retries for a single query
range request; beyond this, the downstream
error is returned.
--query-range.max-split-interval=0
Split query range below this interval in
query-range.horizontal-shards. Queries with a
range longer than this value will be split in
multiple requests of this length.
--query-range.min-split-interval=0
Split query range requests above this
interval in query-range.horizontal-shards
requests of equal range. Using
this parameter is not allowed with
query-range.split-interval. One should also set
query-range.split-min-horizontal-shards to a
value greater than 1 to enable splitting.
--query-range.partial-response
Enable partial response for query range
requests if no partial_response param is
specified. --no-query-range.partial-response
for disabling.
--query-range.request-downsampled
Make additional query for downsampled data in
case of empty or incomplete response to range
request.
--query-range.response-cache-config=<content>
Alternative to
'query-range.response-cache-config-file' flag
(mutually exclusive). Content of YAML file that
contains response cache configuration.
--query-range.response-cache-config-file=<file-path>
Path to YAML file that contains response cache
configuration.
--query-range.response-cache-max-freshness=1m
Most recent allowed cacheable result for query
range requests, to prevent caching very recent
results that might still be in flux.
--query-range.split-interval=24h
Split query range requests by an interval and
execute in parallel, it should be greater than
0 when query-range.response-cache-config is
configured.
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
-h, --[no-]help Show context-sensitive help (also try --help-long
and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--version Show application version.
--web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory.
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for HTTP
Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--[no-]web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to be
allowed by all.
--[no-]query-range.align-range-with-step
Mutate incoming queries to align their start and
end with their step for better cache-ability.
Note: Grafana dashboards do that by default.
--[no-]query-range.request-downsampled
Make additional query for downsampled data in
case of empty or incomplete response to range
request.
--query-range.split-interval=24h
Split query range requests by an interval and
execute in parallel, it should be greater than
0 when query-range.response-cache-config is
configured.
--query-range.min-split-interval=0
Split query range requests above this interval
in query-range.horizontal-shards requests of
equal range. Using this parameter is not allowed
with query-range.split-interval. One should also
set query-range.split-min-horizontal-shards to a
value greater than 1 to enable splitting.
--query-range.max-split-interval=0
Split query range below this interval in
query-range.horizontal-shards. Queries with a
range longer than this value will be split in
multiple requests of this length.
--query-range.horizontal-shards=0
Split queries in this many requests when query
duration is below query-range.max-split-interval.
--query-range.max-retries-per-request=5
Maximum number of retries for a single query
range request; beyond this, the downstream error
is returned.
--[no-]query-frontend.enable-x-functions
Enable experimental x-
functions in query-frontend.
--no-query-frontend.enable-x-functions for
disabling.
--enable-feature= ... Comma separated feature names to enable. Valid
options for now: promql-experimental-functions
(enables promql experimental functions in
query-frontend)
--query-range.max-query-length=0
Limit the query time range (end - start time) in
the query-frontend, 0 disables it.
--query-range.max-query-parallelism=14
Maximum number of query range requests will be
scheduled in parallel by the Frontend.
--query-range.response-cache-max-freshness=1m
Most recent allowed cacheable result for query
range requests, to prevent caching very recent
results that might still be in flux.
--[no-]query-range.partial-response
Enable partial response for query range requests
if no partial_response param is specified.
--no-query-range.partial-response for disabling.
--query-range.response-cache-config-file=<file-path>
Path to YAML file that contains response cache
configuration.
--query-range.response-cache-config=<content>
Alternative to
'query-range.response-cache-config-file' flag
(mutually exclusive). Content of YAML file that
contains response cache configuration.
--labels.split-interval=24h
Split labels requests by an interval and execute
in parallel, it should be greater than 0 when
labels.response-cache-config is configured.
--labels.max-retries-per-request=5
Maximum number of retries for a single
label/series API request; beyond this, the
downstream error is returned.
--labels.max-query-parallelism=14
Maximum number of labels requests will be
scheduled in parallel by the Frontend.
--labels.response-cache-max-freshness=1m
Most recent allowed cacheable result for labels
requests, to prevent caching very recent results
that might still be in flux.
--[no-]labels.partial-response
Enable partial response for labels requests
if no partial_response param is specified.
--no-labels.partial-response for disabling.
--labels.default-time-range=24h
The default metadata time range duration for
retrieving labels through Labels and Series API
when the range parameters are not specified.
--labels.response-cache-config-file=<file-path>
Path to YAML file that contains response cache
configuration.
--labels.response-cache-config=<content>
Alternative to
'labels.response-cache-config-file' flag
(mutually exclusive). Content of YAML file that
contains response cache configuration.
--cache-compression-type=""
Use compression in results cache. Supported
values are: 'snappy' and ” (disable compression).
--query-frontend.downstream-url="http://localhost:9090"
URL of downstream Prometheus Query compatible
API.
--query-frontend.downstream-tripper-config-file=<file-path>
Path to YAML file that contains downstream
tripper configuration. If your downstream URL
is localhost or 127.0.0.1 then it is highly
recommended to increase max_idle_conns_per_host
to at least 100.
--query-frontend.downstream-tripper-config=<content>
Alternative to
'query-frontend.downstream-tripper-config-file'
flag (mutually exclusive). Content of YAML file
that contains downstream tripper configuration.
If your downstream URL is localhost or 127.0.0.1
then it is highly recommended to increase
max_idle_conns_per_host to at least 100.
--[no-]query-frontend.compress-responses
Compress HTTP responses.
--query-frontend.log-queries-longer-than=0
Log queries that are slower than the specified
duration. Set to 0 to disable. Set to < 0 to
enable on all queries.
--[no-]query-frontend.force-query-stats
Enables query statistics for all queries and will
export statistics as logs and service headers.
--query-frontend.org-id-header=<http-header-name> ...
Deprecation Warning - This flag
will be soon deprecated in favor of
query-frontend.tenant-header and both flags
cannot be used at the same time. Request header
names used to identify the source of slow queries
(repeated flag). The values of the header will be
added to the org id field in the slow query log.
If multiple headers match the request, the first
matching arg specified will take precedence.
If no headers match 'anonymous' will be used.
--query-frontend.forward-header=<http-header-name> ...
List of headers forwarded by the query-frontend
to downstream queriers, default is empty
--query-frontend.vertical-shards=QUERY-FRONTEND.VERTICAL-SHARDS
Number of shards to use when
distributing shardable PromQL queries.
For more details, you can refer to
the Vertical query sharding proposal:
https://thanos.io/tip/proposals-accepted/202205-vertical-query-sharding.md
--query-frontend.slow-query-logs-user-header=<http-header-name>
Set the value of the field remote_user in the
slow query logs to the value of the given HTTP
header. Falls back to reading the user from the
basic auth header.
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
```

View File

@ -103,7 +103,16 @@ thanos query \
This logic can also be controlled via parameter on QueryAPI. More details below.
## Experimental PromQL Engine
### Deduplication Algorithms
Thanos Querier supports different algorithms for deduplicating overlapping series. You can choose the deduplication algorithm using the `--deduplication.func` flag. The available options are:
* `penalty` (default): This is the default deduplication algorithm used by Thanos. It fills gaps only after a certain penalty window. This helps avoid flapping between replicas due to minor differences or delays.
* `chain`: This algorithm performs 1:1 deduplication for samples. It merges all available data points from the replicas without applying any penalty. This is useful in deployments based on receivers, where each replica is populated by the same data. In such cases, using the penalty algorithm may cause gaps even when data is available in other replicas.
Note that deduplication of HA groups is not supported by the `chain` algorithm.
## Thanos PromQL Engine (experimental)
By default, Thanos querier comes with standard Prometheus PromQL engine. However, when `--query.promql-engine=thanos` is specified, Thanos will use [experimental Thanos PromQL engine](http://github.com/thanos-community/promql-engine) which is a drop-in, efficient implementation of PromQL engine with query planner and optimizers.
@ -113,6 +122,14 @@ This feature is still **experimental** given active development. All queries sho
For new engine bugs/issues, please use https://github.com/thanos-community/promql-engine GitHub issues.
### Distributed execution mode
When using Thanos PromQL Engine the distributed execution mode can be enabled using `--query.mode=distributed`. When this mode is enabled, the Querier will break down each query into independent fragments and delegate them to components which implement the Query API.
This mode is particularly useful in architectures where multiple independent Queriers are deployed in separate environments (different regions or different Kubernetes clusters) and are federated through a separate central Querier. A Querier running in the distributed mode will only talk to Queriers, or other components which implement the Query API. Endpoints which only act as Stores (e.g. Store Gateways or Rulers), and are directly connected to a distributed Querier, will not be included in the execution of a distributed query. This constraint should help with keeping the distributed query execution simple and efficient, but could be removed in the future if there are good use cases for it.
For further details on the design and use cases of this feature, see the [official design document](https://thanos.io/tip/proposals-done/202301-distributed-query-execution.md/).
## Query API Overview
As mentioned, Query API exposed by Thanos is guaranteed to be compatible with [Prometheus 2.x. API](https://prometheus.io/docs/prometheus/latest/querying/api/). However for additional Thanos features on top of Prometheus, Thanos adds:
@ -264,7 +281,7 @@ Example file SD file in YAML:
### Tenant Metrics
Tenant information is captured in relevant Thanos exported metrics in the Querier, Query Frontend and Store. In order make use of this functionality requests to the Query/Query Frontend component should include the tenant-id in the appropriate HTTP request header as configured with `--query.tenant-header`. The tenant information is passed through components (including Query Frontend), down to the Thanos Store, enabling per-tenant metrics in these components also. If no tenant header is set to requests to the query component, the default tenant as defined by `--query.tenant-default-id` will be used.
Tenant information is captured in relevant Thanos exported metrics in the Querier, Query Frontend and Store. In order make use of this functionality requests to the Query/Query Frontend component should include the tenant-id in the appropriate HTTP request header as configured with `--query.tenant-header`. The tenant information is passed through components (including Query Frontend), down to the Thanos Store, enabling per-tenant metrics in these components also. If no tenant header is set to requests to the query component, the default tenant as defined by `--query.default-tenant-id` will be used.
### Tenant Enforcement
@ -274,14 +291,6 @@ In case of nested Thanos Query components, it's important to note that tenancy e
Further, note that there are no authentication mechanisms in Thanos, so anyone can set an arbitrary tenant in the HTTP header. It is recommended to use a proxy in front of the querier in case an authentication mechanism is needed. The Query UI also includes an option to set an arbitrary tenant, and should therefore not be exposed to end-users if users should not be able to see each others data.
### Distributed execution mode
The distributed execution mode can be enabled using `--query.mode=distributed`. When this mode is enabled, the Querier will break down each query into independent fragments and delegate them to components which implement the Query API.
This mode is particularly useful in architectures where multiple independent Queriers are deployed in separate environments (different regions or different Kubernetes clusters) and are federated through a separate central Querier. A Querier running in the distributed mode will only talk to Queriers, or other components which implement the Query API. Endpoints which only act as Stores (e.g. Store Gateways or Rulers), and are directly connected to a distributed Querier, will not be included in the execution of a distributed query. This constraint should help with keeping the distributed query execution simple and efficient, but could be removed in the future if there are good use cases for it.
For further details on the design and use cases of this feature, see the [official design document](https://thanos.io/tip/proposals-done/202301-distributed-query-execution.md/).
## Flags
```$ mdox-exec="thanos query --help"
@ -290,71 +299,29 @@ usage: thanos query [<flags>]
Query node exposing PromQL enabled Query API with data retrieved from multiple
store nodes.
Flags:
--alert.query-url=ALERT.QUERY-URL
The external Thanos Query URL that would be set
in all alerts 'Source' field.
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory.
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--endpoint=<endpoint> ... Addresses of statically configured Thanos
API servers (repeatable). The scheme may be
prefixed with 'dns+' or 'dnssrv+' to detect
Thanos API servers through respective DNS
lookups.
--endpoint-group=<endpoint-group> ...
Experimental: DNS name of statically configured
Thanos API server groups (repeatable). Targets
resolved from the DNS name will be queried in
a round-robin, instead of a fanout manner.
This flag should be used when connecting a
Thanos Query to HA groups of Thanos components.
--endpoint-group-strict=<endpoint-group-strict> ...
Experimental: DNS name of statically configured
Thanos API server groups (repeatable) that are
always used, even if the health check fails.
--endpoint-strict=<staticendpoint> ...
Addresses of only statically configured Thanos
API servers that are always used, even if
the health check fails. Useful if you have a
caching layer on top.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-client-server-name=""
Server name to verify the hostname on
the returned gRPC certificates. See
https://tools.ietf.org/html/rfc4366#section-3.1
--grpc-client-tls-ca="" TLS CA Certificates to use to verify gRPC
servers
--grpc-client-tls-cert="" TLS Certificates to use to identify this client
to the server
--grpc-client-tls-key="" TLS Key for the client's certificate
--grpc-client-tls-secure Use TLS when talking to the gRPC server
--grpc-client-tls-skip-verify
Disable TLS certificate verification i.e self
signed, signed by fake CA
--grpc-compression=none Compression algorithm to use for gRPC requests
to other clients. Must be one of: snappy, none
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
@ -362,177 +329,50 @@ Flags:
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--query.active-query-path=""
Directory to log currently active queries in
the queries.active file.
--query.auto-downsampling Enable automatic adjustment (step / 5) to what
source of data should be used in store gateways
if no max_source_resolution param is specified.
--query.conn-metric.label=external_labels... ...
Optional selection of query connection metric
labels to be collected from endpoint set
--query.default-evaluation-interval=1m
Set default evaluation interval for sub
queries.
--query.default-step=1s Set default step for range queries. Default
step is only used when step is not set in UI.
In such cases, Thanos UI will use default
step to calculate resolution (resolution
= max(rangeSeconds / 250, defaultStep)).
This will not work from Grafana, but Grafana
has __step variable which can be used.
--query.default-tenant-id="default-tenant"
Default tenant ID to use if tenant header is
not present
--query.enable-x-functions
Whether to enable extended rate functions
(xrate, xincrease and xdelta). Only has effect
when used with Thanos engine.
--query.enforce-tenancy Enforce tenancy on Query APIs. Responses
are returned only if the label value of the
configured tenant-label-name and the value of
the tenant header matches.
--query.lookback-delta=QUERY.LOOKBACK-DELTA
The maximum lookback duration for retrieving
metrics during expression evaluations.
PromQL always evaluates the query for the
certain timestamp (query range timestamps are
deduced by step). Since scrape intervals might
be different, PromQL looks back for given
amount of time to get latest sample. If it
exceeds the maximum lookback delta it assumes
series is stale and returns none (a gap).
This is why lookback delta should be set to at
least 2 times of the slowest scrape interval.
If unset it will use the promql default of 5m.
--query.max-concurrent=20 Maximum number of queries processed
concurrently by query node.
--query.max-concurrent-select=4
Maximum number of select requests made
concurrently per a query.
--query.metadata.default-time-range=0s
The default metadata time range duration for
retrieving labels through Labels and Series API
when the range parameters are not specified.
The zero value means range covers the time
since the beginning.
--query.mode=local PromQL query mode. One of: local, distributed.
--query.partial-response Enable partial response for queries if
no partial_response param is specified.
--no-query.partial-response for disabling.
--query.promql-engine=prometheus
Default PromQL engine to use.
--query.replica-label=QUERY.REPLICA-LABEL ...
Labels to treat as a replica indicator along
which data is deduplicated. Still you will
be able to query without deduplication using
'dedup=false' parameter. Data includes time
series, recording rules, and alerting rules.
--query.telemetry.request-duration-seconds-quantiles=0.1... ...
The quantiles for exporting metrics about the
request duration quantiles.
--query.telemetry.request-samples-quantiles=100... ...
The quantiles for exporting metrics about the
samples count quantiles.
--query.telemetry.request-series-seconds-quantiles=10... ...
The quantiles for exporting metrics about the
series count quantiles.
--query.tenant-certificate-field=
Use TLS client's certificate field to determine
tenant for write requests. Must be one of
organization, organizationalUnit or commonName.
This setting will cause the query.tenant-header
flag value to be ignored.
--query.tenant-header="THANOS-TENANT"
HTTP header to determine tenant.
--query.tenant-label-name="tenant_id"
Label name to use when enforcing tenancy (if
--query.enforce-tenancy is enabled).
--query.timeout=2m Maximum time to process query by query node.
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--selector-label=<name>="<value>" ...
Query selector labels that will be exposed in
info endpoint (repeated).
--selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML
file with relabeling configuration that allows
selecting blocks to query based on their
external labels. It follows the Thanos sharding
relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to query based on their external labels.
It follows the Thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--store=<store> ... Deprecation Warning - This flag is deprecated
and replaced with `endpoint`. Addresses of
statically configured store API servers
(repeatable). The scheme may be prefixed with
'dns+' or 'dnssrv+' to detect store API servers
through respective DNS lookups.
--store-strict=<staticstore> ...
Deprecation Warning - This flag is deprecated
and replaced with `endpoint-strict`. Addresses
of only statically configured store API servers
that are always used, even if the health check
fails. Useful if you have a caching layer on
top.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.response-timeout=0ms
If a Store doesn't send any data in this
specified duration then a Store will be ignored
and partial data will be returned if it's
enabled. 0 disables timeout.
--store.sd-dns-interval=30s
Interval between DNS resolutions.
--store.sd-files=<path> ...
Path to files that contain addresses of store
API servers. The path can be a glob pattern
(repeatable).
--store.sd-interval=5m Refresh interval to re-read file SD files.
It is used as a resync fallback.
--store.unhealthy-timeout=5m
Timeout before an unhealthy store is cleaned
from the store UI page.
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--version Show application version.
--web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--[no-]grpc-client-tls-secure
Use TLS when talking to the gRPC server
--[no-]grpc-client-tls-skip-verify
Disable TLS certificate verification i.e self
signed, signed by fake CA
--grpc-client-tls-cert="" TLS Certificates to use to identify this client
to the server
--grpc-client-tls-key="" TLS Key for the client's certificate
--grpc-client-tls-ca="" TLS CA Certificates to use to verify gRPC
servers
--grpc-client-server-name=""
Server name to verify the hostname on
the returned gRPC certificates. See
https://tools.ietf.org/html/rfc4366#section-3.1
--grpc-compression=none Compression algorithm to use for gRPC requests
to other clients. Must be one of: snappy, none
--web.route-prefix="" Prefix for API and UI endpoints. This allows
thanos UI to be served on a sub-path.
Defaults to the value of --web.external-prefix.
This option is analogous to --web.route-prefix
of Prometheus.
--web.external-prefix="" Static prefix for all HTML links and
redirect URLs in the UI query web interface.
Actual endpoints are still served on / or the
@ -552,11 +392,217 @@ Flags:
stripped prefix value in X-Forwarded-Prefix
header. This allows thanos UI to be served on a
sub-path.
--web.route-prefix="" Prefix for API and UI endpoints. This allows
thanos UI to be served on a sub-path.
Defaults to the value of --web.external-prefix.
This option is analogous to --web.route-prefix
of Prometheus.
--[no-]web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--query.timeout=2m Maximum time to process query by query node.
--query.promql-engine=prometheus
Default PromQL engine to use.
--[no-]query.enable-x-functions
Whether to enable extended rate functions
(xrate, xincrease and xdelta). Only has effect
when used with Thanos engine.
--query.mode=local PromQL query mode. One of: local, distributed.
--query.max-concurrent=20 Maximum number of queries processed
concurrently by query node.
--query.lookback-delta=QUERY.LOOKBACK-DELTA
The maximum lookback duration for retrieving
metrics during expression evaluations.
PromQL always evaluates the query for the
certain timestamp (query range timestamps are
deduced by step). Since scrape intervals might
be different, PromQL looks back for given
amount of time to get latest sample. If it
exceeds the maximum lookback delta it assumes
series is stale and returns none (a gap).
This is why lookback delta should be set to at
least 2 times of the slowest scrape interval.
If unset it will use the promql default of 5m.
--query.max-concurrent-select=4
Maximum number of select requests made
concurrently per a query.
--query.conn-metric.label=external_labels... ...
Optional selection of query connection metric
labels to be collected from endpoint set
--deduplication.func=penalty
Experimental. Deduplication algorithm for
merging overlapping series. Possible values
are: "penalty", "chain". If no value is
specified, penalty based deduplication
algorithm will be used. When set to chain, the
default compact deduplication merger is used,
which performs 1:1 deduplication for samples.
At least one replica label has to be set via
--query.replica-label flag.
--query.replica-label=QUERY.REPLICA-LABEL ...
Labels to treat as a replica indicator along
which data is deduplicated. Still you will
be able to query without deduplication using
'dedup=false' parameter. Data includes time
series, recording rules, and alerting rules.
Flag may be specified multiple times as well as
a comma separated list of labels.
--query.partition-label=QUERY.PARTITION-LABEL ...
Labels that partition the leaf queriers. This
is used to scope down the labelsets of leaf
queriers when using the distributed query mode.
If set, these labels must form a partition
of the leaf queriers. Partition labels must
not intersect with replica labels. Every TSDB
of a leaf querier must have these labels.
This is useful when there are multiple external
labels that are irrelevant for the partition as
it allows the distributed engine to ignore them
for some optimizations. If this is empty then
all labels are used as partition labels.
--query.metadata.default-time-range=0s
The default metadata time range duration for
retrieving labels through Labels and Series API
when the range parameters are not specified.
The zero value means range covers the time
since the beginning.
--selector-label=<name>="<value>" ...
Query selector labels that will be exposed in
info endpoint (repeated).
--[no-]query.auto-downsampling
Enable automatic adjustment (step / 5) to what
source of data should be used in store gateways
if no max_source_resolution param is specified.
--[no-]query.partial-response
Enable partial response for queries if
no partial_response param is specified.
--no-query.partial-response for disabling.
--query.active-query-path=""
Directory to log currently active queries in
the queries.active file.
--enable-feature= ... Comma separated feature names to enable. Valid
options for now: promql-experimental-functions
(enables promql experimental functions in
query)
--query.default-evaluation-interval=1m
Set default evaluation interval for sub
queries.
--query.default-step=1s Set default step for range queries. Default
step is only used when step is not set in UI.
In such cases, Thanos UI will use default
step to calculate resolution (resolution
= max(rangeSeconds / 250, defaultStep)).
This will not work from Grafana, but Grafana
has __step variable which can be used.
--store.response-timeout=0ms
If a Store doesn't send any data in this
specified duration then a Store will be ignored
and partial data will be returned if it's
enabled. 0 disables timeout.
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to query based on their external labels.
It follows the Thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML
file with relabeling configuration that allows
selecting blocks to query based on their
external labels. It follows the Thanos sharding
relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--alert.query-url=ALERT.QUERY-URL
The external Thanos Query URL that would be set
in all alerts 'Source' field.
--query.telemetry.request-duration-seconds-quantiles=0.1... ...
The quantiles for exporting metrics about the
request duration quantiles.
--query.telemetry.request-samples-quantiles=100... ...
The quantiles for exporting metrics about the
samples count quantiles.
--query.telemetry.request-series-seconds-quantiles=10... ...
The quantiles for exporting metrics about the
series count quantiles.
--query.tenant-header="THANOS-TENANT"
HTTP header to determine tenant.
--query.default-tenant-id="default-tenant"
Default tenant ID to use if tenant header is
not present
--query.tenant-certificate-field=
Use TLS client's certificate field to determine
tenant for write requests. Must be one of
organization, organizationalUnit or commonName.
This setting will cause the query.tenant-header
flag value to be ignored.
--[no-]query.enforce-tenancy
Enforce tenancy on Query APIs. Responses
are returned only if the label value of the
configured tenant-label-name and the value of
the tenant header matches.
--query.tenant-label-name="tenant_id"
Label name to use when enforcing tenancy (if
--query.enforce-tenancy is enabled).
--store.sd-dns-interval=30s
Interval between DNS resolutions.
--store.unhealthy-timeout=5m
Timeout before an unhealthy store is cleaned
from the store UI page.
--endpoint.sd-config-file=<file-path>
Path to Config File with endpoint definitions
--endpoint.sd-config=<content>
Alternative to 'endpoint.sd-config-file' flag
(mutually exclusive). Content of Config File
with endpoint definitions
--endpoint.sd-config-reload-interval=5m
Interval between endpoint config refreshes
--store.sd-files=<path> ...
(Deprecated) Path to files that contain
addresses of store API servers. The path can be
a glob pattern (repeatable).
--store.sd-interval=5m (Deprecated) Refresh interval to re-read file
SD files. It is used as a resync fallback.
--endpoint=<endpoint> ... (Deprecated): Addresses of statically
configured Thanos API servers (repeatable).
The scheme may be prefixed with 'dns+' or
'dnssrv+' to detect Thanos API servers through
respective DNS lookups.
--endpoint-group=<endpoint-group> ...
(Deprecated, Experimental): DNS name of
statically configured Thanos API server groups
(repeatable). Targets resolved from the DNS
name will be queried in a round-robin, instead
of a fanout manner. This flag should be used
when connecting a Thanos Query to HA groups of
Thanos components.
--endpoint-strict=<endpoint-strict> ...
(Deprecated): Addresses of only statically
configured Thanos API servers that are always
used, even if the health check fails. Useful if
you have a caching layer on top.
--endpoint-group-strict=<endpoint-group-strict> ...
(Deprecated, Experimental): DNS name of
statically configured Thanos API server groups
(repeatable) that are always used, even if the
health check fails.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
```

View File

@ -22,10 +22,69 @@ The Ketama algorithm is a consistent hashing scheme which enables stable scaling
If you are using the `hashmod` algorithm and wish to migrate to `ketama`, the simplest and safest way would be to set up a new pool receivers with `ketama` hashrings and start remote-writing to them. Provided you are on the latest Thanos version, old receivers will flush their TSDBs after the configured retention period and will upload blocks to object storage. Once you have verified that is done, decommission the old receivers.
#### Shuffle sharding
Ketama also supports [shuffle sharding](https://aws.amazon.com/builders-library/workload-isolation-using-shuffle-sharding/). It allows you to provide a single-tenant experience in a multi-tenant system. With shuffle sharding, a tenant gets a subset of all nodes in a hashring. You can configure shuffle sharding for any Ketama hashring like so:
```json
[
{
"endpoints": [
{"address": "node-1:10901", "capnproto_address": "node-1:19391", "az": "foo"},
{"address": "node-2:10901", "capnproto_address": "node-2:19391", "az": "bar"},
{"address": "node-3:10901", "capnproto_address": "node-3:19391", "az": "qux"},
{"address": "node-4:10901", "capnproto_address": "node-4:19391", "az": "foo"},
{"address": "node-5:10901", "capnproto_address": "node-5:19391", "az": "bar"},
{"address": "node-6:10901", "capnproto_address": "node-6:19391", "az": "qux"}
],
"algorithm": "ketama",
"shuffle_sharding_config": {
"shard_size": 2,
"cache_size": 100,
"overrides": [
{
"shard_size": 3,
"tenants": ["prefix-tenant-*"],
"tenant_matcher_type": "glob"
}
]
}
}
]
```
This will enable shuffle sharding with the default shard size of 2 and override it to 3 for every tenant that starts with `prefix-tenant-`.
`cache_size` sets the size of the in-memory LRU cache of the computed subrings. It is not possible to cache everything because an attacker could possibly spam requests with random tenants and those subrings would stay in-memory forever.
With this config, `shard_size/number_of_azs` is chosen from each availability zone for each tenant. So, each tenant will get a unique and consistent set of 3 nodes.
You can use `zone_awareness_disabled` to disable zone awareness. This is useful in the case where you have many separate AZs and it doesn't matter which one to choose. The shards will ignore AZs but the Ketama algorithm will later prefer spreading load through as many AZs as possible. That's why with zone awareness disabled it is recommended to set the shard size to be `max(nodes_in_any_az, replication_factor)`.
Receive only supports stateless shuffle sharding now so it doesn't store and check there have been any overlaps between shards.
### Hashmod (discouraged)
This algorithm uses a `hashmod` function over all labels to decide which receiver is responsible for a given timeseries. This is the default algorithm due to historical reasons. However, its usage for new Receive installations is discouraged since adding new Receiver nodes leads to series churn and memory usage spikes.
### Replication protocols
By default, Receivers replicate data using Protobuf over gRPC. Deserializing protobuf-encoded messages can be resource-intensive and cause significant GC pressure. Alternatively, you can use [Cap'N Proto](https://capnproto.org/) for replication encoding and as the RPC framework.
In order to enable this mode, you can use the `receive.replication-protocol=capnproto` option on the receiver. Thanos will try to infer the Cap'N Proto address of each peer in the hashring using the existing gRPC address. You can also explicitly set the Cap'N Proto as follows:
```json
[
{
"endpoints": [
{"address": "node-1:10901", "capnproto_address": "node-1:19391"},
{"address": "node-2:10901", "capnproto_address": "node-2:19391"},
{"address": "node-3:10901", "capnproto_address": "node-3:19391"}
]
}
]
```
### Hashring management and autoscaling in Kubernetes
The [Thanos Receive Controller](https://github.com/observatorium/thanos-receive-controller) project aims to automate hashring management when running Thanos in Kubernetes. In combination with the Ketama hashring algorithm, this controller can also be used to keep hashrings up to date when Receivers are scaled automatically using an HPA or [Keda](https://keda.sh/).
@ -94,6 +153,8 @@ config:
server_name: ""
insecure_skip_verify: false
disable_compression: false
chunk_size_bytes: 0
max_retries: 0
prefix: ""
```
@ -307,6 +368,26 @@ This number of workers is controlled by `--receive.forward.async-workers=`.
Please see the metric `thanos_receive_forward_delay_seconds` to see if you need to increase the number of forwarding workers.
## Quorum
The following formula is used for calculating quorum:
```go mdox-exec="sed -n '1046,1056p' pkg/receive/handler.go"
// writeQuorum returns minimum number of replicas that has to confirm write success before claiming replication success.
func (h *Handler) writeQuorum() int {
// NOTE(GiedriusS): this is here because otherwise RF=2 doesn't make sense as all writes
// would need to succeed all the time. Another way to think about it is when migrating
// from a Sidecar based setup with 2 Prometheus nodes to a Receiver setup, we want to
// keep the same guarantees.
if h.options.ReplicationFactor == 2 {
return 1
}
return int((h.options.ReplicationFactor / 2) + 1)
}
```
So, if the replication factor is 2 then at least one write must succeed. With RF=3, two writes must succeed, and so on.
## Flags
```$ mdox-exec="thanos receive --help"
@ -314,38 +395,29 @@ usage: thanos receive [<flags>]
Accept Prometheus remote write API requests and write to local tsdb.
Flags:
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory.
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
@ -353,32 +425,100 @@ Flags:
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--remote-write.address="0.0.0.0:19291"
Address to listen on for remote write requests.
--remote-write.server-tls-cert=""
TLS Certificate for HTTP server, leave blank to
disable TLS.
--remote-write.server-tls-key=""
TLS Key for the HTTP server, leave blank to
disable TLS.
--remote-write.server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--remote-write.server-tls-min-version="1.3"
TLS version for the gRPC server, leave blank
to default to TLS 1.3, allow values: ["1.0",
"1.1", "1.2", "1.3"]
--remote-write.client-tls-cert=""
TLS Certificates to use to identify this client
to the server.
--remote-write.client-tls-key=""
TLS Key for the client's certificate.
--[no-]remote-write.client-tls-secure
Use TLS when talking to the other receivers.
--[no-]remote-write.client-tls-skip-verify
Disable TLS certificate verification when
talking to the other receivers i.e self signed,
signed by fake CA.
--remote-write.client-tls-ca=""
TLS CA Certificates to use to verify servers.
--remote-write.client-server-name=""
Server name to verify the hostname
on the returned TLS certificates. See
https://tools.ietf.org/html/rfc4366#section-3.1
--tsdb.path="./data" Data directory of TSDB.
--label=key="value" ... External labels to announce. This flag will be
removed in the future when handling multiple
tsdb instances is added.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--receive.default-tenant-id="default-tenant"
Default tenant ID to use when none is provided
via a header.
--receive.forward.async-workers=5
Number of concurrent workers processing
forwarding of remote-write requests.
--receive.grpc-compression=snappy
Compression algorithm to use for gRPC requests
to other receivers. Must be one of: snappy,
none
--tsdb.retention=15d How long to retain raw samples on local
storage. 0d - disables the retention
policy (i.e. infinite retention).
For more details on how retention is
enforced for individual tenants, please
refer to the Tenant lifecycle management
section in the Receive documentation:
https://thanos.io/tip/components/receive.md/#tenant-lifecycle-management
--receive.hashrings-file=<path>
Path to file that contains the hashring
configuration. A watcher is initialized
to watch changes and update the hashring
dynamically.
--receive.hashrings=<content>
Alternative to 'receive.hashrings-file' flag
(lower priority). Content of file that contains
@ -388,11 +528,6 @@ Flags:
the hashrings. Must be one of hashmod, ketama.
Will be overwritten by the tenant-specific
algorithm in the hashring config.
--receive.hashrings-file=<path>
Path to file that contains the hashring
configuration. A watcher is initialized
to watch changes and update the hashring
dynamically.
--receive.hashrings-file-refresh-interval=5m
Refresh interval to re-read the hashring
configuration file. (used as a fallback)
@ -402,98 +537,95 @@ Flags:
configuration. If it's empty AND hashring
configuration was provided, it means that
receive will run in RoutingOnly mode.
--receive.relabel-config=<content>
Alternative to 'receive.relabel-config-file'
flag (mutually exclusive). Content of YAML file
that contains relabeling configuration.
--receive.relabel-config-file=<file-path>
Path to YAML file that contains relabeling
configuration.
--receive.replica-header="THANOS-REPLICA"
HTTP header specifying the replica number of a
write request.
--receive.replication-factor=1
How many times to replicate incoming write
--receive.tenant-header="THANOS-TENANT"
HTTP header to determine tenant for write
requests.
--receive.split-tenant-label-name=""
Label name through which the request will
be split into multiple tenants. This takes
precedence over the HTTP header.
--receive.tenant-certificate-field=
Use TLS client's certificate field to
determine tenant for write requests.
Must be one of organization, organizationalUnit
or commonName. This setting will cause the
receive.tenant-header flag value to be ignored.
--receive.tenant-header="THANOS-TENANT"
HTTP header to determine tenant for write
requests.
--receive.default-tenant-id="default-tenant"
Default tenant ID to use when none is provided
via a header.
--receive.split-tenant-label-name=""
Label name through which the request will
be split into multiple tenants. This takes
precedence over the HTTP header.
--receive.tenant-label-name="tenant_id"
Label name through which the tenant will be
announced.
--remote-write.address="0.0.0.0:19291"
Address to listen on for remote write requests.
--remote-write.client-server-name=""
Server name to verify the hostname
on the returned TLS certificates. See
https://tools.ietf.org/html/rfc4366#section-3.1
--remote-write.client-tls-ca=""
TLS CA Certificates to use to verify servers.
--remote-write.client-tls-cert=""
TLS Certificates to use to identify this client
to the server.
--remote-write.client-tls-key=""
TLS Key for the client's certificate.
--remote-write.client-tls-secure
Use TLS when talking to the other receivers.
--remote-write.client-tls-skip-verify
Disable TLS certificate verification when
talking to the other receivers i.e self signed,
signed by fake CA.
--remote-write.server-tls-cert=""
TLS Certificate for HTTP server, leave blank to
disable TLS.
--remote-write.server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--remote-write.server-tls-key=""
TLS Key for the HTTP server, leave blank to
disable TLS.
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tsdb.allow-overlapping-blocks
--receive.replica-header="THANOS-REPLICA"
HTTP header specifying the replica number of a
write request.
--receive.forward.async-workers=5
Number of concurrent workers processing
forwarding of remote-write requests.
--receive.grpc-compression=snappy
Compression algorithm to use for gRPC requests
to other receivers. Must be one of: snappy,
none
--receive.replication-factor=1
How many times to replicate incoming write
requests.
--receive.replication-protocol=protobuf
The protocol to use for replicating
remote-write requests. One of protobuf,
capnproto
--receive.capnproto-address="0.0.0.0:19391"
Address for the Cap'n Proto server.
--receive.grpc-service-config=<content>
gRPC service configuration file
or content in JSON format. See
https://github.com/grpc/grpc/blob/master/doc/service_config.md
--receive.relabel-config-file=<file-path>
Path to YAML file that contains relabeling
configuration.
--receive.relabel-config=<content>
Alternative to 'receive.relabel-config-file'
flag (mutually exclusive). Content of YAML file
that contains relabeling configuration.
--tsdb.too-far-in-future.time-window=0s
Configures the allowed time window for
ingesting samples too far in the future.
Disabled (0s) by default. Please note enable
this flag will reject samples in the future of
receive local NTP time + configured duration
due to clock skew in remote write clients.
--tsdb.out-of-order.time-window=0s
[EXPERIMENTAL] Configures the allowed
time window for ingestion of out-of-order
samples. Disabled (0s) by defaultPlease
note if you enable this option and you
use compactor, make sure you have the
--compact.enable-vertical-compaction flag
enabled, otherwise you might risk compactor
halt.
--tsdb.out-of-order.cap-max=0
[EXPERIMENTAL] Configures the maximum capacity
for out-of-order chunks (in samples). If set to
<=0, default value 32 is assumed.
--[no-]tsdb.allow-overlapping-blocks
Allow overlapping blocks, which in turn enables
vertical compaction and vertical query merge.
Does not do anything, enabled all the time.
--tsdb.max-retention-bytes=0
Maximum number of bytes that can be stored for
blocks. A unit is required, supported units: B,
KB, MB, GB, TB, PB, EB. Ex: "512MB". Based on
powers-of-2, so 1KB is 1024B.
--[no-]tsdb.wal-compression
Compress the tsdb WAL.
--[no-]tsdb.no-lockfile Do not create lockfile in TSDB data directory.
In any case, the lockfiles will be deleted on
next startup.
--tsdb.head.expanded-postings-cache-size=0
[EXPERIMENTAL] If non-zero, enables expanded
postings cache for the head block.
--tsdb.block.expanded-postings-cache-size=0
[EXPERIMENTAL] If non-zero, enables expanded
postings cache for compacted blocks.
--tsdb.max-exemplars=0 Enables support for ingesting exemplars and
sets the maximum number of exemplars that will
be stored per tenant. In case the exemplar
@ -502,32 +634,41 @@ Flags:
ingesting a new exemplar will evict the oldest
exemplar from storage. 0 (or less) value of
this flag disables exemplars storage.
--tsdb.max-retention-bytes=0
Maximum number of bytes that can be stored for
blocks. A unit is required, supported units: B,
KB, MB, GB, TB, PB, EB. Ex: "512MB". Based on
powers-of-2, so 1KB is 1024B.
--tsdb.no-lockfile Do not create lockfile in TSDB data directory.
In any case, the lockfiles will be deleted on
next startup.
--tsdb.path="./data" Data directory of TSDB.
--tsdb.retention=15d How long to retain raw samples on local
storage. 0d - disables the retention
policy (i.e. infinite retention).
For more details on how retention is
enforced for individual tenants, please
refer to the Tenant lifecycle management
section in the Receive documentation:
https://thanos.io/tip/components/receive.md/#tenant-lifecycle-management
--tsdb.too-far-in-future.time-window=0s
[EXPERIMENTAL] Configures the allowed time
window for ingesting samples too far in the
future. Disabled (0s) by defaultPlease note
enable this flag will reject samples in the
future of receive local NTP time + configured
duration due to clock skew in remote write
clients.
--tsdb.wal-compression Compress the tsdb WAL.
--version Show application version.
--[no-]tsdb.enable-native-histograms
[EXPERIMENTAL] Enables the ingestion of native
histograms.
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
--matcher-cache-size=0 Max number of cached matchers items. Using 0
disables caching.
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--[no-]receive.otlp-enable-target-info
Enables target information in OTLP metrics
ingested by Receive. If enabled, it converts
the resource to the target info metric
--receive.otlp-promote-resource-attributes= ...
(Repeatable) Resource attributes to include in
OTLP metrics ingested by Receive.
--enable-feature= ... Comma separated experimental feature names
to enable. The current list of features is
metric-names-filter.
--receive.lazy-retrieval-max-buffered-responses=20
The lazy retrieval strategy can buffer up to
this number of responses. This is to limit the
memory usage. This flag takes effect only when
the lazy retrieval strategy is enabled.
```

View File

@ -18,6 +18,7 @@ The data of each Rule node can be labeled to satisfy the clusters labeling schem
thanos rule \
--data-dir "/path/to/data" \
--eval-interval "30s" \
--rule-query-offset "10s" \
--rule-file "/path/to/rules/*.rules.yaml" \
--alert.query-url "http://0.0.0.0:9090" \ # This tells what query URL to link to in UI.
--alertmanagers.url "http://alert.thanos.io" \
@ -64,6 +65,9 @@ name: <string>
# How often rules in the group are evaluated.
[ interval: <duration> | default = global.evaluation_interval ]
# Offset the rule evaluation timestamp of this particular group by the specified duration into the past.
[ query_offset: <duration> | default = --rule-query-offset flag ]
rules:
[ - <rule> ... ]
```
@ -265,101 +269,29 @@ usage: thanos rule [<flags>]
Ruler evaluating Prometheus rules against given Query nodes, exposing Store API
and storing old blocks in bucket.
Flags:
--alert.label-drop=ALERT.LABEL-DROP ...
Labels by name to drop before sending
to alertmanager. This allows alert to be
deduplicated on replica label (repeated).
Similar Prometheus alert relabelling
--alert.query-template="/graph?g0.expr={{.Expr}}&g0.tab=1"
Template to use in alerts source field.
Need only include {{.Expr}} parameter
--alert.query-url=ALERT.QUERY-URL
The external Thanos Query URL that would be set
in all alerts 'Source' field
--alert.relabel-config=<content>
Alternative to 'alert.relabel-config-file' flag
(mutually exclusive). Content of YAML file that
contains alert relabelling configuration.
--alert.relabel-config-file=<file-path>
Path to YAML file that contains alert
relabelling configuration.
--alertmanagers.config=<content>
Alternative to 'alertmanagers.config-file'
flag (mutually exclusive). Content
of YAML file that contains alerting
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence
over the '--alertmanagers.url' and
'--alertmanagers.send-timeout' flags.
--alertmanagers.config-file=<file-path>
Path to YAML file that contains alerting
configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence
over the '--alertmanagers.url' and
'--alertmanagers.send-timeout' flags.
--alertmanagers.sd-dns-interval=30s
Interval between DNS resolutions of
Alertmanager hosts.
--alertmanagers.send-timeout=10s
Timeout for sending alerts to Alertmanager
--alertmanagers.url=ALERTMANAGERS.URL ...
Alertmanager replica URLs to push firing
alerts. Ruler claims success if push to
at least one alertmanager from discovered
succeeds. The scheme should not be empty
e.g `http` might be used. The scheme may be
prefixed with 'dns+' or 'dnssrv+' to detect
Alertmanager IPs through respective DNS
lookups. The port defaults to 9093 or the
SRV record's value. The URL path is used as a
prefix for the regular Alertmanager API path.
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory.
--data-dir="data/" data directory
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--eval-interval=1m The default evaluation interval to use.
--for-grace-period=10m Minimum duration between alert and restored
"for" state. This is maintained only for alerts
with configured "for" time greater than grace
period.
--for-outage-tolerance=1h Max time to tolerate prometheus outage for
restoring "for" state of alert.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-query-endpoint=<endpoint> ...
Addresses of Thanos gRPC query API servers
(repeatable). The scheme may be prefixed
with 'dns+' or 'dnssrv+' to detect Thanos API
servers through respective DNS lookups.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
@ -367,140 +299,33 @@ Flags:
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--label=<name>="<value>" ...
Labels to be applied to all generated metrics
(repeated). Similar to external labels for
Prometheus, used to identify ruler and its
blocks as unique source.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--query=<query> ... Addresses of statically configured query
API servers (repeatable). The scheme may be
prefixed with 'dns+' or 'dnssrv+' to detect
query API servers through respective DNS
lookups.
--query.config=<content> Alternative to 'query.config-file' flag
(mutually exclusive). Content of YAML
file that contains query API servers
configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence over the
'--query' and '--query.sd-files' flags.
--query.config-file=<file-path>
Path to YAML file that contains query API
servers configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence over the
'--query' and '--query.sd-files' flags.
--query.default-step=1s Default range query step to use. This is
only used in stateless Ruler and alert state
restoration.
--query.enable-x-functions
Whether to enable extended rate functions
(xrate, xincrease and xdelta). Only has effect
when used with Thanos engine.
--query.http-method=POST HTTP method to use when sending queries.
Possible options: [GET, POST]
--query.sd-dns-interval=30s
Interval between DNS resolutions.
--query.sd-files=<path> ...
Path to file that contains addresses of query
API servers. The path can be a glob pattern
(repeatable).
--query.sd-interval=5m Refresh interval to re-read file SD files.
(used as a fallback)
--remote-write.config=<content>
Alternative to 'remote-write.config-file'
flag (mutually exclusive). Content
of YAML config for the remote-write
configurations, that specify servers
where samples should be sent to (see
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
This automatically enables stateless mode
for ruler and no series will be stored in the
ruler's TSDB. If an empty config (or file) is
provided, the flag is ignored and ruler is run
with its own TSDB.
--remote-write.config-file=<file-path>
Path to YAML config for the remote-write
configurations, that specify servers
where samples should be sent to (see
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
This automatically enables stateless mode
for ruler and no series will be stored in the
ruler's TSDB. If an empty config (or file) is
provided, the flag is ignored and ruler is run
with its own TSDB.
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--resend-delay=1m Minimum amount of time to wait before resending
an alert to Alertmanager.
--restore-ignored-label=RESTORE-IGNORED-LABEL ...
Label names to be ignored when restoring alerts
from the remote storage. This is only used in
stateless mode.
--rule-file=rules/ ... Rule files that should be used by rule
manager. Can be in glob format (repeated).
Note that rules are not automatically detected,
use SIGHUP or do HTTP POST /-/reload to re-read
them.
--shipper.meta-file-name="thanos.shipper.json"
the file to store shipper metadata in
--shipper.upload-compacted
If true shipper will try to upload compacted
blocks as well. Useful for migration purposes.
Works only if compaction is disabled on
Prometheus. Do it once and then disable the
flag when done.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tsdb.block-duration=2h Block duration for TSDB block.
--tsdb.no-lockfile Do not create lockfile in TSDB data directory.
In any case, the lockfiles will be deleted on
next startup.
--tsdb.retention=48h Block retention time on local disk.
--tsdb.wal-compression Compress the tsdb WAL.
--version Show application version.
--web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--web.route-prefix="" Prefix for API and UI endpoints. This allows
thanos UI to be served on a sub-path. This
option is analogous to --web.route-prefix of
Prometheus.
--web.external-prefix="" Static prefix for all HTML links and redirect
URLs in the bucket web UI interface.
Actual endpoints are still served on / or the
@ -520,10 +345,209 @@ Flags:
stripped prefix value in X-Forwarded-Prefix
header. This allows thanos UI to be served on a
sub-path.
--web.route-prefix="" Prefix for API and UI endpoints. This allows
thanos UI to be served on a sub-path. This
option is analogous to --web.route-prefix of
Prometheus.
--[no-]web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--[no-]shipper.upload-compacted
If true shipper will try to upload compacted
blocks as well. Useful for migration purposes.
Works only if compaction is disabled on
Prometheus. Do it once and then disable the
flag when done.
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
--shipper.meta-file-name="thanos.shipper.json"
the file to store shipper metadata in
--query=<query> ... Addresses of statically configured query
API servers (repeatable). The scheme may be
prefixed with 'dns+' or 'dnssrv+' to detect
query API servers through respective DNS
lookups.
--query.config-file=<file-path>
Path to YAML file that contains query API
servers configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence over the
'--query' and '--query.sd-files' flags.
--query.config=<content> Alternative to 'query.config-file' flag
(mutually exclusive). Content of YAML
file that contains query API servers
configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence over the
'--query' and '--query.sd-files' flags.
--query.sd-files=<path> ...
Path to file that contains addresses of query
API servers. The path can be a glob pattern
(repeatable).
--query.sd-interval=5m Refresh interval to re-read file SD files.
(used as a fallback)
--query.sd-dns-interval=30s
Interval between DNS resolutions.
--query.http-method=POST HTTP method to use when sending queries.
Possible options: [GET, POST]
--query.default-step=1s Default range query step to use. This is
only used in stateless Ruler and alert state
restoration.
--alertmanagers.config-file=<file-path>
Path to YAML file that contains alerting
configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence
over the '--alertmanagers.url' and
'--alertmanagers.send-timeout' flags.
--alertmanagers.config=<content>
Alternative to 'alertmanagers.config-file'
flag (mutually exclusive). Content
of YAML file that contains alerting
configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence
over the '--alertmanagers.url' and
'--alertmanagers.send-timeout' flags.
--alertmanagers.url=ALERTMANAGERS.URL ...
Alertmanager replica URLs to push firing
alerts. Ruler claims success if push to
at least one alertmanager from discovered
succeeds. The scheme should not be empty
e.g `http` might be used. The scheme may be
prefixed with 'dns+' or 'dnssrv+' to detect
Alertmanager IPs through respective DNS
lookups. The port defaults to 9093 or the
SRV record's value. The URL path is used as a
prefix for the regular Alertmanager API path.
--alertmanagers.send-timeout=10s
Timeout for sending alerts to Alertmanager
--alertmanagers.sd-dns-interval=30s
Interval between DNS resolutions of
Alertmanager hosts.
--alert.query-url=ALERT.QUERY-URL
The external Thanos Query URL that would be set
in all alerts 'Source' field
--alert.label-drop=ALERT.LABEL-DROP ...
Labels by name to drop before sending
to alertmanager. This allows alert to be
deduplicated on replica label (repeated).
Similar Prometheus alert relabelling
--alert.relabel-config-file=<file-path>
Path to YAML file that contains alert
relabelling configuration.
--alert.relabel-config=<content>
Alternative to 'alert.relabel-config-file' flag
(mutually exclusive). Content of YAML file that
contains alert relabelling configuration.
--alert.query-template="/graph?g0.expr={{.Expr}}&g0.tab=1"
Template to use in alerts source field.
Need only include {{.Expr}} parameter
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--label=<name>="<value>" ...
Labels to be applied to all generated metrics
(repeated). Similar to external labels for
Prometheus, used to identify ruler and its
blocks as unique source.
--tsdb.block-duration=2h Block duration for TSDB block.
--tsdb.retention=48h Block retention time on local disk.
--[no-]tsdb.no-lockfile Do not create lockfile in TSDB data directory.
In any case, the lockfiles will be deleted on
next startup.
--[no-]tsdb.wal-compression
Compress the tsdb WAL.
--data-dir="data/" data directory
--rule-file=rules/ ... Rule files that should be used by rule
manager. Can be in glob format (repeated).
Note that rules are not automatically detected,
use SIGHUP or do HTTP POST /-/reload to re-read
them.
--resend-delay=1m Minimum amount of time to wait before resending
an alert to Alertmanager.
--eval-interval=1m The default evaluation interval to use.
--rule-query-offset=0s The default rule group query_offset duration to
use.
--for-outage-tolerance=1h Max time to tolerate prometheus outage for
restoring "for" state of alert.
--for-grace-period=10m Minimum duration between alert and restored
"for" state. This is maintained only for alerts
with configured "for" time greater than grace
period.
--restore-ignored-label=RESTORE-IGNORED-LABEL ...
Label names to be ignored when restoring alerts
from the remote storage. This is only used in
stateless mode.
--rule-concurrent-evaluation=1
How many rules can be evaluated concurrently.
Default is 1.
--grpc-query-endpoint=<endpoint> ...
Addresses of Thanos gRPC query API servers
(repeatable). The scheme may be prefixed
with 'dns+' or 'dnssrv+' to detect Thanos API
servers through respective DNS lookups.
--[no-]query.enable-x-functions
Whether to enable extended rate functions
(xrate, xincrease and xdelta). Only has effect
when used with Thanos engine.
--enable-feature= ... Comma separated feature names to enable. Valid
options for now: promql-experimental-functions
(enables promql experimental functions for
ruler)
--[no-]tsdb.enable-native-histograms
[EXPERIMENTAL] Enables the ingestion of native
histograms.
--remote-write.config-file=<file-path>
Path to YAML config for the remote-write
configurations, that specify servers
where samples should be sent to (see
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
This automatically enables stateless mode
for ruler and no series will be stored in the
ruler's TSDB. If an empty config (or file) is
provided, the flag is ignored and ruler is run
with its own TSDB.
--remote-write.config=<content>
Alternative to 'remote-write.config-file'
flag (mutually exclusive). Content
of YAML config for the remote-write
configurations, that specify servers
where samples should be sent to (see
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
This automatically enables stateless mode
for ruler and no series will be stored in the
ruler's TSDB. If an empty config (or file) is
provided, the flag is ignored and ruler is run
with its own TSDB.
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
```

View File

@ -74,6 +74,8 @@ config:
server_name: ""
insecure_skip_verify: false
disable_compression: false
chunk_size_bytes: 0
max_retries: 0
prefix: ""
```
@ -93,38 +95,29 @@ usage: thanos sidecar [<flags>]
Sidecar for Prometheus server.
Flags:
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory.
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
@ -132,81 +125,105 @@ Flags:
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--min-time=0000-01-01T00:00:00Z
Start of time range limit to serve. Thanos
sidecar will serve only metrics, which happened
later than this value. Option can be a constant
time in RFC3339 format or time duration
relative to current time, such as -1d or 2h45m.
Valid duration units are ms, s, m, h, d, w, y.
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--prometheus.url=http://localhost:9090
URL at which to reach Prometheus's API.
For better performance use local network.
--prometheus.ready_timeout=10m
Maximum time to wait for the Prometheus
instance to start up
--prometheus.get_config_interval=30s
How often to get Prometheus config
--prometheus.get_config_timeout=5s
--prometheus.get_config_timeout=30s
Timeout for getting Prometheus config
--prometheus.http-client-file=<file-path>
Path to YAML file or string with http
client configs. See Format details:
https://thanos.io/tip/components/sidecar.md/#configuration.
--prometheus.http-client=<content>
Alternative to 'prometheus.http-client-file'
flag (mutually exclusive). Content
of YAML file or string with http
client configs. See Format details:
https://thanos.io/tip/components/sidecar.md/#configuration.
--prometheus.http-client-file=<file-path>
Path to YAML file or string with http
client configs. See Format details:
https://thanos.io/tip/components/sidecar.md/#configuration.
--prometheus.ready_timeout=10m
Maximum time to wait for the Prometheus
instance to start up
--prometheus.url=http://localhost:9090
URL at which to reach Prometheus's API.
For better performance use local network.
--tsdb.path="./data" Data directory of TSDB.
--reloader.config-file="" Config file watched by the reloader.
--reloader.config-envsubst-file=""
Output file for environment variable
substituted config file.
--reloader.config-file="" Config file watched by the reloader.
--reloader.method=http Method used to reload the configuration.
--reloader.process-name="prometheus"
Executable name used to match the process being
reloaded when using the signal method.
--reloader.retry-interval=5s
Controls how often reloader retries config
reload in case of error.
--reloader.rule-dir=RELOADER.RULE-DIR ...
Rule directories for the reloader to refresh
(repeated field).
--reloader.watch-interval=3m
Controls how often reloader re-reads config and
rules.
--reloader.retry-interval=5s
Controls how often reloader retries config
reload in case of error.
--reloader.method=http Method used to reload the configuration.
--reloader.process-name="prometheus"
Executable name used to match the process being
reloaded when using the signal method.
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--shipper.meta-file-name="thanos.shipper.json"
the file to store shipper metadata in
--shipper.upload-compacted
https://thanos.io/tip/thanos/storage.md/#configuration
--[no-]shipper.upload-compacted
If true shipper will try to upload compacted
blocks as well. Useful for migration purposes.
Works only if compaction is disabled on
Prometheus. Do it once and then disable the
flag when done.
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
--shipper.meta-file-name="thanos.shipper.json"
the file to store shipper metadata in
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
@ -214,21 +231,13 @@ Flags:
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tsdb.path="./data" Data directory of TSDB.
--version Show application version.
--min-time=0000-01-01T00:00:00Z
Start of time range limit to serve. Thanos
sidecar will serve only metrics, which happened
later than this value. Option can be a constant
time in RFC3339 format or time duration
relative to current time, such as -1d or 2h45m.
Valid duration units are ms, s, m, h, d, w, y.
```

View File

@ -33,6 +33,8 @@ config:
server_name: ""
insecure_skip_verify: false
disable_compression: false
chunk_size_bytes: 0
max_retries: 0
prefix: ""
```
@ -46,10 +48,124 @@ usage: thanos store [<flags>]
Store node giving access to blocks in a bucket provider. Now supported GCS, S3,
Azure, Swift, Tencent COS and Aliyun OSS.
Flags:
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory.
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
HTTP Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--data-dir="./data" Local data directory used for caching
purposes (index-header, in-mem cache items and
meta.jsons). If removed, no data will be lost,
just store will have to rebuild the cache.
NOTE: Putting raw blocks here will not
cause the store to read them. For such use
cases use Prometheus + sidecar. Ignored if
--no-cache-index-header option is specified.
--[no-]cache-index-header Cache TSDB index-headers on disk to reduce
startup time. When set to true, Thanos Store
will download index headers from remote object
storage on startup and create a header file on
disk. Use --data-dir to set the directory in
which index headers will be downloaded.
--index-cache-size=250MB Maximum size of items held in the in-memory
index cache. Ignored if --index-cache.config or
--index-cache.config-file option is specified.
--index-cache.config-file=<file-path>
Path to YAML file that contains index
cache configuration. See format details:
https://thanos.io/tip/components/store.md/#index-cache
--index-cache.config=<content>
Alternative to 'index-cache.config-file'
flag (mutually exclusive). Content of
YAML file that contains index cache
configuration. See format details:
https://thanos.io/tip/components/store.md/#index-cache
--chunk-pool-size=2GB Maximum size of concurrently allocatable
bytes reserved strictly to reuse for chunks in
memory.
--store.grpc.touched-series-limit=0
DEPRECATED: use store.limits.request-series.
--store.grpc.series-sample-limit=0
DEPRECATED: use store.limits.request-samples.
--store.grpc.downloaded-bytes-limit=0
Maximum amount of downloaded (either
fetched or touched) bytes in a single
Series/LabelNames/LabelValues call. The Series
call fails if this limit is exceeded. 0 means
no limit.
--store.grpc.series-max-concurrency=20
Maximum number of concurrent Series calls.
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--sync-block-duration=15m Repeat interval for syncing the blocks between
local and remote view.
--block-discovery-strategy="concurrent"
One of concurrent, recursive. When set to
concurrent, stores will concurrently issue
@ -59,66 +175,46 @@ Flags:
recursively traversing into each directory.
This avoids N+1 calls at the expense of having
slower bucket iterations.
--block-meta-fetch-concurrency=32
Number of goroutines to use when fetching block
metadata from object storage.
--block-sync-concurrency=20
Number of goroutines to use when constructing
index-cache.json blocks from object storage.
Must be equal or greater than 1.
--bucket-web-label=BUCKET-WEB-LABEL
External block label to use as group title in
the bucket web UI
--cache-index-header Cache TSDB index-headers on disk to reduce
startup time. When set to true, Thanos Store
will download index headers from remote object
storage on startup and create a header file on
disk. Use --data-dir to set the directory in
which index headers will be downloaded.
--chunk-pool-size=2GB Maximum size of concurrently allocatable
bytes reserved strictly to reuse for chunks in
memory.
--block-meta-fetch-concurrency=32
Number of goroutines to use when fetching block
metadata from object storage.
--min-time=0000-01-01T00:00:00Z
Start of time range limit to serve. Thanos
Store will serve only metrics, which happened
later than this value. Option can be a constant
time in RFC3339 format or time duration
relative to current time, such as -1d or 2h45m.
Valid duration units are ms, s, m, h, d, w, y.
--max-time=9999-12-31T23:59:59Z
End of time range limit to serve. Thanos Store
will serve only blocks, which happened earlier
than this value. Option can be a constant time
in RFC3339 format or time duration relative
to current time, such as -1d or 2h45m. Valid
duration units are ms, s, m, h, d, w, y.
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to act on based on their external labels.
It follows thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML
file with relabeling configuration that allows
selecting blocks to act on based on their
external labels. It follows thanos sharding
relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--consistency-delay=0s Minimum age of all blocks before they are
being read. Set it to safe value (e.g 30m) if
your object storage is eventually consistent.
GCS and S3 are (roughly) strongly consistent.
--data-dir="./data" Local data directory used for caching
purposes (index-header, in-mem cache items and
meta.jsons). If removed, no data will be lost,
just store will have to rebuild the cache.
NOTE: Putting raw blocks here will not
cause the store to read them. For such use
cases use Prometheus + sidecar. Ignored if
--no-cache-index-header option is specified.
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
HTTP Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--ignore-deletion-marks-delay=24h
Duration after which the blocks marked for
deletion will be filtered out while fetching
@ -140,125 +236,34 @@ Flags:
blocks before being deleted from bucket.
Default is 24h, half of the default value for
--delete-delay on compactor.
--index-cache-size=250MB Maximum size of items held in the in-memory
index cache. Ignored if --index-cache.config or
--index-cache.config-file option is specified.
--index-cache.config=<content>
Alternative to 'index-cache.config-file'
flag (mutually exclusive). Content of
YAML file that contains index cache
configuration. See format details:
https://thanos.io/tip/components/store.md/#index-cache
--index-cache.config-file=<file-path>
Path to YAML file that contains index
cache configuration. See format details:
https://thanos.io/tip/components/store.md/#index-cache
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--max-time=9999-12-31T23:59:59Z
End of time range limit to serve. Thanos Store
will serve only blocks, which happened earlier
than this value. Option can be a constant time
in RFC3339 format or time duration relative
to current time, such as -1d or 2h45m. Valid
duration units are ms, s, m, h, d, w, y.
--min-time=0000-01-01T00:00:00Z
Start of time range limit to serve. Thanos
Store will serve only metrics, which happened
later than this value. Option can be a constant
time in RFC3339 format or time duration
relative to current time, such as -1d or 2h45m.
Valid duration units are ms, s, m, h, d, w, y.
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML
file with relabeling configuration that allows
selecting blocks to act on based on their
external labels. It follows thanos sharding
relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to act on based on their external labels.
It follows thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--store.enable-index-header-lazy-reader
--[no-]store.enable-index-header-lazy-reader
If true, Store Gateway will lazy memory map
index-header only once the block is required by
a query.
--store.enable-lazy-expanded-postings
--[no-]store.enable-lazy-expanded-postings
If true, Store Gateway will estimate postings
size and try to lazily expand postings if
it downloads less data than expanding all
postings.
--store.grpc.downloaded-bytes-limit=0
Maximum amount of downloaded (either
fetched or touched) bytes in a single
Series/LabelNames/LabelValues call. The Series
call fails if this limit is exceeded. 0 means
no limit.
--store.grpc.series-max-concurrency=20
Maximum number of concurrent Series calls.
--store.grpc.series-sample-limit=0
DEPRECATED: use store.limits.request-samples.
--store.grpc.touched-series-limit=0
DEPRECATED: use store.limits.request-series.
--store.posting-group-max-key-series-ratio=100
Mark posting group as lazy if it fetches more
keys than R * max series the query should
fetch. With R set to 100, a posting group which
fetches 100K keys will be marked as lazy if
the current query only fetches 1000 series.
thanos_bucket_store_lazy_expanded_posting_groups_total
shows lazy expanded postings groups with
reasons and you can tune this config
accordingly. This config is only valid if lazy
expanded posting is enabled. 0 disables the
limit.
--store.index-header-lazy-download-strategy=eager
Strategy of how to download index headers
lazily. Supported values: eager, lazy.
If eager, always download index header during
initial load. If lazy, download index header
during query time.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--sync-block-duration=15m Repeat interval for syncing the blocks between
local and remote view.
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--version Show application version.
--web.disable Disable Block Viewer UI.
--web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--[no-]web.disable Disable Block Viewer UI.
--web.external-prefix="" Static prefix for all HTML links and redirect
URLs in the bucket web UI interface.
Actual endpoints are still served on / or the
@ -278,6 +283,27 @@ Flags:
stripped prefix value in X-Forwarded-Prefix
header. This allows thanos UI to be served on a
sub-path.
--[no-]web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--bucket-web-label=BUCKET-WEB-LABEL
External block label to use as group title in
the bucket web UI
--matcher-cache-size=0 Max number of cached matchers items. Using 0
disables caching.
--[no-]disable-admin-operations
Disable UI/API admin operations like marking
blocks for deletion and no compaction.
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
```
@ -373,6 +399,12 @@ The **required** settings are:
- `addresses`: list of memcached addresses, that will get resolved with the [DNS service discovery](../service-discovery.md#dns-service-discovery) provider. If your cluster supports auto-discovery, you should use the flag `auto_discovery` instead and only point to *one of* the memcached servers. This typically means that there should be only one address specified that resolves to any of the alive memcached servers. Use this for Amazon ElastiCache and other similar services.
**NOTE**: The Memcached client uses a jump hash algorithm to shard cached entries across a cluster of Memcached servers. For this reason, you should make sure memcached servers are not behind any kind of load balancer and their address is configured so that servers are added/removed to the end of the list whenever a scale up/down occurs. For example, if youre running Memcached in Kubernetes, you may:
1. Deploy your Memcached cluster using a [StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/)
2. Create a [headless](https://kubernetes.io/docs/concepts/services-networking/service/#headless-services) service for Memcached StatefulSet
3. Configure the Thanos's memcached `addresses` using the `dnssrvnoa+` [DNS service discovery](../service-discovery.md#dns-service-discovery)
While the remaining settings are **optional**:
- `timeout`: the socket read/write timeout.
@ -380,7 +412,7 @@ While the remaining settings are **optional**:
- `max_async_concurrency`: maximum number of concurrent asynchronous operations can occur.
- `max_async_buffer_size`: maximum number of enqueued asynchronous operations allowed.
- `max_get_multi_concurrency`: maximum number of concurrent connections when fetching keys. If set to `0`, the concurrency is unlimited.
- `max_get_multi_batch_size`: maximum number of keys a single underlying operation should fetch. If more keys are specified, internally keys are splitted into multiple batches and fetched concurrently, honoring `max_get_multi_concurrency`. If set to `0`, the batch size is unlimited.
- `max_get_multi_batch_size`: maximum number of keys a single underlying operation should fetch. If more keys are specified, internally keys are split into multiple batches and fetched concurrently, honoring `max_get_multi_concurrency`. If set to `0`, the batch size is unlimited.
- `max_item_size`: maximum size of an item to be stored in memcached. This option should be set to the same value of memcached `-I` flag (defaults to 1MB) in order to avoid wasting network round trips to store items larger than the max item size allowed in memcached. If set to `0`, the item size is unlimited.
- `dns_provider_update_interval`: the DNS discovery update interval.
- `auto_discovery`: whether to use the auto-discovery mechanism for memcached.
@ -454,10 +486,6 @@ Here is an example of what effect client-side caching could have:
<img src="../img/rueidis-client-side.png" class="img-fluid" alt="Example of client-side in action - reduced network usage by a lot"/>
- `pool_size`: maximum number of socket connections.
- `min_idle_conns`: specifies the minimum number of idle connections which is useful when establishing new connection is slow.
- `idle_timeout`: amount of time after which client closes idle connections. Should be less than server's timeout.
- `max_conn_age`: connection age at which client retires (closes) the connection.
- `max_get_multi_concurrency`: specifies the maximum number of concurrent GetMulti() operations.
- `get_multi_batch_size`: specifies the maximum size per batch for mget.
- `max_set_multi_concurrency`: specifies the maximum number of concurrent SetMulti() operations.
@ -564,6 +592,33 @@ Note that there must be no trailing slash in the `peers` configuration i.e. one
If timeout is set to zero then there is no timeout for fetching and fetching's lifetime is equal to the lifetime to the original request's lifetime. It is recommended to keep it higher than zero. It is generally preferred to keep this value higher because the fetching operation potentially includes loading of data from remote object storage.
## Hedged Requests
Thanos Store Gateway supports `hedged requests` to enhance performance and reliability, particularly in high-latency environments. This feature addresses `long-tail latency issues` that can occur between the Thanos Store Gateway and an external cache, reducing the impact of slower response times on overall performance.
The configuration options for hedged requests allow for tuning based on latency tolerance and cost considerations, as some providers may charge per request.
In the `bucket.yml` file, you can specify the following fields under `hedging_config`:
- `enabled`: bool to enable hedged requests.
- `up_to`: maximum number of hedged requests allowed for each initial request.
- **Purpose**: controls the redundancy level of hedged requests to improve response times.
- **Cost vs. Benefit**: increasing up_to can reduce latency but may increase costs, as some providers charge per request. Higher values provide diminishing returns on latency beyond a certain level.
- `quantile`: latency threshold, specified as a quantile (e.g., percentile), which determines when additional hedged requests should be sent.
- **Purpose**: controls when hedged requests are triggered based on response time distribution.
- **Cost vs. Benefit**: lower quantile (e.g., 0.7) initiates hedged requests sooner, potentially raising costs while lowering latency variance. A higher quantile (e.g., 0.95) will initiate hedged requests later, reducing cost by limiting redundancy.
By default, `hedging_config` is set as follows:
```yaml
hedging_config:
enabled: false
up_to: 3
quantile: 0.9
```
This configuration sends up to three additional requests if the initial request response time exceeds the 90th percentile.
## Index Header
In order to query series inside blocks from object storage, Store Gateway has to know certain initial info from each block index. In order to achieve so, on startup the Gateway builds an `index-header` for each block and stores it on local disk; such `index-header` is build by downloading specific pieces of original block's index, stored on local disk and then mmaped and used by Store Gateway.

File diff suppressed because it is too large Load Diff

View File

@ -38,7 +38,7 @@ There are certain rules you can follow to make the MOST from your time with us!
- Try to be independent and responsible for the feature you want to deliver. The sooner you start to lead your task, the better for you! It's hard in the beginning but try to think about the user experience. Is it hard or easy to make mistake using it? How difficult is it to migrate to this feature? Is there anything we can do to reduce data loss errors?
- Try to help others by **reviewing** other contributors, mentees or mentors' Pull Requests! It sounds scary, but this is actually the best way to learn about coding practices, patterns and how to maintain high quality codebase! (GIFs on PRs are welcome as well!)
- Try using an [iterative process for development](https://en.wikipedia.org/wiki/Iterative_and_incremental_development). Start with small and simple assumptions, and once you have a working example ready, keep improving and discussing with the mentors. Small changes are easy to review and easy to accept 😄.
- Try working out a [proof of concept](https://en.wikipedia.org/wiki/Proof_of_concept), which can be used as a baseline, and can be improved upon. These are real-world projects, so it's not possible to have a deterministic solution everytime, and proof of concepts are quick way to determine feasibility.
- Try working out a [proof of concept](https://en.wikipedia.org/wiki/Proof_of_concept), which can be used as a baseline, and can be improved upon. These are real-world projects, so it's not possible to have a deterministic solution every time, and proof of concepts are quick way to determine feasibility.
> At the end of mentorship, it's not the end! You are welcome to join our Community Office Hours. See [this](https://docs.google.com/document/d/137XnxfOT2p1NcNUq6NWZjwmtlSdA6Wyti86Pd6cyQhs/edit#) for details. This is the meeting for any Thanos contributor, but you will find fellow current and ex-mentees on the meeting too.

View File

@ -89,6 +89,7 @@ See up to date [jsonnet mixins](https://github.com/thanos-io/thanos/tree/main/mi
## Talks
* 2024
* [Enlightning - Scaling Your Metrics with Thanos](https://www.youtube.com/live/1qvcVJiVx7M)
* [6 Learnings from Building Thanos Project](https://www.youtube.com/watch?v=ur8dDFaNEFg)
* [Monitoring the World: Scaling Thanos in Dynamic Prometheus Environments](https://www.youtube.com/watch?v=ofhvbG0iTjU)
* [Scaling Thanos at Reddit](https://www.youtube.com/watch?v=c18RGbAxCfI)
@ -137,6 +138,7 @@ See up to date [jsonnet mixins](https://github.com/thanos-io/thanos/tree/main/mi
## Blog posts
* 2024:
* [Scaling Prometheus with Thanos.](https://www.cloudraft.io/blog/scaling-prometheus-with-thanos)
* [Streamlining Long-Term Storage Query Performance for Metrics With Thanos.](https://blog.devops.dev/streamlining-long-term-storage-query-performance-for-metrics-with-thanos-b44419c70cc4)
* 2023:
@ -160,7 +162,7 @@ See up to date [jsonnet mixins](https://github.com/thanos-io/thanos/tree/main/mi
* [HelloFresh blog posts part 1](https://engineering.hellofresh.com/monitoring-at-hellofresh-part-1-architecture-677b4bd6b728)
* [HelloFresh blog posts part 2](https://engineering.hellofresh.com/monitoring-at-hellofresh-part-2-operating-the-monitoring-system-8175cd939c1d)
* [Thanos deployment](https://www.metricfire.com/blog/ha-kubernetes-monitoring-using-prometheus-and-thanos)
* [Taboola user story](https://blog.taboola.com/monitoring-and-metering-scale/)
* [Taboola user story](https://www.taboola.com/engineering/monitoring-and-metering-scale/)
* [Thanos via Prometheus Operator](https://kkc.github.io/2019/02/10/prometheus-operator-with-thanos/)
* 2018:

View File

@ -164,7 +164,7 @@ metadata:
### Forward proxy Envoy configuration `envoy.yaml`
This is a static v2 envoy configuration (v3 example below). You will need to update this configuration for every sidecar you would like to talk to. There are also several options for dynamic configuration, like envoy XDS (and other associated dynamic config modes), or using something like terraform (if thats your deployment method) to generate the configs at deployment time. NOTE: This config **does not** send a client certificate to authenticate with remote clusters, see envoy v3 config.
This is a static v2 envoy configuration (v3 example below). You will need to update this configuration for every sidecar you would like to talk to. There are also several options for dynamic configuration, like envoy XDS (and other associated dynamic config modes), or using something like terraform (if that's your deployment method) to generate the configs at deployment time. NOTE: This config **does not** send a client certificate to authenticate with remote clusters, see envoy v3 config.
```yaml
admin:

View File

@ -55,7 +55,7 @@ The main motivation for considering deletions in the object storage are the foll
* **reason for deletion**
* The entered details are processed by the CLI tool to create a tombstone file (unique for a request and irrespective of the presence of series), and the file is uploaded to the object storage making it accessible to all components.
* **Filename optimization**: The filename is created from the hash of matchers, minTime and maxTime. This helps re-write an existing tombstone, whenever a same request is made in the future hence avoiding duplication of the same request. (NOTE: Requests which entail common deletions still creates different tombstones.)
* Store Gateway masks the series on processing the global tombstone files from the object storage. At chunk level, whenever there's a match with the data corresponding to atleast one of the tombstones, we skip the chunk, potentially resulting in the masking of chunk.
* Store Gateway masks the series on processing the global tombstone files from the object storage. At chunk level, whenever there's a match with the data corresponding to at least one of the tombstones, we skip the chunk, potentially resulting in the masking of chunk.
## Considerations

View File

@ -203,9 +203,9 @@ func (s *seriesServer) Send(r *storepb.SeriesResponse) error {
}
```
Now that the `SeriesStats` are propagated into the `storepb.SeriesServer`, we can ammend the `selectFn` function to return a tuple of `(storage.SeriesSet, storage.SeriesSetCounter, error)`
Now that the `SeriesStats` are propagated into the `storepb.SeriesServer`, we can amend the `selectFn` function to return a tuple of `(storage.SeriesSet, storage.SeriesSetCounter, error)`
Ammending the QueryableCreator to provide a func parameter:
Amending the QueryableCreator to provide a func parameter:
```go
type SeriesStatsReporter func(seriesStats storepb.SeriesStatsCounter)

View File

@ -134,7 +134,7 @@ Using the reference implementation, we benchmarked query execution and memory us
We then ran the following query on the reference dataset for 10-15 minutes: `sum by (pod) (http_requests_total)`
The memory usage of Queriers with and without sharding was ~650MB and ~1.5GB respectively, as shown n the screenshots bellow.
The memory usage of Queriers with and without sharding was ~650MB and ~1.5GB respectively, as shown n the screenshots below.
Memory usage with sharding:

View File

@ -180,7 +180,7 @@ message SeriesRefMap {
### 9.2 Per-Receive Validation
We can implement the same new endpoints as mentioned in the previous approach, on Thanos Receive, but do merging and checking operations on each Receive node in the hashring, i.e change the existing Router and Ingestor modes to handle the same limting logic.
We can implement the same new endpoints as mentioned in the previous approach, on Thanos Receive, but do merging and checking operations on each Receive node in the hashring, i.e change the existing Router and Ingestor modes to handle the same limiting logic.
The implementation would be as follows,

View File

@ -192,7 +192,7 @@ Receivers do not need to re-shard data on rollouts; instead, they must flush the
This may produce small, and therefore unoptimized, TSDB blocks in object storage, however these are optimized away by the Thanos compactor by merging the small blocks into bigger blocks. The compaction process is done concurrently in a separate deployment to the receivers. Timestamps involved are produced by the sending Prometheus, therefore no clock synchronization is necessary.
When changing a soft tenant to a hard tenant (or vise versa), all blocks on all nodes in hashrings in which the tenant is present must be flushed.
When changing a soft tenant to a hard tenant (or vice versa), all blocks on all nodes in hashrings in which the tenant is present must be flushed.
## Open questions

View File

@ -131,7 +131,7 @@ Example usages would be:
* Add/import relabel config into Thanos, add relevant logic.
* Hook it for selecting blocks on Store Gateway
* Advertise original labels of "approved" blocs on resulted external labels.
* Advertise original labels of "approved" blocks on resulted external labels.
* Hook it for selecting blocks on Compactor.
* Add documentation about following concern: Care must be taken with changing selection for compactor to unsure only single compactor ever running over each Source's blocks.

View File

@ -35,7 +35,7 @@ Thus, this logic needs to be changed somehow. There are a few possible options:
2. Another option could be introduced such as `--store.hold-timeout` which would be `--store.unhealthy-timeout`'s brother and we would hold the StoreAPI nodes for `max(hold_timeout, unhealthy_timeout)`.
3. Another option such as `--store.strict-mode` could be introduced which means that we would always retain the last information of the StoreAPI nodes of the last successful check.
4. The StoreAPI node specification format that is used in `--store` could be extended to include another flag which would let specify the previous option per-specific node.
5. Instead of extending the specification format, we could move the same inforamtion to the command line options themselves. This would increase the explicitness of this new mode i.e. that it only applies to statically defined nodes.
5. Instead of extending the specification format, we could move the same information to the command line options themselves. This would increase the explicitness of this new mode i.e. that it only applies to statically defined nodes.
Lets look through their pros and cons:

View File

@ -13,7 +13,7 @@ We want to be able to distinguish between gRPC Store APIs and other Queriers in
This is useful for a few reasons:
* When Queriers register disjoint Store targets, they should be able to deduplicate series and then execute the query without concerns of duplicate data from other queriers. This new API would enable users to effectively partition by Querier, and avoid shipping raw series back from each disjointed Querier to the root Querier.
* If Queriers register Store targets with overlapping series, users would be able to express a query sharding strategy between Queriers to more effectively distribute query load amongst a fleet of homogenous Queriers.
* If Queriers register Store targets with overlapping series, users would be able to express a query sharding strategy between Queriers to more effectively distribute query load amongst a fleet of homogeneous Queriers.
* The proposed Query API utilizes gRPC instead of HTTP, which would enable gRPC streaming from root Querier all the way to the underlying Store targets (Query API -> Store API) and unlock the performance benefits of gRPC over HTTP.
* When there is only one StoreAPI connected to Thanos Query which completely covers the requested range of the original user's query, then it is more optimal to execute the query directly in the store, instead of sending raw samples to the querier. This scenario is not unlikely given query-frontend's sharding capabilities.

View File

@ -190,7 +190,7 @@ sum(
The root querier would need to know that downstream queriers have already executed the `count` and should convert the aggregation into a `sum`
A similar problem can happen with a `sum(rate(metric[2m]))` expression where downstream queriers calculate the `sum` over the metric's `rate`. In order for the values to not get rated twice, either the downstream queriers need to invert the rate into a cumulative value, or the central querier needs to omit the rate and only calcualte the sum.
A similar problem can happen with a `sum(rate(metric[2m]))` expression where downstream queriers calculate the `sum` over the metric's `rate`. In order for the values to not get rated twice, either the downstream queriers need to invert the rate into a cumulative value, or the central querier needs to omit the rate and only calculate the sum.
Managing this complexity in Thanos itself seems error prone and hard to maintain over time. As a result, this proposal suggests to localize the complexity into a single logical optimizer as suggested in the sections above.

View File

@ -23,6 +23,11 @@ Release shepherd responsibilities:
| Release | Time of first RC | Shepherd (GitHub handle) |
|---------|------------------|-------------------------------|
| v0.39.0 | 2025.05.29 | `@GiedriusS` |
| v0.38.0 | 2025.03.25 | `@MichaHoffmann` |
| v0.37.0 | 2024.11.19 | `@saswatamcode` |
| v0.36.0 | 2024.06.26 | `@MichaHoffmann` |
| v0.35.0 | 2024.04.09 | `@saswatamcode` |
| v0.34.0 | 2024.01.14 | `@MichaHoffmann` |
| v0.33.0 | 2023.10.24 | `@MichaHoffmann` |
| v0.32.0 | 2023.08.23 | `@saswatamcode` |

View File

@ -103,6 +103,7 @@ config:
kms_encryption_context: {}
encryption_key: ""
sts_endpoint: ""
max_retries: 0
prefix: ""
```
@ -138,7 +139,7 @@ For debug and testing purposes you can set
##### S3 Server-Side Encryption
SSE can be configued using the `sse_config`. [SSE-S3](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html), [SSE-KMS](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html), and [SSE-C](https://docs.aws.amazon.com/AmazonS3/latest/dev/ServerSideEncryptionCustomerKeys.html) are supported.
SSE can be configured using the `sse_config`. [SSE-S3](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingServerSideEncryption.html), [SSE-KMS](https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingKMSEncryption.html), and [SSE-C](https://docs.aws.amazon.com/AmazonS3/latest/dev/ServerSideEncryptionCustomerKeys.html) are supported.
* If type is set to `SSE-S3` you do not need to configure other options.
@ -304,6 +305,8 @@ config:
server_name: ""
insecure_skip_verify: false
disable_compression: false
chunk_size_bytes: 0
max_retries: 0
prefix: ""
```
@ -494,6 +497,7 @@ config:
endpoint: ""
secret_key: ""
secret_id: ""
max_retries: 0
http_config:
idle_conn_timeout: 1m30s
response_header_timeout: 2m

View File

@ -411,7 +411,7 @@ rules:
severity: critical
- alert: ThanosQueryOverload
annotations:
description: Thanos Query {{$labels.job}} has been overloaded for more than 15 minutes. This may be a symptom of excessive simultanous complex requests, low performance of the Prometheus API, or failures within these components. Assess the health of the Thanos query instances, the connnected Prometheus instances, look for potential senders of these requests and then contact support.
description: Thanos Query {{$labels.job}} has been overloaded for more than 15 minutes. This may be a symptom of excessive simultaneous complex requests, low performance of the Prometheus API, or failures within these components. Assess the health of the Thanos query instances, the connected Prometheus instances, look for potential senders of these requests and then contact support.
runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanosqueryoverload
summary: Thanos query reaches its maximum capacity serving concurrent requests.
expr: |

View File

@ -160,7 +160,7 @@ groups:
severity: critical
- alert: ThanosQueryOverload
annotations:
description: Thanos Query {{$labels.job}} has been overloaded for more than 15 minutes. This may be a symptom of excessive simultanous complex requests, low performance of the Prometheus API, or failures within these components. Assess the health of the Thanos query instances, the connnected Prometheus instances, look for potential senders of these requests and then contact support.
description: Thanos Query {{$labels.job}} has been overloaded for more than 15 minutes. This may be a symptom of excessive simultaneous complex requests, low performance of the Prometheus API, or failures within these components. Assess the health of the Thanos query instances, the connected Prometheus instances, look for potential senders of these requests and then contact support.
runbook_url: https://github.com/thanos-io/thanos/tree/main/mixin/runbook.md#alert-name-thanosqueryoverload
summary: Thanos query reaches its maximum capacity serving concurrent requests.
expr: |

View File

@ -1365,7 +1365,7 @@
"dashLength": 10,
"dashes": false,
"datasource": "$datasource",
"description": "Shows ratio of errors compared to the total number of forwareded requests to other receive nodes.",
"description": "Shows ratio of errors compared to the total number of forwarded requests to other receive nodes.",
"fill": 10,
"id": 17,
"legend": {

View File

@ -8,6 +8,7 @@ import (
"os"
execlib "os/exec"
"path/filepath"
"strings"
"testing"
"github.com/efficientgo/e2e"
@ -329,19 +330,25 @@ func TestReadOnlyThanosSetup(t *testing.T) {
// │ Sidecar │◄─────┘
// └────────────┘
//
storeAPIEndpoints := []string{
store1.InternalEndpoint("grpc"),
store2.InternalEndpoint("grpc"),
sidecarHA0.InternalEndpoint("grpc"),
sidecarHA1.InternalEndpoint("grpc"),
sidecar2.InternalEndpoint("grpc"),
receive1.InternalEndpoint("grpc"),
}
query1 := e2edb.NewThanosQuerier(
e,
"query1",
[]string{
store1.InternalEndpoint("grpc"),
store2.InternalEndpoint("grpc"),
sidecarHA0.InternalEndpoint("grpc"),
sidecarHA1.InternalEndpoint("grpc"),
sidecar2.InternalEndpoint("grpc"),
receive1.InternalEndpoint("grpc"),
},
[]string{},
e2edb.WithImage("thanos:latest"),
e2edb.WithFlagOverride(map[string]string{"--tracing.config": string(jaegerConfig)}),
e2edb.WithFlagOverride(map[string]string{
"--tracing.config": string(jaegerConfig),
"--endpoint": strings.Join(storeAPIEndpoints, ","),
}),
)
testutil.Ok(t, e2e.StartAndWaitReady(query1))

401
go.mod
View File

@ -1,282 +1,315 @@
module github.com/thanos-io/thanos
go 1.21
go 1.24.0
require (
cloud.google.com/go/storage v1.40.0 // indirect
cloud.google.com/go/trace v1.10.7
github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace v1.8.3
github.com/alecthomas/units v0.0.0-20231202071711-9a357b53e9c9
github.com/alicebob/miniredis/v2 v2.22.0
capnproto.org/go/capnp/v3 v3.1.0-alpha.1
cloud.google.com/go/trace v1.11.4
github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace v1.27.0
github.com/KimMachineGun/automemlimit v0.7.3
github.com/alecthomas/units v0.0.0-20240927000941-0f3dac36c52b
github.com/alicebob/miniredis/v2 v2.35.0
github.com/blang/semver/v4 v4.0.0
github.com/bradfitz/gomemcache v0.0.0-20190913173617-a41fca850d0b
github.com/cespare/xxhash v1.1.0
github.com/bradfitz/gomemcache v0.0.0-20250403215159-8d39553ac7cf
github.com/caio/go-tdigest v3.1.0+incompatible
github.com/cespare/xxhash/v2 v2.3.0
github.com/chromedp/cdproto v0.0.0-20230802225258-3cf4e6d46a89
github.com/chromedp/chromedp v0.9.2
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/cortexproject/promqlsmith v0.0.0-20250407233056-90db95b1a4e4
github.com/cristalhq/hedgedhttp v0.9.1
github.com/dustin/go-humanize v1.0.1
github.com/efficientgo/core v1.0.0-rc.3
github.com/efficientgo/e2e v0.14.1-0.20230710114240-c316eb95ae5b
github.com/efficientgo/tools/extkingpin v0.0.0-20220817170617-6c25e3b627dd
github.com/efficientgo/tools/extkingpin v0.0.0-20230505153745-6b7392939a60
github.com/facette/natsort v0.0.0-20181210072756-2cd4dd1e2dcb
github.com/fatih/structtag v1.2.0
github.com/felixge/fgprof v0.9.4
github.com/felixge/fgprof v0.9.5
github.com/fortytw2/leaktest v1.3.0
github.com/fsnotify/fsnotify v1.7.0
github.com/fsnotify/fsnotify v1.9.0
github.com/go-kit/log v0.2.1
github.com/go-openapi/strfmt v0.23.0
github.com/gogo/protobuf v1.3.2
github.com/gogo/status v1.1.1
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da
github.com/golang/groupcache v0.0.0-20241129210726-2c02b8208cf8
github.com/golang/protobuf v1.5.4
github.com/golang/snappy v0.0.4
github.com/golang/snappy v1.0.0
github.com/google/go-cmp v0.7.0
github.com/google/uuid v1.6.0
github.com/googleapis/gax-go v2.0.2+incompatible
github.com/gorilla/mux v1.8.0 // indirect
github.com/grafana/regexp v0.0.0-20240518133315-a468a5bfb3bc // indirect
github.com/grpc-ecosystem/go-grpc-middleware/v2 v2.1.0
github.com/grpc-ecosystem/go-grpc-middleware/providers/prometheus v1.0.1
github.com/grpc-ecosystem/go-grpc-middleware/v2 v2.3.2
github.com/hashicorp/golang-lru/v2 v2.0.7
github.com/jpillora/backoff v1.0.0
github.com/json-iterator/go v1.1.12
github.com/klauspost/compress v1.17.9
github.com/klauspost/compress v1.18.0
github.com/leanovate/gopter v0.2.9
github.com/lightstep/lightstep-tracer-go v0.25.0
github.com/lightstep/lightstep-tracer-go v0.26.0
github.com/lovoo/gcloud-opentracing v0.3.0
github.com/miekg/dns v1.1.59
github.com/minio/minio-go/v7 v7.0.72 // indirect
github.com/miekg/dns v1.1.66
github.com/minio/sha256-simd v1.0.1
github.com/mitchellh/go-ps v1.0.0
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f
github.com/oklog/run v1.1.0
github.com/oklog/ulid v1.3.1
github.com/oklog/ulid v1.3.1 // indirect
github.com/olekukonko/tablewriter v0.0.5
github.com/opentracing-contrib/go-grpc v0.0.0-20210225150812-73cb765af46e // indirect
github.com/opentracing-contrib/go-stdlib v1.0.0 // indirect
github.com/onsi/gomega v1.36.2
github.com/opentracing/basictracer-go v1.1.0
github.com/opentracing/opentracing-go v1.2.0
github.com/pkg/errors v0.9.1
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/prometheus/alertmanager v0.27.0
github.com/prometheus/client_golang v1.19.1
github.com/prometheus/client_model v0.6.1
github.com/prometheus/common v0.54.1-0.20240615204547-04635d2962f9
github.com/prometheus/exporter-toolkit v0.11.0
// Prometheus maps version 2.x.y to tags v0.x.y.
github.com/prometheus/prometheus v0.52.2-0.20240614130246-4c1e71fa0b3d
github.com/sony/gobreaker v0.5.0
github.com/stretchr/testify v1.9.0
github.com/thanos-io/objstore v0.0.0-20240622095743-1afe5d4bc3cd
github.com/thanos-io/promql-engine v0.0.0-20240515161521-93aa311933cf
github.com/prometheus-community/prom-label-proxy v0.11.1
github.com/prometheus/alertmanager v0.28.1
github.com/prometheus/client_golang v1.22.0
github.com/prometheus/client_model v0.6.2
github.com/prometheus/common v0.63.0
github.com/prometheus/exporter-toolkit v0.14.0
// Prometheus maps version 3.x.y to tags v0.30x.y.
github.com/prometheus/prometheus v0.303.1
github.com/redis/rueidis v1.0.61
github.com/seiflotfy/cuckoofilter v0.0.0-20240715131351-a2f2c23f1771
github.com/sony/gobreaker v1.0.0
github.com/stretchr/testify v1.10.0
github.com/thanos-io/objstore v0.0.0-20241111205755-d1dd89d41f97
github.com/thanos-io/promql-engine v0.0.0-20250522103302-dd83bd8fdb50
github.com/uber/jaeger-client-go v2.30.0+incompatible
github.com/uber/jaeger-lib v2.4.1+incompatible // indirect
github.com/vimeo/galaxycache v0.0.0-20210323154928-b7e5d71c067a
github.com/vimeo/galaxycache v1.3.1
github.com/weaveworks/common v0.0.0-20230728070032-dd9e68f319d5
go.elastic.co/apm v1.15.0
go.elastic.co/apm/module/apmot v1.15.0
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.52.0 // indirect
go.opentelemetry.io/otel v1.27.0
go.opentelemetry.io/otel/bridge/opentracing v1.21.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.27.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.27.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.27.0
go.opentelemetry.io/otel/sdk v1.27.0
go.opentelemetry.io/otel/trace v1.27.0
go.opentelemetry.io/contrib/propagators/autoprop v0.61.0
go.opentelemetry.io/contrib/samplers/jaegerremote v0.30.0
go.opentelemetry.io/otel v1.36.0
go.opentelemetry.io/otel/bridge/opentracing v1.36.0
go.opentelemetry.io/otel/exporters/jaeger v1.17.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.36.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.36.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.36.0
go.opentelemetry.io/otel/sdk v1.36.0
go.opentelemetry.io/otel/trace v1.36.0
go.uber.org/atomic v1.11.0
go.uber.org/automaxprocs v1.5.3
go.uber.org/automaxprocs v1.6.0
go.uber.org/goleak v1.3.0
golang.org/x/crypto v0.24.0
golang.org/x/net v0.26.0
golang.org/x/sync v0.7.0
golang.org/x/text v0.16.0
golang.org/x/time v0.5.0
google.golang.org/api v0.183.0 // indirect
google.golang.org/genproto v0.0.0-20240528184218-531527333157 // indirect
google.golang.org/grpc v1.64.0
go4.org/intern v0.0.0-20230525184215-6c62f75575cb
golang.org/x/crypto v0.39.0
golang.org/x/net v0.41.0
golang.org/x/sync v0.15.0
golang.org/x/text v0.26.0
golang.org/x/time v0.12.0
google.golang.org/grpc v1.73.0
google.golang.org/grpc/examples v0.0.0-20211119005141-f45e61797429
gopkg.in/alecthomas/kingpin.v2 v2.2.6
google.golang.org/protobuf v1.36.6
gopkg.in/yaml.v2 v2.4.0
gopkg.in/yaml.v3 v3.0.1
)
require (
github.com/efficientgo/core v1.0.0-rc.2
github.com/minio/sha256-simd v1.0.1
cloud.google.com/go v0.118.0 // indirect
cloud.google.com/go/auth v0.15.1-0.20250317171031-671eed979bfd // indirect
cloud.google.com/go/auth/oauth2adapt v0.2.8 // indirect
cloud.google.com/go/compute/metadata v0.7.0 // indirect
cloud.google.com/go/iam v1.3.1 // indirect
cloud.google.com/go/storage v1.43.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.18.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.10.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.1 // indirect
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.6.1 // indirect
github.com/AzureAD/microsoft-authentication-library-for-go v1.4.2 // indirect
)
require (
cloud.google.com/go v0.114.0 // indirect
cloud.google.com/go/iam v1.1.8 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.11.1 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.5.2 // indirect
github.com/Azure/azure-sdk-for-go/sdk/internal v1.6.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.3.0 // indirect
github.com/AzureAD/microsoft-authentication-library-for-go v1.2.2 // indirect
go.opentelemetry.io/contrib/samplers/jaegerremote v0.7.0
go.opentelemetry.io/otel/exporters/jaeger v1.16.0
github.com/alecthomas/kingpin/v2 v2.4.0
github.com/oklog/ulid/v2 v2.1.1
github.com/prometheus/otlptranslator v0.0.0-20250527173959-2573485683d5
github.com/tjhop/slog-gokit v0.1.4
go.opentelemetry.io/collector/pdata v1.34.0
go.opentelemetry.io/collector/semconv v0.128.0
)
require (
github.com/cortexproject/promqlsmith v0.0.0-20240326071418-c2a9ca1e89f5
github.com/grpc-ecosystem/go-grpc-middleware/providers/prometheus v1.0.1
github.com/hashicorp/golang-lru/v2 v2.0.7
github.com/mitchellh/go-ps v1.0.0
github.com/onsi/gomega v1.33.1
github.com/prometheus-community/prom-label-proxy v0.8.1-0.20240127162815-c1195f9aabc0
go.opentelemetry.io/contrib/propagators/autoprop v0.38.0
go4.org/intern v0.0.0-20230525184215-6c62f75575cb
golang.org/x/exp v0.0.0-20240119083558-1b970713d09a
)
require github.com/dgryski/go-metro v0.0.0-20250106013310-edb8663e5e33 // indirect
require (
cloud.google.com/go/auth v0.5.1 // indirect
cloud.google.com/go/auth/oauth2adapt v0.2.2 // indirect
github.com/HdrHistogram/hdrhistogram-go v1.1.2 // indirect
github.com/bboreham/go-loser v0.0.0-20230920113527-fcc2c21820a3 // indirect
github.com/cilium/ebpf v0.11.0 // indirect
github.com/containerd/cgroups/v3 v3.0.3 // indirect
github.com/docker/go-units v0.5.0 // indirect
github.com/elastic/go-licenser v0.3.1 // indirect
github.com/go-openapi/runtime v0.27.1 // indirect
github.com/goccy/go-json v0.10.3 // indirect
github.com/godbus/dbus/v5 v5.0.4 // indirect
github.com/golang-jwt/jwt/v5 v5.2.1 // indirect
github.com/google/s2a-go v0.1.7 // indirect
github.com/huaweicloud/huaweicloud-sdk-go-obs v3.23.3+incompatible // indirect
github.com/jcchavezs/porto v0.1.0 // indirect
github.com/elastic/go-licenser v0.4.2 // indirect
github.com/go-ini/ini v1.67.0 // indirect
github.com/go-openapi/runtime v0.28.0 // indirect
github.com/goccy/go-json v0.10.5 // indirect
github.com/golang-jwt/jwt/v5 v5.2.2 // indirect
github.com/google/s2a-go v0.1.9 // indirect
github.com/huaweicloud/huaweicloud-sdk-go-obs v3.25.4+incompatible // indirect
github.com/jcchavezs/porto v0.7.0 // indirect
github.com/leesper/go_rng v0.0.0-20190531154944-a612b043e353 // indirect
github.com/mdlayher/socket v0.5.1 // indirect
github.com/mdlayher/vsock v1.2.1 // indirect
github.com/metalmatze/signal v0.0.0-20210307161603-1c9aa721a97a // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/onsi/ginkgo v1.16.5 // indirect
github.com/opencontainers/runtime-spec v1.0.2 // indirect
github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58 // indirect
github.com/sercand/kuberesolver/v4 v4.0.0 // indirect
github.com/zhangyunhao116/umap v0.0.0-20221211160557-cb7705fafa39 // indirect
go.opentelemetry.io/collector/pdata v1.8.0 // indirect
go.opentelemetry.io/collector/semconv v0.101.0 // indirect
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.49.0 // indirect
go.opentelemetry.io/contrib/propagators/ot v1.13.0 // indirect
go4.org/unsafe/assume-no-moving-gc v0.0.0-20230525183740-e7c30c78aeb2 // indirect
golang.org/x/lint v0.0.0-20210508222113-6edffad5e616 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20240528184218-531527333157 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20240528184218-531527333157 // indirect
k8s.io/apimachinery v0.29.3 // indirect
k8s.io/client-go v0.29.3 // indirect
k8s.io/klog/v2 v2.120.1 // indirect
k8s.io/utils v0.0.0-20230726121419-3b25d923346b // indirect
github.com/zhangyunhao116/umap v0.0.0-20250307031311-0b61e69e958b // indirect
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.61.0 // indirect
go.opentelemetry.io/contrib/propagators/ot v1.36.0 // indirect
go4.org/unsafe/assume-no-moving-gc v0.0.0-20231121144256-b99613f794b6 // indirect
golang.org/x/lint v0.0.0-20241112194109-818c5a804067 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20250603155806-513f23925822 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20250603155806-513f23925822 // indirect
k8s.io/apimachinery v0.33.1 // indirect
k8s.io/client-go v0.33.1 // indirect
k8s.io/klog/v2 v2.130.1 // indirect
k8s.io/utils v0.0.0-20250604170112-4c0f3b243397 // indirect
)
require (
cloud.google.com/go/compute/metadata v0.3.0 // indirect
github.com/GoogleCloudPlatform/opentelemetry-operations-go/internal/resourcemapping v0.32.3 // indirect
github.com/KimMachineGun/automemlimit v0.6.1
github.com/OneOfOne/xxhash v1.2.6 // indirect
github.com/alecthomas/template v0.0.0-20190718012654-fb15b899a751 // indirect
github.com/alicebob/gopher-json v0.0.0-20200520072559-a9ecdc9d1d3a // indirect
github.com/aliyun/aliyun-oss-go-sdk v2.2.2+incompatible // indirect
github.com/GoogleCloudPlatform/opentelemetry-operations-go/internal/resourcemapping v0.52.0 // indirect
github.com/aliyun/aliyun-oss-go-sdk v3.0.2+incompatible // indirect
github.com/armon/go-radix v1.0.0 // indirect
github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 // indirect
github.com/aws/aws-sdk-go v1.53.16 // indirect
github.com/aws/aws-sdk-go-v2 v1.16.0 // indirect
github.com/aws/aws-sdk-go-v2/config v1.15.1 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.11.0 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.12.1 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.7 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.4.1 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.3.8 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.9.1 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.11.1 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.16.1 // indirect
github.com/aws/smithy-go v1.11.1 // indirect
github.com/baidubce/bce-sdk-go v0.9.111 // indirect
github.com/aws/aws-sdk-go v1.55.7 // indirect
github.com/aws/aws-sdk-go-v2 v1.36.3 // indirect
github.com/aws/aws-sdk-go-v2/config v1.29.15 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.17.68 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.16.30 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.3.34 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.6.34 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.12.3 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.12.15 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.25.3 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.30.1 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.33.20 // indirect
github.com/aws/smithy-go v1.22.3 // indirect
github.com/baidubce/bce-sdk-go v0.9.230 // indirect
github.com/beorn7/perks v1.0.1 // indirect
github.com/cenkalti/backoff/v4 v4.3.0 // indirect
github.com/cenkalti/backoff/v5 v5.0.2 // indirect
github.com/chromedp/sysutil v1.0.0 // indirect
github.com/clbanning/mxj v1.8.4 // indirect
github.com/cncf/xds/go v0.0.0-20250501225837-2ac532fd4443 // indirect
github.com/coreos/go-systemd/v22 v22.5.0 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/dennwc/varint v1.0.0 // indirect
github.com/edsrzf/mmap-go v1.1.0 // indirect
github.com/elastic/go-sysinfo v1.8.1 // indirect
github.com/elastic/go-windows v1.0.1 // indirect
github.com/edsrzf/mmap-go v1.2.0 // indirect
github.com/elastic/go-sysinfo v1.15.3 // indirect
github.com/elastic/go-windows v1.0.2 // indirect
github.com/envoyproxy/go-control-plane/envoy v1.32.4 // indirect
github.com/fatih/color v1.18.0 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/go-logfmt/logfmt v0.6.0 // indirect
github.com/go-logr/logr v1.4.1 // indirect
github.com/go-logr/logr v1.4.3 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-ole/go-ole v1.2.6 // indirect
github.com/go-openapi/analysis v0.22.2 // indirect
github.com/go-openapi/errors v0.22.0 // indirect
github.com/go-openapi/jsonpointer v0.20.2 // indirect
github.com/go-openapi/jsonreference v0.20.4 // indirect
github.com/go-openapi/loads v0.21.5 // indirect
github.com/go-openapi/spec v0.20.14 // indirect
github.com/go-openapi/swag v0.22.9 // indirect
github.com/go-openapi/validate v0.23.0 // indirect
github.com/go-openapi/analysis v0.23.0 // indirect
github.com/go-openapi/errors v0.22.1 // indirect
github.com/go-openapi/jsonpointer v0.21.1 // indirect
github.com/go-openapi/jsonreference v0.21.0 // indirect
github.com/go-openapi/loads v0.22.0 // indirect
github.com/go-openapi/spec v0.21.0 // indirect
github.com/go-openapi/swag v0.23.1 // indirect
github.com/go-openapi/validate v0.24.0 // indirect
github.com/go-viper/mapstructure/v2 v2.2.1 // indirect
github.com/gobwas/glob v0.2.3 // indirect
github.com/gobwas/httphead v0.1.0 // indirect
github.com/gobwas/pool v0.2.1 // indirect
github.com/gobwas/ws v1.2.1 // indirect
github.com/gofrs/flock v0.8.1 // indirect
github.com/gogo/googleapis v1.4.0 // indirect
github.com/google/go-cmp v0.6.0
github.com/gofrs/flock v0.12.1 // indirect
github.com/gogo/googleapis v1.4.1 // indirect
github.com/google/go-querystring v1.1.0 // indirect
github.com/google/pprof v0.0.0-20240528025155-186aa0362fba // indirect
github.com/google/uuid v1.6.0
github.com/googleapis/enterprise-certificate-proxy v0.3.2 // indirect
github.com/googleapis/gax-go/v2 v2.12.4 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.20.0 // indirect
github.com/google/pprof v0.0.0-20250607225305-033d6d78b36a // indirect
github.com/googleapis/enterprise-certificate-proxy v0.3.6 // indirect
github.com/googleapis/gax-go/v2 v2.14.1 // indirect
github.com/gorilla/mux v1.8.1 // indirect
github.com/grafana/regexp v0.0.0-20240518133315-a468a5bfb3bc // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.3 // indirect
github.com/hashicorp/go-version v1.7.0 // indirect
github.com/jaegertracing/jaeger-idl v0.6.0 // indirect
github.com/jmespath/go-jmespath v0.4.0 // indirect
github.com/joeshaw/multierror v0.0.0-20140124173710-69b34d4ec901 // indirect
github.com/josharian/intern v1.0.0 // indirect
github.com/julienschmidt/httprouter v1.3.0 // indirect
github.com/klauspost/cpuid/v2 v2.2.8 // indirect
github.com/klauspost/cpuid/v2 v2.2.10 // indirect
github.com/knadh/koanf/maps v0.1.2 // indirect
github.com/knadh/koanf/providers/confmap v1.0.0 // indirect
github.com/knadh/koanf/v2 v2.2.1 // indirect
github.com/kylelemons/godebug v1.1.0 // indirect
github.com/lightstep/lightstep-tracer-common/golang/gogo v0.0.0-20210210170715-a8dfcb80d3a7 // indirect
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/mattn/go-runewidth v0.0.13 // indirect
github.com/mailru/easyjson v0.9.0 // indirect
github.com/mattn/go-colorable v0.1.14 // indirect
github.com/mattn/go-runewidth v0.0.16 // indirect
github.com/minio/md5-simd v1.1.2 // indirect
github.com/minio/minio-go/v7 v7.0.80 // indirect
github.com/mitchellh/copystructure v1.2.0 // indirect
github.com/mitchellh/mapstructure v1.5.0 // indirect
github.com/mitchellh/reflectwalk v1.0.2 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/mozillazg/go-httpheader v0.2.1 // indirect
github.com/mozillazg/go-httpheader v0.4.0 // indirect
github.com/ncw/swift v1.0.53 // indirect
github.com/oracle/oci-go-sdk/v65 v65.41.1 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/internal/exp/metrics v0.128.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatautil v0.128.0 // indirect
github.com/open-telemetry/opentelemetry-collector-contrib/processor/deltatocumulativeprocessor v0.128.0 // indirect
github.com/opentracing-contrib/go-grpc v0.1.2 // indirect
github.com/opentracing-contrib/go-stdlib v1.1.0 // indirect
github.com/oracle/oci-go-sdk/v65 v65.93.1 // indirect
github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c // indirect
github.com/power-devops/perfstat v0.0.0-20210106213030-5aafc221ea8c // indirect
github.com/prometheus/common/sigv4 v0.1.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/redis/rueidis v1.0.14-go1.18
github.com/rivo/uniseg v0.2.0 // indirect
github.com/rs/xid v1.5.0 // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/prometheus/procfs v0.16.1 // indirect
github.com/prometheus/sigv4 v0.1.2 // indirect
github.com/puzpuzpuz/xsync/v3 v3.5.1 // indirect
github.com/rivo/uniseg v0.4.7 // indirect
github.com/rs/xid v1.6.0 // indirect
github.com/santhosh-tekuri/jsonschema v1.2.4 // indirect
github.com/shirou/gopsutil/v3 v3.22.9 // indirect
github.com/sirupsen/logrus v1.9.3 // indirect
github.com/spaolacci/murmur3 v1.1.0 // indirect
github.com/stretchr/objx v0.5.2 // indirect
github.com/tencentyun/cos-go-sdk-v5 v0.7.40 // indirect
github.com/tklauser/go-sysconf v0.3.10 // indirect
github.com/tklauser/numcpus v0.4.0 // indirect
github.com/tencentyun/cos-go-sdk-v5 v0.7.66 // indirect
github.com/uber/jaeger-lib v2.4.1+incompatible // indirect
github.com/weaveworks/promrus v1.2.0 // indirect
github.com/yuin/gopher-lua v0.0.0-20210529063254-f4c35e4016d9 // indirect
github.com/yusufpapurcu/wmi v1.2.2 // indirect
github.com/xhit/go-str2duration/v2 v2.1.0 // indirect
github.com/youmark/pkcs8 v0.0.0-20240726163527-a2c0da244d78 // indirect
github.com/yuin/gopher-lua v1.1.1 // indirect
go.elastic.co/apm/module/apmhttp v1.15.0 // indirect
go.elastic.co/fastjson v1.1.0 // indirect
go.mongodb.org/mongo-driver v1.14.0 // indirect
go.elastic.co/fastjson v1.5.1 // indirect
go.mongodb.org/mongo-driver v1.17.4 // indirect
go.opencensus.io v0.24.0 // indirect
go.opentelemetry.io/contrib/propagators/aws v1.13.0 // indirect
go.opentelemetry.io/contrib/propagators/b3 v1.13.0 // indirect
go.opentelemetry.io/contrib/propagators/jaeger v1.13.0 // indirect
go.opentelemetry.io/otel/metric v1.27.0 // indirect
go.opentelemetry.io/proto/otlp v1.2.0 // indirect
go.opentelemetry.io/auto/sdk v1.1.0 // indirect
go.opentelemetry.io/collector/component v1.34.0 // indirect
go.opentelemetry.io/collector/confmap v1.34.0 // indirect
go.opentelemetry.io/collector/confmap/xconfmap v0.128.0 // indirect
go.opentelemetry.io/collector/consumer v1.34.0 // indirect
go.opentelemetry.io/collector/featuregate v1.34.0 // indirect
go.opentelemetry.io/collector/internal/telemetry v0.128.0 // indirect
go.opentelemetry.io/collector/pipeline v0.128.0 // indirect
go.opentelemetry.io/collector/processor v1.34.0 // indirect
go.opentelemetry.io/contrib/bridges/otelzap v0.11.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/httptrace/otelhttptrace v0.61.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.61.0 // indirect
go.opentelemetry.io/contrib/propagators/aws v1.36.0 // indirect
go.opentelemetry.io/contrib/propagators/b3 v1.36.0 // indirect
go.opentelemetry.io/contrib/propagators/jaeger v1.36.0 // indirect
go.opentelemetry.io/otel/log v0.12.2 // indirect
go.opentelemetry.io/otel/metric v1.36.0 // indirect
go.opentelemetry.io/proto/otlp v1.7.0 // indirect
go.uber.org/multierr v1.11.0 // indirect
golang.org/x/mod v0.18.0 // indirect
golang.org/x/oauth2 v0.21.0 // indirect
golang.org/x/sys v0.21.0 // indirect
golang.org/x/tools v0.22.0 // indirect
gonum.org/v1/gonum v0.12.0 // indirect
google.golang.org/protobuf v1.34.2
gopkg.in/ini.v1 v1.67.0 // indirect
howett.net/plist v0.0.0-20181124034731-591f970eefbb // indirect
go.uber.org/zap v1.27.0 // indirect
golang.org/x/exp v0.0.0-20250606033433-dcc06ee1d476 // indirect
golang.org/x/mod v0.25.0 // indirect
golang.org/x/oauth2 v0.30.0 // indirect
golang.org/x/sys v0.33.0 // indirect
golang.org/x/tools v0.34.0 // indirect
gonum.org/v1/gonum v0.16.0 // indirect
google.golang.org/api v0.228.0 // indirect
google.golang.org/genproto v0.0.0-20250122153221-138b5a5a4fd4 // indirect
howett.net/plist v1.0.1 // indirect
sigs.k8s.io/yaml v1.4.0 // indirect
zenhack.net/go/util v0.0.0-20230607025951-8b02fee814ae // indirect
)
replace (
// Pinnning capnp due to https://github.com/thanos-io/thanos/issues/7944
capnproto.org/go/capnp/v3 => capnproto.org/go/capnp/v3 v3.0.0-alpha.30
// Using a 3rd-party branch for custom dialer - see https://github.com/bradfitz/gomemcache/pull/86.
// Required by Cortex https://github.com/cortexproject/cortex/pull/3051.
github.com/bradfitz/gomemcache => github.com/themihai/gomemcache v0.0.0-20180902122335-24332e2d58ab
// v3.3.1 with https://github.com/prometheus/prometheus/pull/16252.
github.com/prometheus/prometheus => github.com/thanos-io/thanos-prometheus v0.0.0-20250610133519-082594458a88
// Pin kuberesolver/v5 to support new grpc version. Need to upgrade kuberesolver version on weaveworks/common.
github.com/sercand/kuberesolver/v4 => github.com/sercand/kuberesolver/v5 v5.1.1
@ -287,8 +320,4 @@ replace (
// Overriding to use latest commit.
gopkg.in/alecthomas/kingpin.v2 => github.com/alecthomas/kingpin v1.3.8-0.20210301060133-17f40c25f497
// From Prometheus.
k8s.io/klog => github.com/simonpasquier/klog-gokit v0.3.0
k8s.io/klog/v2 => github.com/simonpasquier/klog-gokit/v3 v3.0.0
)

898
go.sum

File diff suppressed because it is too large Load Diff

View File

@ -22,6 +22,7 @@ import (
"github.com/thanos-io/thanos/pkg/clientconfig"
"github.com/thanos-io/thanos/pkg/discovery/dns"
memcacheDiscovery "github.com/thanos-io/thanos/pkg/discovery/memcache"
"github.com/thanos-io/thanos/pkg/errors"
"github.com/thanos-io/thanos/pkg/extprom"
)
@ -202,6 +203,28 @@ func (c *memcachedClient) dialViaCircuitBreaker(network, address string, timeout
return conn.(net.Conn), nil
}
func (c *memcachedClient) Set(item *memcache.Item) error {
// Skip hitting memcached at all if the item is bigger than the max allowed size.
if c.maxItemSize > 0 && len(item.Value) > c.maxItemSize {
c.skipped.Inc()
return nil
}
err := c.Client.Set(item)
if err == nil {
return nil
}
// Inject the server address in order to have more information about which memcached
// backend server failed. This is a best effort.
addr, addrErr := c.serverList.PickServer(item.Key)
if addrErr != nil {
return err
}
return errors.Wrapf(err, "server=%s", addr)
}
func (c *memcachedClient) updateLoop(updateInterval time.Duration) {
defer c.wait.Done()
ticker := time.NewTicker(updateInterval)
@ -228,7 +251,7 @@ func (c *memcachedClient) updateMemcacheServers() error {
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
if err := c.provider.Resolve(ctx, c.addresses); err != nil {
if err := c.provider.Resolve(ctx, c.addresses, true); err != nil {
return err
}
servers = c.provider.Addresses()

View File

@ -9,7 +9,7 @@ import (
"sync"
"github.com/bradfitz/gomemcache/memcache"
"github.com/cespare/xxhash"
"github.com/cespare/xxhash/v2"
"github.com/facette/natsort"
)

Some files were not shown because too many files have changed in this diff Show More