Compare commits

...

122 Commits

Author SHA1 Message Date
Giedrius Statkevičius 49a560d09d
Merge pull request #7758 from thibaultmg/life_of_a_sample_part_2
Blog article submission: Life of a Sample in Thanos Part II
2025-07-26 15:03:50 +03:00
Harry John c3d4ea7cdd
*: Update promql-engine and prometheus (#8388)
* *: Update promql-engine and prometheus

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

* Fix data race

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>

---------

Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2025-07-25 10:39:09 -07:00
Thibault Mange 98130c25d6
fix inaccuracies
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2025-07-25 14:32:51 +02:00
Thibault Mange bf8777dcc5
Update docs/blog/2023-11-20-life-of-a-sample-part-2.md
Co-authored-by: Giedrius Statkevičius <giedriuswork@gmail.com>
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2025-07-25 10:44:25 +02:00
James Geisler be2f408d9e
[tools] add flag for uploading compacted blocks to bucket upload-blocks (#8359)
* add flag for uploading compacted blocks to thanos tools

Signed-off-by: James Geisler <geislerjamesd@gmail.com>

* update changelog

Signed-off-by: James Geisler <geislerjamesd@gmail.com>

* fix doc check

Signed-off-by: James Geisler <geislerjamesd@gmail.com>

---------

Signed-off-by: James Geisler <geislerjamesd@gmail.com>
2025-07-23 17:58:01 -07:00
Giedrius Statkevičius e30e831b1c
Merge pull request #8389 from thanos-io/bust_cache
block: bust cache if modified timestamp differs
2025-07-23 14:51:08 +03:00
Giedrius Statkevičius cdecd4ee3f block: use sync.Map for fetcher
f.cached can now be modified concurrently so use a sync.Map.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-23 14:06:24 +03:00
Giedrius Statkevičius 97196973f3 block: bust cache if modified timestamp differs
In the parquet converter, we mark the original meta.json file with a
flag when it gets converted so that Thanos Store wouldn't load it. For
that to work, we need to bust the local cache when that happens.

For tests, we need the updated objstore module so I am doing that as
well.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-23 13:32:41 +03:00
Giedrius Statkevičius de1a2236eb
Merge pull request #8372 from harry671003/update_grpc
*: Update GRPC
2025-07-22 16:29:23 +03:00
🌲 Harry 🌊 John 🏔 ba255aaccd Fix data race
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2025-07-21 15:18:32 -07:00
🌲 Harry 🌊 John 🏔 f1991970bf *: update GRPC
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
2025-07-21 14:25:25 -07:00
Michael Hoffmann 9073c8d0c5
Merge pull request #8384 from thanos-io/r0392_merge_to_main
Merge release 0.39.2 to main
2025-07-21 09:23:29 +02:00
Michael Hoffmann ba5c91aefb Merge remote-tracking branch 'origin/main' into r0392_merge_to_main 2025-07-21 06:55:52 +00:00
Michael Hoffmann 36681afb5e
Merge pull request #8379 from thanos-io/rel_0392
Release 0.39.2
2025-07-21 08:19:58 +02:00
Michael Hoffmann 5dd0031fab Release 0.39.2
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-07-17 14:17:28 +00:00
Joel Verezhak c0273e1d1a fix: querier panic (#8374)
Thanos Query crashes with "concurrent map iteration and map write" panic
in distributed mode when multiple goroutines access the same `annotations.Annotations`
map concurrently.

```
panic: concurrent map iteration and map write
github.com/prometheus/prometheus/util/annotations.(*Annotations).Merge(...)
github.com/thanos-io/promql-engine/engine.(*compatibilityQuery).Exec(...)
```

Here I replaced direct access to `res.Warnings.AsErrors()` with a thread-safe copy:
```go
// Before (unsafe)
warnings = append(warnings, res.Warnings.AsErrors()...)

// After (thread-safe)
safeWarnings := annotations.New().Merge(res.Warnings)
warnings = append(warnings, safeWarnings.AsErrors()...)
```

Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
Co-authored-by: Joel Verezhak <jverezhak@open-systems.com>
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-07-17 14:17:28 +00:00
Michael Hoffmann e78458176e query: add custom values to prompb methods (#8375)
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-07-17 14:17:28 +00:00
Michael Hoffmann 20900389bb
query: add custom values to prompb methods (#8375)
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-07-17 13:28:16 +00:00
Joel Verezhak 6f4895633a
fix: querier panic (#8374)
Thanos Query crashes with "concurrent map iteration and map write" panic
in distributed mode when multiple goroutines access the same `annotations.Annotations`
map concurrently.

```
panic: concurrent map iteration and map write
github.com/prometheus/prometheus/util/annotations.(*Annotations).Merge(...)
github.com/thanos-io/promql-engine/engine.(*compatibilityQuery).Exec(...)
```

Here I replaced direct access to `res.Warnings.AsErrors()` with a thread-safe copy:
```go
// Before (unsafe)
warnings = append(warnings, res.Warnings.AsErrors()...)

// After (thread-safe)
safeWarnings := annotations.New().Merge(res.Warnings)
warnings = append(warnings, safeWarnings.AsErrors()...)
```

Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
Co-authored-by: Joel Verezhak <jverezhak@open-systems.com>
2025-07-17 13:28:03 +00:00
Giedrius Statkevičius 0dc0b29fc8
Merge pull request #8366 from verejoel/feature/parquet-migration-flag
feat: ignore parquet migrated blocks in store gateway
2025-07-16 22:51:23 +03:00
Giedrius Statkevičius b4951291c7 *: always enable, clean up tests+code
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-16 18:30:16 +03:00
Giedrius Statkevičius 77f12e3e97
Merge pull request #8370 from open-ch/fix/querier-relabel-config
fix: query announced endpoints match relabel-config
2025-07-15 14:03:48 +03:00
Joel Verezhak dddffa99c4
fix acceptance test
Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
2025-07-14 01:01:43 +02:00
Joel Verezhak f2ff735e76
return only one store
Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
2025-07-13 21:52:56 +02:00
Joel Verezhak dee991e0d9
acceptance test
Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
2025-07-13 21:42:13 +02:00
Joel Verezhak 0972c43f29
acceptance test
Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
2025-07-13 21:17:30 +02:00
Joel Verezhak bd88416a19
rename method
Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
2025-07-13 20:35:41 +02:00
Joel Verezhak 0bb3e73e9d
refactor
Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
2025-07-13 20:24:08 +02:00
Joel Verezhak 9f2acf9df9
lint
Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
2025-07-12 01:38:19 +02:00
Joel Verezhak 8b3c29acc7
fix: querier external labels match relabel config
Signed-off-by: Joel Verezhak <jverezhak@open-systems.com>
2025-07-12 01:16:20 +02:00
Joel Verezhak ecd54dafd0
feat: ignore parquet migrated blocks in store gateway
Signed-off-by: Joel Verezhak <j.verezhak@gmail.com>
2025-07-08 17:46:19 +02:00
Giedrius Statkevičius b51ef67654
Merge pull request #8364 from thanos-io/use_prom_consts
*: use prometheus consts
2025-07-08 15:33:45 +03:00
Giedrius Statkevičius c8e9c2b12c *: use prometheus consts
Use Prometheus consts instead of using our own.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-08 14:50:55 +03:00
Michael Hoffmann 0f81bb792a
query: make grpc service config for endpoint groups configurable (#8287)
We add a "service_config" field to endpoint config file that we can use
to override the default service_config for endpoint groups. This enables
us to configure retry policy or loadbalncing on an endpoint level.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-07-08 08:54:32 +01:00
Giedrius Statkevičius ddd5ff85f4
Merge pull request #8352 from thanos-io/r0391_merge_to_main
Merge release-0.39 to main
2025-07-01 17:13:53 +03:00
Giedrius Statkevičius 49cccb4d83 CHANGELOG: fix formatting
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-01 16:34:53 +03:00
Giedrius Statkevičius d6a926e613 Merge branch 'main' into r0391_merge_to_main
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-01 16:28:16 +03:00
Giedrius Statkevičius ad743914dd
Merge pull request #8351 from thanos-io/rel_0391
Release 0.39.1
2025-07-01 13:15:10 +03:00
Giedrius Statkevičius 35309514d1
Merge pull request #8347 from Saumya40-codes/update-docs-links
docs: update changed repositories links in docs/ to correct location
2025-07-01 13:02:03 +03:00
Giedrius Statkevičius e9bdd79df2 CHANGELOG: release 0.39.1
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-01 12:59:58 +03:00
Giedrius Statkevičius 5583757964 qfe: defer properly
Refactor this check into a separate function so that defer would run at
the end of it and clean up resources properly.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-01 12:58:11 +03:00
Giedrius Statkevičius 4240ff3579 cmd/query_frontend: use original roundtripper + close immediately
Let's avoid using all the Cortex roundtripper machinery by using the
downstream roundtripper directly and then close the body immediately as
to not allocate any memory for the body of the response.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-01 12:58:05 +03:00
Giedrius Statkevičius 7c5ba37e5e
Merge pull request #8349 from thanos-io/defer_qfe
qfe: defer properly
2025-07-01 11:28:58 +03:00
Giedrius Statkevičius 938c083d6b qfe: defer properly
Refactor this check into a separate function so that defer would run at
the end of it and clean up resources properly.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-07-01 10:34:37 +03:00
Saumya Shah 9847758315 update changed repositories urls in docs/
Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-07-01 08:44:37 +05:30
Giedrius Statkevičius 246502a29b
Merge pull request #8338 from thanos-io/tweak_qfe
cmd/query_frontend: use original roundtripper + close immediately
2025-06-30 14:16:06 +03:00
Giedrius Statkevičius d87029eea4 cmd/query_frontend: use original roundtripper + close immediately
Let's avoid using all the Cortex roundtripper machinery by using the
downstream roundtripper directly and then close the body immediately as
to not allocate any memory for the body of the response.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-26 12:40:15 +03:00
Giedrius Statkevičius 3727363b49
Merge pull request #8335 from pedro-stanaka/fix/flaky-unit-test-store-proxy
fix: make TestProxyStore_SeriesSlowStores less flaky by removing timing assertions
2025-06-26 12:20:19 +03:00
Giedrius Statkevičius 37254e5779
Merge pull request #8336 from thanos-io/lazyindexheader_fix
indexheader: fix race between lazy index header creation
2025-06-26 11:19:12 +03:00
Giedrius Statkevičius 4b31bbaa6b indexheader: create lazy header in singleflight
Creation of the index header shares the underlying storage so we should
use singleflight here to only create it once.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-26 10:18:07 +03:00
Giedrius Statkevičius d6ee898a06 indexheader: produce race in test
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-26 10:01:21 +03:00
Giedrius Statkevičius 5a95d13802
Merge pull request #8333 from thanos-io/repro_8224
e2e: add repro for 8224
2025-06-26 08:01:35 +03:00
Pedro Tanaka b54d293dbd
fix: make TestProxyStore_SeriesSlowStores less flaky by removing timing assertions
The TestProxyStore_SeriesSlowStores test was failing intermittently in CI due to
strict timing assertions that were sensitive to system load and scheduling variations.

The test now focuses on functional correctness rather than precise timing,
making it more reliable in CI environments while still validating the
proxy store's timeout and partial response behavior.

Signed-off-by: Pedro Tanaka <pedro.stanaka@gmail.com>
2025-06-25 23:09:47 +02:00
Giedrius Statkevičius dfcbfe7c40 e2e: add repro for 8224
Add repro for https://github.com/thanos-io/thanos/issues/8224. Fix in
follow up PRs.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-25 18:07:48 +03:00
Giedrius Statkevičius 8b738c55b1
Merge pull request #8331 from thanos-io/merge-release-0.39-to-main-v2
Merge release 0.39 to main
2025-06-25 15:25:36 +03:00
Giedrius Statkevičius 69624ecbf1 Merge branch 'main' into merge-release-0.39-to-main-v2
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-25 14:59:35 +03:00
Giedrius Statkevičius 0453c9b144
*: release 0.39.0 (#8330)
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-25 14:05:34 +03:00
Saswata Mukherjee 9c955d21df
e2e: Check rule group label works (#8322)
* e2e: Check rule group label works

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Fix fanout test

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2025-06-23 10:27:07 +01:00
Paul 7de9c13e5f
add rule tsdb.enable-native-histograms flag (#8321)
Signed-off-by: Paul Hsieh <supaulkawaii@gmail.com>
2025-06-23 10:06:00 +01:00
Giedrius Statkevičius a6c05e6df6
*: add CHANGELOG, update VERSION (#8320)
Prepare for 0.39.0-rc.0.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-20 07:12:19 +03:00
Giedrius Statkevičius 34a98c8efb
CHANGELOG: indicate release (#8319)
Indicate that 0.39.0 is in progress.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-19 17:59:12 +03:00
Giedrius Statkevičius 933f04f55e
query_frontend: only ready if downstream is ready (#8315)
We had an incident in prod where QFE was reporting that it is ready even
though the downstream didn't work due to a misconfigured load-balancer.
In this PR I am proposing sending periodic requests to downstream
to check whether it is working.

TestQueryFrontendTenantForward never worked so I deleted it.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-18 11:56:48 +03:00
dependabot[bot] f1c0f4b9b8
build(deps): bump github.com/KimMachineGun/automemlimit (#8312)
Bumps [github.com/KimMachineGun/automemlimit](https://github.com/KimMachineGun/automemlimit) from 0.7.2 to 0.7.3.
- [Release notes](https://github.com/KimMachineGun/automemlimit/releases)
- [Commits](https://github.com/KimMachineGun/automemlimit/compare/v0.7.2...v0.7.3)

---
updated-dependencies:
- dependency-name: github.com/KimMachineGun/automemlimit
  dependency-version: 0.7.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-18 12:35:30 +05:30
Hongcheng Zhu a6370c7cc6
Add Prometheus counters for pending write requests and series requests in Receive (#8308)
Signed-off-by: HC Zhu <hczhu.mtv@gmail.com>
Co-authored-by: HC Zhu (DB) <hc.zhu@databricks.com>
2025-06-17 10:46:12 +05:30
Hongcheng Zhu 8f715b0b6b
Query: limit LazyRetrieval memory buffer size (#8296)
* Limit lazyRespSet memory buffer size using a ring buffer

Signed-off-by: HC Zhu <hczhu.mtv@gmail.com>

* store: make heap a bit more consistent

Add len comparison to make it more consistent.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Fix linter complains

Signed-off-by: HC Zhu <hczhu.mtv@gmail.com>

---------

Signed-off-by: HC Zhu <hczhu.mtv@gmail.com>
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Co-authored-by: HC Zhu (DB) <hc.zhu@databricks.com>
Co-authored-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Co-authored-by: HC Zhu <hczhu.mtv@gmail.com>
2025-06-14 10:52:46 -07:00
Filip Petkovski 6c27396458
Merge pull request #8306 from GregSharpe1/main
[docs] Updating documentation around --compact flags
2025-06-13 08:53:04 +02:00
Greg Sharpe d1afea6a69 Updating the documention to reflect the correct flags when using --compact.enable-vertical-compaction.
Signed-off-by: Greg Sharpe <git+me@gregsharpe.co.uk>
2025-06-13 08:28:42 +02:00
gabyf 03d5b6bc28
tools: fix tool bucket inspect output arg description (#8252)
* docs: fix tool bucket output arg description

Signed-off-by: gabyf <zweeking.tech@gmail.com>

* fix(tools_bucket): output description from cvs to csv

Signed-off-by: gabyf <zweeking.tech@gmail.com>

---------

Signed-off-by: gabyf <zweeking.tech@gmail.com>
2025-06-12 16:35:42 -07:00
Giedrius Statkevičius 8769b97c86
go.mod: update promql engine + Prom dep (#8305)
Update dependencies. Almost everything works except for
https://github.com/prometheus/prometheus/pull/16252.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-06-12 10:50:03 +03:00
Aaron Walker 26f6e64365
Revert capnp to v3.0.0-alpha (#8300)
cef0b02 caused a regression of !7944. This reverts the version upgrade to the previously working version

Signed-off-by: Aaron Walker <aaron@vcra.io>
2025-06-10 09:41:59 +05:30
dependabot[bot] 60533e4a22
build(deps): bump golang.org/x/time from 0.11.0 to 0.12.0 (#8302)
Bumps [golang.org/x/time](https://github.com/golang/time) from 0.11.0 to 0.12.0.
- [Commits](https://github.com/golang/time/compare/v0.11.0...v0.12.0)

---
updated-dependencies:
- dependency-name: golang.org/x/time
  dependency-version: 0.12.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-10 09:31:02 +05:30
dependabot[bot] 95a2b00f17
build(deps): bump github.com/alicebob/miniredis/v2 from 2.22.0 to 2.35.0 (#8303)
Bumps [github.com/alicebob/miniredis/v2](https://github.com/alicebob/miniredis) from 2.22.0 to 2.35.0.
- [Release notes](https://github.com/alicebob/miniredis/releases)
- [Changelog](https://github.com/alicebob/miniredis/blob/master/CHANGELOG.md)
- [Commits](https://github.com/alicebob/miniredis/compare/v2.22.0...v2.35.0)

---
updated-dependencies:
- dependency-name: github.com/alicebob/miniredis/v2
  dependency-version: 2.35.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-10 09:30:45 +05:30
dependabot[bot] 2ed24bdf5b
build(deps): bump github/codeql-action from 3.26.13 to 3.28.19 (#8304)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.26.13 to 3.28.19.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](f779452ac5...fca7ace96b)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.28.19
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-10 09:30:25 +05:30
Naman-Parlecha 23d60b8615
Fix: DataRace in TestEndpointSetUpdate_StrictEndpointMetadata test (#8288)
* fix: Fixing Unit Test TestEndpointSetUpdate_StrictEndpointMetadata

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* revert: CHANGELOG.md

Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>

---------

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>
2025-06-06 15:51:53 +03:00
Naman-Parlecha 290f16c0e9
Resolve GitHub Actions Failure (#8299)
* update: changing to new prometheus page

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* fix: disable-admin-op flag

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

---------

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
2025-06-05 13:52:12 +03:00
Aaron Walker 4ad45948cd
Receive: Remove migration of legacy storage to multi-tsdb (#8289)
This has been in since 0.13 (~5 years ago). This fixes issues caused when the default-tenant does not have any data and gets churned, resulting in the migration assuming that per-tenant directories are actually blocks, resulting in blocks not being queryable.

Signed-off-by: Aaron Walker <aaron@vcra.io>
2025-06-03 16:57:57 +03:00
Daniel Blando 15b1ef2ead
shipper: allow shipper sync to skip corrupted blocks (#8259)
* Allow shipper sync to skip corrupted blocks

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Move check to blockMetasFromOldest

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Split metrics. Return error

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* fix test

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Reorder shipper contructor variables

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Use opts in shipper constructor

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

* Fix typo

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>

---------

Signed-off-by: Daniel Deluiggi <ddeluigg@amazon.com>
2025-06-02 23:30:16 -07:00
Naman-Parlecha 2029c9bee0
store: Add --disable-admin-operations Flag to Store Gateway (#8284)
* fix(sidebar): maintain expanded state based on current page

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* fixing changelog

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* store: --disable-admin-operation flag

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

docs: Adding Flag details

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

updated changelog

refactor: changelog

Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>

---------

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>
2025-06-01 15:26:58 -07:00
Saumya Shah 4e04420489
query: handle query.Analyze returning nil gracefully (#8199)
* fix: handle analyze returning nil gracefully

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* update CHANGELOG.md

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* fix format

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

---------

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-05-30 12:15:42 +03:00
Naman-Parlecha 36df30bbe8
fix: maintain expanded state based on current page (#8266)
* fix(sidebar): maintain expanded state based on current page

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

* fixing changelog

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>

---------

Signed-off-by: Naman-B-Parlecha <namanparlecha@gmail.com>
Signed-off-by: Naman-Parlecha <namanparlecha@gmail.com>
2025-05-30 12:07:23 +03:00
Saumya Shah 390fd0a023
query, query-frontend, ruler: Add support for flags to use promQL experimental functions & bump promql-engine (#8245)
* feat: add support for experimental functions, if enabled

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* fix tests

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* allow setting enable-feature flag in ruler

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* add flag info in docs

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* add CHANGELOG

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* add hidden flag to throw err on query fallback, red in tests ^_^

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* bump promql-engine to latest version/commit

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* format docs

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

---------

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-05-30 10:04:28 +03:00
Anna Tran 12649d8be7
Force sync writes to meta.json in case of host crash (#8282)
* Force sync writes to meta.json in case of host crash

Signed-off-by: Anna Tran <trananna@amazon.com>

* Update CHANGELOG for fsync meta.json

Signed-off-by: Anna Tran <trananna@amazon.com>

---------

Signed-off-by: Anna Tran <trananna@amazon.com>
2025-05-29 12:23:49 +03:00
Giedrius Statkevičius cef0b0200e
go.mod: mass update modules (#8277)
Maintenance task: let's update all modules.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-27 18:32:28 +03:00
Saumya Shah efc6eee8c6
query: fix query analyze to return appropriate results (#8262)
* call query analysis once querys being exec

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* refract the analyze logic

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* send not analyzable warnings instead of returning err

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* add seperate warnings in query non analyzable state based on engine

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

---------

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-05-27 16:13:30 +03:00
Siavash Safi da421eaffe
Shipper: fix missing meta file errors (#8268)
- fix meta file read error check
- use proper logs for missing meta file vs. other read errors

Signed-off-by: Siavash Safi <siavash@cloudflare.com>
2025-05-23 11:46:09 +00:00
Giedrius Statkevičius d71a58cbd4
docs: fix receive page (#8267)
Fix the docs after the most recent merge.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-23 10:47:01 +00:00
Giedrius Statkevičius f847ff0262
receive: implement shuffle sharding (#8238)
See the documentation for details.

Closes #3821.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-22 11:08:23 +03:00
dronenb ec9601aa0e
feat(promu): add darwin/arm64 (#8263)
* feat(promu): add darwin/arm64

Signed-off-by: Ben Dronen <dronenb@users.noreply.github.com>

* fix(promu): just use darwin

Signed-off-by: Ben Dronen <dronenb@users.noreply.github.com>

---------

Signed-off-by: Ben Dronen <dronenb@users.noreply.github.com>
2025-05-22 10:04:57 +02:00
Michael Hoffmann 759773c4dc
shipper: delete unused functions (#8260)
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-05-21 08:18:52 +00:00
Giedrius Statkevičius 88092449cd
docs: volunteer as shepherd (#8249)
* docs: volunteer as shepherd

Release the next version in a few weeks.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>

* Fix formatting

Signed-off-by: Matej Gera <38492574+matej-g@users.noreply.github.com>

---------

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Matej Gera <38492574+matej-g@users.noreply.github.com>
Co-authored-by: Matej Gera <38492574+matej-g@users.noreply.github.com>
2025-05-15 14:56:00 +03:00
Ayoub Mrini 34b3d64034
test(tools_test.go/Test_CheckRules_Glob): take into consideration RO current dirs while (#8014)
changing files permissions.

The process may not have the needed permissions on the file (not the owner, not root or doesn't have the CAP_FOWNER capability)
to chmod it.
i

Signed-off-by: machine424 <ayoubmrini424@gmail.com>
2025-05-14 13:20:14 +01:00
dongjiang 242b5f6307
add otlp clientType (#8243)
Signed-off-by: dongjiang <dongjiang1989@126.com>
2025-05-13 14:18:41 +03:00
Giedrius Statkevičius aa3e4199db
e2e: disable some more flaky tests (#8241)
These are flaky hence disable them.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-09 16:46:23 +03:00
Giedrius Statkevičius 81b4260f5f
reloader: disable some flaky tests (#8240)
Disabling some flaky tests.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-05-08 15:59:24 +03:00
Saumya Shah 2dfc749a85
UI: bump codemirror-promql dependency to latest version (#8230)
* bump codemirror-promql react dep to latest version

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* fix lint errors, build react-app

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* sync ui change of input expression

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* revert build files

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

* build and update few warnings

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>

---------

Signed-off-by: Saumya Shah <saumyabshah90@gmail.com>
2025-05-07 11:20:49 +01:00
Philip Gough 2a5a856e34
tools: Extend bucket ls options (#8225)
* tools: Extend bucket ls command with min and max time, selector config and timeout options

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>

* make: docs

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>

Update cmd/thanos/tools_bucket.go

Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Philip Gough <pgough@redhat.com>

Update cmd/thanos/tools_bucket.go

Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Signed-off-by: Philip Gough <pgough@redhat.com>

---------

Signed-off-by: Philip Gough <philip.p.gough@gmail.com>
Signed-off-by: Philip Gough <pgough@redhat.com>
Co-authored-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2025-04-25 10:33:34 +00:00
Giedrius Statkevičius cff147dbc0
receive: remove Get() method from hashring (#8226)
Get() is equivalent to GetN(1) so remove it. It's not used.

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
2025-04-25 09:59:37 +00:00
Saswata Mukherjee 7d7ea650b7
Receive: Ensure forward/replication metrics are incremented in err cases (#8212)
* Ensure forward/replication metrics are incremented in err cases

This commit ensures forward and replication metrics are incremented with
err labels.

This seemed to be missing, came across this whilst working on a
dashboard.

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

* Add changelog

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>

---------

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
2025-04-22 11:35:34 +00:00
Andrew Reilly 92db7aabb1
Update query.md documentation where example uses --query.tenant-default-id flag instead of --query.default-tenant-id (#8210)
Signed-off-by: Andrew Reilly <adr@maas.ca>
2025-04-22 11:27:49 +03:00
Filip Petkovski 66f54ac88d
Merge pull request #8216 from yuchen-db/fix-iter-race
Fix Pull iterator race between next() and stop()
2025-04-22 08:19:10 +02:00
Yuchen Wang a8220d7317 simplify unit test
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 17:54:47 -07:00
Yuchen Wang 909c08fa98 add comments
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 17:46:38 -07:00
Yuchen Wang 6663bb01ac update changelog
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 17:16:33 -07:00
Yuchen Wang d7876b4303 fix unit test
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 16:55:57 -07:00
Yuchen Wang 6f556d2bbb add unit test
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 15:46:54 -07:00
Yuchen Wang 0dcc9e9ccd add changelog
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 15:46:54 -07:00
Yuchen Wang f168dc0cbb fix Pull iter race between next() and stop()
Signed-off-by: Yuchen Wang <yuchen.wang@databricks.com>
2025-04-20 15:46:54 -07:00
dependabot[bot] 8273ad013c
build(deps): bump github.com/golang-jwt/jwt/v5 from 5.2.1 to 5.2.2 (#8164)
Bumps [github.com/golang-jwt/jwt/v5](https://github.com/golang-jwt/jwt) from 5.2.1 to 5.2.2.
- [Release notes](https://github.com/golang-jwt/jwt/releases)
- [Changelog](https://github.com/golang-jwt/jwt/blob/main/VERSION_HISTORY.md)
- [Commits](https://github.com/golang-jwt/jwt/compare/v5.2.1...v5.2.2)

---
updated-dependencies:
- dependency-name: github.com/golang-jwt/jwt/v5
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-17 15:08:12 +01:00
Michael Hoffmann 31c6115317
Query: fix partial response for distributed instant query (#8211)
This commit fixes a typo in partial response handling for distributed
instant queries.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-04-17 08:06:42 +00:00
Aaron Walker c0b5500cb5
Unhide tsdb.enable-native-histograms flag in receive (#8202)
Signed-off-by: Aaron Walker <aaron@vcra.io>
2025-04-11 13:33:24 +02:00
Michael Hoffmann ce2b51f93e
Sidecar: increase default prometheus timeout (#8192)
Adjust the default get-config timeout to match the default get-config
interval.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-04-07 15:29:22 +00:00
Naohiro Okada 1a559f9de8
fix changelog markdown. (#8190)
Signed-off-by: naohiroo <naohiro.dev@gmail.com>
2025-04-04 14:23:16 +02:00
Michael Hoffmann b2f5ee44a7
merge release 0.38.0 to main (#8186)
* Changelog: cut release 0.38-rc.0 (#8174)

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

* Changelog: cut release 0.38.0-rc.1 (#8180)

* Query: fix endpointset setup

This commit fixes an issue where we add non-strict, non-group endpoints
to the endpointset twice, once with resolved addresses from the dns
provider and once with its dns prefix.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

* deps: bump promql-engine (#8181)

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

* Changelog: cut release 0.38-rc.1

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

---------

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

* Changelog: cut release 0.38 (#8185)

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>

---------

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-04-04 06:10:48 +00:00
Michael Hoffmann 08e5907cba
deps: bump promql-engine (#8181)
Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-03-31 13:22:35 +00:00
Michael Hoffmann 2fccdfbf5a
Query: fix endpointset setup (#8175)
This commit fixes an issue where we add non-strict, non-group endpoints
to the endpointset twice, once with resolved addresses from the dns
provider and once with its dns prefix.

Signed-off-by: Michael Hoffmann <mhoffmann@cloudflare.com>
2025-03-27 07:02:21 +00:00
Thibault Mange ade0aed6f4
remove internal links
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2024-09-19 13:29:18 +02:00
Thibault Mange a9ae3070b9
fix links
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2024-09-19 13:29:18 +02:00
Thibault Mange 62ec424747
format
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2024-09-19 13:29:18 +02:00
Thibault Mange a631728945
fix img size
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2024-09-19 13:29:18 +02:00
Thibault Mange 38a98c7ec0
add store limits
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2024-09-19 13:29:18 +02:00
Thibault Mange e2fb8c034b
fix typo
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2024-09-19 13:29:18 +02:00
Thibault Mange 72a4952f48
add part II
Signed-off-by: Thibault Mange <22740367+thibaultmg@users.noreply.github.com>
2024-09-19 13:29:13 +02:00
197 changed files with 6345 additions and 5090 deletions

View File

@ -48,7 +48,7 @@ jobs:
# Initializes the CodeQL tools for scanning. # Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL - name: Initialize CodeQL
uses: github/codeql-action/init@f779452ac5af1c261dce0346a8f964149f49322b # v3.26.13 uses: github/codeql-action/init@fca7ace96b7d713c7035871441bd52efbe39e27e # v3.28.19
with: with:
languages: ${{ matrix.language }} languages: ${{ matrix.language }}
config-file: ./.github/codeql/codeql-config.yml config-file: ./.github/codeql/codeql-config.yml
@ -60,7 +60,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java). # Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below) # If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild - name: Autobuild
uses: github/codeql-action/autobuild@f779452ac5af1c261dce0346a8f964149f49322b # v3.26.13 uses: github/codeql-action/autobuild@fca7ace96b7d713c7035871441bd52efbe39e27e # v3.28.19
# Command-line programs to run using the OS shell. # Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl # 📚 https://git.io/JvXDl
@ -74,4 +74,4 @@ jobs:
# make release # make release
- name: Perform CodeQL Analysis - name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@f779452ac5af1c261dce0346a8f964149f49322b # v3.26.13 uses: github/codeql-action/analyze@fca7ace96b7d713c7035871441bd52efbe39e27e # v3.28.19

View File

@ -9,6 +9,9 @@ run:
# exit code when at least one issue was found, default is 1 # exit code when at least one issue was found, default is 1
issues-exit-code: 1 issues-exit-code: 1
build-tags:
- slicelabels
# output configuration options # output configuration options
output: output:
# The formats used to render issues. # The formats used to render issues.
@ -57,7 +60,7 @@ issues:
# We don't check metrics naming in the tests. # We don't check metrics naming in the tests.
- path: _test\.go - path: _test\.go
linters: linters:
- promlinter - promlinter
# These are not being checked since these methods exist # These are not being checked since these methods exist
# so that no one else could implement them. # so that no one else could implement them.
- linters: - linters:

View File

@ -6,7 +6,11 @@ build:
binaries: binaries:
- name: thanos - name: thanos
path: ./cmd/thanos path: ./cmd/thanos
flags: -a -tags netgo flags: -a
tags:
- netgo
- slicelabels
ldflags: | ldflags: |
-X github.com/prometheus/common/version.Version={{.Version}} -X github.com/prometheus/common/version.Version={{.Version}}
-X github.com/prometheus/common/version.Revision={{.Revision}} -X github.com/prometheus/common/version.Revision={{.Revision}}
@ -16,7 +20,7 @@ build:
crossbuild: crossbuild:
platforms: platforms:
- linux/amd64 - linux/amd64
- darwin/amd64 - darwin
- linux/arm64 - linux/arm64
- windows/amd64 - windows/amd64
- freebsd/amd64 - freebsd/amd64

View File

@ -10,15 +10,66 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
## Unreleased ## Unreleased
### Fixed
### Added ### Added
- [#8366](https://github.com/thanos-io/thanos/pull/8366) Store: optionally ignore Parquet migrated blocks
- [#8359](https://github.com/thanos-io/thanos/pull/8359) Tools: add `--shipper.upload-compacted` flag for uploading compacted blocks to bucket upload-blocks
### Changed ### Changed
- [#8370](https://github.com/thanos-io/thanos/pull/8370) Query: announced labelset now reflects relabel-config
### Removed ### Removed
### [v0.39.2](https://github.com/thanos-io/thanos/tree/release-0.39) - 2025 07 17
### Fixed ### Fixed
## [v0.38.0 - <in progress>](https://github.com/thanos-io/thanos/tree/release-0.38) - [#8374](https://github.com/thanos-io/thanos/pull/8374) Query: fix panic when concurrently accessing annotations map
- [#8375](https://github.com/thanos-io/thanos/pull/8375) Query: fix native histogram buckets in distributed queries
### [v0.39.1](https://github.com/thanos-io/thanos/tree/release-0.39) - 2025 07 01
Fixes a memory leak issue on query-frontend. The bug only affects v0.39.0.
### Fixed
- [#8349](https://github.com/thanos-io/thanos/pull/8349) Query-Frontend: properly clean up resources
- [#8338](https://github.com/thanos-io/thanos/pull/8338) Query-Frontend: use original roundtripper + close immediately
## [v0.39.0](https://github.com/thanos-io/thanos/tree/release-0.39) - 2025 06 25
In short: there are a bunch of fixes and small improvements. The shining items in this release are memory usage improvements in Thanos Query and shuffle sharding support in Thanos Receiver. Information about shuffle sharding support is available in the documentation. Thank you to all contributors!
### Added
- [#8308](https://github.com/thanos-io/thanos/pull/8308) Receive: Prometheus counters for pending write requests and series requests
- [#8225](https://github.com/thanos-io/thanos/pull/8225) tools: Extend bucket ls options.
- [#8238](https://github.com/thanos-io/thanos/pull/8238) Receive: add shuffle sharding support
- [#8284](https://github.com/thanos-io/thanos/pull/8284) Store: Add `--disable-admin-operations` Flag to Store Gateway
- [#8245](https://github.com/thanos-io/thanos/pull/8245) Querier/Query-Frontend/Ruler: Add `--enable-feature=promql-experimental-functions` flag option to enable using promQL experimental functions in respective Thanos components
- [#8259](https://github.com/thanos-io/thanos/pull/8259) Shipper: Add `--shipper.skip-corrupted-blocks` flag to allow `Sync()` to continue upload when finding a corrupted block
### Changed
- [#8282](https://github.com/thanos-io/thanos/pull/8282) Force sync writes to meta.json in case of host crash
- [#8192](https://github.com/thanos-io/thanos/pull/8192) Sidecar: fix default get config timeout
- [#8202](https://github.com/thanos-io/thanos/pull/8202) Receive: Unhide `--tsdb.enable-native-histograms` flag
- [#8315](https://github.com/thanos-io/thanos/pull/8315) Query-Frontend: only ready if downstream is ready
### Removed
- [#8289](https://github.com/thanos-io/thanos/pull/8289) Receive: *breaking :warning:* Removed migration of legacy-TSDB to multi-TSDB. Ensure you are running version >0.13
### Fixed
- [#8199](https://github.com/thanos-io/thanos/pull/8199) Query: handle panics or nil pointer dereference in querier gracefully when query analyze returns nil
- [#8211](https://github.com/thanos-io/thanos/pull/8211) Query: fix panic on nested partial response in distributed instant query
- [#8216](https://github.com/thanos-io/thanos/pull/8216) Query/Receive: fix iter race between `next()` and `stop()` introduced in https://github.com/thanos-io/thanos/pull/7821.
- [#8212](https://github.com/thanos-io/thanos/pull/8212) Receive: Ensure forward/replication metrics are incremented in err cases
- [#8296](https://github.com/thanos-io/thanos/pull/8296) Query: limit LazyRetrieval memory buffer size
## [v0.38.0](https://github.com/thanos-io/thanos/tree/release-0.38) - 03.04.2025
### Fixed ### Fixed
- [#8091](https://github.com/thanos-io/thanos/pull/8091) *: Add POST into allowed CORS methods header - [#8091](https://github.com/thanos-io/thanos/pull/8091) *: Add POST into allowed CORS methods header
@ -36,6 +87,8 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
- [#8131](https://github.com/thanos-io/thanos/pull/8131) Store Gateway: Optimize regex matchers for .* and .+. #8131 - [#8131](https://github.com/thanos-io/thanos/pull/8131) Store Gateway: Optimize regex matchers for .* and .+. #8131
- [#7808](https://github.com/thanos-io/thanos/pull/7808) Query: Support chain deduplication algorithm. - [#7808](https://github.com/thanos-io/thanos/pull/7808) Query: Support chain deduplication algorithm.
- [#8158](https://github.com/thanos-io/thanos/pull/8158) Rule: Add support for query offset. - [#8158](https://github.com/thanos-io/thanos/pull/8158) Rule: Add support for query offset.
- [#8110](https://github.com/thanos-io/thanos/pull/8110) Compact: implement native histogram downsampling.
- [#7996](https://github.com/thanos-io/thanos/pull/7996) Receive: Add OTLP endpoint.
### Changed ### Changed
@ -43,6 +96,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
- [#7012](https://github.com/thanos-io/thanos/pull/7012) Query: Automatically adjust `max_source_resolution` based on promql query to avoid querying data from higher resolution resulting empty results. - [#7012](https://github.com/thanos-io/thanos/pull/7012) Query: Automatically adjust `max_source_resolution` based on promql query to avoid querying data from higher resolution resulting empty results.
- [#8118](https://github.com/thanos-io/thanos/pull/8118) Query: Bumped promql-engine - [#8118](https://github.com/thanos-io/thanos/pull/8118) Query: Bumped promql-engine
- [#8135](https://github.com/thanos-io/thanos/pull/8135) Query: respect partial response in distributed engine - [#8135](https://github.com/thanos-io/thanos/pull/8135) Query: respect partial response in distributed engine
- [#8181](https://github.com/thanos-io/thanos/pull/8181) Deps: bump promql engine
### Removed ### Removed
@ -52,6 +106,7 @@ We use *breaking :warning:* to mark changes that are not backward compatible (re
- [#7970](https://github.com/thanos-io/thanos/pull/7970) Sidecar: Respect min-time setting. - [#7970](https://github.com/thanos-io/thanos/pull/7970) Sidecar: Respect min-time setting.
- [#7962](https://github.com/thanos-io/thanos/pull/7962) Store: Fix potential deadlock in hedging request. - [#7962](https://github.com/thanos-io/thanos/pull/7962) Store: Fix potential deadlock in hedging request.
- [#8175](https://github.com/thanos-io/thanos/pull/8175) Query: fix endpointset setup
### Added ### Added

View File

@ -5,7 +5,7 @@ WORKDIR $GOPATH/src/github.com/thanos-io/thanos
COPY . $GOPATH/src/github.com/thanos-io/thanos COPY . $GOPATH/src/github.com/thanos-io/thanos
RUN CGO_ENABLED=1 go build -o $GOBIN/thanos -race ./cmd/thanos RUN CGO_ENABLED=1 go build -tags slicelabels -o $GOBIN/thanos -race ./cmd/thanos
# ----------------------------------------------------------------------------- # -----------------------------------------------------------------------------
FROM golang:1.24.0 FROM golang:1.24.0

View File

@ -319,7 +319,7 @@ test: export THANOS_TEST_ALERTMANAGER_PATH= $(ALERTMANAGER)
test: check-git install-tool-deps test: check-git install-tool-deps
@echo ">> install thanos GOOPTS=${GOOPTS}" @echo ">> install thanos GOOPTS=${GOOPTS}"
@echo ">> running unit tests (without /test/e2e). Do export THANOS_TEST_OBJSTORE_SKIP=GCS,S3,AZURE,SWIFT,COS,ALIYUNOSS,BOS,OCI,OBS if you want to skip e2e tests against all real store buckets. Current value: ${THANOS_TEST_OBJSTORE_SKIP}" @echo ">> running unit tests (without /test/e2e). Do export THANOS_TEST_OBJSTORE_SKIP=GCS,S3,AZURE,SWIFT,COS,ALIYUNOSS,BOS,OCI,OBS if you want to skip e2e tests against all real store buckets. Current value: ${THANOS_TEST_OBJSTORE_SKIP}"
@go test -race -timeout 15m $(shell go list ./... | grep -v /vendor/ | grep -v /test/e2e); @go test -tags slicelabels -race -timeout 15m $(shell go list ./... | grep -v /vendor/ | grep -v /test/e2e);
.PHONY: test-local .PHONY: test-local
test-local: ## Runs test excluding tests for ALL object storage integrations. test-local: ## Runs test excluding tests for ALL object storage integrations.
@ -341,9 +341,9 @@ test-e2e: docker-e2e $(GOTESPLIT)
# * If you want to limit CPU time available in e2e tests then pass E2E_DOCKER_CPUS environment variable. For example, E2E_DOCKER_CPUS=0.05 limits CPU time available # * If you want to limit CPU time available in e2e tests then pass E2E_DOCKER_CPUS environment variable. For example, E2E_DOCKER_CPUS=0.05 limits CPU time available
# to spawned Docker containers to 0.05 cores. # to spawned Docker containers to 0.05 cores.
@if [ -n "$(SINGLE_E2E_TEST)" ]; then \ @if [ -n "$(SINGLE_E2E_TEST)" ]; then \
$(GOTESPLIT) -total ${GH_PARALLEL} -index ${GH_INDEX} ./test/e2e -- -run $(SINGLE_E2E_TEST) ${GOTEST_OPTS}; \ $(GOTESPLIT) -total ${GH_PARALLEL} -index ${GH_INDEX} ./test/e2e -- -tags slicelabels -run $(SINGLE_E2E_TEST) ${GOTEST_OPTS}; \
else \ else \
$(GOTESPLIT) -total ${GH_PARALLEL} -index ${GH_INDEX} ./test/e2e/... -- ${GOTEST_OPTS}; \ $(GOTESPLIT) -total ${GH_PARALLEL} -index ${GH_INDEX} ./test/e2e/... -- -tags slicelabels ${GOTEST_OPTS}; \
fi fi
.PHONY: test-e2e-local .PHONY: test-e2e-local
@ -418,7 +418,7 @@ github.com/prometheus/prometheus/promql/parser.{ParseExpr,ParseMetricSelector}=g
io/ioutil.{Discard,NopCloser,ReadAll,ReadDir,ReadFile,TempDir,TempFile,Writefile}" $(shell go list ./... | grep -v "internal/cortex") io/ioutil.{Discard,NopCloser,ReadAll,ReadDir,ReadFile,TempDir,TempFile,Writefile}" $(shell go list ./... | grep -v "internal/cortex")
@$(FAILLINT) -paths "fmt.{Print,Println,Sprint}" -ignore-tests ./... @$(FAILLINT) -paths "fmt.{Print,Println,Sprint}" -ignore-tests ./...
@echo ">> linting all of the Go files GOGC=${GOGC}" @echo ">> linting all of the Go files GOGC=${GOGC}"
@$(GOLANGCI_LINT) run @$(GOLANGCI_LINT) run --build-tags=slicelabels
@echo ">> ensuring Copyright headers" @echo ">> ensuring Copyright headers"
@go run ./scripts/copyright @go run ./scripts/copyright
@echo ">> ensuring generated proto files are up to date" @echo ">> ensuring generated proto files are up to date"

View File

@ -1 +1 @@
0.39.0-dev 0.40.0-dev

View File

@ -136,7 +136,7 @@ func (pc *prometheusConfig) registerFlag(cmd extkingpin.FlagClause) *prometheusC
Default("30s").DurationVar(&pc.getConfigInterval) Default("30s").DurationVar(&pc.getConfigInterval)
cmd.Flag("prometheus.get_config_timeout", cmd.Flag("prometheus.get_config_timeout",
"Timeout for getting Prometheus config"). "Timeout for getting Prometheus config").
Default("5s").DurationVar(&pc.getConfigTimeout) Default("30s").DurationVar(&pc.getConfigTimeout)
pc.httpClient = extflag.RegisterPathOrContent( pc.httpClient = extflag.RegisterPathOrContent(
cmd, cmd,
"prometheus.http-client", "prometheus.http-client",
@ -203,6 +203,7 @@ type shipperConfig struct {
uploadCompacted bool uploadCompacted bool
ignoreBlockSize bool ignoreBlockSize bool
allowOutOfOrderUpload bool allowOutOfOrderUpload bool
skipCorruptedBlocks bool
hashFunc string hashFunc string
metaFileName string metaFileName string
} }
@ -219,6 +220,11 @@ func (sc *shipperConfig) registerFlag(cmd extkingpin.FlagClause) *shipperConfig
"This can trigger compaction without those blocks and as a result will create an overlap situation. Set it to true if you have vertical compaction enabled and wish to upload blocks as soon as possible without caring"+ "This can trigger compaction without those blocks and as a result will create an overlap situation. Set it to true if you have vertical compaction enabled and wish to upload blocks as soon as possible without caring"+
"about order."). "about order.").
Default("false").Hidden().BoolVar(&sc.allowOutOfOrderUpload) Default("false").Hidden().BoolVar(&sc.allowOutOfOrderUpload)
cmd.Flag("shipper.skip-corrupted-blocks",
"If true, shipper will skip corrupted blocks in the given iteration and retry later. This means that some newer blocks might be uploaded sooner than older blocks."+
"This can trigger compaction without those blocks and as a result will create an overlap situation. Set it to true if you have vertical compaction enabled and wish to upload blocks as soon as possible without caring"+
"about order.").
Default("false").Hidden().BoolVar(&sc.skipCorruptedBlocks)
cmd.Flag("hash-func", "Specify which hash function to use when calculating the hashes of produced files. If no function has been specified, it does not happen. This permits avoiding downloading some files twice albeit at some performance cost. Possible values are: \"\", \"SHA256\"."). cmd.Flag("hash-func", "Specify which hash function to use when calculating the hashes of produced files. If no function has been specified, it does not happen. This permits avoiding downloading some files twice albeit at some performance cost. Possible values are: \"\", \"SHA256\".").
Default("").EnumVar(&sc.hashFunc, "SHA256", "") Default("").EnumVar(&sc.hashFunc, "SHA256", "")
cmd.Flag("shipper.meta-file-name", "the file to store shipper metadata in").Default(shipper.DefaultMetaFilename).StringVar(&sc.metaFileName) cmd.Flag("shipper.meta-file-name", "the file to store shipper metadata in").Default(shipper.DefaultMetaFilename).StringVar(&sc.metaFileName)

View File

@ -15,7 +15,8 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/run" "github.com/oklog/run"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"

View File

@ -42,9 +42,10 @@ type fileContent interface {
} }
type endpointSettings struct { type endpointSettings struct {
Strict bool `yaml:"strict"` Strict bool `yaml:"strict"`
Group bool `yaml:"group"` Group bool `yaml:"group"`
Address string `yaml:"address"` Address string `yaml:"address"`
ServiceConfig string `yaml:"service_config"`
} }
type EndpointConfig struct { type EndpointConfig struct {
@ -115,6 +116,9 @@ func validateEndpointConfig(cfg EndpointConfig) error {
if dns.IsDynamicNode(ecfg.Address) && ecfg.Strict { if dns.IsDynamicNode(ecfg.Address) && ecfg.Strict {
return errors.Newf("%s is a dynamically specified endpoint i.e. it uses SD and that is not permitted under strict mode.", ecfg.Address) return errors.Newf("%s is a dynamically specified endpoint i.e. it uses SD and that is not permitted under strict mode.", ecfg.Address)
} }
if !ecfg.Group && len(ecfg.ServiceConfig) != 0 {
return errors.Newf("%s service_config is only valid for endpoint groups.", ecfg.Address)
}
} }
return nil return nil
} }
@ -298,8 +302,7 @@ func setupEndpointSet(
addresses := make([]string, 0, len(endpointConfig.Endpoints)) addresses := make([]string, 0, len(endpointConfig.Endpoints))
for _, ecfg := range endpointConfig.Endpoints { for _, ecfg := range endpointConfig.Endpoints {
if addr := ecfg.Address; !ecfg.Group && !ecfg.Strict { if addr := ecfg.Address; dns.IsDynamicNode(addr) && !ecfg.Group {
// originally only "--endpoint" addresses got resolved
addresses = append(addresses, addr) addresses = append(addresses, addr)
} }
} }
@ -318,14 +321,16 @@ func setupEndpointSet(
endpointConfig := configProvider.config() endpointConfig := configProvider.config()
specs := make([]*query.GRPCEndpointSpec, 0) specs := make([]*query.GRPCEndpointSpec, 0)
// groups and non dynamic endpoints
for _, ecfg := range endpointConfig.Endpoints { for _, ecfg := range endpointConfig.Endpoints {
strict, group, addr := ecfg.Strict, ecfg.Group, ecfg.Address strict, group, addr := ecfg.Strict, ecfg.Group, ecfg.Address
if group { if group {
specs = append(specs, query.NewGRPCEndpointSpec(fmt.Sprintf("thanos:///%s", addr), strict, append(dialOpts, extgrpc.EndpointGroupGRPCOpts()...)...)) specs = append(specs, query.NewGRPCEndpointSpec(fmt.Sprintf("thanos:///%s", addr), strict, append(dialOpts, extgrpc.EndpointGroupGRPCOpts(ecfg.ServiceConfig)...)...))
} else { } else if !dns.IsDynamicNode(addr) {
specs = append(specs, query.NewGRPCEndpointSpec(addr, strict, dialOpts...)) specs = append(specs, query.NewGRPCEndpointSpec(addr, strict, dialOpts...))
} }
} }
// dynamic endpoints
for _, addr := range dnsEndpointProvider.Addresses() { for _, addr := range dnsEndpointProvider.Addresses() {
specs = append(specs, query.NewGRPCEndpointSpec(addr, false, dialOpts...)) specs = append(specs, query.NewGRPCEndpointSpec(addr, false, dialOpts...))
} }

View File

@ -15,6 +15,7 @@ import (
"runtime/debug" "runtime/debug"
"syscall" "syscall"
"github.com/alecthomas/kingpin/v2"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/run" "github.com/oklog/run"
@ -25,7 +26,6 @@ import (
versioncollector "github.com/prometheus/client_golang/prometheus/collectors/version" versioncollector "github.com/prometheus/client_golang/prometheus/collectors/version"
"github.com/prometheus/common/version" "github.com/prometheus/common/version"
"go.uber.org/automaxprocs/maxprocs" "go.uber.org/automaxprocs/maxprocs"
"gopkg.in/alecthomas/kingpin.v2"
"github.com/thanos-io/thanos/pkg/extkingpin" "github.com/thanos-io/thanos/pkg/extkingpin"
"github.com/thanos-io/thanos/pkg/logging" "github.com/thanos-io/thanos/pkg/logging"

View File

@ -14,7 +14,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
promtest "github.com/prometheus/client_golang/prometheus/testutil" promtest "github.com/prometheus/client_golang/prometheus/testutil"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
@ -32,6 +33,11 @@ type erroringBucket struct {
bkt objstore.InstrumentedBucket bkt objstore.InstrumentedBucket
} }
// Provider returns the provider of the bucket.
func (b *erroringBucket) Provider() objstore.ObjProvider {
return b.bkt.Provider()
}
func (b *erroringBucket) Close() error { func (b *erroringBucket) Close() error {
return b.bkt.Close() return b.bkt.Close()
} }
@ -90,8 +96,8 @@ func (b *erroringBucket) Attributes(ctx context.Context, name string) (objstore.
// Upload the contents of the reader as an object into the bucket. // Upload the contents of the reader as an object into the bucket.
// Upload should be idempotent. // Upload should be idempotent.
func (b *erroringBucket) Upload(ctx context.Context, name string, r io.Reader) error { func (b *erroringBucket) Upload(ctx context.Context, name string, r io.Reader, opts ...objstore.ObjectUploadOption) error {
return b.bkt.Upload(ctx, name, r) return b.bkt.Upload(ctx, name, r, opts...)
} }
// Delete removes the object with the given name. // Delete removes the object with the given name.
@ -133,9 +139,9 @@ func TestRegression4960_Deadlock(t *testing.T) {
id, err = e2eutil.CreateBlock( id, err = e2eutil.CreateBlock(
ctx, ctx,
dir, dir,
[]labels.Labels{{{Name: "a", Value: "1"}}}, []labels.Labels{labels.FromStrings("a", "1")},
1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check. 1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check.
labels.Labels{{Name: "e1", Value: "1"}}, labels.FromStrings("e1", "1"),
downsample.ResLevel0, metadata.NoneFunc, nil) downsample.ResLevel0, metadata.NoneFunc, nil)
testutil.Ok(t, err) testutil.Ok(t, err)
testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id.String()), metadata.NoneFunc)) testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id.String()), metadata.NoneFunc))
@ -144,9 +150,9 @@ func TestRegression4960_Deadlock(t *testing.T) {
id2, err = e2eutil.CreateBlock( id2, err = e2eutil.CreateBlock(
ctx, ctx,
dir, dir,
[]labels.Labels{{{Name: "a", Value: "2"}}}, []labels.Labels{labels.FromStrings("a", "2")},
1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check. 1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check.
labels.Labels{{Name: "e1", Value: "2"}}, labels.FromStrings("e1", "2"),
downsample.ResLevel0, metadata.NoneFunc, nil) downsample.ResLevel0, metadata.NoneFunc, nil)
testutil.Ok(t, err) testutil.Ok(t, err)
testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id2.String()), metadata.NoneFunc)) testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id2.String()), metadata.NoneFunc))
@ -155,9 +161,9 @@ func TestRegression4960_Deadlock(t *testing.T) {
id3, err = e2eutil.CreateBlock( id3, err = e2eutil.CreateBlock(
ctx, ctx,
dir, dir,
[]labels.Labels{{{Name: "a", Value: "2"}}}, []labels.Labels{labels.FromStrings("a", "2")},
1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check. 1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check.
labels.Labels{{Name: "e1", Value: "2"}}, labels.FromStrings("e1", "2"),
downsample.ResLevel0, metadata.NoneFunc, nil) downsample.ResLevel0, metadata.NoneFunc, nil)
testutil.Ok(t, err) testutil.Ok(t, err)
testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id3.String()), metadata.NoneFunc)) testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id3.String()), metadata.NoneFunc))
@ -195,9 +201,9 @@ func TestCleanupDownsampleCacheFolder(t *testing.T) {
id, err = e2eutil.CreateBlock( id, err = e2eutil.CreateBlock(
ctx, ctx,
dir, dir,
[]labels.Labels{{{Name: "a", Value: "1"}}}, []labels.Labels{labels.FromStrings("a", "1")},
1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check. 1, 0, downsample.ResLevel1DownsampleRange+1, // Pass the minimum ResLevel1DownsampleRange check.
labels.Labels{{Name: "e1", Value: "1"}}, labels.FromStrings("e1", "1"),
downsample.ResLevel0, metadata.NoneFunc, nil) downsample.ResLevel0, metadata.NoneFunc, nil)
testutil.Ok(t, err) testutil.Ok(t, err)
testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id.String()), metadata.NoneFunc)) testutil.Ok(t, block.Upload(ctx, logger, bkt, path.Join(dir, id.String()), metadata.NoneFunc))

View File

@ -20,6 +20,7 @@ import (
"github.com/prometheus/common/route" "github.com/prometheus/common/route"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/promql" "github.com/prometheus/prometheus/promql"
"github.com/prometheus/prometheus/promql/parser"
apiv1 "github.com/thanos-io/thanos/pkg/api/query" apiv1 "github.com/thanos-io/thanos/pkg/api/query"
"github.com/thanos-io/thanos/pkg/api/query/querypb" "github.com/thanos-io/thanos/pkg/api/query/querypb"
@ -53,9 +54,10 @@ import (
) )
const ( const (
promqlNegativeOffset = "promql-negative-offset" promqlNegativeOffset = "promql-negative-offset"
promqlAtModifier = "promql-at-modifier" promqlAtModifier = "promql-at-modifier"
queryPushdown = "query-pushdown" queryPushdown = "query-pushdown"
promqlExperimentalFunctions = "promql-experimental-functions"
) )
// registerQuery registers a query command. // registerQuery registers a query command.
@ -81,6 +83,8 @@ func registerQuery(app *extkingpin.App) {
defaultEngine := cmd.Flag("query.promql-engine", "Default PromQL engine to use.").Default(string(apiv1.PromqlEnginePrometheus)). defaultEngine := cmd.Flag("query.promql-engine", "Default PromQL engine to use.").Default(string(apiv1.PromqlEnginePrometheus)).
Enum(string(apiv1.PromqlEnginePrometheus), string(apiv1.PromqlEngineThanos)) Enum(string(apiv1.PromqlEnginePrometheus), string(apiv1.PromqlEngineThanos))
disableQueryFallback := cmd.Flag("query.disable-fallback", "If set then thanos engine will throw an error if query falls back to prometheus engine").Hidden().Default("false").Bool()
extendedFunctionsEnabled := cmd.Flag("query.enable-x-functions", "Whether to enable extended rate functions (xrate, xincrease and xdelta). Only has effect when used with Thanos engine.").Default("false").Bool() extendedFunctionsEnabled := cmd.Flag("query.enable-x-functions", "Whether to enable extended rate functions (xrate, xincrease and xdelta). Only has effect when used with Thanos engine.").Default("false").Bool()
promqlQueryMode := cmd.Flag("query.mode", "PromQL query mode. One of: local, distributed."). promqlQueryMode := cmd.Flag("query.mode", "PromQL query mode. One of: local, distributed.").
Default(string(apiv1.PromqlQueryModeLocal)). Default(string(apiv1.PromqlQueryModeLocal)).
@ -135,7 +139,7 @@ func registerQuery(app *extkingpin.App) {
activeQueryDir := cmd.Flag("query.active-query-path", "Directory to log currently active queries in the queries.active file.").Default("").String() activeQueryDir := cmd.Flag("query.active-query-path", "Directory to log currently active queries in the queries.active file.").Default("").String()
featureList := cmd.Flag("enable-feature", "Comma separated experimental feature names to enable.The current list of features is empty.").Hidden().Default("").Strings() featureList := cmd.Flag("enable-feature", "Comma separated feature names to enable. Valid options for now: promql-experimental-functions (enables promql experimental functions in query)").Default("").Strings()
enableExemplarPartialResponse := cmd.Flag("exemplar.partial-response", "Enable partial response for exemplar endpoint. --no-exemplar.partial-response for disabling."). enableExemplarPartialResponse := cmd.Flag("exemplar.partial-response", "Enable partial response for exemplar endpoint. --no-exemplar.partial-response for disabling.").
Hidden().Default("true").Bool() Hidden().Default("true").Bool()
@ -198,6 +202,9 @@ func registerQuery(app *extkingpin.App) {
strictEndpointGroups := extkingpin.Addrs(cmd.Flag("endpoint-group-strict", "(Deprecated, Experimental): DNS name of statically configured Thanos API server groups (repeatable) that are always used, even if the health check fails.").PlaceHolder("<endpoint-group-strict>")) strictEndpointGroups := extkingpin.Addrs(cmd.Flag("endpoint-group-strict", "(Deprecated, Experimental): DNS name of statically configured Thanos API server groups (repeatable) that are always used, even if the health check fails.").PlaceHolder("<endpoint-group-strict>"))
lazyRetrievalMaxBufferedResponses := cmd.Flag("query.lazy-retrieval-max-buffered-responses", "The lazy retrieval strategy can buffer up to this number of responses. This is to limit the memory usage. This flag takes effect only when the lazy retrieval strategy is enabled.").
Default("20").Hidden().Int()
var storeRateLimits store.SeriesSelectLimits var storeRateLimits store.SeriesSelectLimits
storeRateLimits.RegisterFlags(cmd) storeRateLimits.RegisterFlags(cmd)
@ -208,6 +215,10 @@ func registerQuery(app *extkingpin.App) {
} }
for _, feature := range *featureList { for _, feature := range *featureList {
if feature == promqlExperimentalFunctions {
parser.EnableExperimentalFunctions = true
level.Info(logger).Log("msg", "Experimental PromQL functions enabled.", "option", promqlExperimentalFunctions)
}
if feature == promqlAtModifier { if feature == promqlAtModifier {
level.Warn(logger).Log("msg", "This option for --enable-feature is now permanently enabled and therefore a no-op.", "option", promqlAtModifier) level.Warn(logger).Log("msg", "This option for --enable-feature is now permanently enabled and therefore a no-op.", "option", promqlAtModifier)
} }
@ -225,7 +236,6 @@ func registerQuery(app *extkingpin.App) {
} }
grpcLogOpts, logFilterMethods, err := logging.ParsegRPCOptions(reqLogConfig) grpcLogOpts, logFilterMethods, err := logging.ParsegRPCOptions(reqLogConfig)
if err != nil { if err != nil {
return errors.Wrap(err, "error while parsing config for request logging") return errors.Wrap(err, "error while parsing config for request logging")
} }
@ -331,12 +341,14 @@ func registerQuery(app *extkingpin.App) {
store.NewTSDBSelector(tsdbSelector), store.NewTSDBSelector(tsdbSelector),
apiv1.PromqlEngineType(*defaultEngine), apiv1.PromqlEngineType(*defaultEngine),
apiv1.PromqlQueryMode(*promqlQueryMode), apiv1.PromqlQueryMode(*promqlQueryMode),
*disableQueryFallback,
*tenantHeader, *tenantHeader,
*defaultTenant, *defaultTenant,
*tenantCertField, *tenantCertField,
*enforceTenancy, *enforceTenancy,
*tenantLabel, *tenantLabel,
*queryDistributedWithOverlappingInterval, *queryDistributedWithOverlappingInterval,
*lazyRetrievalMaxBufferedResponses,
) )
}) })
} }
@ -393,12 +405,14 @@ func runQuery(
tsdbSelector *store.TSDBSelector, tsdbSelector *store.TSDBSelector,
defaultEngine apiv1.PromqlEngineType, defaultEngine apiv1.PromqlEngineType,
queryMode apiv1.PromqlQueryMode, queryMode apiv1.PromqlQueryMode,
disableQueryFallback bool,
tenantHeader string, tenantHeader string,
defaultTenant string, defaultTenant string,
tenantCertField string, tenantCertField string,
enforceTenancy bool, enforceTenancy bool,
tenantLabel string, tenantLabel string,
queryDistributedWithOverlappingInterval bool, queryDistributedWithOverlappingInterval bool,
lazyRetrievalMaxBufferedResponses int,
) error { ) error {
comp := component.Query comp := component.Query
if alertQueryURL == "" { if alertQueryURL == "" {
@ -412,6 +426,7 @@ func runQuery(
options := []store.ProxyStoreOption{ options := []store.ProxyStoreOption{
store.WithTSDBSelector(tsdbSelector), store.WithTSDBSelector(tsdbSelector),
store.WithProxyStoreDebugLogging(debugLogging), store.WithProxyStoreDebugLogging(debugLogging),
store.WithLazyRetrievalMaxBufferedResponsesForProxy(lazyRetrievalMaxBufferedResponses),
} }
// Parse and sanitize the provided replica labels flags. // Parse and sanitize the provided replica labels flags.
@ -466,6 +481,7 @@ func runQuery(
extendedFunctionsEnabled, extendedFunctionsEnabled,
activeQueryTracker, activeQueryTracker,
queryMode, queryMode,
disableQueryFallback,
) )
lookbackDeltaCreator := LookbackDeltaFactory(lookbackDelta, dynamicLookbackDelta) lookbackDeltaCreator := LookbackDeltaFactory(lookbackDelta, dynamicLookbackDelta)
@ -537,6 +553,7 @@ func runQuery(
tenantCertField, tenantCertField,
enforceTenancy, enforceTenancy,
tenantLabel, tenantLabel,
tsdbSelector,
) )
api.Register(router.WithPrefix("/api/v1"), tracer, logger, ins, logMiddleware) api.Register(router.WithPrefix("/api/v1"), tracer, logger, ins, logMiddleware)

View File

@ -4,6 +4,7 @@
package main package main
import ( import (
"context"
"net" "net"
"net/http" "net/http"
"time" "time"
@ -34,6 +35,7 @@ import (
"github.com/thanos-io/thanos/pkg/logging" "github.com/thanos-io/thanos/pkg/logging"
"github.com/thanos-io/thanos/pkg/prober" "github.com/thanos-io/thanos/pkg/prober"
"github.com/thanos-io/thanos/pkg/queryfrontend" "github.com/thanos-io/thanos/pkg/queryfrontend"
"github.com/thanos-io/thanos/pkg/runutil"
httpserver "github.com/thanos-io/thanos/pkg/server/http" httpserver "github.com/thanos-io/thanos/pkg/server/http"
"github.com/thanos-io/thanos/pkg/server/http/middleware" "github.com/thanos-io/thanos/pkg/server/http/middleware"
"github.com/thanos-io/thanos/pkg/tenancy" "github.com/thanos-io/thanos/pkg/tenancy"
@ -97,6 +99,8 @@ func registerQueryFrontend(app *extkingpin.App) {
cmd.Flag("query-frontend.enable-x-functions", "Enable experimental x- functions in query-frontend. --no-query-frontend.enable-x-functions for disabling."). cmd.Flag("query-frontend.enable-x-functions", "Enable experimental x- functions in query-frontend. --no-query-frontend.enable-x-functions for disabling.").
Default("false").BoolVar(&cfg.EnableXFunctions) Default("false").BoolVar(&cfg.EnableXFunctions)
cmd.Flag("enable-feature", "Comma separated feature names to enable. Valid options for now: promql-experimental-functions (enables promql experimental functions in query-frontend)").Default("").StringsVar(&cfg.EnableFeatures)
cmd.Flag("query-range.max-query-length", "Limit the query time range (end - start time) in the query-frontend, 0 disables it."). cmd.Flag("query-range.max-query-length", "Limit the query time range (end - start time) in the query-frontend, 0 disables it.").
Default("0").DurationVar((*time.Duration)(&cfg.QueryRangeConfig.Limits.MaxQueryLength)) Default("0").DurationVar((*time.Duration)(&cfg.QueryRangeConfig.Limits.MaxQueryLength))
@ -301,6 +305,15 @@ func runQueryFrontend(
} }
} }
if len(cfg.EnableFeatures) > 0 {
for _, feature := range cfg.EnableFeatures {
if feature == promqlExperimentalFunctions {
parser.EnableExperimentalFunctions = true
level.Info(logger).Log("msg", "Experimental PromQL functions enabled.", "option", promqlExperimentalFunctions)
}
}
}
tripperWare, err := queryfrontend.NewTripperware(cfg.Config, reg, logger) tripperWare, err := queryfrontend.NewTripperware(cfg.Config, reg, logger)
if err != nil { if err != nil {
return errors.Wrap(err, "setup tripperwares") return errors.Wrap(err, "setup tripperwares")
@ -316,13 +329,13 @@ func runQueryFrontend(
return err return err
} }
roundTripper, err := cortexfrontend.NewDownstreamRoundTripper(cfg.DownstreamURL, downstreamTripper) downstreamRT, err := cortexfrontend.NewDownstreamRoundTripper(cfg.DownstreamURL, downstreamTripper)
if err != nil { if err != nil {
return errors.Wrap(err, "setup downstream roundtripper") return errors.Wrap(err, "setup downstream roundtripper")
} }
// Wrap the downstream RoundTripper into query frontend Tripperware. // Wrap the downstream RoundTripper into query frontend Tripperware.
roundTripper = tripperWare(roundTripper) roundTripper := tripperWare(downstreamRT)
// Create the query frontend transport. // Create the query frontend transport.
handler := transport.NewHandler(*cfg.CortexHandlerConfig, roundTripper, logger, nil) handler := transport.NewHandler(*cfg.CortexHandlerConfig, roundTripper, logger, nil)
@ -384,8 +397,57 @@ func runQueryFrontend(
}) })
} }
// Periodically check downstream URL to ensure it is reachable.
{
ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error {
var firstRun = true
doCheckDownstream := func() (rerr error) {
timeoutCtx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()
readinessUrl := cfg.DownstreamURL + "/-/ready"
req, err := http.NewRequestWithContext(timeoutCtx, http.MethodGet, readinessUrl, nil)
if err != nil {
return errors.Wrap(err, "creating request to downstream URL")
}
resp, err := downstreamRT.RoundTrip(req)
if err != nil {
return errors.Wrapf(err, "roundtripping to downstream URL %s", readinessUrl)
}
defer runutil.CloseWithErrCapture(&rerr, resp.Body, "downstream health check response body")
if resp.StatusCode/100 == 4 || resp.StatusCode/100 == 5 {
return errors.Errorf("downstream URL %s returned an error: %d", readinessUrl, resp.StatusCode)
}
return nil
}
for {
if !firstRun {
select {
case <-ctx.Done():
return nil
case <-time.After(10 * time.Second):
}
}
firstRun = false
if err := doCheckDownstream(); err != nil {
statusProber.NotReady(err)
} else {
statusProber.Ready()
}
}
}, func(err error) {
cancel()
})
}
level.Info(logger).Log("msg", "starting query frontend") level.Info(logger).Log("msg", "starting query frontend")
statusProber.Ready()
return nil return nil
} }

View File

@ -26,7 +26,7 @@ import (
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/model/relabel" "github.com/prometheus/prometheus/model/relabel"
"github.com/prometheus/prometheus/tsdb" "github.com/prometheus/prometheus/tsdb"
"github.com/prometheus/prometheus/tsdb/wlog" "github.com/prometheus/prometheus/util/compression"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"
"github.com/thanos-io/objstore/client" "github.com/thanos-io/objstore/client"
objstoretracing "github.com/thanos-io/objstore/tracing/opentracing" objstoretracing "github.com/thanos-io/objstore/tracing/opentracing"
@ -35,6 +35,7 @@ import (
"github.com/thanos-io/thanos/pkg/block/metadata" "github.com/thanos-io/thanos/pkg/block/metadata"
"github.com/thanos-io/thanos/pkg/component" "github.com/thanos-io/thanos/pkg/component"
"github.com/thanos-io/thanos/pkg/compressutil"
"github.com/thanos-io/thanos/pkg/exemplars" "github.com/thanos-io/thanos/pkg/exemplars"
"github.com/thanos-io/thanos/pkg/extgrpc" "github.com/thanos-io/thanos/pkg/extgrpc"
"github.com/thanos-io/thanos/pkg/extgrpc/snappy" "github.com/thanos-io/thanos/pkg/extgrpc/snappy"
@ -93,7 +94,7 @@ func registerReceive(app *extkingpin.App) {
MaxBytes: int64(conf.tsdbMaxBytes), MaxBytes: int64(conf.tsdbMaxBytes),
OutOfOrderCapMax: conf.tsdbOutOfOrderCapMax, OutOfOrderCapMax: conf.tsdbOutOfOrderCapMax,
NoLockfile: conf.noLockFile, NoLockfile: conf.noLockFile,
WALCompression: wlog.ParseCompressionType(conf.walCompression, string(wlog.CompressionSnappy)), WALCompression: compressutil.ParseCompressionType(conf.walCompression, compression.Snappy),
MaxExemplars: conf.tsdbMaxExemplars, MaxExemplars: conf.tsdbMaxExemplars,
EnableExemplarStorage: conf.tsdbMaxExemplars > 0, EnableExemplarStorage: conf.tsdbMaxExemplars > 0,
HeadChunksWriteQueueSize: int(conf.tsdbWriteQueueSize), HeadChunksWriteQueueSize: int(conf.tsdbWriteQueueSize),
@ -210,10 +211,9 @@ func runReceive(
} }
} }
// TODO(brancz): remove after a couple of versions // Create TSDB for the default tenant.
// Migrate non-multi-tsdb capable storage to multi-tsdb disk layout. if err := createDefautTenantTSDB(logger, conf.dataDir, conf.defaultTenantID); err != nil {
if err := migrateLegacyStorage(logger, conf.dataDir, conf.defaultTenantID); err != nil { return errors.Wrapf(err, "create default tenant tsdb in %v", conf.dataDir)
return errors.Wrapf(err, "migrate legacy storage in %v to default tenant %v", conf.dataDir, conf.defaultTenantID)
} }
relabelContentYaml, err := conf.relabelConfigPath.Content() relabelContentYaml, err := conf.relabelConfigPath.Content()
@ -243,6 +243,7 @@ func runReceive(
conf.tenantLabelName, conf.tenantLabelName,
bkt, bkt,
conf.allowOutOfOrderUpload, conf.allowOutOfOrderUpload,
conf.skipCorruptedBlocks,
hashFunc, hashFunc,
multiTSDBOptions..., multiTSDBOptions...,
) )
@ -354,10 +355,14 @@ func runReceive(
return errors.Wrap(err, "setup gRPC server") return errors.Wrap(err, "setup gRPC server")
} }
if conf.lazyRetrievalMaxBufferedResponses <= 0 {
return errors.New("--receive.lazy-retrieval-max-buffered-responses must be > 0")
}
options := []store.ProxyStoreOption{ options := []store.ProxyStoreOption{
store.WithProxyStoreDebugLogging(debugLogging), store.WithProxyStoreDebugLogging(debugLogging),
store.WithMatcherCache(cache), store.WithMatcherCache(cache),
store.WithoutDedup(), store.WithoutDedup(),
store.WithLazyRetrievalMaxBufferedResponsesForProxy(conf.lazyRetrievalMaxBufferedResponses),
} }
proxy := store.NewProxyStore( proxy := store.NewProxyStore(
@ -593,7 +598,7 @@ func setupHashring(g *run.Group,
webHandler.Hashring(receive.SingleNodeHashring(conf.endpoint)) webHandler.Hashring(receive.SingleNodeHashring(conf.endpoint))
level.Info(logger).Log("msg", "Empty hashring config. Set up single node hashring.") level.Info(logger).Log("msg", "Empty hashring config. Set up single node hashring.")
} else { } else {
h, err := receive.NewMultiHashring(algorithm, conf.replicationFactor, c) h, err := receive.NewMultiHashring(algorithm, conf.replicationFactor, c, reg)
if err != nil { if err != nil {
return errors.Wrap(err, "unable to create new hashring from config") return errors.Wrap(err, "unable to create new hashring from config")
} }
@ -795,38 +800,25 @@ func startTSDBAndUpload(g *run.Group,
return nil return nil
} }
func migrateLegacyStorage(logger log.Logger, dataDir, defaultTenantID string) error { func createDefautTenantTSDB(logger log.Logger, dataDir, defaultTenantID string) error {
defaultTenantDataDir := path.Join(dataDir, defaultTenantID) defaultTenantDataDir := path.Join(dataDir, defaultTenantID)
if _, err := os.Stat(defaultTenantDataDir); !os.IsNotExist(err) { if _, err := os.Stat(defaultTenantDataDir); !os.IsNotExist(err) {
level.Info(logger).Log("msg", "default tenant data dir already present, not attempting to migrate storage") level.Info(logger).Log("msg", "default tenant data dir already present, will not create")
return nil return nil
} }
if _, err := os.Stat(dataDir); os.IsNotExist(err) { if _, err := os.Stat(dataDir); os.IsNotExist(err) {
level.Info(logger).Log("msg", "no existing storage found, no data migration attempted") level.Info(logger).Log("msg", "no existing storage found, not creating default tenant data dir")
return nil return nil
} }
level.Info(logger).Log("msg", "found legacy storage, migrating to multi-tsdb layout with default tenant", "defaultTenantID", defaultTenantID) level.Info(logger).Log("msg", "default tenant data dir not found, creating", "defaultTenantID", defaultTenantID)
files, err := os.ReadDir(dataDir)
if err != nil {
return errors.Wrapf(err, "read legacy data dir: %v", dataDir)
}
if err := os.MkdirAll(defaultTenantDataDir, 0750); err != nil { if err := os.MkdirAll(defaultTenantDataDir, 0750); err != nil {
return errors.Wrapf(err, "create default tenant data dir: %v", defaultTenantDataDir) return errors.Wrapf(err, "create default tenant data dir: %v", defaultTenantDataDir)
} }
for _, f := range files {
from := path.Join(dataDir, f.Name())
to := path.Join(defaultTenantDataDir, f.Name())
if err := os.Rename(from, to); err != nil {
return errors.Wrapf(err, "migrate file from %v to %v", from, to)
}
}
return nil return nil
} }
@ -895,6 +887,7 @@ type receiveConfig struct {
ignoreBlockSize bool ignoreBlockSize bool
allowOutOfOrderUpload bool allowOutOfOrderUpload bool
skipCorruptedBlocks bool
reqLogConfig *extflag.PathOrContent reqLogConfig *extflag.PathOrContent
relabelConfigPath *extflag.PathOrContent relabelConfigPath *extflag.PathOrContent
@ -907,6 +900,8 @@ type receiveConfig struct {
matcherCacheSize int matcherCacheSize int
lazyRetrievalMaxBufferedResponses int
featureList *[]string featureList *[]string
headExpandedPostingsCacheSize uint64 headExpandedPostingsCacheSize uint64
@ -1010,7 +1005,7 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
rc.tsdbOutOfOrderTimeWindow = extkingpin.ModelDuration(cmd.Flag("tsdb.out-of-order.time-window", rc.tsdbOutOfOrderTimeWindow = extkingpin.ModelDuration(cmd.Flag("tsdb.out-of-order.time-window",
"[EXPERIMENTAL] Configures the allowed time window for ingestion of out-of-order samples. Disabled (0s) by default"+ "[EXPERIMENTAL] Configures the allowed time window for ingestion of out-of-order samples. Disabled (0s) by default"+
"Please note if you enable this option and you use compactor, make sure you have the --enable-vertical-compaction flag enabled, otherwise you might risk compactor halt.", "Please note if you enable this option and you use compactor, make sure you have the --compact.enable-vertical-compaction flag enabled, otherwise you might risk compactor halt.",
).Default("0s")) ).Default("0s"))
cmd.Flag("tsdb.out-of-order.cap-max", cmd.Flag("tsdb.out-of-order.cap-max",
@ -1045,7 +1040,7 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
cmd.Flag("tsdb.enable-native-histograms", cmd.Flag("tsdb.enable-native-histograms",
"[EXPERIMENTAL] Enables the ingestion of native histograms."). "[EXPERIMENTAL] Enables the ingestion of native histograms.").
Default("false").Hidden().BoolVar(&rc.tsdbEnableNativeHistograms) Default("false").BoolVar(&rc.tsdbEnableNativeHistograms)
cmd.Flag("writer.intern", cmd.Flag("writer.intern",
"[EXPERIMENTAL] Enables string interning in receive writer, for more optimized memory usage."). "[EXPERIMENTAL] Enables string interning in receive writer, for more optimized memory usage.").
@ -1062,6 +1057,12 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
"about order."). "about order.").
Default("false").Hidden().BoolVar(&rc.allowOutOfOrderUpload) Default("false").Hidden().BoolVar(&rc.allowOutOfOrderUpload)
cmd.Flag("shipper.skip-corrupted-blocks",
"If true, shipper will skip corrupted blocks in the given iteration and retry later. This means that some newer blocks might be uploaded sooner than older blocks."+
"This can trigger compaction without those blocks and as a result will create an overlap situation. Set it to true if you have vertical compaction enabled and wish to upload blocks as soon as possible without caring"+
"about order.").
Default("false").Hidden().BoolVar(&rc.skipCorruptedBlocks)
cmd.Flag("matcher-cache-size", "Max number of cached matchers items. Using 0 disables caching.").Default("0").IntVar(&rc.matcherCacheSize) cmd.Flag("matcher-cache-size", "Max number of cached matchers items. Using 0 disables caching.").Default("0").IntVar(&rc.matcherCacheSize)
rc.reqLogConfig = extkingpin.RegisterRequestLoggingFlags(cmd) rc.reqLogConfig = extkingpin.RegisterRequestLoggingFlags(cmd)
@ -1074,6 +1075,9 @@ func (rc *receiveConfig) registerFlag(cmd extkingpin.FlagClause) {
cmd.Flag("receive.otlp-promote-resource-attributes", "(Repeatable) Resource attributes to include in OTLP metrics ingested by Receive.").Default("").StringsVar(&rc.otlpResourceAttributes) cmd.Flag("receive.otlp-promote-resource-attributes", "(Repeatable) Resource attributes to include in OTLP metrics ingested by Receive.").Default("").StringsVar(&rc.otlpResourceAttributes)
rc.featureList = cmd.Flag("enable-feature", "Comma separated experimental feature names to enable. The current list of features is "+metricNamesFilter+".").Default("").Strings() rc.featureList = cmd.Flag("enable-feature", "Comma separated experimental feature names to enable. The current list of features is "+metricNamesFilter+".").Default("").Strings()
cmd.Flag("receive.lazy-retrieval-max-buffered-responses", "The lazy retrieval strategy can buffer up to this number of responses. This is to limit the memory usage. This flag takes effect only when the lazy retrieval strategy is enabled.").
Default("20").IntVar(&rc.lazyRetrievalMaxBufferedResponses)
} }
// determineMode returns the ReceiverMode that this receiver is configured to run in. // determineMode returns the ReceiverMode that this receiver is configured to run in.

View File

@ -14,6 +14,7 @@ import (
"os" "os"
"path/filepath" "path/filepath"
"strings" "strings"
"sync"
texttemplate "text/template" texttemplate "text/template"
"time" "time"
@ -35,11 +36,12 @@ import (
"github.com/prometheus/prometheus/promql" "github.com/prometheus/prometheus/promql"
"github.com/prometheus/prometheus/promql/parser" "github.com/prometheus/prometheus/promql/parser"
"github.com/prometheus/prometheus/rules" "github.com/prometheus/prometheus/rules"
"github.com/prometheus/prometheus/scrape"
"github.com/prometheus/prometheus/storage" "github.com/prometheus/prometheus/storage"
"github.com/prometheus/prometheus/storage/remote" "github.com/prometheus/prometheus/storage/remote"
"github.com/prometheus/prometheus/tsdb" "github.com/prometheus/prometheus/tsdb"
"github.com/prometheus/prometheus/tsdb/agent" "github.com/prometheus/prometheus/tsdb/agent"
"github.com/prometheus/prometheus/tsdb/wlog" "github.com/prometheus/prometheus/util/compression"
"gopkg.in/yaml.v2" "gopkg.in/yaml.v2"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"
@ -52,6 +54,7 @@ import (
"github.com/thanos-io/thanos/pkg/block/metadata" "github.com/thanos-io/thanos/pkg/block/metadata"
"github.com/thanos-io/thanos/pkg/clientconfig" "github.com/thanos-io/thanos/pkg/clientconfig"
"github.com/thanos-io/thanos/pkg/component" "github.com/thanos-io/thanos/pkg/component"
"github.com/thanos-io/thanos/pkg/compressutil"
"github.com/thanos-io/thanos/pkg/discovery/dns" "github.com/thanos-io/thanos/pkg/discovery/dns"
"github.com/thanos-io/thanos/pkg/errutil" "github.com/thanos-io/thanos/pkg/errutil"
"github.com/thanos-io/thanos/pkg/extannotations" "github.com/thanos-io/thanos/pkg/extannotations"
@ -110,7 +113,9 @@ type ruleConfig struct {
storeRateLimits store.SeriesSelectLimits storeRateLimits store.SeriesSelectLimits
ruleConcurrentEval int64 ruleConcurrentEval int64
extendedFunctionsEnabled bool extendedFunctionsEnabled bool
EnableFeatures []string
tsdbEnableNativeHistograms bool
} }
type Expression struct { type Expression struct {
@ -165,6 +170,11 @@ func registerRule(app *extkingpin.App) {
PlaceHolder("<endpoint>").StringsVar(&conf.grpcQueryEndpoints) PlaceHolder("<endpoint>").StringsVar(&conf.grpcQueryEndpoints)
cmd.Flag("query.enable-x-functions", "Whether to enable extended rate functions (xrate, xincrease and xdelta). Only has effect when used with Thanos engine.").Default("false").BoolVar(&conf.extendedFunctionsEnabled) cmd.Flag("query.enable-x-functions", "Whether to enable extended rate functions (xrate, xincrease and xdelta). Only has effect when used with Thanos engine.").Default("false").BoolVar(&conf.extendedFunctionsEnabled)
cmd.Flag("enable-feature", "Comma separated feature names to enable. Valid options for now: promql-experimental-functions (enables promql experimental functions for ruler)").Default("").StringsVar(&conf.EnableFeatures)
cmd.Flag("tsdb.enable-native-histograms",
"[EXPERIMENTAL] Enables the ingestion of native histograms.").
Default("false").BoolVar(&conf.tsdbEnableNativeHistograms)
conf.rwConfig = extflag.RegisterPathOrContent(cmd, "remote-write.config", "YAML config for the remote-write configurations, that specify servers where samples should be sent to (see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). This automatically enables stateless mode for ruler and no series will be stored in the ruler's TSDB. If an empty config (or file) is provided, the flag is ignored and ruler is run with its own TSDB.", extflag.WithEnvSubstitution()) conf.rwConfig = extflag.RegisterPathOrContent(cmd, "remote-write.config", "YAML config for the remote-write configurations, that specify servers where samples should be sent to (see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). This automatically enables stateless mode for ruler and no series will be stored in the ruler's TSDB. If an empty config (or file) is provided, the flag is ignored and ruler is run with its own TSDB.", extflag.WithEnvSubstitution())
@ -185,15 +195,16 @@ func registerRule(app *extkingpin.App) {
} }
tsdbOpts := &tsdb.Options{ tsdbOpts := &tsdb.Options{
MinBlockDuration: int64(time.Duration(*tsdbBlockDuration) / time.Millisecond), MinBlockDuration: int64(time.Duration(*tsdbBlockDuration) / time.Millisecond),
MaxBlockDuration: int64(time.Duration(*tsdbBlockDuration) / time.Millisecond), MaxBlockDuration: int64(time.Duration(*tsdbBlockDuration) / time.Millisecond),
RetentionDuration: int64(time.Duration(*tsdbRetention) / time.Millisecond), RetentionDuration: int64(time.Duration(*tsdbRetention) / time.Millisecond),
NoLockfile: *noLockFile, NoLockfile: *noLockFile,
WALCompression: wlog.ParseCompressionType(*walCompression, string(wlog.CompressionSnappy)), WALCompression: compressutil.ParseCompressionType(*walCompression, compression.Snappy),
EnableNativeHistograms: conf.tsdbEnableNativeHistograms,
} }
agentOpts := &agent.Options{ agentOpts := &agent.Options{
WALCompression: wlog.ParseCompressionType(*walCompression, string(wlog.CompressionSnappy)), WALCompression: compressutil.ParseCompressionType(*walCompression, compression.Snappy),
NoLockfile: *noLockFile, NoLockfile: *noLockFile,
} }
@ -469,7 +480,7 @@ func runRule(
// flushDeadline is set to 1m, but it is for metadata watcher only so not used here. // flushDeadline is set to 1m, but it is for metadata watcher only so not used here.
remoteStore := remote.NewStorage(slogger, reg, func() (int64, error) { remoteStore := remote.NewStorage(slogger, reg, func() (int64, error) {
return 0, nil return 0, nil
}, conf.dataDir, 1*time.Minute, nil, false) }, conf.dataDir, 1*time.Minute, &readyScrapeManager{})
if err := remoteStore.ApplyConfig(&config.Config{ if err := remoteStore.ApplyConfig(&config.Config{
GlobalConfig: config.GlobalConfig{ GlobalConfig: config.GlobalConfig{
ExternalLabels: labelsTSDBToProm(conf.lset), ExternalLabels: labelsTSDBToProm(conf.lset),
@ -581,6 +592,15 @@ func runRule(
} }
} }
if len(conf.EnableFeatures) > 0 {
for _, feature := range conf.EnableFeatures {
if feature == promqlExperimentalFunctions {
parser.EnableExperimentalFunctions = true
level.Info(logger).Log("msg", "Experimental PromQL functions enabled.", "option", promqlExperimentalFunctions)
}
}
}
// Run rule evaluation and alert notifications. // Run rule evaluation and alert notifications.
notifyFunc := func(ctx context.Context, expr string, alerts ...*rules.Alert) { notifyFunc := func(ctx context.Context, expr string, alerts ...*rules.Alert) {
res := make([]*notifier.Alert, 0, len(alerts)) res := make([]*notifier.Alert, 0, len(alerts))
@ -839,7 +859,18 @@ func runRule(
} }
}() }()
s := shipper.New(logger, reg, conf.dataDir, bkt, func() labels.Labels { return conf.lset }, metadata.RulerSource, nil, conf.shipper.allowOutOfOrderUpload, metadata.HashFunc(conf.shipper.hashFunc), conf.shipper.metaFileName) s := shipper.New(
bkt,
conf.dataDir,
shipper.WithLogger(logger),
shipper.WithRegisterer(reg),
shipper.WithSource(metadata.RulerSource),
shipper.WithHashFunc(metadata.HashFunc(conf.shipper.hashFunc)),
shipper.WithMetaFileName(conf.shipper.metaFileName),
shipper.WithLabels(func() labels.Labels { return conf.lset }),
shipper.WithAllowOutOfOrderUploads(conf.shipper.allowOutOfOrderUpload),
shipper.WithSkipCorruptedBlocks(conf.shipper.skipCorruptedBlocks),
)
ctx, cancel := context.WithCancel(context.Background()) ctx, cancel := context.WithCancel(context.Background())
@ -1084,3 +1115,32 @@ func filterOutPromQLWarnings(warns []string, logger log.Logger, query string) []
} }
return storeWarnings return storeWarnings
} }
// ReadyScrapeManager allows a scrape manager to be retrieved. Even if it's set at a later point in time.
type readyScrapeManager struct {
mtx sync.RWMutex
m *scrape.Manager
}
// Set the scrape manager.
func (rm *readyScrapeManager) Set(m *scrape.Manager) {
rm.mtx.Lock()
defer rm.mtx.Unlock()
rm.m = m
}
// Get the scrape manager. If is not ready, return an error.
func (rm *readyScrapeManager) Get() (*scrape.Manager, error) {
rm.mtx.RLock()
defer rm.mtx.RUnlock()
if rm.m != nil {
return rm.m, nil
}
return nil, ErrNotReady
}
// ErrNotReady is returned if the underlying scrape manager is not ready yet.
var ErrNotReady = errors.New("scrape manager not ready")

View File

@ -414,9 +414,19 @@ func runSidecar(
return errors.Wrapf(err, "aborting as no external labels found after waiting %s", promReadyTimeout) return errors.Wrapf(err, "aborting as no external labels found after waiting %s", promReadyTimeout)
} }
uploadCompactedFunc := func() bool { return conf.shipper.uploadCompacted } s := shipper.New(
s := shipper.New(logger, reg, conf.tsdb.path, bkt, m.Labels, metadata.SidecarSource, bkt,
uploadCompactedFunc, conf.shipper.allowOutOfOrderUpload, metadata.HashFunc(conf.shipper.hashFunc), conf.shipper.metaFileName) conf.tsdb.path,
shipper.WithLogger(logger),
shipper.WithRegisterer(reg),
shipper.WithSource(metadata.SidecarSource),
shipper.WithHashFunc(metadata.HashFunc(conf.shipper.hashFunc)),
shipper.WithMetaFileName(conf.shipper.metaFileName),
shipper.WithLabels(m.Labels),
shipper.WithUploadCompacted(conf.shipper.uploadCompacted),
shipper.WithAllowOutOfOrderUploads(conf.shipper.allowOutOfOrderUpload),
shipper.WithSkipCorruptedBlocks(conf.shipper.skipCorruptedBlocks),
)
return runutil.Repeat(30*time.Second, ctx.Done(), func() error { return runutil.Repeat(30*time.Second, ctx.Done(), func() error {
if uploaded, err := s.Sync(ctx); err != nil { if uploaded, err := s.Sync(ctx); err != nil {

View File

@ -105,7 +105,8 @@ type storeConfig struct {
indexHeaderLazyDownloadStrategy string indexHeaderLazyDownloadStrategy string
matcherCacheSize int matcherCacheSize int
disableAdminOperations bool
} }
func (sc *storeConfig) registerFlag(cmd extkingpin.FlagClause) { func (sc *storeConfig) registerFlag(cmd extkingpin.FlagClause) {
@ -229,6 +230,8 @@ func (sc *storeConfig) registerFlag(cmd extkingpin.FlagClause) {
cmd.Flag("matcher-cache-size", "Max number of cached matchers items. Using 0 disables caching.").Default("0").IntVar(&sc.matcherCacheSize) cmd.Flag("matcher-cache-size", "Max number of cached matchers items. Using 0 disables caching.").Default("0").IntVar(&sc.matcherCacheSize)
cmd.Flag("disable-admin-operations", "Disable UI/API admin operations like marking blocks for deletion and no compaction.").Default("false").BoolVar(&sc.disableAdminOperations)
sc.reqLogConfig = extkingpin.RegisterRequestLoggingFlags(cmd) sc.reqLogConfig = extkingpin.RegisterRequestLoggingFlags(cmd)
} }
@ -390,14 +393,16 @@ func runStore(
return errors.Errorf("unknown sync strategy %s", conf.blockListStrategy) return errors.Errorf("unknown sync strategy %s", conf.blockListStrategy)
} }
ignoreDeletionMarkFilter := block.NewIgnoreDeletionMarkFilter(logger, insBkt, time.Duration(conf.ignoreDeletionMarksDelay), conf.blockMetaFetchConcurrency) ignoreDeletionMarkFilter := block.NewIgnoreDeletionMarkFilter(logger, insBkt, time.Duration(conf.ignoreDeletionMarksDelay), conf.blockMetaFetchConcurrency)
metaFetcher, err := block.NewMetaFetcher(logger, conf.blockMetaFetchConcurrency, insBkt, blockLister, dataDir, extprom.WrapRegistererWithPrefix("thanos_", reg), filters := []block.MetadataFilter{
[]block.MetadataFilter{ block.NewTimePartitionMetaFilter(conf.filterConf.MinTime, conf.filterConf.MaxTime),
block.NewTimePartitionMetaFilter(conf.filterConf.MinTime, conf.filterConf.MaxTime), block.NewLabelShardedMetaFilter(relabelConfig),
block.NewLabelShardedMetaFilter(relabelConfig), block.NewConsistencyDelayMetaFilter(logger, time.Duration(conf.consistencyDelay), extprom.WrapRegistererWithPrefix("thanos_", reg)),
block.NewConsistencyDelayMetaFilter(logger, time.Duration(conf.consistencyDelay), extprom.WrapRegistererWithPrefix("thanos_", reg)), ignoreDeletionMarkFilter,
ignoreDeletionMarkFilter, block.NewDeduplicateFilter(conf.blockMetaFetchConcurrency),
block.NewDeduplicateFilter(conf.blockMetaFetchConcurrency), block.NewParquetMigratedMetaFilter(logger),
}) }
metaFetcher, err := block.NewMetaFetcher(logger, conf.blockMetaFetchConcurrency, insBkt, blockLister, dataDir, extprom.WrapRegistererWithPrefix("thanos_", reg), filters)
if err != nil { if err != nil {
return errors.Wrap(err, "meta fetcher") return errors.Wrap(err, "meta fetcher")
} }

View File

@ -23,7 +23,8 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/run" "github.com/oklog/run"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/olekukonko/tablewriter" "github.com/olekukonko/tablewriter"
"github.com/opentracing/opentracing-go" "github.com/opentracing/opentracing-go"
"github.com/pkg/errors" "github.com/pkg/errors"
@ -110,8 +111,11 @@ type bucketVerifyConfig struct {
} }
type bucketLsConfig struct { type bucketLsConfig struct {
output string output string
excludeDelete bool excludeDelete bool
selectorRelabelConf extflag.PathOrContent
filterConf *store.FilterConfig
timeout time.Duration
} }
type bucketWebConfig struct { type bucketWebConfig struct {
@ -162,8 +166,9 @@ type bucketMarkBlockConfig struct {
} }
type bucketUploadBlocksConfig struct { type bucketUploadBlocksConfig struct {
path string path string
labels []string labels []string
uploadCompacted bool
} }
func (tbc *bucketVerifyConfig) registerBucketVerifyFlag(cmd extkingpin.FlagClause) *bucketVerifyConfig { func (tbc *bucketVerifyConfig) registerBucketVerifyFlag(cmd extkingpin.FlagClause) *bucketVerifyConfig {
@ -181,10 +186,18 @@ func (tbc *bucketVerifyConfig) registerBucketVerifyFlag(cmd extkingpin.FlagClaus
} }
func (tbc *bucketLsConfig) registerBucketLsFlag(cmd extkingpin.FlagClause) *bucketLsConfig { func (tbc *bucketLsConfig) registerBucketLsFlag(cmd extkingpin.FlagClause) *bucketLsConfig {
tbc.selectorRelabelConf = *extkingpin.RegisterSelectorRelabelFlags(cmd)
tbc.filterConf = &store.FilterConfig{}
cmd.Flag("output", "Optional format in which to print each block's information. Options are 'json', 'wide' or a custom template."). cmd.Flag("output", "Optional format in which to print each block's information. Options are 'json', 'wide' or a custom template.").
Short('o').Default("").StringVar(&tbc.output) Short('o').Default("").StringVar(&tbc.output)
cmd.Flag("exclude-delete", "Exclude blocks marked for deletion."). cmd.Flag("exclude-delete", "Exclude blocks marked for deletion.").
Default("false").BoolVar(&tbc.excludeDelete) Default("false").BoolVar(&tbc.excludeDelete)
cmd.Flag("min-time", "Start of time range limit to list blocks. Thanos Tools will list blocks, which were created later than this value. Option can be a constant time in RFC3339 format or time duration relative to current time, such as -1d or 2h45m. Valid duration units are ms, s, m, h, d, w, y.").
Default("0000-01-01T00:00:00Z").SetValue(&tbc.filterConf.MinTime)
cmd.Flag("max-time", "End of time range limit to list. Thanos Tools will list only blocks, which were created earlier than this value. Option can be a constant time in RFC3339 format or time duration relative to current time, such as -1d or 2h45m. Valid duration units are ms, s, m, h, d, w, y.").
Default("9999-12-31T23:59:59Z").SetValue(&tbc.filterConf.MaxTime)
cmd.Flag("timeout", "Timeout to download metadata from remote storage").Default("5m").DurationVar(&tbc.timeout)
return tbc return tbc
} }
@ -288,6 +301,7 @@ func (tbc *bucketRetentionConfig) registerBucketRetentionFlag(cmd extkingpin.Fla
func (tbc *bucketUploadBlocksConfig) registerBucketUploadBlocksFlag(cmd extkingpin.FlagClause) *bucketUploadBlocksConfig { func (tbc *bucketUploadBlocksConfig) registerBucketUploadBlocksFlag(cmd extkingpin.FlagClause) *bucketUploadBlocksConfig {
cmd.Flag("path", "Path to the directory containing blocks to upload.").Default("./data").StringVar(&tbc.path) cmd.Flag("path", "Path to the directory containing blocks to upload.").Default("./data").StringVar(&tbc.path)
cmd.Flag("label", "External labels to add to the uploaded blocks (repeated).").PlaceHolder("key=\"value\"").StringsVar(&tbc.labels) cmd.Flag("label", "External labels to add to the uploaded blocks (repeated).").PlaceHolder("key=\"value\"").StringsVar(&tbc.labels)
cmd.Flag("shipper.upload-compacted", "If true shipper will try to upload compacted blocks as well.").Default("false").BoolVar(&tbc.uploadCompacted)
return tbc return tbc
} }
@ -418,12 +432,30 @@ func registerBucketLs(app extkingpin.AppClause, objStoreConfig *extflag.PathOrCo
} }
insBkt := objstoretracing.WrapWithTraces(objstore.WrapWithMetrics(bkt, extprom.WrapRegistererWithPrefix("thanos_", reg), bkt.Name())) insBkt := objstoretracing.WrapWithTraces(objstore.WrapWithMetrics(bkt, extprom.WrapRegistererWithPrefix("thanos_", reg), bkt.Name()))
var filters []block.MetadataFilter if tbc.timeout < time.Minute {
level.Warn(logger).Log("msg", "Timeout less than 1m could lead to frequent failures")
}
relabelContentYaml, err := tbc.selectorRelabelConf.Content()
if err != nil {
return errors.Wrap(err, "get content of relabel configuration")
}
relabelConfig, err := block.ParseRelabelConfig(relabelContentYaml, block.SelectorSupportedRelabelActions)
if err != nil {
return err
}
filters := []block.MetadataFilter{
block.NewLabelShardedMetaFilter(relabelConfig),
block.NewTimePartitionMetaFilter(tbc.filterConf.MinTime, tbc.filterConf.MaxTime),
}
if tbc.excludeDelete { if tbc.excludeDelete {
ignoreDeletionMarkFilter := block.NewIgnoreDeletionMarkFilter(logger, insBkt, 0, block.FetcherConcurrency) ignoreDeletionMarkFilter := block.NewIgnoreDeletionMarkFilter(logger, insBkt, 0, block.FetcherConcurrency)
filters = append(filters, ignoreDeletionMarkFilter) filters = append(filters, ignoreDeletionMarkFilter)
} }
baseBlockIDsFetcher := block.NewConcurrentLister(logger, insBkt) baseBlockIDsFetcher := block.NewConcurrentLister(logger, insBkt)
fetcher, err := block.NewMetaFetcher(logger, block.FetcherConcurrency, insBkt, baseBlockIDsFetcher, "", extprom.WrapRegistererWithPrefix(extpromPrefix, reg), filters) fetcher, err := block.NewMetaFetcher(logger, block.FetcherConcurrency, insBkt, baseBlockIDsFetcher, "", extprom.WrapRegistererWithPrefix(extpromPrefix, reg), filters)
if err != nil { if err != nil {
@ -435,7 +467,7 @@ func registerBucketLs(app extkingpin.AppClause, objStoreConfig *extflag.PathOrCo
defer runutil.CloseWithLogOnErr(logger, insBkt, "bucket client") defer runutil.CloseWithLogOnErr(logger, insBkt, "bucket client")
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Minute) ctx, cancel := context.WithTimeout(context.Background(), tbc.timeout)
defer cancel() defer cancel()
var ( var (
@ -505,7 +537,7 @@ func registerBucketInspect(app extkingpin.AppClause, objStoreConfig *extflag.Pat
tbc := &bucketInspectConfig{} tbc := &bucketInspectConfig{}
tbc.registerBucketInspectFlag(cmd) tbc.registerBucketInspectFlag(cmd)
output := cmd.Flag("output", "Output format for result. Currently supports table, cvs, tsv.").Default("table").Enum(outputTypes...) output := cmd.Flag("output", "Output format for result. Currently supports table, csv, tsv.").Default("table").Enum(outputTypes...)
cmd.Setup(func(g *run.Group, logger log.Logger, reg *prometheus.Registry, _ opentracing.Tracer, _ <-chan struct{}, _ bool) error { cmd.Setup(func(g *run.Group, logger log.Logger, reg *prometheus.Registry, _ opentracing.Tracer, _ <-chan struct{}, _ bool) error {
@ -1471,8 +1503,16 @@ func registerBucketUploadBlocks(app extkingpin.AppClause, objStoreConfig *extfla
bkt = objstoretracing.WrapWithTraces(objstore.WrapWithMetrics(bkt, extprom.WrapRegistererWithPrefix("thanos_", reg), bkt.Name())) bkt = objstoretracing.WrapWithTraces(objstore.WrapWithMetrics(bkt, extprom.WrapRegistererWithPrefix("thanos_", reg), bkt.Name()))
s := shipper.New(logger, reg, tbc.path, bkt, func() labels.Labels { return lset }, metadata.BucketUploadSource, s := shipper.New(
nil, false, metadata.HashFunc(""), shipper.DefaultMetaFilename) bkt,
tbc.path,
shipper.WithLogger(logger),
shipper.WithRegisterer(reg),
shipper.WithSource(metadata.BucketUploadSource),
shipper.WithMetaFileName(shipper.DefaultMetaFilename),
shipper.WithLabels(func() labels.Labels { return lset }),
shipper.WithUploadCompacted(tbc.uploadCompacted),
)
ctx, cancel := context.WithCancel(context.Background()) ctx, cancel := context.WithCancel(context.Background())
g.Add(func() error { g.Add(func() error {

View File

@ -5,6 +5,7 @@ package main
import ( import (
"os" "os"
"path"
"testing" "testing"
"github.com/go-kit/log" "github.com/go-kit/log"
@ -47,9 +48,12 @@ func Test_CheckRules_Glob(t *testing.T) {
testutil.NotOk(t, checkRulesFiles(logger, files), "expected err for file %s", files) testutil.NotOk(t, checkRulesFiles(logger, files), "expected err for file %s", files)
// Unreadble path // Unreadble path
files = &[]string{"./testdata/rules-files/unreadable_valid.yaml"} // Move the initial file to a temp dir and make it unreadble there, in case the process cannot chmod the file in the current dir.
filename := (*files)[0] filename := "./testdata/rules-files/unreadable_valid.yaml"
testutil.Ok(t, os.Chmod(filename, 0000), "failed to change file permissions of %s to 0000", filename) bytesRead, err := os.ReadFile(filename)
testutil.Ok(t, err)
filename = path.Join(t.TempDir(), "file.yaml")
testutil.Ok(t, os.WriteFile(filename, bytesRead, 0000))
files = &[]string{filename}
testutil.NotOk(t, checkRulesFiles(logger, files), "expected err for file %s", files) testutil.NotOk(t, checkRulesFiles(logger, files), "expected err for file %s", files)
testutil.Ok(t, os.Chmod(filename, 0777), "failed to change file permissions of %s to 0777", filename)
} }

View File

@ -0,0 +1,245 @@
---
title: Life of a Sample in Thanos and How to Configure It Data Management Part II
date: "2024-09-16"
author: Thibault Mangé (https://github.com/thibaultmg)
---
## Life of a Sample in Thanos and How to Configure It Data Management Part II
### Introduction
In the first part of this series, we followed the life of a sample from its inception in a Prometheus server to our Thanos Receivers. We will now explore how Thanos manages the data ingested by the Receivers and optimizes it in the object store for reduced cost and fast retrieval.
Let's delve into these topics and more in the second part of the series.
### Preparing Samples for Object Storage: Building Chunks and Blocks
#### Using Object Storage
A key feature of Thanos is its ability to leverage economical object storage solutions like AWS S3 for long-term data retention. This contrasts with Prometheus's typical approach of storing data locally for shorter periods.
The Receive component is responsible for preparing data for object storage. Thanos adopts the TSDB (Time Series Database) data model, with some adaptations, for its object storage. This involves aggregating samples over time to construct TSDB Blocks. Please refer to the annexes of the first part if this vocabulary is not clear to you.
These blocks are built by aggregating data over two-hour periods. Once a block is ready, it is sent to the object storage, which is configured using the `--objstore.config` flag. This configuration is uniform across all components requiring object storage access.
On restarts, the Receive component ensures data preservation by immediately flushing existing data to object storage, even if it does not constitute a full two-hour block. These partial blocks are less efficient but are then optimized by the compactor, as we will see later.
The Receive is also able to [isolate data](https://thanos.io/tip/components/receive.md/#tenant-lifecycle-management) coming from different tenants. The tenant can be identified in the request by different means: a header (`--receive.tenant-header`), a label (`--receive.split-tenant-label-name`) or a certificate (`--receive.tenant-certificate-field`). Their data is ingested into different TSDBs instances (you might hear this referred to as the multiTSDB). The benefits are twofold:
* It allows for parallelization of the block-building process, especially on the compactor side as we will see later.
* It allows for smaller indexes. Indeed, labels tend to be similar for samples coming from the same source, leading to more effective compression.
<img src="img/life-of-a-sample/multi-tsdb.png" alt="Data expansion" style="max-width: 600px; display: block;margin: 0 auto;"/>
When a block is ready, it is uploaded to the object store with the block external label defined by the flag `--receive.tenant-label-name`. This corresponds to the `thanos.labels` field of the [block metadata](https://thanos.io/tip/thanos/storage.md/#metadata-file-metajson). This will be used by the compactor to group blocks together, as we will see later.
#### Exposing Local Data for Queries
During the block-building phase, the data is not accessible to the Store Gateway as it has not been uploaded to the object store yet. To counter that, the Receive component also serves as a data store, making the local data available for query through the `Store API`. This is a common gRPC API used across all Thanos components for time series data access, set with the `--grpc-address` flag. The Receive will serve all data it has. The more data it serves, the more resources it will use for this duty in addition to ingesting client data.
<img src="img/life-of-a-sample/receive-store-api.png" alt="Data expansion" style="max-width: 600px; display: block;margin: 0 auto;"/>
The amount of data the Receive component serves can be managed through two parameters:
* `--tsdb.retention`: Sets the local storage retention duration. The minimum is 2 hours, aligning with block construction periods.
* `--store.limits.request-samples` and `--store.limits.request-series`: These parameters limit the volume of data that can be queried by setting a maximum on the number of samples and/or the number of series. If these limits are exceeded, the query will be denied to ensure system stability.
Key points to consider:
* The primary objective of the Receive component is to ensure **reliable data ingestion**. However, the more data it serves through the Store API, the more resources it will use for this duty in addition to ingesting client data. You should set the retention duration to the minimum required for your use case to optimize resource allocation. The minimum value for 2-hour blocks would be a 4-hour retention to account for availability in the Store Gateway after the block is uploaded to object storage. To prevent data loss, if the Receive component fails to upload blocks before the retention limit is reached, it will hold them until the upload succeeds.
* Even when the retention duration is short, your Receive instance could be overwhelmed by a query selecting too much data. You should set limits in place to ensure the stability of the Receive instances. These limits must be carefully set to enable Store API clients to retrieve the data they need while preventing resource exhaustion. The longer the retention, the higher the limits should be as the number of samples and series will increase.
### Maintaining Data: Compaction, Downsampling, and Retention
#### The Need for Compaction
The Receive component implements many strategies to ingest samples reliably. However, this can result in unoptimized data in object storage. This is due to:
* Inefficient partial blocks sent to object storage on shutdowns.
* Duplicated data when replication is set. Several Receive instances will send the same data to object storage.
* Incomplete blocks (invalid blocks) sent to object storage when the Receive fails in the middle of an upload.
The following diagram illustrates the impact on data expansion in object storage when samples from a given target are ingested from a high-availability Prometheus setup (with 2 instances) and replication is set on the Receive (factor 3):
<img src="img/life-of-a-sample/data-expansion.png" alt="Data expansion" style="max-width: 600px; display: block;margin: 0 auto;"/>
This leads to a threefold increase in label volume (one for each block) and a sixfold increase in sample volume! This is where the Compactor comes into play.
The Compactor component is responsible for maintaining and optimizing data in object storage. It is a long-running process when configured to wait for new blocks with the `--wait` flag. It also needs access to the object storage using the `--objstore.config` flag.
Under normal operating conditions, the Compactor will check for new blocks every 5 minutes. By default, it will only consider blocks that are older than 30 minutes (configured with the `--consistency-delay` flag) to avoid reading partially uploaded blocks. It will then process these blocks in a structured manner, compacting them according to defined settings that we will discuss in the next sections.
#### Compaction Modes
Compaction consists of merging blocks that have overlapping or adjacent time ranges. This is called **horizontal compaction**. Using the [Metadata file](https://thanos.io/tip/thanos/storage.md/#metadata-file-metajson) which contains the minimum and maximum timestamps of samples in the block, the Compactor can determine if two blocks overlap. If they do, they are merged into a new block. This new block will have its compaction level index increased by one. So from two adjacent blocks of 2 hours each, we will get a new block of 4 hours.
During this compaction, the Compactor will also deduplicate samples. This is called [**vertical compaction**](https://thanos.io/tip/components/compact.md/#vertical-compactions). The Compactor provides two deduplication modes:
* `one-to-one`: This is the default mode. It will deduplicate samples that have the same timestamp and the same value but different replica label values. The replica label is configured by the `--deduplication.replica-label` flag. This flag can be repeated to account for several replication labels. Usually set to `replica`, make sure it is set up as external label on the Receivers with the flag `--label=replica=xxx`. The benefit of this mode is that it is straightforward and will remove replicated data from the Receive. However, it is not able to remove data replicated by high-availability Prometheus setups because these samples will rarely be scraped at exactly the same timestamps, as demonstrated by the diagram below.
* `penalty`: This a more complex deduplication algorithm that is able to deduplicate data coming from high availability prometheus setups. It can be set with the `--deduplication.func` flag and requires also setting the `--deduplication.replica-label` flag that identifies the label that contains the replica label. Usually `prometheus_replica`.
Here is a diagram illustrating how Prometheus replicas generate samples with different timestamps that cannot be deduplicated with the `one-to-one` mode:
<img src="img/life-of-a-sample/ha-prometheus-duplicates.png" alt="High availability prometheus duplication" style="max-width: 600px; display: block;margin: 0 auto;"/>
Getting back to our example illustrating the data duplication happening in the object storage, here is how each compaction process will impact the data:
<img src="img/life-of-a-sample/compactor-compaction.png" alt="Compactor compaction" width="700"/>
First, horizontal compaction will merge blocks together. This will mostly have an effect on the labels data that are stored in a compressed format in a single index binary file attached to a single block. Then, one-to-one deduplication will remove identical samples and delete the related replica label. Finally, penalty deduplication will remove duplicated samples resulting from concurrent scrapes in high-availability Prometheus setups and remove the related replica label.
You want to deduplicate data as much as possible because it will lower your object storage cost and improve query performance. However, using the penalty mode presents some limitations. For more details, see [the documentation](https://thanos.io/tip/components/compact.md/#vertical-compaction-risks).
Key points to consider:
* You want blocks that are not too big because they will be slow to query. However, you also want to limit the number of blocks because having too many will increase the number of requests to the object storage. Also, the more blocks there are, the less compaction occurs, and the more data there is to store and load into memory.
* You do not need to worry about too small blocks, as the Compactor will merge them together. However, you could have too big blocks. This can happen if you have very high cardinality workloads or churn-heavy workloads like CI runs, build pipelines, serverless functions, or batch jobs, which often lead to huge cardinality explosions as the metrics labels will be changing often.
* The main solution to this is splitting the data into several block streams, as we will see later. This is Thanos's sharding strategy.
* There are also cases where you might want to limit the size of the blocks. To that effect, you can use the following parameters:
* You can limit the compaction levels with `--debug.max-compaction-level` to prevent the Compactor from creating blocks that are too big. This is especially useful when you have a high metrics churn rate. Level 1 is the default and will create blocks of 2 hours. Level 2 will create blocks of 8 hours, level 3 of 2 days, and up to level 4 of 14 days. Without this limit, the Compactor will create blocks of up to 2 weeks. This is not a magic bullet; it does not limit the data size of the blocks. It just limits the number of blocks that can be merged together. The downside of using this setting is that it will increase the number of blocks in the object storage. They will use more space, and the query performance might be impacted.
* The flag `compact.block-max-index-size` can be used more effectively to specify the maximum index size beyond which the Compactor will stop block compaction, independently of its compaction level. Once a block's index exceeds this size, the system marks it for no further compaction. The default value is 64 GB, which is the maximum index size the TSDB supports. As a result, some block streams might appear discontinuous in the UI, displaying a lower compaction level than the surrounding blocks.
#### Scaling the Compactor: Block Streams
Not all blocks covering the same time range are compacted together. Instead, the Compactor organizes them into distinct [compaction groups or block streams](https://thanos.io/tip/components/compact.md/#compaction-groups--block-streams). The key here is to leverage external labels to group data originating from the same source. This strategic grouping is particularly effective for compacting indexes, as blocks from the same source tend to have nearly identical labels.
You can improve the performance of the Compactor by:
* Increasing the number of concurrent compactions using the `--max-concurrent` flag. Bear in mind that you must scale storage, memory and CPU resources accordingly (linearly).
* Sharding the data. In this mode, each Compactor will process a disjoint set of block streams. This is done by setting up the `--selector.relabel-config` flag on the external labels. For example:
```yaml
- action: hashmod
source_labels:
- tenant_id # An external label that identifies some block streams
target_label: shard
modulus: 2 # The number of Compactor replicas
- action: keep
source_labels:
- shard
regex: 0 # The shard number assigned to this Compactor
```
In this configuration, the `hashmod` action is used to distribute blocks across multiple Compactor instances based on the `tenant_id` label. The `modulus` should match the number of Compactor replicas you have. Each replica will then only process the blocks that match its shard number, as defined by the `regex` in the `keep` action.
#### Downsampling and Retention
The Compactor also optimizes data reads for long-range queries. If you are querying data for several months, you do not need the typical 15-second raw resolution. Processing such a query will be very inefficient, as it will retrieve a lot of unnecessary data that you will not be able to visualize with such detail in your UI. In worst-case scenarios, it may even cause some components of your Thanos setup to fail due to memory exhaustion.
To enable performant long range queries, the Compactor can downsample data using `--retention.resolution-*` flags. It supports two downsampling levels: 5 minutes and 1 hour. These are the resolutions of the downsampled series. They will typically come on top of the raw data, so that you can have both raw and downsampled data. This will enable you to spot abnormal patterns over long-range queries and then zoom into specific parts using the raw data. We will discuss how to configure the query to use the downsampled data in the next article.
When the Compactor performs downsampling, it does more than simply reduce the number of data points by removing intermediate samples. While reducing the volume of data is a primary goal, especially to improve performance for long-range queries, the Compactor ensures that essential statistical properties of the original data are preserved. This is crucial for maintaining the accuracy and integrity of any aggregations or analyses performed on the downsampled data. In addition to the downsampled data, it stores the count, minimum, maximum, and sum of the downsampled window. Functions like sum(), min(), max(), and avg() can then be computed correctly over the downsampled data because the necessary statistical information is preserved.
This downsampled data is then stored in its own block, one per downsampling level for each corresponding raw block.
Key points to consider:
* Downsampling is not for reducing the volume of data in object storage. It is for improving long-range query performance, making your system more versatile and stable.
* Thanos recommends having the same retention duration for raw and downsampled data. This will enable you to have a consistent view of your data over time.
* As a rule of thumb, you can consider that each downsampled data level increases the storage need by onefold compared to the raw data, although it is often a bit less than that.
#### The Compactor UI and the Block Streams
The Compactor's functionality and the progress of its operations can be monitored through the **Block Viewer UI**. This web-based interface is accessible if the Compactor is configured with the `--http-address` flag. Additional UI settings are controlled via `--web.*` and `--block-viewer.*` flags. The Compactor UI provides a visual representation of the compaction process, showing how blocks are grouped and compacted over time. Here is a glimpse of what the UI looks like:
<img src="img/life-of-a-sample/compactor-ui.png" alt="Receive and Store data overlap" width="800"/>
Occasionally, some blocks may display an artificially high compaction level in the UI, appearing lower in the stream compared to adjacent blocks. This scenario often occurs in situations like rolling Receiver upgrades, where Receivers restart sequentially, leading to the creation and upload of partial blocks to the object store. The Compactor then vertically compacts these blocks as they arrive, resulting in a temporary increase in compaction levels. When these blocks are horizontally compacted with adjacent blocks, they will be displayed higher up in the stream.
As explained earlier with compaction levels, by default, the Compactors strategy involves compacting 2-hour blocks into 8-hour blocks once they are available, then progressing to 2-day blocks, and up to 14 days, following a structured compaction timeline.
### Exposing Bucket Data for Queries: The Store Gateway and the Store API
#### Exposing Data for Queries
The Store Gateway acts as a facade for the object storage, making bucket data accessible via the Thanos Store API, a feature first introduced with the Receive component. The Store Gateway exposes the Store API with the `--grpc-address` flag.
The Store Gateway requires access to the object storage bucket to retrieve data, which is configured with the `--objstore.config` flag. You can use the `--max-time` flag to specify which blocks should be considered by the Store Gateway. For example, if your Receive instances are serving data up to 10 hours, you may configure `--max-time=-8h` so that it does not consider blocks more recent than 8 hours. This avoids returning the same data as the Receivers while ensuring some overlap between the two.
To function optimally, the Store Gateway relies on caches. To understand their usefulness, let's first explore how the Store Gateway retrieves data from the blocks in the object storage.
#### Retrieving Samples from the Object Store
Consider the simple following query done on the Querier:
```promql
# Between now and 2 days ago, compute the rate of http requests per second, filtered by method and status
rate(http_requests_total{method="GET", status="200"}[5m])
```
This PromQL query will be parsed by the Querier, which will emit a Thanos [Store API](https://github.com/thanos-io/thanos/blob/main/pkg/store/storepb/rpc.proto) request to the Store Gateway with the following parameters:
```proto
SeriesRequest request = {
min_time: [Timestamp 2 days ago],
max_time: [Current Timestamp],
max_resolution_window: 1h, // the minimum time range between two samples, relates to the downsampling levels
matchers: [
{ name: "__name__", value: "http_requests_total", type: EQUAL },
{ name: "method", value: "GET", type: EQUAL },
{ name: "status", value: "200", type: EQUAL }
]
}
```
The Store Gateway processes this request in several steps:
* **Metadata processing**: The Store Gateway first examines the block [metadata](https://thanos.io/tip/thanos/storage.md/#metadata-file-metajson) to determine the relevance of each block to the query. It evaluates the time range (`minTime` and `maxTime`) and external labels (`thanos.labels`). Blocks are deemed relevant if their timestamps overlap with the query's time range and if their resolution (`thanos.downsample.resolution`) matches the query's maximum allowed resolution.
* **Index processing**: Next, the Store Gateway retrieves the [indexes](https://thanos.io/tip/thanos/storage.md/#index-format-index) of candidate blocks. This involves:
* Fetching postings lists for each label specified in the query. These are inverted indexes where each label and value has an associated sorted list of all the corresponding time series IDs. Example:
* `"__name__=http_requests_total": [1, 2, 3]`,
* `"method=GET": [1, 2, 6]`,
* `"status=200": [1, 2, 5]`
* Intersecting these postings lists to select series matching all query labels. In our example these are series 1 and 2.
* Retrieving the series section from the index for these series, which includes the chunk files, the time ranges and offset position in the file. Example:
* Series 1: [Chunk 1: mint=t0, maxt=t1, fileRef=0001, offset=0], ...
* Determining the relevant chunks based on their time range intersection with the query.
* **Chunks retrieval**: The Store Gateway then fetches the appropriate chunks, either from the object storage directly or from a chunk cache. When retrieving from the object store, the Gateway leverages its API to read only the needed bytes (i.e., using S3 range requests), bypassing the need to download entire chunk files.
Then, the Gateway streams the selected chunks to the requesting Querier.
#### Optimizing the Store Gateway
Understanding the retrieval algorithm highlights the critical role of an external [index cache](https://thanos.io/tip/components/store.md/#index-cache) in the Store Gateway's operation. This is configured using the `--index-cache.config` flag. Indexes contain all labels and values of the block, which can result in large sizes. When the cache is full, Least Recently Used (LRU) eviction is applied. In scenarios where no external cache is configured, a portion of the memory will be utilized as a cache, managed via the `--index-cache.size` flag.
Moreover, the direct retrieval of chunks from object storage can be suboptimal, and result in excessive costs, especially with a high volume of queries. To mitigate this, employing a [caching bucket](https://thanos.io/tip/components/store.md/#caching-bucket) can significantly reduce the number of queries to the object storage. This is configured using the `--store.caching-bucket.config` flag. This chunk caching strategy is particularly effective when data access patterns are predominantly focused on recent data. By caching these frequently accessed chunks, query performance is enhanced, and the load on object storage is reduced.
Finally, you can implement the same safeguards as the Receive component by setting limits on the number of samples and series that can be queried. This is accomplished using the same `--store.limits.request-samples` and `--store.limits.request-series` flags.
#### Scaling the Store Gateway
The performance of Thanos Store components can be notably improved by managing concurrency and implementing sharding strategies.
Adjusting the level of concurrency can have a significant impact on performance. This is managed through the `--store.grpc.series-max-concurrency` flag, which sets the number of allowed concurrent series requests on the Store API. Other lower-level concurrency settings are also available.
After optimizing the store processing, you can distribute the query load using sharding strategies similar to what was done with the Compactor. Using a relabel configuration, you can assign a disjoint set of blocks to each Store Gateway replica. Heres an example of how to set up sharding using the `--selector.relabel-config` flag:
```yaml
- action: hashmod
source_labels:
- tenant_id # An external label that identifies some block streams
target_label: shard
modulus: 2 # The number of Store Gateways replicas
- action: keep
source_labels:
- shard
regex: 0 # The shard number assigned to this Store Gateway
```
Sharding based on the `__block_id` is not recommended because it prevents Stores from selecting the most relevant data resolution needed for a query. For example, one store might see only the raw data and return it, while another store sees the downsampled version for the same query and also returns it. This duplication creates unnecessary overhead.
External label based shrading avoids this issue. By giving a store a complete view of a stream's data (both raw and downsampled), it can effectively select the most appropriate resolution.
If external label sharding is not sufficient, you can combine it with time partitioning using the `--min-time` and `--max-time` flags. This process is done at the chunk level, meaning you can use shorter time intervals for recent data in 2 hour blocks, but you must use longer intervals for older data to account for horizontal compaction. The goal is for any store instance to have a complete view of the stream's data at every resolution for a given time slot, allowing it to return the unique and most appropriate data.
### Conclusion
In this second part, we explored how Thanos manages data for efficient storage and retrieval. We examined how the Receive component prepares samples and exposes local data for queries, and how the Compactor optimizes data through compaction and downsampling. We also discussed how the Store Gateway retrieves data and can be optimized by leveraging indexes and implementing sharding strategies.
Now that our samples are efficiently stored and prepared for queries, we can move on to the final part of this series, where we will explore how this distributed data is retrieved by query components like the Querier.
See the full list of articles in this series:
* Life of a sample in thanos, and how to configure it Ingestion Part I
* Life of a sample in thanos, and how to configure it Data Management Part II
* Life of a sample in thanos, and how to configure it Querying Part III

Binary file not shown.

After

Width:  |  Height:  |  Size: 218 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 756 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 244 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 144 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 126 KiB

View File

@ -8,4 +8,4 @@ Welcome 👋🏼
This space was created for the Thanos community to share learnings, insights, best practices and cool things to the world. If you are interested in contributing relevant content to Thanos blog, feel free to add Pull Request (PR) to [Thanos repo's blog directory](http://github.com/thanos-io/thanos). See ya there! This space was created for the Thanos community to share learnings, insights, best practices and cool things to the world. If you are interested in contributing relevant content to Thanos blog, feel free to add Pull Request (PR) to [Thanos repo's blog directory](http://github.com/thanos-io/thanos). See ya there!
PS: For Prometheus specific content, consider contributing to [Prometheus blog space](https://prometheus.io/blog/) by creating PR to [Prometheus docs repo](https://github.com/prometheus/docs/tree/main/content/blog). PS: For Prometheus specific content, consider contributing to [Prometheus blog space](https://prometheus.io/blog/) by creating PR to [Prometheus docs repo](https://github.com/prometheus/docs/tree/main/blog-posts).

View File

@ -280,10 +280,75 @@ usage: thanos compact [<flags>]
Continuously compacts blocks in an object store bucket. Continuously compacts blocks in an object store bucket.
Flags: Flags:
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9 --auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory. detected maximum container or system memory.
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
HTTP Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--data-dir="./data" Data directory in which to cache blocks and
process compactions.
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--consistency-delay=30m Minimum age of fresh (non-compacted)
blocks before they are being processed.
Malformed blocks older than the maximum of
consistency-delay and 48h0m0s will be removed.
--retention.resolution-raw=0d
How long to retain raw samples in bucket.
Setting this to 0d will retain samples of this
resolution forever
--retention.resolution-5m=0d
How long to retain samples of resolution 1 (5
minutes) in bucket. Setting this to 0d will
retain samples of this resolution forever
--retention.resolution-1h=0d
How long to retain samples of resolution 2 (1
hour) in bucket. Setting this to 0d will retain
samples of this resolution forever
-w, --[no-]wait Do not exit after all compactions have been
processed and wait for new work.
--wait-interval=5m Wait interval between consecutive compaction
runs and bucket refreshes. Only works when
--wait flag specified.
--[no-]downsampling.disable
Disables downsampling. This is not recommended
as querying long time ranges without
non-downsampled data is not efficient and useful
e.g it is not possible to render all samples for
a human eye anyway
--block-discovery-strategy="concurrent" --block-discovery-strategy="concurrent"
One of concurrent, recursive. When set to One of concurrent, recursive. When set to
concurrent, stores will concurrently issue concurrent, stores will concurrently issue
@ -293,13 +358,13 @@ Flags:
recursively traversing into each directory. recursively traversing into each directory.
This avoids N+1 calls at the expense of having This avoids N+1 calls at the expense of having
slower bucket iterations. slower bucket iterations.
--block-meta-fetch-concurrency=32
Number of goroutines to use when fetching block
metadata from object storage.
--block-files-concurrency=1 --block-files-concurrency=1
Number of goroutines to use when Number of goroutines to use when
fetching/uploading block files from object fetching/uploading block files from object
storage. storage.
--block-meta-fetch-concurrency=32
Number of goroutines to use when fetching block
metadata from object storage.
--block-viewer.global.sync-block-interval=1m --block-viewer.global.sync-block-interval=1m
Repeat interval for syncing the blocks between Repeat interval for syncing the blocks between
local and remote view for /global Block Viewer local and remote view for /global Block Viewer
@ -308,32 +373,37 @@ Flags:
Maximum time for syncing the blocks between Maximum time for syncing the blocks between
local and remote view for /global Block Viewer local and remote view for /global Block Viewer
UI. UI.
--bucket-web-label=BUCKET-WEB-LABEL
External block label to use as group title in
the bucket web UI
--compact.blocks-fetch-concurrency=1
Number of goroutines to use when download block
during compaction.
--compact.cleanup-interval=5m --compact.cleanup-interval=5m
How often we should clean up partially uploaded How often we should clean up partially uploaded
blocks and blocks with deletion mark in the blocks and blocks with deletion mark in the
background when --wait has been enabled. Setting background when --wait has been enabled. Setting
it to "0s" disables it - the cleaning will only it to "0s" disables it - the cleaning will only
happen at the end of an iteration. happen at the end of an iteration.
--compact.concurrency=1 Number of goroutines to use when compacting
groups.
--compact.progress-interval=5m --compact.progress-interval=5m
Frequency of calculating the compaction progress Frequency of calculating the compaction progress
in the background when --wait has been enabled. in the background when --wait has been enabled.
Setting it to "0s" disables it. Now compaction, Setting it to "0s" disables it. Now compaction,
downsampling and retention progress are downsampling and retention progress are
supported. supported.
--consistency-delay=30m Minimum age of fresh (non-compacted) --compact.concurrency=1 Number of goroutines to use when compacting
blocks before they are being processed. groups.
Malformed blocks older than the maximum of --compact.blocks-fetch-concurrency=1
consistency-delay and 48h0m0s will be removed. Number of goroutines to use when download block
--data-dir="./data" Data directory in which to cache blocks and during compaction.
process compactions. --downsample.concurrency=1
Number of goroutines to use when downsampling
blocks.
--delete-delay=48h Time before a block marked for deletion is
deleted from bucket. If delete-delay is non
zero, blocks will be marked for deletion and
compactor component will delete blocks marked
for deletion from the bucket. If delete-delay
is 0, blocks will be deleted straight away.
Note that deleting blocks immediately can cause
query failures, if store gateway still has the
block loaded, or compactor is ignoring the
deletion because it's compacting the block at
the same time.
--deduplication.func= Experimental. Deduplication algorithm for --deduplication.func= Experimental. Deduplication algorithm for
merging overlapping blocks. Possible values are: merging overlapping blocks. Possible values are:
"", "penalty". If no value is specified, "", "penalty". If no value is specified,
@ -360,48 +430,19 @@ Flags:
need a different deduplication algorithm (e.g need a different deduplication algorithm (e.g
one that works well with Prometheus replicas), one that works well with Prometheus replicas),
please set it via --deduplication.func. please set it via --deduplication.func.
--delete-delay=48h Time before a block marked for deletion is
deleted from bucket. If delete-delay is non
zero, blocks will be marked for deletion and
compactor component will delete blocks marked
for deletion from the bucket. If delete-delay
is 0, blocks will be deleted straight away.
Note that deleting blocks immediately can cause
query failures, if store gateway still has the
block loaded, or compactor is ignoring the
deletion because it's compacting the block at
the same time.
--disable-admin-operations
Disable UI/API admin operations like marking
blocks for deletion and no compaction.
--downsample.concurrency=1
Number of goroutines to use when downsampling
blocks.
--downsampling.disable Disables downsampling. This is not recommended
as querying long time ranges without
non-downsampled data is not efficient and useful
e.g it is not possible to render all samples for
a human eye anyway
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--hash-func= Specify which hash function to use when --hash-func= Specify which hash function to use when
calculating the hashes of produced files. calculating the hashes of produced files.
If no function has been specified, it does not If no function has been specified, it does not
happen. This permits avoiding downloading some happen. This permits avoiding downloading some
files twice albeit at some performance cost. files twice albeit at some performance cost.
Possible values are: "", "SHA256". Possible values are: "", "SHA256".
-h, --help Show context-sensitive help (also try --min-time=0000-01-01T00:00:00Z
--help-long and --help-man). Start of time range limit to compact.
--http-address="0.0.0.0:10902" Thanos Compactor will compact only blocks, which
Listen host:port for HTTP endpoints. happened later than this value. Option can be a
--http-grace-period=2m Time to wait after an interrupt received for constant time in RFC3339 format or time duration
HTTP Server. relative to current time, such as -1d or 2h45m.
--http.config="" [EXPERIMENTAL] Path to the configuration file Valid duration units are ms, s, m, h, d, w, y.
that can enable TLS or authentication for all
HTTP endpoints.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--max-time=9999-12-31T23:59:59Z --max-time=9999-12-31T23:59:59Z
End of time range limit to compact. End of time range limit to compact.
Thanos Compactor will compact only blocks, Thanos Compactor will compact only blocks,
@ -410,35 +451,14 @@ Flags:
duration relative to current time, such as -1d duration relative to current time, such as -1d
or 2h45m. Valid duration units are ms, s, m, h, or 2h45m. Valid duration units are ms, s, m, h,
d, w, y. d, w, y.
--min-time=0000-01-01T00:00:00Z --[no-]web.disable Disable Block Viewer UI.
Start of time range limit to compact. --selector.relabel-config-file=<file-path>
Thanos Compactor will compact only blocks, which Path to YAML file with relabeling
happened later than this value. Option can be a configuration that allows selecting blocks
constant time in RFC3339 format or time duration to act on based on their external labels.
relative to current time, such as -1d or 2h45m. It follows thanos sharding relabel-config
Valid duration units are ms, s, m, h, d, w, y. syntax. For format details see:
--objstore.config=<content> https://thanos.io/tip/thanos/sharding.md/#relabelling
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--retention.resolution-1h=0d
How long to retain samples of resolution 2 (1
hour) in bucket. Setting this to 0d will retain
samples of this resolution forever
--retention.resolution-5m=0d
How long to retain samples of resolution 1 (5
minutes) in bucket. Setting this to 0d will
retain samples of this resolution forever
--retention.resolution-raw=0d
How long to retain raw samples in bucket.
Setting this to 0d will retain samples of this
resolution forever
--selector.relabel-config=<content> --selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file' Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML flag (mutually exclusive). Content of YAML
@ -447,32 +467,10 @@ Flags:
external labels. It follows thanos sharding external labels. It follows thanos sharding
relabel-config syntax. For format details see: relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config-file=<file-path> --web.route-prefix="" Prefix for API and UI endpoints. This allows
Path to YAML file with relabeling thanos UI to be served on a sub-path. This
configuration that allows selecting blocks option is analogous to --web.route-prefix of
to act on based on their external labels. Prometheus.
It follows thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--version Show application version.
-w, --wait Do not exit after all compactions have been
processed and wait for new work.
--wait-interval=5m Wait interval between consecutive compaction
runs and bucket refreshes. Only works when
--wait flag specified.
--web.disable Disable Block Viewer UI.
--web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--web.external-prefix="" Static prefix for all HTML links and redirect --web.external-prefix="" Static prefix for all HTML links and redirect
URLs in the bucket web UI interface. URLs in the bucket web UI interface.
Actual endpoints are still served on / or the Actual endpoints are still served on / or the
@ -492,9 +490,14 @@ Flags:
stripped prefix value in X-Forwarded-Prefix stripped prefix value in X-Forwarded-Prefix
header. This allows thanos UI to be served on a header. This allows thanos UI to be served on a
sub-path. sub-path.
--web.route-prefix="" Prefix for API and UI endpoints. This allows --[no-]web.disable-cors Whether to disable CORS headers to be set by
thanos UI to be served on a sub-path. This Thanos. By default Thanos sets CORS headers to
option is analogous to --web.route-prefix of be allowed by all.
Prometheus. --bucket-web-label=BUCKET-WEB-LABEL
External block label to use as group title in
the bucket web UI
--[no-]disable-admin-operations
Disable UI/API admin operations like marking
blocks for deletion and no compaction.
``` ```

View File

@ -200,198 +200,196 @@ usage: thanos query-frontend [<flags>]
Query frontend command implements a service deployed in front of queriers to Query frontend command implements a service deployed in front of queriers to
improve query parallelization and caching. improve query parallelization and caching.
Flags: Flags:
--auto-gomemlimit.ratio=0.9 -h, --[no-]help Show context-sensitive help (also try --help-long
The ratio of reserved GOMEMLIMIT memory to the and --help-man).
detected maximum container or system memory. --[no-]version Show application version.
--cache-compression-type="" --log.level=info Log filtering level.
Use compression in results cache. --log.format=logfmt Log format to use. Possible options: logfmt or
Supported values are: 'snappy' and ” (disable json.
compression).
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
HTTP Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--labels.default-time-range=24h
The default metadata time range duration for
retrieving labels through Labels and Series API
when the range parameters are not specified.
--labels.max-query-parallelism=14
Maximum number of labels requests will be
scheduled in parallel by the Frontend.
--labels.max-retries-per-request=5
Maximum number of retries for a single
label/series API request; beyond this,
the downstream error is returned.
--labels.partial-response Enable partial response for labels requests
if no partial_response param is specified.
--no-labels.partial-response for disabling.
--labels.response-cache-config=<content>
Alternative to
'labels.response-cache-config-file' flag
(mutually exclusive). Content of YAML file that
contains response cache configuration.
--labels.response-cache-config-file=<file-path>
Path to YAML file that contains response cache
configuration.
--labels.response-cache-max-freshness=1m
Most recent allowed cacheable result for
labels requests, to prevent caching very recent
results that might still be in flux.
--labels.split-interval=24h
Split labels requests by an interval and
execute in parallel, it should be greater
than 0 when labels.response-cache-config is
configured.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--query-frontend.compress-responses
Compress HTTP responses.
--query-frontend.downstream-tripper-config=<content>
Alternative to
'query-frontend.downstream-tripper-config-file'
flag (mutually exclusive). Content of YAML file
that contains downstream tripper configuration.
If your downstream URL is localhost or
127.0.0.1 then it is highly recommended to
increase max_idle_conns_per_host to at least
100.
--query-frontend.downstream-tripper-config-file=<file-path>
Path to YAML file that contains downstream
tripper configuration. If your downstream URL
is localhost or 127.0.0.1 then it is highly
recommended to increase max_idle_conns_per_host
to at least 100.
--query-frontend.downstream-url="http://localhost:9090"
URL of downstream Prometheus Query compatible
API.
--query-frontend.enable-x-functions
Enable experimental x-
functions in query-frontend.
--no-query-frontend.enable-x-functions for
disabling.
--query-frontend.force-query-stats
Enables query statistics for all queries and
will export statistics as logs and service
headers.
--query-frontend.forward-header=<http-header-name> ...
List of headers forwarded by the query-frontend
to downstream queriers, default is empty
--query-frontend.log-queries-longer-than=0
Log queries that are slower than the specified
duration. Set to 0 to disable. Set to < 0 to
enable on all queries.
--query-frontend.org-id-header=<http-header-name> ...
Deprecation Warning - This flag
will be soon deprecated in favor of
query-frontend.tenant-header and both flags
cannot be used at the same time. Request header
names used to identify the source of slow
queries (repeated flag). The values of the
header will be added to the org id field in
the slow query log. If multiple headers match
the request, the first matching arg specified
will take precedence. If no headers match
'anonymous' will be used.
--query-frontend.slow-query-logs-user-header=<http-header-name>
Set the value of the field remote_user in the
slow query logs to the value of the given HTTP
header. Falls back to reading the user from the
basic auth header.
--query-frontend.vertical-shards=QUERY-FRONTEND.VERTICAL-SHARDS
Number of shards to use when
distributing shardable PromQL queries.
For more details, you can refer to
the Vertical query sharding proposal:
https://thanos.io/tip/proposals-accepted/202205-vertical-query-sharding.md
--query-range.align-range-with-step
Mutate incoming queries to align their
start and end with their step for better
cache-ability. Note: Grafana dashboards do that
by default.
--query-range.horizontal-shards=0
Split queries in this many requests
when query duration is below
query-range.max-split-interval.
--query-range.max-query-length=0
Limit the query time range (end - start time)
in the query-frontend, 0 disables it.
--query-range.max-query-parallelism=14
Maximum number of query range requests will be
scheduled in parallel by the Frontend.
--query-range.max-retries-per-request=5
Maximum number of retries for a single query
range request; beyond this, the downstream
error is returned.
--query-range.max-split-interval=0
Split query range below this interval in
query-range.horizontal-shards. Queries with a
range longer than this value will be split in
multiple requests of this length.
--query-range.min-split-interval=0
Split query range requests above this
interval in query-range.horizontal-shards
requests of equal range. Using
this parameter is not allowed with
query-range.split-interval. One should also set
query-range.split-min-horizontal-shards to a
value greater than 1 to enable splitting.
--query-range.partial-response
Enable partial response for query range
requests if no partial_response param is
specified. --no-query-range.partial-response
for disabling.
--query-range.request-downsampled
Make additional query for downsampled data in
case of empty or incomplete response to range
request.
--query-range.response-cache-config=<content>
Alternative to
'query-range.response-cache-config-file' flag
(mutually exclusive). Content of YAML file that
contains response cache configuration.
--query-range.response-cache-config-file=<file-path>
Path to YAML file that contains response cache
configuration.
--query-range.response-cache-max-freshness=1m
Most recent allowed cacheable result for query
range requests, to prevent caching very recent
results that might still be in flux.
--query-range.split-interval=24h
Split query range requests by an interval and
execute in parallel, it should be greater than
0 when query-range.response-cache-config is
configured.
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path> --tracing.config-file=<file-path>
Path to YAML file with tracing Path to YAML file with tracing
configuration. See format details: configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration https://thanos.io/tip/thanos/tracing.md/#configuration
--version Show application version. --tracing.config=<content>
--web.disable-cors Whether to disable CORS headers to be set by Alternative to 'tracing.config-file' flag
Thanos. By default Thanos sets CORS headers to (mutually exclusive). Content of YAML file
be allowed by all. with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory.
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for HTTP
Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--[no-]web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to be
allowed by all.
--[no-]query-range.align-range-with-step
Mutate incoming queries to align their start and
end with their step for better cache-ability.
Note: Grafana dashboards do that by default.
--[no-]query-range.request-downsampled
Make additional query for downsampled data in
case of empty or incomplete response to range
request.
--query-range.split-interval=24h
Split query range requests by an interval and
execute in parallel, it should be greater than
0 when query-range.response-cache-config is
configured.
--query-range.min-split-interval=0
Split query range requests above this interval
in query-range.horizontal-shards requests of
equal range. Using this parameter is not allowed
with query-range.split-interval. One should also
set query-range.split-min-horizontal-shards to a
value greater than 1 to enable splitting.
--query-range.max-split-interval=0
Split query range below this interval in
query-range.horizontal-shards. Queries with a
range longer than this value will be split in
multiple requests of this length.
--query-range.horizontal-shards=0
Split queries in this many requests when query
duration is below query-range.max-split-interval.
--query-range.max-retries-per-request=5
Maximum number of retries for a single query
range request; beyond this, the downstream error
is returned.
--[no-]query-frontend.enable-x-functions
Enable experimental x-
functions in query-frontend.
--no-query-frontend.enable-x-functions for
disabling.
--enable-feature= ... Comma separated feature names to enable. Valid
options for now: promql-experimental-functions
(enables promql experimental functions in
query-frontend)
--query-range.max-query-length=0
Limit the query time range (end - start time) in
the query-frontend, 0 disables it.
--query-range.max-query-parallelism=14
Maximum number of query range requests will be
scheduled in parallel by the Frontend.
--query-range.response-cache-max-freshness=1m
Most recent allowed cacheable result for query
range requests, to prevent caching very recent
results that might still be in flux.
--[no-]query-range.partial-response
Enable partial response for query range requests
if no partial_response param is specified.
--no-query-range.partial-response for disabling.
--query-range.response-cache-config-file=<file-path>
Path to YAML file that contains response cache
configuration.
--query-range.response-cache-config=<content>
Alternative to
'query-range.response-cache-config-file' flag
(mutually exclusive). Content of YAML file that
contains response cache configuration.
--labels.split-interval=24h
Split labels requests by an interval and execute
in parallel, it should be greater than 0 when
labels.response-cache-config is configured.
--labels.max-retries-per-request=5
Maximum number of retries for a single
label/series API request; beyond this, the
downstream error is returned.
--labels.max-query-parallelism=14
Maximum number of labels requests will be
scheduled in parallel by the Frontend.
--labels.response-cache-max-freshness=1m
Most recent allowed cacheable result for labels
requests, to prevent caching very recent results
that might still be in flux.
--[no-]labels.partial-response
Enable partial response for labels requests
if no partial_response param is specified.
--no-labels.partial-response for disabling.
--labels.default-time-range=24h
The default metadata time range duration for
retrieving labels through Labels and Series API
when the range parameters are not specified.
--labels.response-cache-config-file=<file-path>
Path to YAML file that contains response cache
configuration.
--labels.response-cache-config=<content>
Alternative to
'labels.response-cache-config-file' flag
(mutually exclusive). Content of YAML file that
contains response cache configuration.
--cache-compression-type=""
Use compression in results cache. Supported
values are: 'snappy' and ” (disable compression).
--query-frontend.downstream-url="http://localhost:9090"
URL of downstream Prometheus Query compatible
API.
--query-frontend.downstream-tripper-config-file=<file-path>
Path to YAML file that contains downstream
tripper configuration. If your downstream URL
is localhost or 127.0.0.1 then it is highly
recommended to increase max_idle_conns_per_host
to at least 100.
--query-frontend.downstream-tripper-config=<content>
Alternative to
'query-frontend.downstream-tripper-config-file'
flag (mutually exclusive). Content of YAML file
that contains downstream tripper configuration.
If your downstream URL is localhost or 127.0.0.1
then it is highly recommended to increase
max_idle_conns_per_host to at least 100.
--[no-]query-frontend.compress-responses
Compress HTTP responses.
--query-frontend.log-queries-longer-than=0
Log queries that are slower than the specified
duration. Set to 0 to disable. Set to < 0 to
enable on all queries.
--[no-]query-frontend.force-query-stats
Enables query statistics for all queries and will
export statistics as logs and service headers.
--query-frontend.org-id-header=<http-header-name> ...
Deprecation Warning - This flag
will be soon deprecated in favor of
query-frontend.tenant-header and both flags
cannot be used at the same time. Request header
names used to identify the source of slow queries
(repeated flag). The values of the header will be
added to the org id field in the slow query log.
If multiple headers match the request, the first
matching arg specified will take precedence.
If no headers match 'anonymous' will be used.
--query-frontend.forward-header=<http-header-name> ...
List of headers forwarded by the query-frontend
to downstream queriers, default is empty
--query-frontend.vertical-shards=QUERY-FRONTEND.VERTICAL-SHARDS
Number of shards to use when
distributing shardable PromQL queries.
For more details, you can refer to
the Vertical query sharding proposal:
https://thanos.io/tip/proposals-accepted/202205-vertical-query-sharding.md
--query-frontend.slow-query-logs-user-header=<http-header-name>
Set the value of the field remote_user in the
slow query logs to the value of the given HTTP
header. Falls back to reading the user from the
basic auth header.
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
``` ```

View File

@ -114,13 +114,13 @@ Note that deduplication of HA groups is not supported by the `chain` algorithm.
## Thanos PromQL Engine (experimental) ## Thanos PromQL Engine (experimental)
By default, Thanos querier comes with standard Prometheus PromQL engine. However, when `--query.promql-engine=thanos` is specified, Thanos will use [experimental Thanos PromQL engine](http://github.com/thanos-community/promql-engine) which is a drop-in, efficient implementation of PromQL engine with query planner and optimizers. By default, Thanos querier comes with standard Prometheus PromQL engine. However, when `--query.promql-engine=thanos` is specified, Thanos will use [experimental Thanos PromQL engine](http://github.com/thanos-io/promql-engine) which is a drop-in, efficient implementation of PromQL engine with query planner and optimizers.
To learn more, see [the introduction talk](https://youtu.be/pjkWzDVxWk4?t=3609) from [the PromConEU 2022](https://promcon.io/2022-munich/talks/opening-pandoras-box-redesigning/). To learn more, see [the introduction talk](https://youtu.be/pjkWzDVxWk4?t=3609) from [the PromConEU 2022](https://promcon.io/2022-munich/talks/opening-pandoras-box-redesigning/).
This feature is still **experimental** given active development. All queries should be supported due to bulit-in fallback to old PromQL if something is not yet implemented. This feature is still **experimental** given active development. All queries should be supported due to bulit-in fallback to old PromQL if something is not yet implemented.
For new engine bugs/issues, please use https://github.com/thanos-community/promql-engine GitHub issues. For new engine bugs/issues, please use https://github.com/thanos-io/promql-engine GitHub issues.
### Distributed execution mode ### Distributed execution mode
@ -281,7 +281,7 @@ Example file SD file in YAML:
### Tenant Metrics ### Tenant Metrics
Tenant information is captured in relevant Thanos exported metrics in the Querier, Query Frontend and Store. In order make use of this functionality requests to the Query/Query Frontend component should include the tenant-id in the appropriate HTTP request header as configured with `--query.tenant-header`. The tenant information is passed through components (including Query Frontend), down to the Thanos Store, enabling per-tenant metrics in these components also. If no tenant header is set to requests to the query component, the default tenant as defined by `--query.tenant-default-id` will be used. Tenant information is captured in relevant Thanos exported metrics in the Querier, Query Frontend and Store. In order make use of this functionality requests to the Query/Query Frontend component should include the tenant-id in the appropriate HTTP request header as configured with `--query.tenant-header`. The tenant information is passed through components (including Query Frontend), down to the Thanos Store, enabling per-tenant metrics in these components also. If no tenant header is set to requests to the query component, the default tenant as defined by `--query.default-tenant-id` will be used.
### Tenant Enforcement ### Tenant Enforcement
@ -299,96 +299,29 @@ usage: thanos query [<flags>]
Query node exposing PromQL enabled Query API with data retrieved from multiple Query node exposing PromQL enabled Query API with data retrieved from multiple
store nodes. store nodes.
Flags: Flags:
--alert.query-url=ALERT.QUERY-URL -h, --[no-]help Show context-sensitive help (also try
The external Thanos Query URL that would be set --help-long and --help-man).
in all alerts 'Source' field. --[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9 --auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory. detected maximum container or system memory.
--deduplication.func=penalty
Experimental. Deduplication algorithm for
merging overlapping series. Possible values
are: "penalty", "chain". If no value is
specified, penalty based deduplication
algorithm will be used. When set to chain, the
default compact deduplication merger is used,
which performs 1:1 deduplication for samples.
At least one replica label has to be set via
--query.replica-label flag.
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--endpoint=<endpoint> ... (Deprecated): Addresses of statically
configured Thanos API servers (repeatable).
The scheme may be prefixed with 'dns+' or
'dnssrv+' to detect Thanos API servers through
respective DNS lookups.
--endpoint-group=<endpoint-group> ...
(Deprecated, Experimental): DNS name of
statically configured Thanos API server groups
(repeatable). Targets resolved from the DNS
name will be queried in a round-robin, instead
of a fanout manner. This flag should be used
when connecting a Thanos Query to HA groups of
Thanos components.
--endpoint-group-strict=<endpoint-group-strict> ...
(Deprecated, Experimental): DNS name of
statically configured Thanos API server groups
(repeatable) that are always used, even if the
health check fails.
--endpoint-strict=<endpoint-strict> ...
(Deprecated): Addresses of only statically
configured Thanos API servers that are always
used, even if the health check fails. Useful if
you have a caching layer on top.
--endpoint.sd-config=<content>
Alternative to 'endpoint.sd-config-file' flag
(mutually exclusive). Content of Config File
with endpoint definitions
--endpoint.sd-config-file=<file-path>
Path to Config File with endpoint definitions
--endpoint.sd-config-reload-interval=5m
Interval between endpoint config refreshes
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-client-server-name=""
Server name to verify the hostname on
the returned gRPC certificates. See
https://tools.ietf.org/html/rfc4366#section-3.1
--grpc-client-tls-ca="" TLS CA Certificates to use to verify gRPC
servers
--grpc-client-tls-cert="" TLS Certificates to use to identify this client
to the server
--grpc-client-tls-key="" TLS Key for the client's certificate
--grpc-client-tls-secure Use TLS when talking to the gRPC server
--grpc-client-tls-skip-verify
Disable TLS certificate verification i.e self
signed, signed by fake CA
--grpc-compression=none Compression algorithm to use for gRPC requests
to other clients. Must be one of: snappy, none
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902" --http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints. Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for --http-grace-period=2m Time to wait after an interrupt received for
@ -396,179 +329,50 @@ Flags:
--http.config="" [EXPERIMENTAL] Path to the configuration file --http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all that can enable TLS or authentication for all
HTTP endpoints. HTTP endpoints.
--log.format=logfmt Log format to use. Possible options: logfmt or --grpc-address="0.0.0.0:10901"
json. Listen ip:port address for gRPC endpoints
--log.level=info Log filtering level. (StoreAPI). Make sure this address is routable
--query.active-query-path="" from other components.
Directory to log currently active queries in --grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
the queries.active file. disable TLS
--query.auto-downsampling Enable automatic adjustment (step / 5) to what --grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
source of data should be used in store gateways disable TLS
if no max_source_resolution param is specified. --grpc-server-tls-client-ca=""
--query.conn-metric.label=external_labels... ... TLS CA to verify clients against. If no
Optional selection of query connection metric client CA is specified, there is no client
labels to be collected from endpoint set verification on server side. (tls.NoClientCert)
--query.default-evaluation-interval=1m --grpc-server-tls-min-version="1.3"
Set default evaluation interval for sub TLS supported minimum version for gRPC server.
queries. If no version is specified, it'll default to
--query.default-step=1s Set default step for range queries. Default 1.3. Allowed values: ["1.0", "1.1", "1.2",
step is only used when step is not set in UI. "1.3"]
In such cases, Thanos UI will use default --grpc-server-max-connection-age=60m
step to calculate resolution (resolution The grpc server max connection age. This
= max(rangeSeconds / 250, defaultStep)). controls how often to re-establish connections
This will not work from Grafana, but Grafana and redo TLS handshakes.
has __step variable which can be used. --grpc-grace-period=2m Time to wait after an interrupt received for
--query.default-tenant-id="default-tenant" GRPC Server.
Default tenant ID to use if tenant header is --[no-]grpc-client-tls-secure
not present Use TLS when talking to the gRPC server
--query.enable-x-functions --[no-]grpc-client-tls-skip-verify
Whether to enable extended rate functions Disable TLS certificate verification i.e self
(xrate, xincrease and xdelta). Only has effect signed, signed by fake CA
when used with Thanos engine. --grpc-client-tls-cert="" TLS Certificates to use to identify this client
--query.enforce-tenancy Enforce tenancy on Query APIs. Responses to the server
are returned only if the label value of the --grpc-client-tls-key="" TLS Key for the client's certificate
configured tenant-label-name and the value of --grpc-client-tls-ca="" TLS CA Certificates to use to verify gRPC
the tenant header matches. servers
--query.lookback-delta=QUERY.LOOKBACK-DELTA --grpc-client-server-name=""
The maximum lookback duration for retrieving Server name to verify the hostname on
metrics during expression evaluations. the returned gRPC certificates. See
PromQL always evaluates the query for the https://tools.ietf.org/html/rfc4366#section-3.1
certain timestamp (query range timestamps are --grpc-compression=none Compression algorithm to use for gRPC requests
deduced by step). Since scrape intervals might to other clients. Must be one of: snappy, none
be different, PromQL looks back for given --web.route-prefix="" Prefix for API and UI endpoints. This allows
amount of time to get latest sample. If it thanos UI to be served on a sub-path.
exceeds the maximum lookback delta it assumes Defaults to the value of --web.external-prefix.
series is stale and returns none (a gap). This option is analogous to --web.route-prefix
This is why lookback delta should be set to at of Prometheus.
least 2 times of the slowest scrape interval.
If unset it will use the promql default of 5m.
--query.max-concurrent=20 Maximum number of queries processed
concurrently by query node.
--query.max-concurrent-select=4
Maximum number of select requests made
concurrently per a query.
--query.metadata.default-time-range=0s
The default metadata time range duration for
retrieving labels through Labels and Series API
when the range parameters are not specified.
The zero value means range covers the time
since the beginning.
--query.mode=local PromQL query mode. One of: local, distributed.
--query.partial-response Enable partial response for queries if
no partial_response param is specified.
--no-query.partial-response for disabling.
--query.partition-label=QUERY.PARTITION-LABEL ...
Labels that partition the leaf queriers. This
is used to scope down the labelsets of leaf
queriers when using the distributed query mode.
If set, these labels must form a partition
of the leaf queriers. Partition labels must
not intersect with replica labels. Every TSDB
of a leaf querier must have these labels.
This is useful when there are multiple external
labels that are irrelevant for the partition as
it allows the distributed engine to ignore them
for some optimizations. If this is empty then
all labels are used as partition labels.
--query.promql-engine=prometheus
Default PromQL engine to use.
--query.replica-label=QUERY.REPLICA-LABEL ...
Labels to treat as a replica indicator along
which data is deduplicated. Still you will
be able to query without deduplication using
'dedup=false' parameter. Data includes time
series, recording rules, and alerting rules.
Flag may be specified multiple times as well as
a comma separated list of labels.
--query.telemetry.request-duration-seconds-quantiles=0.1... ...
The quantiles for exporting metrics about the
request duration quantiles.
--query.telemetry.request-samples-quantiles=100... ...
The quantiles for exporting metrics about the
samples count quantiles.
--query.telemetry.request-series-seconds-quantiles=10... ...
The quantiles for exporting metrics about the
series count quantiles.
--query.tenant-certificate-field=
Use TLS client's certificate field to determine
tenant for write requests. Must be one of
organization, organizationalUnit or commonName.
This setting will cause the query.tenant-header
flag value to be ignored.
--query.tenant-header="THANOS-TENANT"
HTTP header to determine tenant.
--query.tenant-label-name="tenant_id"
Label name to use when enforcing tenancy (if
--query.enforce-tenancy is enabled).
--query.timeout=2m Maximum time to process query by query node.
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--selector-label=<name>="<value>" ...
Query selector labels that will be exposed in
info endpoint (repeated).
--selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML
file with relabeling configuration that allows
selecting blocks to query based on their
external labels. It follows the Thanos sharding
relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to query based on their external labels.
It follows the Thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.response-timeout=0ms
If a Store doesn't send any data in this
specified duration then a Store will be ignored
and partial data will be returned if it's
enabled. 0 disables timeout.
--store.sd-dns-interval=30s
Interval between DNS resolutions.
--store.sd-files=<path> ...
(Deprecated) Path to files that contain
addresses of store API servers. The path can be
a glob pattern (repeatable).
--store.sd-interval=5m (Deprecated) Refresh interval to re-read file
SD files. It is used as a resync fallback.
--store.unhealthy-timeout=5m
Timeout before an unhealthy store is cleaned
from the store UI page.
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--version Show application version.
--web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--web.external-prefix="" Static prefix for all HTML links and --web.external-prefix="" Static prefix for all HTML links and
redirect URLs in the UI query web interface. redirect URLs in the UI query web interface.
Actual endpoints are still served on / or the Actual endpoints are still served on / or the
@ -588,11 +392,217 @@ Flags:
stripped prefix value in X-Forwarded-Prefix stripped prefix value in X-Forwarded-Prefix
header. This allows thanos UI to be served on a header. This allows thanos UI to be served on a
sub-path. sub-path.
--web.route-prefix="" Prefix for API and UI endpoints. This allows --[no-]web.disable-cors Whether to disable CORS headers to be set by
thanos UI to be served on a sub-path. Thanos. By default Thanos sets CORS headers to
Defaults to the value of --web.external-prefix. be allowed by all.
This option is analogous to --web.route-prefix --query.timeout=2m Maximum time to process query by query node.
of Prometheus. --query.promql-engine=prometheus
Default PromQL engine to use.
--[no-]query.enable-x-functions
Whether to enable extended rate functions
(xrate, xincrease and xdelta). Only has effect
when used with Thanos engine.
--query.mode=local PromQL query mode. One of: local, distributed.
--query.max-concurrent=20 Maximum number of queries processed
concurrently by query node.
--query.lookback-delta=QUERY.LOOKBACK-DELTA
The maximum lookback duration for retrieving
metrics during expression evaluations.
PromQL always evaluates the query for the
certain timestamp (query range timestamps are
deduced by step). Since scrape intervals might
be different, PromQL looks back for given
amount of time to get latest sample. If it
exceeds the maximum lookback delta it assumes
series is stale and returns none (a gap).
This is why lookback delta should be set to at
least 2 times of the slowest scrape interval.
If unset it will use the promql default of 5m.
--query.max-concurrent-select=4
Maximum number of select requests made
concurrently per a query.
--query.conn-metric.label=external_labels... ...
Optional selection of query connection metric
labels to be collected from endpoint set
--deduplication.func=penalty
Experimental. Deduplication algorithm for
merging overlapping series. Possible values
are: "penalty", "chain". If no value is
specified, penalty based deduplication
algorithm will be used. When set to chain, the
default compact deduplication merger is used,
which performs 1:1 deduplication for samples.
At least one replica label has to be set via
--query.replica-label flag.
--query.replica-label=QUERY.REPLICA-LABEL ...
Labels to treat as a replica indicator along
which data is deduplicated. Still you will
be able to query without deduplication using
'dedup=false' parameter. Data includes time
series, recording rules, and alerting rules.
Flag may be specified multiple times as well as
a comma separated list of labels.
--query.partition-label=QUERY.PARTITION-LABEL ...
Labels that partition the leaf queriers. This
is used to scope down the labelsets of leaf
queriers when using the distributed query mode.
If set, these labels must form a partition
of the leaf queriers. Partition labels must
not intersect with replica labels. Every TSDB
of a leaf querier must have these labels.
This is useful when there are multiple external
labels that are irrelevant for the partition as
it allows the distributed engine to ignore them
for some optimizations. If this is empty then
all labels are used as partition labels.
--query.metadata.default-time-range=0s
The default metadata time range duration for
retrieving labels through Labels and Series API
when the range parameters are not specified.
The zero value means range covers the time
since the beginning.
--selector-label=<name>="<value>" ...
Query selector labels that will be exposed in
info endpoint (repeated).
--[no-]query.auto-downsampling
Enable automatic adjustment (step / 5) to what
source of data should be used in store gateways
if no max_source_resolution param is specified.
--[no-]query.partial-response
Enable partial response for queries if
no partial_response param is specified.
--no-query.partial-response for disabling.
--query.active-query-path=""
Directory to log currently active queries in
the queries.active file.
--enable-feature= ... Comma separated feature names to enable. Valid
options for now: promql-experimental-functions
(enables promql experimental functions in
query)
--query.default-evaluation-interval=1m
Set default evaluation interval for sub
queries.
--query.default-step=1s Set default step for range queries. Default
step is only used when step is not set in UI.
In such cases, Thanos UI will use default
step to calculate resolution (resolution
= max(rangeSeconds / 250, defaultStep)).
This will not work from Grafana, but Grafana
has __step variable which can be used.
--store.response-timeout=0ms
If a Store doesn't send any data in this
specified duration then a Store will be ignored
and partial data will be returned if it's
enabled. 0 disables timeout.
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to query based on their external labels.
It follows the Thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML
file with relabeling configuration that allows
selecting blocks to query based on their
external labels. It follows the Thanos sharding
relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--alert.query-url=ALERT.QUERY-URL
The external Thanos Query URL that would be set
in all alerts 'Source' field.
--query.telemetry.request-duration-seconds-quantiles=0.1... ...
The quantiles for exporting metrics about the
request duration quantiles.
--query.telemetry.request-samples-quantiles=100... ...
The quantiles for exporting metrics about the
samples count quantiles.
--query.telemetry.request-series-seconds-quantiles=10... ...
The quantiles for exporting metrics about the
series count quantiles.
--query.tenant-header="THANOS-TENANT"
HTTP header to determine tenant.
--query.default-tenant-id="default-tenant"
Default tenant ID to use if tenant header is
not present
--query.tenant-certificate-field=
Use TLS client's certificate field to determine
tenant for write requests. Must be one of
organization, organizationalUnit or commonName.
This setting will cause the query.tenant-header
flag value to be ignored.
--[no-]query.enforce-tenancy
Enforce tenancy on Query APIs. Responses
are returned only if the label value of the
configured tenant-label-name and the value of
the tenant header matches.
--query.tenant-label-name="tenant_id"
Label name to use when enforcing tenancy (if
--query.enforce-tenancy is enabled).
--store.sd-dns-interval=30s
Interval between DNS resolutions.
--store.unhealthy-timeout=5m
Timeout before an unhealthy store is cleaned
from the store UI page.
--endpoint.sd-config-file=<file-path>
Path to Config File with endpoint definitions
--endpoint.sd-config=<content>
Alternative to 'endpoint.sd-config-file' flag
(mutually exclusive). Content of Config File
with endpoint definitions
--endpoint.sd-config-reload-interval=5m
Interval between endpoint config refreshes
--store.sd-files=<path> ...
(Deprecated) Path to files that contain
addresses of store API servers. The path can be
a glob pattern (repeatable).
--store.sd-interval=5m (Deprecated) Refresh interval to re-read file
SD files. It is used as a resync fallback.
--endpoint=<endpoint> ... (Deprecated): Addresses of statically
configured Thanos API servers (repeatable).
The scheme may be prefixed with 'dns+' or
'dnssrv+' to detect Thanos API servers through
respective DNS lookups.
--endpoint-group=<endpoint-group> ...
(Deprecated, Experimental): DNS name of
statically configured Thanos API server groups
(repeatable). Targets resolved from the DNS
name will be queried in a round-robin, instead
of a fanout manner. This flag should be used
when connecting a Thanos Query to HA groups of
Thanos components.
--endpoint-strict=<endpoint-strict> ...
(Deprecated): Addresses of only statically
configured Thanos API servers that are always
used, even if the health check fails. Useful if
you have a caching layer on top.
--endpoint-group-strict=<endpoint-group-strict> ...
(Deprecated, Experimental): DNS name of
statically configured Thanos API server groups
(repeatable) that are always used, even if the
health check fails.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
``` ```

View File

@ -22,6 +22,47 @@ The Ketama algorithm is a consistent hashing scheme which enables stable scaling
If you are using the `hashmod` algorithm and wish to migrate to `ketama`, the simplest and safest way would be to set up a new pool receivers with `ketama` hashrings and start remote-writing to them. Provided you are on the latest Thanos version, old receivers will flush their TSDBs after the configured retention period and will upload blocks to object storage. Once you have verified that is done, decommission the old receivers. If you are using the `hashmod` algorithm and wish to migrate to `ketama`, the simplest and safest way would be to set up a new pool receivers with `ketama` hashrings and start remote-writing to them. Provided you are on the latest Thanos version, old receivers will flush their TSDBs after the configured retention period and will upload blocks to object storage. Once you have verified that is done, decommission the old receivers.
#### Shuffle sharding
Ketama also supports [shuffle sharding](https://aws.amazon.com/builders-library/workload-isolation-using-shuffle-sharding/). It allows you to provide a single-tenant experience in a multi-tenant system. With shuffle sharding, a tenant gets a subset of all nodes in a hashring. You can configure shuffle sharding for any Ketama hashring like so:
```json
[
{
"endpoints": [
{"address": "node-1:10901", "capnproto_address": "node-1:19391", "az": "foo"},
{"address": "node-2:10901", "capnproto_address": "node-2:19391", "az": "bar"},
{"address": "node-3:10901", "capnproto_address": "node-3:19391", "az": "qux"},
{"address": "node-4:10901", "capnproto_address": "node-4:19391", "az": "foo"},
{"address": "node-5:10901", "capnproto_address": "node-5:19391", "az": "bar"},
{"address": "node-6:10901", "capnproto_address": "node-6:19391", "az": "qux"}
],
"algorithm": "ketama",
"shuffle_sharding_config": {
"shard_size": 2,
"cache_size": 100,
"overrides": [
{
"shard_size": 3,
"tenants": ["prefix-tenant-*"],
"tenant_matcher_type": "glob"
}
]
}
}
]
```
This will enable shuffle sharding with the default shard size of 2 and override it to 3 for every tenant that starts with `prefix-tenant-`.
`cache_size` sets the size of the in-memory LRU cache of the computed subrings. It is not possible to cache everything because an attacker could possibly spam requests with random tenants and those subrings would stay in-memory forever.
With this config, `shard_size/number_of_azs` is chosen from each availability zone for each tenant. So, each tenant will get a unique and consistent set of 3 nodes.
You can use `zone_awareness_disabled` to disable zone awareness. This is useful in the case where you have many separate AZs and it doesn't matter which one to choose. The shards will ignore AZs but the Ketama algorithm will later prefer spreading load through as many AZs as possible. That's why with zone awareness disabled it is recommended to set the shard size to be `max(nodes_in_any_az, replication_factor)`.
Receive only supports stateless shuffle sharding now so it doesn't store and check there have been any overlaps between shards.
### Hashmod (discouraged) ### Hashmod (discouraged)
This algorithm uses a `hashmod` function over all labels to decide which receiver is responsible for a given timeseries. This is the default algorithm due to historical reasons. However, its usage for new Receive installations is discouraged since adding new Receiver nodes leads to series churn and memory usage spikes. This algorithm uses a `hashmod` function over all labels to decide which receiver is responsible for a given timeseries. This is the default algorithm due to historical reasons. However, its usage for new Receive installations is discouraged since adding new Receiver nodes leads to series churn and memory usage spikes.
@ -331,7 +372,7 @@ Please see the metric `thanos_receive_forward_delay_seconds` to see if you need
The following formula is used for calculating quorum: The following formula is used for calculating quorum:
```go mdox-exec="sed -n '1029,1039p' pkg/receive/handler.go" ```go mdox-exec="sed -n '1046,1056p' pkg/receive/handler.go"
// writeQuorum returns minimum number of replicas that has to confirm write success before claiming replication success. // writeQuorum returns minimum number of replicas that has to confirm write success before claiming replication success.
func (h *Handler) writeQuorum() int { func (h *Handler) writeQuorum() int {
// NOTE(GiedriusS): this is here because otherwise RF=2 doesn't make sense as all writes // NOTE(GiedriusS): this is here because otherwise RF=2 doesn't make sense as all writes
@ -354,46 +395,29 @@ usage: thanos receive [<flags>]
Accept Prometheus remote write API requests and write to local tsdb. Accept Prometheus remote write API requests and write to local tsdb.
Flags: Flags:
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9 --auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory. detected maximum container or system memory.
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--enable-feature= ... Comma separated experimental feature names
to enable. The current list of features is
metric-names-filter.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902" --http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints. Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for --http-grace-period=2m Time to wait after an interrupt received for
@ -401,40 +425,100 @@ Flags:
--http.config="" [EXPERIMENTAL] Path to the configuration file --http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all that can enable TLS or authentication for all
HTTP endpoints. HTTP endpoints.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--remote-write.address="0.0.0.0:19291"
Address to listen on for remote write requests.
--remote-write.server-tls-cert=""
TLS Certificate for HTTP server, leave blank to
disable TLS.
--remote-write.server-tls-key=""
TLS Key for the HTTP server, leave blank to
disable TLS.
--remote-write.server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--remote-write.server-tls-min-version="1.3"
TLS version for the gRPC server, leave blank
to default to TLS 1.3, allow values: ["1.0",
"1.1", "1.2", "1.3"]
--remote-write.client-tls-cert=""
TLS Certificates to use to identify this client
to the server.
--remote-write.client-tls-key=""
TLS Key for the client's certificate.
--[no-]remote-write.client-tls-secure
Use TLS when talking to the other receivers.
--[no-]remote-write.client-tls-skip-verify
Disable TLS certificate verification when
talking to the other receivers i.e self signed,
signed by fake CA.
--remote-write.client-tls-ca=""
TLS CA Certificates to use to verify servers.
--remote-write.client-server-name=""
Server name to verify the hostname
on the returned TLS certificates. See
https://tools.ietf.org/html/rfc4366#section-3.1
--tsdb.path="./data" Data directory of TSDB.
--label=key="value" ... External labels to announce. This flag will be --label=key="value" ... External labels to announce. This flag will be
removed in the future when handling multiple removed in the future when handling multiple
tsdb instances is added. tsdb instances is added.
--log.format=logfmt Log format to use. Possible options: logfmt or --objstore.config-file=<file-path>
json. Path to YAML file that contains object
--log.level=info Log filtering level. store configuration. See format details:
--matcher-cache-size=0 Max number of cached matchers items. Using 0 https://thanos.io/tip/thanos/storage.md/#configuration
disables caching.
--objstore.config=<content> --objstore.config=<content>
Alternative to 'objstore.config-file' Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of flag (mutually exclusive). Content of
YAML file that contains object store YAML file that contains object store
configuration. See format details: configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config-file=<file-path> --tsdb.retention=15d How long to retain raw samples on local
Path to YAML file that contains object storage. 0d - disables the retention
store configuration. See format details: policy (i.e. infinite retention).
https://thanos.io/tip/thanos/storage.md/#configuration For more details on how retention is
--receive.capnproto-address="0.0.0.0:19391" enforced for individual tenants, please
Address for the Cap'n Proto server. refer to the Tenant lifecycle management
--receive.default-tenant-id="default-tenant" section in the Receive documentation:
Default tenant ID to use when none is provided https://thanos.io/tip/components/receive.md/#tenant-lifecycle-management
via a header. --receive.hashrings-file=<path>
--receive.forward.async-workers=5 Path to file that contains the hashring
Number of concurrent workers processing configuration. A watcher is initialized
forwarding of remote-write requests. to watch changes and update the hashring
--receive.grpc-compression=snappy dynamically.
Compression algorithm to use for gRPC requests
to other receivers. Must be one of: snappy,
none
--receive.grpc-service-config=<content>
gRPC service configuration file
or content in JSON format. See
https://github.com/grpc/grpc/blob/master/doc/service_config.md
--receive.hashrings=<content> --receive.hashrings=<content>
Alternative to 'receive.hashrings-file' flag Alternative to 'receive.hashrings-file' flag
(lower priority). Content of file that contains (lower priority). Content of file that contains
@ -444,11 +528,6 @@ Flags:
the hashrings. Must be one of hashmod, ketama. the hashrings. Must be one of hashmod, ketama.
Will be overwritten by the tenant-specific Will be overwritten by the tenant-specific
algorithm in the hashring config. algorithm in the hashring config.
--receive.hashrings-file=<path>
Path to file that contains the hashring
configuration. A watcher is initialized
to watch changes and update the hashring
dynamically.
--receive.hashrings-file-refresh-interval=5m --receive.hashrings-file-refresh-interval=5m
Refresh interval to re-read the hashring Refresh interval to re-read the hashring
configuration file. (used as a fallback) configuration file. (used as a fallback)
@ -458,23 +537,35 @@ Flags:
configuration. If it's empty AND hashring configuration. If it's empty AND hashring
configuration was provided, it means that configuration was provided, it means that
receive will run in RoutingOnly mode. receive will run in RoutingOnly mode.
--receive.otlp-enable-target-info --receive.tenant-header="THANOS-TENANT"
Enables target information in OTLP metrics HTTP header to determine tenant for write
ingested by Receive. If enabled, it converts requests.
the resource to the target info metric --receive.tenant-certificate-field=
--receive.otlp-promote-resource-attributes= ... Use TLS client's certificate field to
(Repeatable) Resource attributes to include in determine tenant for write requests.
OTLP metrics ingested by Receive. Must be one of organization, organizationalUnit
--receive.relabel-config=<content> or commonName. This setting will cause the
Alternative to 'receive.relabel-config-file' receive.tenant-header flag value to be ignored.
flag (mutually exclusive). Content of YAML file --receive.default-tenant-id="default-tenant"
that contains relabeling configuration. Default tenant ID to use when none is provided
--receive.relabel-config-file=<file-path> via a header.
Path to YAML file that contains relabeling --receive.split-tenant-label-name=""
configuration. Label name through which the request will
be split into multiple tenants. This takes
precedence over the HTTP header.
--receive.tenant-label-name="tenant_id"
Label name through which the tenant will be
announced.
--receive.replica-header="THANOS-REPLICA" --receive.replica-header="THANOS-REPLICA"
HTTP header specifying the replica number of a HTTP header specifying the replica number of a
write request. write request.
--receive.forward.async-workers=5
Number of concurrent workers processing
forwarding of remote-write requests.
--receive.grpc-compression=snappy
Compression algorithm to use for gRPC requests
to other receivers. Must be one of: snappy,
none
--receive.replication-factor=1 --receive.replication-factor=1
How many times to replicate incoming write How many times to replicate incoming write
requests. requests.
@ -482,95 +573,59 @@ Flags:
The protocol to use for replicating The protocol to use for replicating
remote-write requests. One of protobuf, remote-write requests. One of protobuf,
capnproto capnproto
--receive.split-tenant-label-name="" --receive.capnproto-address="0.0.0.0:19391"
Label name through which the request will Address for the Cap'n Proto server.
be split into multiple tenants. This takes --receive.grpc-service-config=<content>
precedence over the HTTP header. gRPC service configuration file
--receive.tenant-certificate-field= or content in JSON format. See
Use TLS client's certificate field to https://github.com/grpc/grpc/blob/master/doc/service_config.md
determine tenant for write requests. --receive.relabel-config-file=<file-path>
Must be one of organization, organizationalUnit Path to YAML file that contains relabeling
or commonName. This setting will cause the configuration.
receive.tenant-header flag value to be ignored. --receive.relabel-config=<content>
--receive.tenant-header="THANOS-TENANT" Alternative to 'receive.relabel-config-file'
HTTP header to determine tenant for write flag (mutually exclusive). Content of YAML file
requests. that contains relabeling configuration.
--receive.tenant-label-name="tenant_id" --tsdb.too-far-in-future.time-window=0s
Label name through which the tenant will be Configures the allowed time window for
announced. ingesting samples too far in the future.
--remote-write.address="0.0.0.0:19291" Disabled (0s) by default. Please note enable
Address to listen on for remote write requests. this flag will reject samples in the future of
--remote-write.client-server-name="" receive local NTP time + configured duration
Server name to verify the hostname due to clock skew in remote write clients.
on the returned TLS certificates. See --tsdb.out-of-order.time-window=0s
https://tools.ietf.org/html/rfc4366#section-3.1 [EXPERIMENTAL] Configures the allowed
--remote-write.client-tls-ca="" time window for ingestion of out-of-order
TLS CA Certificates to use to verify servers. samples. Disabled (0s) by defaultPlease
--remote-write.client-tls-cert="" note if you enable this option and you
TLS Certificates to use to identify this client use compactor, make sure you have the
to the server. --compact.enable-vertical-compaction flag
--remote-write.client-tls-key="" enabled, otherwise you might risk compactor
TLS Key for the client's certificate. halt.
--remote-write.client-tls-secure --tsdb.out-of-order.cap-max=0
Use TLS when talking to the other receivers. [EXPERIMENTAL] Configures the maximum capacity
--remote-write.client-tls-skip-verify for out-of-order chunks (in samples). If set to
Disable TLS certificate verification when <=0, default value 32 is assumed.
talking to the other receivers i.e self signed, --[no-]tsdb.allow-overlapping-blocks
signed by fake CA.
--remote-write.server-tls-cert=""
TLS Certificate for HTTP server, leave blank to
disable TLS.
--remote-write.server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--remote-write.server-tls-key=""
TLS Key for the HTTP server, leave blank to
disable TLS.
--remote-write.server-tls-min-version="1.3"
TLS version for the gRPC server, leave blank
to default to TLS 1.3, allow values: ["1.0",
"1.1", "1.2", "1.3"]
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tsdb.allow-overlapping-blocks
Allow overlapping blocks, which in turn enables Allow overlapping blocks, which in turn enables
vertical compaction and vertical query merge. vertical compaction and vertical query merge.
Does not do anything, enabled all the time. Does not do anything, enabled all the time.
--tsdb.block.expanded-postings-cache-size=0 --tsdb.max-retention-bytes=0
[EXPERIMENTAL] If non-zero, enables expanded Maximum number of bytes that can be stored for
postings cache for compacted blocks. blocks. A unit is required, supported units: B,
KB, MB, GB, TB, PB, EB. Ex: "512MB". Based on
powers-of-2, so 1KB is 1024B.
--[no-]tsdb.wal-compression
Compress the tsdb WAL.
--[no-]tsdb.no-lockfile Do not create lockfile in TSDB data directory.
In any case, the lockfiles will be deleted on
next startup.
--tsdb.head.expanded-postings-cache-size=0 --tsdb.head.expanded-postings-cache-size=0
[EXPERIMENTAL] If non-zero, enables expanded [EXPERIMENTAL] If non-zero, enables expanded
postings cache for the head block. postings cache for the head block.
--tsdb.block.expanded-postings-cache-size=0
[EXPERIMENTAL] If non-zero, enables expanded
postings cache for compacted blocks.
--tsdb.max-exemplars=0 Enables support for ingesting exemplars and --tsdb.max-exemplars=0 Enables support for ingesting exemplars and
sets the maximum number of exemplars that will sets the maximum number of exemplars that will
be stored per tenant. In case the exemplar be stored per tenant. In case the exemplar
@ -579,43 +634,41 @@ Flags:
ingesting a new exemplar will evict the oldest ingesting a new exemplar will evict the oldest
exemplar from storage. 0 (or less) value of exemplar from storage. 0 (or less) value of
this flag disables exemplars storage. this flag disables exemplars storage.
--tsdb.max-retention-bytes=0 --[no-]tsdb.enable-native-histograms
Maximum number of bytes that can be stored for [EXPERIMENTAL] Enables the ingestion of native
blocks. A unit is required, supported units: B, histograms.
KB, MB, GB, TB, PB, EB. Ex: "512MB". Based on --hash-func= Specify which hash function to use when
powers-of-2, so 1KB is 1024B. calculating the hashes of produced files.
--tsdb.no-lockfile Do not create lockfile in TSDB data directory. If no function has been specified, it does not
In any case, the lockfiles will be deleted on happen. This permits avoiding downloading some
next startup. files twice albeit at some performance cost.
--tsdb.out-of-order.cap-max=0 Possible values are: "", "SHA256".
[EXPERIMENTAL] Configures the maximum capacity --matcher-cache-size=0 Max number of cached matchers items. Using 0
for out-of-order chunks (in samples). If set to disables caching.
<=0, default value 32 is assumed. --request.logging-config-file=<file-path>
--tsdb.out-of-order.time-window=0s Path to YAML file with request logging
[EXPERIMENTAL] Configures the allowed time configuration. See format details:
window for ingestion of out-of-order samples. https://thanos.io/tip/thanos/logging.md/#configuration
Disabled (0s) by defaultPlease note if you --request.logging-config=<content>
enable this option and you use compactor, make Alternative to 'request.logging-config-file'
sure you have the --enable-vertical-compaction flag (mutually exclusive). Content
flag enabled, otherwise you might risk of YAML file with request logging
compactor halt. configuration. See format details:
--tsdb.path="./data" Data directory of TSDB. https://thanos.io/tip/thanos/logging.md/#configuration
--tsdb.retention=15d How long to retain raw samples on local --[no-]receive.otlp-enable-target-info
storage. 0d - disables the retention Enables target information in OTLP metrics
policy (i.e. infinite retention). ingested by Receive. If enabled, it converts
For more details on how retention is the resource to the target info metric
enforced for individual tenants, please --receive.otlp-promote-resource-attributes= ...
refer to the Tenant lifecycle management (Repeatable) Resource attributes to include in
section in the Receive documentation: OTLP metrics ingested by Receive.
https://thanos.io/tip/components/receive.md/#tenant-lifecycle-management --enable-feature= ... Comma separated experimental feature names
--tsdb.too-far-in-future.time-window=0s to enable. The current list of features is
Configures the allowed time window for metric-names-filter.
ingesting samples too far in the future. --receive.lazy-retrieval-max-buffered-responses=20
Disabled (0s) by default. Please note enable The lazy retrieval strategy can buffer up to
this flag will reject samples in the future of this number of responses. This is to limit the
receive local NTP time + configured duration memory usage. This flag takes effect only when
due to clock skew in remote write clients. the lazy retrieval strategy is enabled.
--tsdb.wal-compression Compress the tsdb WAL.
--version Show application version.
``` ```

View File

@ -269,106 +269,29 @@ usage: thanos rule [<flags>]
Ruler evaluating Prometheus rules against given Query nodes, exposing Store API Ruler evaluating Prometheus rules against given Query nodes, exposing Store API
and storing old blocks in bucket. and storing old blocks in bucket.
Flags: Flags:
--alert.label-drop=ALERT.LABEL-DROP ... -h, --[no-]help Show context-sensitive help (also try
Labels by name to drop before sending --help-long and --help-man).
to alertmanager. This allows alert to be --[no-]version Show application version.
deduplicated on replica label (repeated). --log.level=info Log filtering level.
Similar Prometheus alert relabelling --log.format=logfmt Log format to use. Possible options: logfmt or
--alert.query-template="/graph?g0.expr={{.Expr}}&g0.tab=1" json.
Template to use in alerts source field. --tracing.config-file=<file-path>
Need only include {{.Expr}} parameter Path to YAML file with tracing
--alert.query-url=ALERT.QUERY-URL
The external Thanos Query URL that would be set
in all alerts 'Source' field
--alert.relabel-config=<content>
Alternative to 'alert.relabel-config-file' flag
(mutually exclusive). Content of YAML file that
contains alert relabelling configuration.
--alert.relabel-config-file=<file-path>
Path to YAML file that contains alert
relabelling configuration.
--alertmanagers.config=<content>
Alternative to 'alertmanagers.config-file'
flag (mutually exclusive). Content
of YAML file that contains alerting
configuration. See format details: configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration. https://thanos.io/tip/thanos/tracing.md/#configuration
If defined, it takes precedence --tracing.config=<content>
over the '--alertmanagers.url' and Alternative to 'tracing.config-file' flag
'--alertmanagers.send-timeout' flags. (mutually exclusive). Content of YAML file
--alertmanagers.config-file=<file-path> with tracing configuration. See format details:
Path to YAML file that contains alerting https://thanos.io/tip/thanos/tracing.md/#configuration
configuration. See format details: --[no-]enable-auto-gomemlimit
https://thanos.io/tip/components/rule.md/#configuration. Enable go runtime to automatically limit memory
If defined, it takes precedence consumption.
over the '--alertmanagers.url' and
'--alertmanagers.send-timeout' flags.
--alertmanagers.sd-dns-interval=30s
Interval between DNS resolutions of
Alertmanager hosts.
--alertmanagers.send-timeout=10s
Timeout for sending alerts to Alertmanager
--alertmanagers.url=ALERTMANAGERS.URL ...
Alertmanager replica URLs to push firing
alerts. Ruler claims success if push to
at least one alertmanager from discovered
succeeds. The scheme should not be empty
e.g `http` might be used. The scheme may be
prefixed with 'dns+' or 'dnssrv+' to detect
Alertmanager IPs through respective DNS
lookups. The port defaults to 9093 or the
SRV record's value. The URL path is used as a
prefix for the regular Alertmanager API path.
--auto-gomemlimit.ratio=0.9 --auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory. detected maximum container or system memory.
--data-dir="data/" data directory
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--eval-interval=1m The default evaluation interval to use.
--for-grace-period=10m Minimum duration between alert and restored
"for" state. This is maintained only for alerts
with configured "for" time greater than grace
period.
--for-outage-tolerance=1h Max time to tolerate prometheus outage for
restoring "for" state of alert.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-query-endpoint=<endpoint> ...
Addresses of Thanos gRPC query API servers
(repeatable). The scheme may be prefixed
with 'dns+' or 'dnssrv+' to detect Thanos API
servers through respective DNS lookups.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902" --http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints. Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for --http-grace-period=2m Time to wait after an interrupt received for
@ -376,145 +299,33 @@ Flags:
--http.config="" [EXPERIMENTAL] Path to the configuration file --http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all that can enable TLS or authentication for all
HTTP endpoints. HTTP endpoints.
--label=<name>="<value>" ... --grpc-address="0.0.0.0:10901"
Labels to be applied to all generated metrics Listen ip:port address for gRPC endpoints
(repeated). Similar to external labels for (StoreAPI). Make sure this address is routable
Prometheus, used to identify ruler and its from other components.
blocks as unique source. --grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
--log.format=logfmt Log format to use. Possible options: logfmt or disable TLS
json. --grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
--log.level=info Log filtering level. disable TLS
--objstore.config=<content> --grpc-server-tls-client-ca=""
Alternative to 'objstore.config-file' TLS CA to verify clients against. If no
flag (mutually exclusive). Content of client CA is specified, there is no client
YAML file that contains object store verification on server side. (tls.NoClientCert)
configuration. See format details: --grpc-server-tls-min-version="1.3"
https://thanos.io/tip/thanos/storage.md/#configuration TLS supported minimum version for gRPC server.
--objstore.config-file=<file-path> If no version is specified, it'll default to
Path to YAML file that contains object 1.3. Allowed values: ["1.0", "1.1", "1.2",
store configuration. See format details: "1.3"]
https://thanos.io/tip/thanos/storage.md/#configuration --grpc-server-max-connection-age=60m
--query=<query> ... Addresses of statically configured query The grpc server max connection age. This
API servers (repeatable). The scheme may be controls how often to re-establish connections
prefixed with 'dns+' or 'dnssrv+' to detect and redo TLS handshakes.
query API servers through respective DNS --grpc-grace-period=2m Time to wait after an interrupt received for
lookups. GRPC Server.
--query.config=<content> Alternative to 'query.config-file' flag --web.route-prefix="" Prefix for API and UI endpoints. This allows
(mutually exclusive). Content of YAML thanos UI to be served on a sub-path. This
file that contains query API servers option is analogous to --web.route-prefix of
configuration. See format details: Prometheus.
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence over the
'--query' and '--query.sd-files' flags.
--query.config-file=<file-path>
Path to YAML file that contains query API
servers configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence over the
'--query' and '--query.sd-files' flags.
--query.default-step=1s Default range query step to use. This is
only used in stateless Ruler and alert state
restoration.
--query.enable-x-functions
Whether to enable extended rate functions
(xrate, xincrease and xdelta). Only has effect
when used with Thanos engine.
--query.http-method=POST HTTP method to use when sending queries.
Possible options: [GET, POST]
--query.sd-dns-interval=30s
Interval between DNS resolutions.
--query.sd-files=<path> ...
Path to file that contains addresses of query
API servers. The path can be a glob pattern
(repeatable).
--query.sd-interval=5m Refresh interval to re-read file SD files.
(used as a fallback)
--remote-write.config=<content>
Alternative to 'remote-write.config-file'
flag (mutually exclusive). Content
of YAML config for the remote-write
configurations, that specify servers
where samples should be sent to (see
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
This automatically enables stateless mode
for ruler and no series will be stored in the
ruler's TSDB. If an empty config (or file) is
provided, the flag is ignored and ruler is run
with its own TSDB.
--remote-write.config-file=<file-path>
Path to YAML config for the remote-write
configurations, that specify servers
where samples should be sent to (see
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
This automatically enables stateless mode
for ruler and no series will be stored in the
ruler's TSDB. If an empty config (or file) is
provided, the flag is ignored and ruler is run
with its own TSDB.
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--resend-delay=1m Minimum amount of time to wait before resending
an alert to Alertmanager.
--restore-ignored-label=RESTORE-IGNORED-LABEL ...
Label names to be ignored when restoring alerts
from the remote storage. This is only used in
stateless mode.
--rule-concurrent-evaluation=1
How many rules can be evaluated concurrently.
Default is 1.
--rule-file=rules/ ... Rule files that should be used by rule
manager. Can be in glob format (repeated).
Note that rules are not automatically detected,
use SIGHUP or do HTTP POST /-/reload to re-read
them.
--rule-query-offset=0s The default rule group query_offset duration to
use.
--shipper.meta-file-name="thanos.shipper.json"
the file to store shipper metadata in
--shipper.upload-compacted
If true shipper will try to upload compacted
blocks as well. Useful for migration purposes.
Works only if compaction is disabled on
Prometheus. Do it once and then disable the
flag when done.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tsdb.block-duration=2h Block duration for TSDB block.
--tsdb.no-lockfile Do not create lockfile in TSDB data directory.
In any case, the lockfiles will be deleted on
next startup.
--tsdb.retention=48h Block retention time on local disk.
--tsdb.wal-compression Compress the tsdb WAL.
--version Show application version.
--web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--web.external-prefix="" Static prefix for all HTML links and redirect --web.external-prefix="" Static prefix for all HTML links and redirect
URLs in the bucket web UI interface. URLs in the bucket web UI interface.
Actual endpoints are still served on / or the Actual endpoints are still served on / or the
@ -534,10 +345,209 @@ Flags:
stripped prefix value in X-Forwarded-Prefix stripped prefix value in X-Forwarded-Prefix
header. This allows thanos UI to be served on a header. This allows thanos UI to be served on a
sub-path. sub-path.
--web.route-prefix="" Prefix for API and UI endpoints. This allows --[no-]web.disable-cors Whether to disable CORS headers to be set by
thanos UI to be served on a sub-path. This Thanos. By default Thanos sets CORS headers to
option is analogous to --web.route-prefix of be allowed by all.
Prometheus. --[no-]shipper.upload-compacted
If true shipper will try to upload compacted
blocks as well. Useful for migration purposes.
Works only if compaction is disabled on
Prometheus. Do it once and then disable the
flag when done.
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
--shipper.meta-file-name="thanos.shipper.json"
the file to store shipper metadata in
--query=<query> ... Addresses of statically configured query
API servers (repeatable). The scheme may be
prefixed with 'dns+' or 'dnssrv+' to detect
query API servers through respective DNS
lookups.
--query.config-file=<file-path>
Path to YAML file that contains query API
servers configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence over the
'--query' and '--query.sd-files' flags.
--query.config=<content> Alternative to 'query.config-file' flag
(mutually exclusive). Content of YAML
file that contains query API servers
configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence over the
'--query' and '--query.sd-files' flags.
--query.sd-files=<path> ...
Path to file that contains addresses of query
API servers. The path can be a glob pattern
(repeatable).
--query.sd-interval=5m Refresh interval to re-read file SD files.
(used as a fallback)
--query.sd-dns-interval=30s
Interval between DNS resolutions.
--query.http-method=POST HTTP method to use when sending queries.
Possible options: [GET, POST]
--query.default-step=1s Default range query step to use. This is
only used in stateless Ruler and alert state
restoration.
--alertmanagers.config-file=<file-path>
Path to YAML file that contains alerting
configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence
over the '--alertmanagers.url' and
'--alertmanagers.send-timeout' flags.
--alertmanagers.config=<content>
Alternative to 'alertmanagers.config-file'
flag (mutually exclusive). Content
of YAML file that contains alerting
configuration. See format details:
https://thanos.io/tip/components/rule.md/#configuration.
If defined, it takes precedence
over the '--alertmanagers.url' and
'--alertmanagers.send-timeout' flags.
--alertmanagers.url=ALERTMANAGERS.URL ...
Alertmanager replica URLs to push firing
alerts. Ruler claims success if push to
at least one alertmanager from discovered
succeeds. The scheme should not be empty
e.g `http` might be used. The scheme may be
prefixed with 'dns+' or 'dnssrv+' to detect
Alertmanager IPs through respective DNS
lookups. The port defaults to 9093 or the
SRV record's value. The URL path is used as a
prefix for the regular Alertmanager API path.
--alertmanagers.send-timeout=10s
Timeout for sending alerts to Alertmanager
--alertmanagers.sd-dns-interval=30s
Interval between DNS resolutions of
Alertmanager hosts.
--alert.query-url=ALERT.QUERY-URL
The external Thanos Query URL that would be set
in all alerts 'Source' field
--alert.label-drop=ALERT.LABEL-DROP ...
Labels by name to drop before sending
to alertmanager. This allows alert to be
deduplicated on replica label (repeated).
Similar Prometheus alert relabelling
--alert.relabel-config-file=<file-path>
Path to YAML file that contains alert
relabelling configuration.
--alert.relabel-config=<content>
Alternative to 'alert.relabel-config-file' flag
(mutually exclusive). Content of YAML file that
contains alert relabelling configuration.
--alert.query-template="/graph?g0.expr={{.Expr}}&g0.tab=1"
Template to use in alerts source field.
Need only include {{.Expr}} parameter
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--label=<name>="<value>" ...
Labels to be applied to all generated metrics
(repeated). Similar to external labels for
Prometheus, used to identify ruler and its
blocks as unique source.
--tsdb.block-duration=2h Block duration for TSDB block.
--tsdb.retention=48h Block retention time on local disk.
--[no-]tsdb.no-lockfile Do not create lockfile in TSDB data directory.
In any case, the lockfiles will be deleted on
next startup.
--[no-]tsdb.wal-compression
Compress the tsdb WAL.
--data-dir="data/" data directory
--rule-file=rules/ ... Rule files that should be used by rule
manager. Can be in glob format (repeated).
Note that rules are not automatically detected,
use SIGHUP or do HTTP POST /-/reload to re-read
them.
--resend-delay=1m Minimum amount of time to wait before resending
an alert to Alertmanager.
--eval-interval=1m The default evaluation interval to use.
--rule-query-offset=0s The default rule group query_offset duration to
use.
--for-outage-tolerance=1h Max time to tolerate prometheus outage for
restoring "for" state of alert.
--for-grace-period=10m Minimum duration between alert and restored
"for" state. This is maintained only for alerts
with configured "for" time greater than grace
period.
--restore-ignored-label=RESTORE-IGNORED-LABEL ...
Label names to be ignored when restoring alerts
from the remote storage. This is only used in
stateless mode.
--rule-concurrent-evaluation=1
How many rules can be evaluated concurrently.
Default is 1.
--grpc-query-endpoint=<endpoint> ...
Addresses of Thanos gRPC query API servers
(repeatable). The scheme may be prefixed
with 'dns+' or 'dnssrv+' to detect Thanos API
servers through respective DNS lookups.
--[no-]query.enable-x-functions
Whether to enable extended rate functions
(xrate, xincrease and xdelta). Only has effect
when used with Thanos engine.
--enable-feature= ... Comma separated feature names to enable. Valid
options for now: promql-experimental-functions
(enables promql experimental functions for
ruler)
--[no-]tsdb.enable-native-histograms
[EXPERIMENTAL] Enables the ingestion of native
histograms.
--remote-write.config-file=<file-path>
Path to YAML config for the remote-write
configurations, that specify servers
where samples should be sent to (see
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
This automatically enables stateless mode
for ruler and no series will be stored in the
ruler's TSDB. If an empty config (or file) is
provided, the flag is ignored and ruler is run
with its own TSDB.
--remote-write.config=<content>
Alternative to 'remote-write.config-file'
flag (mutually exclusive). Content
of YAML config for the remote-write
configurations, that specify servers
where samples should be sent to (see
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
This automatically enables stateless mode
for ruler and no series will be stored in the
ruler's TSDB. If an empty config (or file) is
provided, the flag is ignored and ruler is run
with its own TSDB.
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
``` ```

View File

@ -95,43 +95,29 @@ usage: thanos sidecar [<flags>]
Sidecar for Prometheus server. Sidecar for Prometheus server.
Flags: Flags:
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9 --auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory. detected maximum container or system memory.
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902" --http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints. Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for --http-grace-period=2m Time to wait after an interrupt received for
@ -139,81 +125,105 @@ Flags:
--http.config="" [EXPERIMENTAL] Path to the configuration file --http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all that can enable TLS or authentication for all
HTTP endpoints. HTTP endpoints.
--log.format=logfmt Log format to use. Possible options: logfmt or --grpc-address="0.0.0.0:10901"
json. Listen ip:port address for gRPC endpoints
--log.level=info Log filtering level. (StoreAPI). Make sure this address is routable
--min-time=0000-01-01T00:00:00Z from other components.
Start of time range limit to serve. Thanos --grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
sidecar will serve only metrics, which happened disable TLS
later than this value. Option can be a constant --grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
time in RFC3339 format or time duration disable TLS
relative to current time, such as -1d or 2h45m. --grpc-server-tls-client-ca=""
Valid duration units are ms, s, m, h, d, w, y. TLS CA to verify clients against. If no
--objstore.config=<content> client CA is specified, there is no client
Alternative to 'objstore.config-file' verification on server side. (tls.NoClientCert)
flag (mutually exclusive). Content of --grpc-server-tls-min-version="1.3"
YAML file that contains object store TLS supported minimum version for gRPC server.
configuration. See format details: If no version is specified, it'll default to
https://thanos.io/tip/thanos/storage.md/#configuration 1.3. Allowed values: ["1.0", "1.1", "1.2",
--objstore.config-file=<file-path> "1.3"]
Path to YAML file that contains object --grpc-server-max-connection-age=60m
store configuration. See format details: The grpc server max connection age. This
https://thanos.io/tip/thanos/storage.md/#configuration controls how often to re-establish connections
and redo TLS handshakes.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--prometheus.url=http://localhost:9090
URL at which to reach Prometheus's API.
For better performance use local network.
--prometheus.ready_timeout=10m
Maximum time to wait for the Prometheus
instance to start up
--prometheus.get_config_interval=30s --prometheus.get_config_interval=30s
How often to get Prometheus config How often to get Prometheus config
--prometheus.get_config_timeout=5s --prometheus.get_config_timeout=30s
Timeout for getting Prometheus config Timeout for getting Prometheus config
--prometheus.http-client-file=<file-path>
Path to YAML file or string with http
client configs. See Format details:
https://thanos.io/tip/components/sidecar.md/#configuration.
--prometheus.http-client=<content> --prometheus.http-client=<content>
Alternative to 'prometheus.http-client-file' Alternative to 'prometheus.http-client-file'
flag (mutually exclusive). Content flag (mutually exclusive). Content
of YAML file or string with http of YAML file or string with http
client configs. See Format details: client configs. See Format details:
https://thanos.io/tip/components/sidecar.md/#configuration. https://thanos.io/tip/components/sidecar.md/#configuration.
--prometheus.http-client-file=<file-path> --tsdb.path="./data" Data directory of TSDB.
Path to YAML file or string with http --reloader.config-file="" Config file watched by the reloader.
client configs. See Format details:
https://thanos.io/tip/components/sidecar.md/#configuration.
--prometheus.ready_timeout=10m
Maximum time to wait for the Prometheus
instance to start up
--prometheus.url=http://localhost:9090
URL at which to reach Prometheus's API.
For better performance use local network.
--reloader.config-envsubst-file="" --reloader.config-envsubst-file=""
Output file for environment variable Output file for environment variable
substituted config file. substituted config file.
--reloader.config-file="" Config file watched by the reloader.
--reloader.method=http Method used to reload the configuration.
--reloader.process-name="prometheus"
Executable name used to match the process being
reloaded when using the signal method.
--reloader.retry-interval=5s
Controls how often reloader retries config
reload in case of error.
--reloader.rule-dir=RELOADER.RULE-DIR ... --reloader.rule-dir=RELOADER.RULE-DIR ...
Rule directories for the reloader to refresh Rule directories for the reloader to refresh
(repeated field). (repeated field).
--reloader.watch-interval=3m --reloader.watch-interval=3m
Controls how often reloader re-reads config and Controls how often reloader re-reads config and
rules. rules.
--reloader.retry-interval=5s
Controls how often reloader retries config
reload in case of error.
--reloader.method=http Method used to reload the configuration.
--reloader.process-name="prometheus"
Executable name used to match the process being
reloaded when using the signal method.
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content> --request.logging-config=<content>
Alternative to 'request.logging-config-file' Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content flag (mutually exclusive). Content
of YAML file with request logging of YAML file with request logging
configuration. See format details: configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path> --objstore.config-file=<file-path>
Path to YAML file with request logging Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details: configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration https://thanos.io/tip/thanos/storage.md/#configuration
--shipper.meta-file-name="thanos.shipper.json" --[no-]shipper.upload-compacted
the file to store shipper metadata in
--shipper.upload-compacted
If true shipper will try to upload compacted If true shipper will try to upload compacted
blocks as well. Useful for migration purposes. blocks as well. Useful for migration purposes.
Works only if compaction is disabled on Works only if compaction is disabled on
Prometheus. Do it once and then disable the Prometheus. Do it once and then disable the
flag when done. flag when done.
--hash-func= Specify which hash function to use when
calculating the hashes of produced files.
If no function has been specified, it does not
happen. This permits avoiding downloading some
files twice albeit at some performance cost.
Possible values are: "", "SHA256".
--shipper.meta-file-name="thanos.shipper.json"
the file to store shipper metadata in
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0 --store.limits.request-samples=0
The maximum samples allowed for a single The maximum samples allowed for a single
Series request, The Series call fails if Series request, The Series call fails if
@ -221,21 +231,13 @@ Flags:
NOTE: For efficiency the limit is internally NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples. chunk contains a maximum of 120 samples.
--store.limits.request-series=0 --min-time=0000-01-01T00:00:00Z
The maximum series allowed for a single Series Start of time range limit to serve. Thanos
request. The Series call fails if this limit is sidecar will serve only metrics, which happened
exceeded. 0 means no limit. later than this value. Option can be a constant
--tracing.config=<content> time in RFC3339 format or time duration
Alternative to 'tracing.config-file' flag relative to current time, such as -1d or 2h45m.
(mutually exclusive). Content of YAML file Valid duration units are ms, s, m, h, d, w, y.
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tsdb.path="./data" Data directory of TSDB.
--version Show application version.
``` ```

View File

@ -48,10 +48,124 @@ usage: thanos store [<flags>]
Store node giving access to blocks in a bucket provider. Now supported GCS, S3, Store node giving access to blocks in a bucket provider. Now supported GCS, S3,
Azure, Swift, Tencent COS and Aliyun OSS. Azure, Swift, Tencent COS and Aliyun OSS.
Flags: Flags:
-h, --[no-]help Show context-sensitive help (also try
--help-long and --help-man).
--[no-]version Show application version.
--log.level=info Log filtering level.
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--tracing.config=<content>
Alternative to 'tracing.config-file' flag
(mutually exclusive). Content of YAML file
with tracing configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--[no-]enable-auto-gomemlimit
Enable go runtime to automatically limit memory
consumption.
--auto-gomemlimit.ratio=0.9 --auto-gomemlimit.ratio=0.9
The ratio of reserved GOMEMLIMIT memory to the The ratio of reserved GOMEMLIMIT memory to the
detected maximum container or system memory. detected maximum container or system memory.
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
HTTP Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--data-dir="./data" Local data directory used for caching
purposes (index-header, in-mem cache items and
meta.jsons). If removed, no data will be lost,
just store will have to rebuild the cache.
NOTE: Putting raw blocks here will not
cause the store to read them. For such use
cases use Prometheus + sidecar. Ignored if
--no-cache-index-header option is specified.
--[no-]cache-index-header Cache TSDB index-headers on disk to reduce
startup time. When set to true, Thanos Store
will download index headers from remote object
storage on startup and create a header file on
disk. Use --data-dir to set the directory in
which index headers will be downloaded.
--index-cache-size=250MB Maximum size of items held in the in-memory
index cache. Ignored if --index-cache.config or
--index-cache.config-file option is specified.
--index-cache.config-file=<file-path>
Path to YAML file that contains index
cache configuration. See format details:
https://thanos.io/tip/components/store.md/#index-cache
--index-cache.config=<content>
Alternative to 'index-cache.config-file'
flag (mutually exclusive). Content of
YAML file that contains index cache
configuration. See format details:
https://thanos.io/tip/components/store.md/#index-cache
--chunk-pool-size=2GB Maximum size of concurrently allocatable
bytes reserved strictly to reuse for chunks in
memory.
--store.grpc.touched-series-limit=0
DEPRECATED: use store.limits.request-series.
--store.grpc.series-sample-limit=0
DEPRECATED: use store.limits.request-samples.
--store.grpc.downloaded-bytes-limit=0
Maximum amount of downloaded (either
fetched or touched) bytes in a single
Series/LabelNames/LabelValues call. The Series
call fails if this limit is exceeded. 0 means
no limit.
--store.grpc.series-max-concurrency=20
Maximum number of concurrent Series calls.
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--sync-block-duration=15m Repeat interval for syncing the blocks between
local and remote view.
--block-discovery-strategy="concurrent" --block-discovery-strategy="concurrent"
One of concurrent, recursive. When set to One of concurrent, recursive. When set to
concurrent, stores will concurrently issue concurrent, stores will concurrently issue
@ -61,71 +175,46 @@ Flags:
recursively traversing into each directory. recursively traversing into each directory.
This avoids N+1 calls at the expense of having This avoids N+1 calls at the expense of having
slower bucket iterations. slower bucket iterations.
--block-meta-fetch-concurrency=32
Number of goroutines to use when fetching block
metadata from object storage.
--block-sync-concurrency=20 --block-sync-concurrency=20
Number of goroutines to use when constructing Number of goroutines to use when constructing
index-cache.json blocks from object storage. index-cache.json blocks from object storage.
Must be equal or greater than 1. Must be equal or greater than 1.
--bucket-web-label=BUCKET-WEB-LABEL --block-meta-fetch-concurrency=32
External block label to use as group title in Number of goroutines to use when fetching block
the bucket web UI metadata from object storage.
--cache-index-header Cache TSDB index-headers on disk to reduce --min-time=0000-01-01T00:00:00Z
startup time. When set to true, Thanos Store Start of time range limit to serve. Thanos
will download index headers from remote object Store will serve only metrics, which happened
storage on startup and create a header file on later than this value. Option can be a constant
disk. Use --data-dir to set the directory in time in RFC3339 format or time duration
which index headers will be downloaded. relative to current time, such as -1d or 2h45m.
--chunk-pool-size=2GB Maximum size of concurrently allocatable Valid duration units are ms, s, m, h, d, w, y.
bytes reserved strictly to reuse for chunks in --max-time=9999-12-31T23:59:59Z
memory. End of time range limit to serve. Thanos Store
will serve only blocks, which happened earlier
than this value. Option can be a constant time
in RFC3339 format or time duration relative
to current time, such as -1d or 2h45m. Valid
duration units are ms, s, m, h, d, w, y.
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to act on based on their external labels.
It follows thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML
file with relabeling configuration that allows
selecting blocks to act on based on their
external labels. It follows thanos sharding
relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--consistency-delay=0s Minimum age of all blocks before they are --consistency-delay=0s Minimum age of all blocks before they are
being read. Set it to safe value (e.g 30m) if being read. Set it to safe value (e.g 30m) if
your object storage is eventually consistent. your object storage is eventually consistent.
GCS and S3 are (roughly) strongly consistent. GCS and S3 are (roughly) strongly consistent.
--data-dir="./data" Local data directory used for caching
purposes (index-header, in-mem cache items and
meta.jsons). If removed, no data will be lost,
just store will have to rebuild the cache.
NOTE: Putting raw blocks here will not
cause the store to read them. For such use
cases use Prometheus + sidecar. Ignored if
--no-cache-index-header option is specified.
--enable-auto-gomemlimit Enable go runtime to automatically limit memory
consumption.
--grpc-address="0.0.0.0:10901"
Listen ip:port address for gRPC endpoints
(StoreAPI). Make sure this address is routable
from other components.
--grpc-grace-period=2m Time to wait after an interrupt received for
GRPC Server.
--grpc-server-max-connection-age=60m
The grpc server max connection age. This
controls how often to re-establish connections
and redo TLS handshakes.
--grpc-server-tls-cert="" TLS Certificate for gRPC server, leave blank to
disable TLS
--grpc-server-tls-client-ca=""
TLS CA to verify clients against. If no
client CA is specified, there is no client
verification on server side. (tls.NoClientCert)
--grpc-server-tls-key="" TLS Key for the gRPC server, leave blank to
disable TLS
--grpc-server-tls-min-version="1.3"
TLS supported minimum version for gRPC server.
If no version is specified, it'll default to
1.3. Allowed values: ["1.0", "1.1", "1.2",
"1.3"]
-h, --help Show context-sensitive help (also try
--help-long and --help-man).
--http-address="0.0.0.0:10902"
Listen host:port for HTTP endpoints.
--http-grace-period=2m Time to wait after an interrupt received for
HTTP Server.
--http.config="" [EXPERIMENTAL] Path to the configuration file
that can enable TLS or authentication for all
HTTP endpoints.
--ignore-deletion-marks-delay=24h --ignore-deletion-marks-delay=24h
Duration after which the blocks marked for Duration after which the blocks marked for
deletion will be filtered out while fetching deletion will be filtered out while fetching
@ -147,111 +236,15 @@ Flags:
blocks before being deleted from bucket. blocks before being deleted from bucket.
Default is 24h, half of the default value for Default is 24h, half of the default value for
--delete-delay on compactor. --delete-delay on compactor.
--index-cache-size=250MB Maximum size of items held in the in-memory --[no-]store.enable-index-header-lazy-reader
index cache. Ignored if --index-cache.config or
--index-cache.config-file option is specified.
--index-cache.config=<content>
Alternative to 'index-cache.config-file'
flag (mutually exclusive). Content of
YAML file that contains index cache
configuration. See format details:
https://thanos.io/tip/components/store.md/#index-cache
--index-cache.config-file=<file-path>
Path to YAML file that contains index
cache configuration. See format details:
https://thanos.io/tip/components/store.md/#index-cache
--log.format=logfmt Log format to use. Possible options: logfmt or
json.
--log.level=info Log filtering level.
--matcher-cache-size=0 Max number of cached matchers items. Using 0
disables caching.
--max-time=9999-12-31T23:59:59Z
End of time range limit to serve. Thanos Store
will serve only blocks, which happened earlier
than this value. Option can be a constant time
in RFC3339 format or time duration relative
to current time, such as -1d or 2h45m. Valid
duration units are ms, s, m, h, d, w, y.
--min-time=0000-01-01T00:00:00Z
Start of time range limit to serve. Thanos
Store will serve only metrics, which happened
later than this value. Option can be a constant
time in RFC3339 format or time duration
relative to current time, such as -1d or 2h45m.
Valid duration units are ms, s, m, h, d, w, y.
--objstore.config=<content>
Alternative to 'objstore.config-file'
flag (mutually exclusive). Content of
YAML file that contains object store
configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--objstore.config-file=<file-path>
Path to YAML file that contains object
store configuration. See format details:
https://thanos.io/tip/thanos/storage.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--selector.relabel-config=<content>
Alternative to 'selector.relabel-config-file'
flag (mutually exclusive). Content of YAML
file with relabeling configuration that allows
selecting blocks to act on based on their
external labels. It follows thanos sharding
relabel-config syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--selector.relabel-config-file=<file-path>
Path to YAML file with relabeling
configuration that allows selecting blocks
to act on based on their external labels.
It follows thanos sharding relabel-config
syntax. For format details see:
https://thanos.io/tip/thanos/sharding.md/#relabelling
--store.enable-index-header-lazy-reader
If true, Store Gateway will lazy memory map If true, Store Gateway will lazy memory map
index-header only once the block is required by index-header only once the block is required by
a query. a query.
--store.enable-lazy-expanded-postings --[no-]store.enable-lazy-expanded-postings
If true, Store Gateway will estimate postings If true, Store Gateway will estimate postings
size and try to lazily expand postings if size and try to lazily expand postings if
it downloads less data than expanding all it downloads less data than expanding all
postings. postings.
--store.grpc.downloaded-bytes-limit=0
Maximum amount of downloaded (either
fetched or touched) bytes in a single
Series/LabelNames/LabelValues call. The Series
call fails if this limit is exceeded. 0 means
no limit.
--store.grpc.series-max-concurrency=20
Maximum number of concurrent Series calls.
--store.grpc.series-sample-limit=0
DEPRECATED: use store.limits.request-samples.
--store.grpc.touched-series-limit=0
DEPRECATED: use store.limits.request-series.
--store.index-header-lazy-download-strategy=eager
Strategy of how to download index headers
lazily. Supported values: eager, lazy.
If eager, always download index header during
initial load. If lazy, download index header
during query time.
--store.limits.request-samples=0
The maximum samples allowed for a single
Series request, The Series call fails if
this limit is exceeded. 0 means no limit.
NOTE: For efficiency the limit is internally
implemented as 'chunks limit' considering each
chunk contains a maximum of 120 samples.
--store.limits.request-series=0
The maximum series allowed for a single Series
request. The Series call fails if this limit is
exceeded. 0 means no limit.
--store.posting-group-max-key-series-ratio=100 --store.posting-group-max-key-series-ratio=100
Mark posting group as lazy if it fetches more Mark posting group as lazy if it fetches more
keys than R * max series the query should keys than R * max series the query should
@ -264,22 +257,13 @@ Flags:
accordingly. This config is only valid if lazy accordingly. This config is only valid if lazy
expanded posting is enabled. 0 disables the expanded posting is enabled. 0 disables the
limit. limit.
--sync-block-duration=15m Repeat interval for syncing the blocks between --store.index-header-lazy-download-strategy=eager
local and remote view. Strategy of how to download index headers
--tracing.config=<content> lazily. Supported values: eager, lazy.
Alternative to 'tracing.config-file' flag If eager, always download index header during
(mutually exclusive). Content of YAML file initial load. If lazy, download index header
with tracing configuration. See format details: during query time.
https://thanos.io/tip/thanos/tracing.md/#configuration --[no-]web.disable Disable Block Viewer UI.
--tracing.config-file=<file-path>
Path to YAML file with tracing
configuration. See format details:
https://thanos.io/tip/thanos/tracing.md/#configuration
--version Show application version.
--web.disable Disable Block Viewer UI.
--web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--web.external-prefix="" Static prefix for all HTML links and redirect --web.external-prefix="" Static prefix for all HTML links and redirect
URLs in the bucket web UI interface. URLs in the bucket web UI interface.
Actual endpoints are still served on / or the Actual endpoints are still served on / or the
@ -299,6 +283,27 @@ Flags:
stripped prefix value in X-Forwarded-Prefix stripped prefix value in X-Forwarded-Prefix
header. This allows thanos UI to be served on a header. This allows thanos UI to be served on a
sub-path. sub-path.
--[no-]web.disable-cors Whether to disable CORS headers to be set by
Thanos. By default Thanos sets CORS headers to
be allowed by all.
--bucket-web-label=BUCKET-WEB-LABEL
External block label to use as group title in
the bucket web UI
--matcher-cache-size=0 Max number of cached matchers items. Using 0
disables caching.
--[no-]disable-admin-operations
Disable UI/API admin operations like marking
blocks for deletion and no compaction.
--request.logging-config-file=<file-path>
Path to YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
--request.logging-config=<content>
Alternative to 'request.logging-config-file'
flag (mutually exclusive). Content
of YAML file with request logging
configuration. See format details:
https://thanos.io/tip/thanos/logging.md/#configuration
``` ```

File diff suppressed because it is too large Load Diff

View File

@ -11,7 +11,7 @@ menu: proposals-accepted
* https://github.com/thanos-io/thanos/pull/5250 * https://github.com/thanos-io/thanos/pull/5250
* https://github.com/thanos-io/thanos/pull/4917 * https://github.com/thanos-io/thanos/pull/4917
* https://github.com/thanos-io/thanos/pull/5350 * https://github.com/thanos-io/thanos/pull/5350
* https://github.com/thanos-community/promql-engine/issues/25 * https://github.com/thanos-io/promql-engine/issues/25
## 2 Why ## 2 Why
@ -75,7 +75,7 @@ Keeping PromQL execution in Query components allows for deduplication between Pr
<img src="../img/distributed-execution-proposal-1.png" alt="Distributed query execution" width="400"/> <img src="../img/distributed-execution-proposal-1.png" alt="Distributed query execution" width="400"/>
The initial version of the solution can be found here: https://github.com/thanos-community/promql-engine/pull/139 The initial version of the solution can be found here: https://github.com/thanos-io/promql-engine/pull/139
### Query rewrite algorithm ### Query rewrite algorithm

View File

@ -92,7 +92,7 @@ Enforcing tenancy label in queries:
#### Apply verification and enforcement logic in the Query Frontend instead of Querier. #### Apply verification and enforcement logic in the Query Frontend instead of Querier.
The Query Frontend is an optional component on any Thanos deployment, while the Querier is always present. Plus, there might be deployments with multiple Querier layers where one or more might need to apply tenant verification and enforcement. On top of this, doing it in the Querier supports future work on using the [new Thanos PromQL engine](https://github.com/thanos-community/promql-engine), which can potentially make the Query Frontend unnecessary. The Query Frontend is an optional component on any Thanos deployment, while the Querier is always present. Plus, there might be deployments with multiple Querier layers where one or more might need to apply tenant verification and enforcement. On top of this, doing it in the Querier supports future work on using the [new Thanos PromQL engine](https://github.com/thanos-io/promql-engine), which can potentially make the Query Frontend unnecessary.
#### Add the tenant identification as an optional field in the Store API protobuffer spec instead of an HTTP header. #### Add the tenant identification as an optional field in the Store API protobuffer spec instead of an HTTP header.

View File

@ -143,4 +143,4 @@ An alternative to this is to use the existing [hashmod](https://prometheus.io/do
Once a Prometheus instance has been drained and no longer has targets to scrape we will wish to scale down and remove the instance. However, we will need to ensure that the data that is currently in the WAL block but not uploaded to object storage is flushed before we can remove the instance. Failing to do so will mean that any data in the WAL is lost when the Prometheus node is terminated. During this flush period until it is confirmed that the WAL has been uploaded we should still have the Prometheus instance serve requests for the data in the WAL. Once a Prometheus instance has been drained and no longer has targets to scrape we will wish to scale down and remove the instance. However, we will need to ensure that the data that is currently in the WAL block but not uploaded to object storage is flushed before we can remove the instance. Failing to do so will mean that any data in the WAL is lost when the Prometheus node is terminated. During this flush period until it is confirmed that the WAL has been uploaded we should still have the Prometheus instance serve requests for the data in the WAL.
See [prometheus/tsdb - Issue 346](https://github.com/prometheus/tsdb/issues/346) for more information. See [prometheus/tsdb - Issue 346](https://github.com/prometheus-junkyard/tsdb/issues/346) for more information.

View File

@ -23,6 +23,9 @@ Release shepherd responsibilities:
| Release | Time of first RC | Shepherd (GitHub handle) | | Release | Time of first RC | Shepherd (GitHub handle) |
|---------|------------------|-------------------------------| |---------|------------------|-------------------------------|
| v0.39.0 | 2025.05.29 | `@GiedriusS` |
| v0.38.0 | 2025.03.25 | `@MichaHoffmann` |
| v0.37.0 | 2024.11.19 | `@saswatamcode` |
| v0.36.0 | 2024.06.26 | `@MichaHoffmann` | | v0.36.0 | 2024.06.26 | `@MichaHoffmann` |
| v0.35.0 | 2024.04.09 | `@saswatamcode` | | v0.35.0 | 2024.04.09 | `@saswatamcode` |
| v0.34.0 | 2024.01.14 | `@MichaHoffmann` | | v0.34.0 | 2024.01.14 | `@MichaHoffmann` |

View File

@ -23,6 +23,7 @@ import (
"gopkg.in/yaml.v2" "gopkg.in/yaml.v2"
"github.com/efficientgo/core/testutil" "github.com/efficientgo/core/testutil"
"github.com/thanos-io/objstore"
"github.com/thanos-io/objstore/client" "github.com/thanos-io/objstore/client"
"github.com/thanos-io/objstore/providers/s3" "github.com/thanos-io/objstore/providers/s3"
tracingclient "github.com/thanos-io/thanos/pkg/tracing/client" tracingclient "github.com/thanos-io/thanos/pkg/tracing/client"
@ -176,7 +177,7 @@ func TestReadOnlyThanosSetup(t *testing.T) {
// │ │ // │ │
// └───────────┘ // └───────────┘
bkt1Config, err := yaml.Marshal(client.BucketConfig{ bkt1Config, err := yaml.Marshal(client.BucketConfig{
Type: client.S3, Type: objstore.S3,
Config: s3.Config{ Config: s3.Config{
Bucket: "bkt1", Bucket: "bkt1",
AccessKey: e2edb.MinioAccessKey, AccessKey: e2edb.MinioAccessKey,
@ -198,7 +199,7 @@ func TestReadOnlyThanosSetup(t *testing.T) {
) )
bkt2Config, err := yaml.Marshal(client.BucketConfig{ bkt2Config, err := yaml.Marshal(client.BucketConfig{
Type: client.S3, Type: objstore.S3,
Config: s3.Config{ Config: s3.Config{
Bucket: "bkt2", Bucket: "bkt2",
AccessKey: e2edb.MinioAccessKey, AccessKey: e2edb.MinioAccessKey,

370
go.mod
View File

@ -1,285 +1,319 @@
module github.com/thanos-io/thanos module github.com/thanos-io/thanos
go 1.24 go 1.24.0
require ( require (
capnproto.org/go/capnp/v3 v3.0.0-alpha.30 capnproto.org/go/capnp/v3 v3.1.0-alpha.1
cloud.google.com/go/trace v1.10.12 cloud.google.com/go/trace v1.11.6
github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace v1.8.3 github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/trace v1.27.0
github.com/KimMachineGun/automemlimit v0.6.1 github.com/KimMachineGun/automemlimit v0.7.3
github.com/alecthomas/units v0.0.0-20240927000941-0f3dac36c52b github.com/alecthomas/units v0.0.0-20240927000941-0f3dac36c52b
github.com/alicebob/miniredis/v2 v2.22.0 github.com/alicebob/miniredis/v2 v2.35.0
github.com/blang/semver/v4 v4.0.0 github.com/blang/semver/v4 v4.0.0
github.com/bradfitz/gomemcache v0.0.0-20190913173617-a41fca850d0b github.com/bradfitz/gomemcache v0.0.0-20250403215159-8d39553ac7cf
github.com/caio/go-tdigest v3.1.0+incompatible github.com/caio/go-tdigest v3.1.0+incompatible
github.com/cespare/xxhash/v2 v2.3.0 github.com/cespare/xxhash/v2 v2.3.0
github.com/chromedp/cdproto v0.0.0-20230802225258-3cf4e6d46a89 github.com/chromedp/cdproto v0.0.0-20230802225258-3cf4e6d46a89
github.com/chromedp/chromedp v0.9.2 github.com/chromedp/chromedp v0.9.2
github.com/cortexproject/promqlsmith v0.0.0-20240506042652-6cfdd9739a5e github.com/cortexproject/promqlsmith v0.0.0-20250407233056-90db95b1a4e4
github.com/cristalhq/hedgedhttp v0.9.1 github.com/cristalhq/hedgedhttp v0.9.1
github.com/dustin/go-humanize v1.0.1 github.com/dustin/go-humanize v1.0.1
github.com/efficientgo/core v1.0.0-rc.3 github.com/efficientgo/core v1.0.0-rc.3
github.com/efficientgo/e2e v0.14.1-0.20230710114240-c316eb95ae5b github.com/efficientgo/e2e v0.14.1-0.20230710114240-c316eb95ae5b
github.com/efficientgo/tools/extkingpin v0.0.0-20220817170617-6c25e3b627dd github.com/efficientgo/tools/extkingpin v0.0.0-20230505153745-6b7392939a60
github.com/facette/natsort v0.0.0-20181210072756-2cd4dd1e2dcb github.com/facette/natsort v0.0.0-20181210072756-2cd4dd1e2dcb
github.com/fatih/structtag v1.2.0 github.com/fatih/structtag v1.2.0
github.com/felixge/fgprof v0.9.5 github.com/felixge/fgprof v0.9.5
github.com/fortytw2/leaktest v1.3.0 github.com/fortytw2/leaktest v1.3.0
github.com/fsnotify/fsnotify v1.8.0 github.com/fsnotify/fsnotify v1.9.0
github.com/go-kit/log v0.2.1 github.com/go-kit/log v0.2.1
github.com/go-openapi/strfmt v0.23.0 github.com/go-openapi/strfmt v0.23.0
github.com/gogo/protobuf v1.3.2 github.com/gogo/protobuf v1.3.2
github.com/gogo/status v1.1.1 github.com/gogo/status v1.1.1
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da github.com/golang/groupcache v0.0.0-20241129210726-2c02b8208cf8
github.com/golang/protobuf v1.5.4 github.com/golang/protobuf v1.5.4
github.com/golang/snappy v0.0.4 github.com/golang/snappy v1.0.0
github.com/google/go-cmp v0.7.0 github.com/google/go-cmp v0.7.0
github.com/google/uuid v1.6.0 github.com/google/uuid v1.6.0
github.com/googleapis/gax-go v2.0.2+incompatible github.com/googleapis/gax-go v2.0.2+incompatible
github.com/grpc-ecosystem/go-grpc-middleware/providers/prometheus v1.0.1 github.com/grpc-ecosystem/go-grpc-middleware/providers/prometheus v1.0.1
github.com/grpc-ecosystem/go-grpc-middleware/v2 v2.1.0 github.com/grpc-ecosystem/go-grpc-middleware/v2 v2.3.2
github.com/hashicorp/golang-lru/v2 v2.0.7 github.com/hashicorp/golang-lru/v2 v2.0.7
github.com/jpillora/backoff v1.0.0 github.com/jpillora/backoff v1.0.0
github.com/json-iterator/go v1.1.12 github.com/json-iterator/go v1.1.12
github.com/klauspost/compress v1.17.11 github.com/klauspost/compress v1.18.0
github.com/leanovate/gopter v0.2.9 github.com/leanovate/gopter v0.2.9
github.com/lightstep/lightstep-tracer-go v0.25.0 github.com/lightstep/lightstep-tracer-go v0.26.0
github.com/lovoo/gcloud-opentracing v0.3.0 github.com/lovoo/gcloud-opentracing v0.3.0
github.com/miekg/dns v1.1.62 github.com/miekg/dns v1.1.66
github.com/minio/sha256-simd v1.0.1 github.com/minio/sha256-simd v1.0.1
github.com/mitchellh/go-ps v1.0.0 github.com/mitchellh/go-ps v1.0.0
github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f github.com/mwitkow/go-conntrack v0.0.0-20190716064945-2f068394615f
github.com/oklog/run v1.1.0 github.com/oklog/run v1.2.0
github.com/oklog/ulid v1.3.1 github.com/oklog/ulid v1.3.1 // indirect
github.com/olekukonko/tablewriter v0.0.5 github.com/olekukonko/tablewriter v0.0.5
github.com/onsi/gomega v1.34.0 github.com/onsi/gomega v1.36.2
github.com/opentracing/basictracer-go v1.1.0 github.com/opentracing/basictracer-go v1.1.0
github.com/opentracing/opentracing-go v1.2.0 github.com/opentracing/opentracing-go v1.2.0
github.com/pkg/errors v0.9.1 github.com/pkg/errors v0.9.1
github.com/prometheus-community/prom-label-proxy v0.8.1-0.20240127162815-c1195f9aabc0 github.com/prometheus-community/prom-label-proxy v0.11.1
github.com/prometheus/alertmanager v0.27.0 github.com/prometheus/alertmanager v0.28.1
github.com/prometheus/client_golang v1.21.1 github.com/prometheus/client_golang v1.22.0
github.com/prometheus/client_model v0.6.1 github.com/prometheus/client_model v0.6.2
github.com/prometheus/common v0.62.0 github.com/prometheus/common v0.65.1-0.20250703115700-7f8b2a0d32d3
github.com/prometheus/exporter-toolkit v0.13.2 github.com/prometheus/exporter-toolkit v0.14.0
// Prometheus maps version 3.x.y to tags v0.30x.y. // Prometheus maps version 3.x.y to tags v0.30x.y.
github.com/prometheus/prometheus v0.301.0 github.com/prometheus/prometheus v0.304.3-0.20250708181613-d8c921804e87
github.com/redis/rueidis v1.0.45-alpha.1 github.com/redis/rueidis v1.0.61
github.com/seiflotfy/cuckoofilter v0.0.0-20240715131351-a2f2c23f1771 github.com/seiflotfy/cuckoofilter v0.0.0-20240715131351-a2f2c23f1771
github.com/sony/gobreaker v0.5.0 github.com/sony/gobreaker v1.0.0
github.com/stretchr/testify v1.10.0 github.com/stretchr/testify v1.10.0
github.com/thanos-io/objstore v0.0.0-20241111205755-d1dd89d41f97 github.com/thanos-io/objstore v0.0.0-20250722142242-922b22272ee3
github.com/thanos-io/promql-engine v0.0.0-20250302135832-accbf0891a16 github.com/thanos-io/promql-engine v0.0.0-20250711160436-eb186b2cf537
github.com/uber/jaeger-client-go v2.30.0+incompatible github.com/uber/jaeger-client-go v2.30.0+incompatible
github.com/vimeo/galaxycache v0.0.0-20210323154928-b7e5d71c067a github.com/vimeo/galaxycache v1.3.1
github.com/weaveworks/common v0.0.0-20230728070032-dd9e68f319d5 github.com/weaveworks/common v0.0.0-20230728070032-dd9e68f319d5
go.elastic.co/apm v1.15.0 go.elastic.co/apm v1.15.0
go.elastic.co/apm/module/apmot v1.15.0 go.elastic.co/apm/module/apmot v1.15.0
go.opentelemetry.io/contrib/propagators/autoprop v0.54.0 go.opentelemetry.io/contrib/propagators/autoprop v0.61.0
go.opentelemetry.io/contrib/samplers/jaegerremote v0.23.0 go.opentelemetry.io/contrib/samplers/jaegerremote v0.30.0
go.opentelemetry.io/otel v1.35.0 go.opentelemetry.io/otel v1.36.0
go.opentelemetry.io/otel/bridge/opentracing v1.31.0 go.opentelemetry.io/otel/bridge/opentracing v1.36.0
go.opentelemetry.io/otel/exporters/jaeger v1.16.0 go.opentelemetry.io/otel/exporters/jaeger v1.17.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.34.0 go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.36.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.33.0 go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.36.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.34.0 go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.36.0
go.opentelemetry.io/otel/sdk v1.35.0 go.opentelemetry.io/otel/sdk v1.36.0
go.opentelemetry.io/otel/trace v1.35.0 go.opentelemetry.io/otel/trace v1.36.0
go.uber.org/atomic v1.11.0 go.uber.org/atomic v1.11.0
go.uber.org/automaxprocs v1.6.0 go.uber.org/automaxprocs v1.6.0
go.uber.org/goleak v1.3.0 go.uber.org/goleak v1.3.0
go4.org/intern v0.0.0-20230525184215-6c62f75575cb go4.org/intern v0.0.0-20230525184215-6c62f75575cb
golang.org/x/crypto v0.32.0 golang.org/x/crypto v0.39.0
golang.org/x/net v0.34.0 golang.org/x/net v0.41.0
golang.org/x/sync v0.10.0 golang.org/x/sync v0.15.0
golang.org/x/text v0.21.0 golang.org/x/text v0.26.0
golang.org/x/time v0.8.0 golang.org/x/time v0.12.0
google.golang.org/grpc v1.69.4 google.golang.org/grpc v1.73.0
google.golang.org/grpc/examples v0.0.0-20211119005141-f45e61797429 google.golang.org/grpc/examples v0.0.0-20230224211313-3775f633ce20
google.golang.org/protobuf v1.36.3 google.golang.org/protobuf v1.36.6
gopkg.in/alecthomas/kingpin.v2 v2.2.6
gopkg.in/yaml.v2 v2.4.0 gopkg.in/yaml.v2 v2.4.0
gopkg.in/yaml.v3 v3.0.1 gopkg.in/yaml.v3 v3.0.1
) )
require ( require (
cloud.google.com/go v0.115.1 // indirect cloud.google.com/go v0.120.0 // indirect
cloud.google.com/go/auth v0.13.0 // indirect cloud.google.com/go/auth v0.16.2 // indirect
cloud.google.com/go/auth/oauth2adapt v0.2.6 // indirect cloud.google.com/go/auth/oauth2adapt v0.2.8 // indirect
cloud.google.com/go/compute/metadata v0.6.0 // indirect cloud.google.com/go/compute/metadata v0.7.0 // indirect
cloud.google.com/go/iam v1.1.13 // indirect cloud.google.com/go/iam v1.5.2 // indirect
cloud.google.com/go/storage v1.43.0 // indirect cloud.google.com/go/storage v1.50.0 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.16.0 // indirect github.com/Azure/azure-sdk-for-go/sdk/azcore v1.18.1 // indirect
github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.8.0 // indirect github.com/Azure/azure-sdk-for-go/sdk/azidentity v1.10.1 // indirect
github.com/Azure/azure-sdk-for-go/sdk/internal v1.10.0 // indirect github.com/Azure/azure-sdk-for-go/sdk/internal v1.11.1 // indirect
github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.3.0 // indirect github.com/Azure/azure-sdk-for-go/sdk/storage/azblob v1.6.1 // indirect
github.com/AzureAD/microsoft-authentication-library-for-go v1.2.2 // indirect github.com/AzureAD/microsoft-authentication-library-for-go v1.4.2 // indirect
) )
require ( require (
github.com/tjhop/slog-gokit v0.1.3 github.com/alecthomas/kingpin/v2 v2.4.0
go.opentelemetry.io/collector/pdata v1.22.0 github.com/oklog/ulid/v2 v2.1.1
go.opentelemetry.io/collector/semconv v0.116.0 github.com/prometheus/otlptranslator v0.0.0-20250620074007-94f535e0c588
github.com/tjhop/slog-gokit v0.1.4
go.opentelemetry.io/collector/pdata v1.34.0
go.opentelemetry.io/collector/semconv v0.128.0
) )
require github.com/dgryski/go-metro v0.0.0-20200812162917-85c65e2d0165 // indirect require github.com/dgryski/go-metro v0.0.0-20250106013310-edb8663e5e33 // indirect
require ( require (
github.com/HdrHistogram/hdrhistogram-go v1.1.2 // indirect github.com/HdrHistogram/hdrhistogram-go v1.1.2 // indirect
github.com/bboreham/go-loser v0.0.0-20230920113527-fcc2c21820a3 // indirect github.com/bboreham/go-loser v0.0.0-20230920113527-fcc2c21820a3 // indirect
github.com/cilium/ebpf v0.11.0 // indirect github.com/elastic/go-licenser v0.4.2 // indirect
github.com/containerd/cgroups/v3 v3.0.3 // indirect
github.com/docker/go-units v0.5.0 // indirect
github.com/elastic/go-licenser v0.3.1 // indirect
github.com/go-ini/ini v1.67.0 // indirect github.com/go-ini/ini v1.67.0 // indirect
github.com/go-openapi/runtime v0.27.1 // indirect github.com/go-openapi/runtime v0.28.0 // indirect
github.com/goccy/go-json v0.10.3 // indirect github.com/goccy/go-json v0.10.5 // indirect
github.com/godbus/dbus/v5 v5.0.4 // indirect github.com/golang-jwt/jwt/v5 v5.2.2 // indirect
github.com/golang-jwt/jwt/v5 v5.2.1 // indirect github.com/google/s2a-go v0.1.9 // indirect
github.com/google/s2a-go v0.1.8 // indirect github.com/huaweicloud/huaweicloud-sdk-go-obs v3.25.4+incompatible // indirect
github.com/huaweicloud/huaweicloud-sdk-go-obs v3.23.3+incompatible // indirect github.com/jcchavezs/porto v0.7.0 // indirect
github.com/jcchavezs/porto v0.1.0 // indirect
github.com/leesper/go_rng v0.0.0-20190531154944-a612b043e353 // indirect github.com/leesper/go_rng v0.0.0-20190531154944-a612b043e353 // indirect
github.com/mdlayher/socket v0.4.1 // indirect github.com/mdlayher/socket v0.5.1 // indirect
github.com/mdlayher/vsock v1.2.1 // indirect github.com/mdlayher/vsock v1.2.1 // indirect
github.com/metalmatze/signal v0.0.0-20210307161603-1c9aa721a97a // indirect github.com/metalmatze/signal v0.0.0-20210307161603-1c9aa721a97a // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/onsi/ginkgo v1.16.5 // indirect github.com/onsi/ginkgo v1.16.5 // indirect
github.com/opencontainers/runtime-spec v1.0.2 // indirect
github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58 // indirect github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58 // indirect
github.com/sercand/kuberesolver/v4 v4.0.0 // indirect github.com/sercand/kuberesolver/v4 v4.0.0 // indirect
github.com/zhangyunhao116/umap v0.0.0-20221211160557-cb7705fafa39 // indirect go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.61.0 // indirect
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.54.0 // indirect go.opentelemetry.io/contrib/propagators/ot v1.36.0 // indirect
go.opentelemetry.io/contrib/propagators/ot v1.29.0 // indirect go4.org/unsafe/assume-no-moving-gc v0.0.0-20231121144256-b99613f794b6 // indirect
go4.org/unsafe/assume-no-moving-gc v0.0.0-20230525183740-e7c30c78aeb2 // indirect golang.org/x/lint v0.0.0-20241112194109-818c5a804067 // indirect
golang.org/x/lint v0.0.0-20210508222113-6edffad5e616 // indirect google.golang.org/genproto/googleapis/api v0.0.0-20250603155806-513f23925822 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20250115164207-1a7da9e5054f // indirect google.golang.org/genproto/googleapis/rpc v0.0.0-20250603155806-513f23925822 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20250115164207-1a7da9e5054f // indirect k8s.io/apimachinery v0.33.1 // indirect
k8s.io/apimachinery v0.31.3 // indirect k8s.io/client-go v0.33.1 // indirect
k8s.io/client-go v0.31.3 // indirect
k8s.io/klog/v2 v2.130.1 // indirect k8s.io/klog/v2 v2.130.1 // indirect
k8s.io/utils v0.0.0-20240711033017-18e509b52bc8 // indirect k8s.io/utils v0.0.0-20250604170112-4c0f3b243397 // indirect
zenhack.net/go/util v0.0.0-20230414204917-531d38494cf5 // indirect
) )
require ( require (
github.com/GoogleCloudPlatform/opentelemetry-operations-go/internal/resourcemapping v0.32.3 // indirect cel.dev/expr v0.23.1 // indirect
github.com/alecthomas/template v0.0.0-20190718012654-fb15b899a751 // indirect cloud.google.com/go/monitoring v1.24.2 // indirect
github.com/alicebob/gopher-json v0.0.0-20200520072559-a9ecdc9d1d3a // indirect github.com/GoogleCloudPlatform/opentelemetry-operations-go/detectors/gcp v1.27.0 // indirect
github.com/aliyun/aliyun-oss-go-sdk v2.2.2+incompatible // indirect github.com/GoogleCloudPlatform/opentelemetry-operations-go/exporter/metric v0.50.0 // indirect
github.com/GoogleCloudPlatform/opentelemetry-operations-go/internal/resourcemapping v0.52.0 // indirect
github.com/aliyun/aliyun-oss-go-sdk v3.0.2+incompatible // indirect
github.com/armon/go-radix v1.0.0 // indirect github.com/armon/go-radix v1.0.0 // indirect
github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 // indirect github.com/asaskevich/govalidator v0.0.0-20230301143203-a9d515a09cc2 // indirect
github.com/aws/aws-sdk-go v1.55.5 // indirect github.com/aws/aws-sdk-go-v2 v1.36.3 // indirect
github.com/aws/aws-sdk-go-v2 v1.16.0 // indirect github.com/aws/aws-sdk-go-v2/config v1.29.15 // indirect
github.com/aws/aws-sdk-go-v2/config v1.15.1 // indirect github.com/aws/aws-sdk-go-v2/credentials v1.17.68 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.11.0 // indirect github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.16.30 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.12.1 // indirect github.com/aws/aws-sdk-go-v2/internal/configsources v1.3.34 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.7 // indirect github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.6.34 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.4.1 // indirect github.com/aws/aws-sdk-go-v2/internal/ini v1.8.3 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.3.8 // indirect github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.12.3 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.9.1 // indirect github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.12.15 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.11.1 // indirect github.com/aws/aws-sdk-go-v2/service/sso v1.25.3 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.16.1 // indirect github.com/aws/aws-sdk-go-v2/service/ssooidc v1.30.1 // indirect
github.com/aws/smithy-go v1.11.1 // indirect github.com/aws/aws-sdk-go-v2/service/sts v1.33.20 // indirect
github.com/baidubce/bce-sdk-go v0.9.111 // indirect github.com/aws/smithy-go v1.22.3 // indirect
github.com/baidubce/bce-sdk-go v0.9.230 // indirect
github.com/beorn7/perks v1.0.1 // indirect github.com/beorn7/perks v1.0.1 // indirect
github.com/cenkalti/backoff/v4 v4.3.0 // indirect github.com/cenkalti/backoff/v5 v5.0.2 // indirect
github.com/chromedp/sysutil v1.0.0 // indirect github.com/chromedp/sysutil v1.0.0 // indirect
github.com/clbanning/mxj v1.8.4 // indirect github.com/clbanning/mxj v1.8.4 // indirect
github.com/cncf/xds/go v0.0.0-20250501225837-2ac532fd4443 // indirect
github.com/coreos/go-systemd/v22 v22.5.0 // indirect github.com/coreos/go-systemd/v22 v22.5.0 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/dennwc/varint v1.0.0 // indirect github.com/dennwc/varint v1.0.0 // indirect
github.com/edsrzf/mmap-go v1.2.0 // indirect github.com/edsrzf/mmap-go v1.2.0 // indirect
github.com/elastic/go-sysinfo v1.8.1 // indirect github.com/elastic/go-sysinfo v1.15.3 // indirect
github.com/elastic/go-windows v1.0.1 // indirect github.com/elastic/go-windows v1.0.2 // indirect
github.com/envoyproxy/go-control-plane/envoy v1.32.4 // indirect
github.com/envoyproxy/protoc-gen-validate v1.2.1 // indirect
github.com/fatih/color v1.18.0 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/go-jose/go-jose/v4 v4.0.5 // indirect
github.com/go-logfmt/logfmt v0.6.0 // indirect github.com/go-logfmt/logfmt v0.6.0 // indirect
github.com/go-logr/logr v1.4.2 // indirect github.com/go-logr/logr v1.4.3 // indirect
github.com/go-logr/stdr v1.2.2 // indirect github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-ole/go-ole v1.2.6 // indirect github.com/go-openapi/analysis v0.23.0 // indirect
github.com/go-openapi/analysis v0.22.2 // indirect github.com/go-openapi/errors v0.22.1 // indirect
github.com/go-openapi/errors v0.22.0 // indirect github.com/go-openapi/jsonpointer v0.21.1 // indirect
github.com/go-openapi/jsonpointer v0.21.0 // indirect github.com/go-openapi/jsonreference v0.21.0 // indirect
github.com/go-openapi/jsonreference v0.20.4 // indirect github.com/go-openapi/loads v0.22.0 // indirect
github.com/go-openapi/loads v0.21.5 // indirect github.com/go-openapi/spec v0.21.0 // indirect
github.com/go-openapi/spec v0.20.14 // indirect github.com/go-openapi/swag v0.23.1 // indirect
github.com/go-openapi/swag v0.23.0 // indirect github.com/go-openapi/validate v0.24.0 // indirect
github.com/go-openapi/validate v0.23.0 // indirect github.com/go-viper/mapstructure/v2 v2.2.1 // indirect
github.com/gobwas/glob v0.2.3 // indirect
github.com/gobwas/httphead v0.1.0 // indirect github.com/gobwas/httphead v0.1.0 // indirect
github.com/gobwas/pool v0.2.1 // indirect github.com/gobwas/pool v0.2.1 // indirect
github.com/gobwas/ws v1.2.1 // indirect github.com/gobwas/ws v1.2.1 // indirect
github.com/gofrs/flock v0.8.1 // indirect github.com/gofrs/flock v0.12.1 // indirect
github.com/gogo/googleapis v1.4.0 // indirect github.com/gogo/googleapis v1.4.1 // indirect
github.com/google/go-querystring v1.1.0 // indirect github.com/google/go-querystring v1.1.0 // indirect
github.com/google/pprof v0.0.0-20241210010833-40e02aabc2ad // indirect github.com/google/pprof v0.0.0-20250607225305-033d6d78b36a // indirect
github.com/googleapis/enterprise-certificate-proxy v0.3.4 // indirect github.com/googleapis/enterprise-certificate-proxy v0.3.6 // indirect
github.com/googleapis/gax-go/v2 v2.14.0 // indirect github.com/googleapis/gax-go/v2 v2.14.2 // indirect
github.com/gorilla/mux v1.8.0 // indirect github.com/gorilla/mux v1.8.1 // indirect
github.com/grafana/regexp v0.0.0-20240518133315-a468a5bfb3bc // indirect github.com/grafana/regexp v0.0.0-20240518133315-a468a5bfb3bc // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.25.1 // indirect github.com/grpc-ecosystem/grpc-gateway/v2 v2.26.3 // indirect
github.com/jmespath/go-jmespath v0.4.0 // indirect github.com/hashicorp/go-version v1.7.0 // indirect
github.com/joeshaw/multierror v0.0.0-20140124173710-69b34d4ec901 // indirect github.com/jaegertracing/jaeger-idl v0.6.0 // indirect
github.com/josharian/intern v1.0.0 // indirect github.com/josharian/intern v1.0.0 // indirect
github.com/julienschmidt/httprouter v1.3.0 // indirect github.com/julienschmidt/httprouter v1.3.0 // indirect
github.com/klauspost/cpuid/v2 v2.2.8 // indirect github.com/klauspost/cpuid/v2 v2.2.10 // indirect
github.com/knadh/koanf/maps v0.1.2 // indirect
github.com/knadh/koanf/providers/confmap v1.0.0 // indirect
github.com/knadh/koanf/v2 v2.2.1 // indirect
github.com/kylelemons/godebug v1.1.0 // indirect github.com/kylelemons/godebug v1.1.0 // indirect
github.com/lightstep/lightstep-tracer-common/golang/gogo v0.0.0-20210210170715-a8dfcb80d3a7 // indirect github.com/lightstep/lightstep-tracer-common/golang/gogo v0.0.0-20210210170715-a8dfcb80d3a7 // indirect
github.com/lufia/plan9stats v0.0.0-20211012122336-39d0f177ccd0 // indirect github.com/mailru/easyjson v0.9.0 // indirect
github.com/mailru/easyjson v0.7.7 // indirect github.com/mattn/go-colorable v0.1.14 // indirect
github.com/mattn/go-runewidth v0.0.13 // indirect github.com/mattn/go-runewidth v0.0.16 // indirect
github.com/minio/crc64nvme v1.0.1 // indirect
github.com/minio/md5-simd v1.1.2 // indirect github.com/minio/md5-simd v1.1.2 // indirect
github.com/minio/minio-go/v7 v7.0.80 // indirect github.com/minio/minio-go/v7 v7.0.93 // indirect
github.com/mitchellh/copystructure v1.2.0 // indirect
github.com/mitchellh/mapstructure v1.5.0 // indirect github.com/mitchellh/mapstructure v1.5.0 // indirect
github.com/mitchellh/reflectwalk v1.0.2 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/mozillazg/go-httpheader v0.2.1 // indirect github.com/mozillazg/go-httpheader v0.4.0 // indirect
github.com/ncw/swift v1.0.53 // indirect github.com/ncw/swift v1.0.53 // indirect
github.com/opentracing-contrib/go-grpc v0.0.0-20210225150812-73cb765af46e // indirect github.com/open-telemetry/opentelemetry-collector-contrib/internal/exp/metrics v0.128.0 // indirect
github.com/opentracing-contrib/go-stdlib v1.0.0 // indirect github.com/open-telemetry/opentelemetry-collector-contrib/pkg/pdatautil v0.128.0 // indirect
github.com/oracle/oci-go-sdk/v65 v65.41.1 // indirect github.com/open-telemetry/opentelemetry-collector-contrib/processor/deltatocumulativeprocessor v0.128.0 // indirect
github.com/opentracing-contrib/go-grpc v0.1.2 // indirect
github.com/opentracing-contrib/go-stdlib v1.1.0 // indirect
github.com/oracle/oci-go-sdk/v65 v65.93.1 // indirect
github.com/philhofer/fwd v1.1.3-0.20240916144458-20a13a1f6b7c // indirect
github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c // indirect github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c // indirect
github.com/planetscale/vtprotobuf v0.6.1-0.20240319094008-0393e58bdf10 // indirect
github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect github.com/pmezard/go-difflib v1.0.1-0.20181226105442-5d4384ee4fb2 // indirect
github.com/power-devops/perfstat v0.0.0-20210106213030-5aafc221ea8c // indirect github.com/prometheus/procfs v0.16.1 // indirect
github.com/prometheus/procfs v0.15.1 // indirect github.com/prometheus/sigv4 v0.2.0 // indirect
github.com/prometheus/sigv4 v0.1.0 // indirect github.com/puzpuzpuz/xsync/v3 v3.5.1 // indirect
github.com/rivo/uniseg v0.2.0 // indirect github.com/rivo/uniseg v0.4.7 // indirect
github.com/rs/xid v1.6.0 // indirect github.com/rs/xid v1.6.0 // indirect
github.com/santhosh-tekuri/jsonschema v1.2.4 // indirect github.com/santhosh-tekuri/jsonschema v1.2.4 // indirect
github.com/shirou/gopsutil/v3 v3.22.9 // indirect
github.com/sirupsen/logrus v1.9.3 // indirect github.com/sirupsen/logrus v1.9.3 // indirect
github.com/spiffe/go-spiffe/v2 v2.5.0 // indirect
github.com/stretchr/objx v0.5.2 // indirect github.com/stretchr/objx v0.5.2 // indirect
github.com/tencentyun/cos-go-sdk-v5 v0.7.40 // indirect github.com/tencentyun/cos-go-sdk-v5 v0.7.66 // indirect
github.com/tklauser/go-sysconf v0.3.10 // indirect github.com/tinylib/msgp v1.3.0 // indirect
github.com/tklauser/numcpus v0.4.0 // indirect
github.com/uber/jaeger-lib v2.4.1+incompatible // indirect github.com/uber/jaeger-lib v2.4.1+incompatible // indirect
github.com/weaveworks/promrus v1.2.0 // indirect github.com/weaveworks/promrus v1.2.0 // indirect
github.com/yuin/gopher-lua v0.0.0-20210529063254-f4c35e4016d9 // indirect github.com/xhit/go-str2duration/v2 v2.1.0 // indirect
github.com/yusufpapurcu/wmi v1.2.2 // indirect github.com/youmark/pkcs8 v0.0.0-20240726163527-a2c0da244d78 // indirect
github.com/yuin/gopher-lua v1.1.1 // indirect
github.com/zeebo/errs v1.4.0 // indirect
go.elastic.co/apm/module/apmhttp v1.15.0 // indirect go.elastic.co/apm/module/apmhttp v1.15.0 // indirect
go.elastic.co/fastjson v1.1.0 // indirect go.elastic.co/fastjson v1.5.1 // indirect
go.mongodb.org/mongo-driver v1.14.0 // indirect go.mongodb.org/mongo-driver v1.17.4 // indirect
go.opencensus.io v0.24.0 // indirect go.opencensus.io v0.24.0 // indirect
go.opentelemetry.io/auto/sdk v1.1.0 // indirect go.opentelemetry.io/auto/sdk v1.1.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/httptrace/otelhttptrace v0.58.0 // indirect go.opentelemetry.io/collector/component v1.34.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.58.0 // indirect go.opentelemetry.io/collector/confmap v1.34.0 // indirect
go.opentelemetry.io/contrib/propagators/aws v1.29.0 // indirect go.opentelemetry.io/collector/confmap/xconfmap v0.128.0 // indirect
go.opentelemetry.io/contrib/propagators/b3 v1.29.0 // indirect go.opentelemetry.io/collector/consumer v1.34.0 // indirect
go.opentelemetry.io/contrib/propagators/jaeger v1.29.0 // indirect go.opentelemetry.io/collector/featuregate v1.34.0 // indirect
go.opentelemetry.io/otel/metric v1.35.0 // indirect go.opentelemetry.io/collector/internal/telemetry v0.128.0 // indirect
go.opentelemetry.io/proto/otlp v1.5.0 // indirect go.opentelemetry.io/collector/pipeline v0.128.0 // indirect
go.opentelemetry.io/collector/processor v1.34.0 // indirect
go.opentelemetry.io/contrib/bridges/otelzap v0.11.0 // indirect
go.opentelemetry.io/contrib/detectors/gcp v1.35.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/httptrace/otelhttptrace v0.61.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.61.0 // indirect
go.opentelemetry.io/contrib/propagators/aws v1.36.0 // indirect
go.opentelemetry.io/contrib/propagators/b3 v1.36.0 // indirect
go.opentelemetry.io/contrib/propagators/jaeger v1.36.0 // indirect
go.opentelemetry.io/otel/log v0.12.2 // indirect
go.opentelemetry.io/otel/metric v1.36.0 // indirect
go.opentelemetry.io/otel/sdk/metric v1.36.0 // indirect
go.opentelemetry.io/proto/otlp v1.7.0 // indirect
go.uber.org/multierr v1.11.0 // indirect go.uber.org/multierr v1.11.0 // indirect
golang.org/x/exp v0.0.0-20240613232115-7f521ea00fb8 // indirect go.uber.org/zap v1.27.0 // indirect
golang.org/x/mod v0.22.0 // indirect golang.org/x/exp v0.0.0-20250606033433-dcc06ee1d476 // indirect
golang.org/x/oauth2 v0.24.0 // indirect golang.org/x/mod v0.25.0 // indirect
golang.org/x/sys v0.30.0 // indirect golang.org/x/oauth2 v0.30.0 // indirect
golang.org/x/tools v0.28.0 // indirect golang.org/x/sys v0.33.0 // indirect
gonum.org/v1/gonum v0.15.0 // indirect golang.org/x/tools v0.34.0 // indirect
google.golang.org/api v0.213.0 // indirect gonum.org/v1/gonum v0.16.0 // indirect
google.golang.org/genproto v0.0.0-20240823204242-4ba0660f739c // indirect google.golang.org/api v0.238.0 // indirect
howett.net/plist v0.0.0-20181124034731-591f970eefbb // indirect google.golang.org/genproto v0.0.0-20250505200425-f936aa4a68b2 // indirect
howett.net/plist v1.0.1 // indirect
sigs.k8s.io/yaml v1.4.0 // indirect
zenhack.net/go/util v0.0.0-20230414204917-531d38494cf5 // indirect
) )
replace ( replace (
// Pinnning capnp due to https://github.com/thanos-io/thanos/issues/7944
capnproto.org/go/capnp/v3 => capnproto.org/go/capnp/v3 v3.0.0-alpha.30
// Using a 3rd-party branch for custom dialer - see https://github.com/bradfitz/gomemcache/pull/86. // Using a 3rd-party branch for custom dialer - see https://github.com/bradfitz/gomemcache/pull/86.
// Required by Cortex https://github.com/cortexproject/cortex/pull/3051. // Required by Cortex https://github.com/cortexproject/cortex/pull/3051.
github.com/bradfitz/gomemcache => github.com/themihai/gomemcache v0.0.0-20180902122335-24332e2d58ab github.com/bradfitz/gomemcache => github.com/themihai/gomemcache v0.0.0-20180902122335-24332e2d58ab
@ -289,9 +323,9 @@ replace (
github.com/vimeo/galaxycache => github.com/thanos-community/galaxycache v0.0.0-20211122094458-3a32041a1f1e github.com/vimeo/galaxycache => github.com/thanos-community/galaxycache v0.0.0-20211122094458-3a32041a1f1e
// Pinning grpc due https://github.com/grpc/grpc-go/issues/7314
google.golang.org/grpc => google.golang.org/grpc v1.63.2
// Overriding to use latest commit. // Overriding to use latest commit.
gopkg.in/alecthomas/kingpin.v2 => github.com/alecthomas/kingpin v1.3.8-0.20210301060133-17f40c25f497 gopkg.in/alecthomas/kingpin.v2 => github.com/alecthomas/kingpin v1.3.8-0.20210301060133-17f40c25f497
// The domain `zenhack.net` expired.
zenhack.net/go/util => github.com/zenhack/go-util v0.0.0-20231005031245-66f5419c2aea
) )

2058
go.sum

File diff suppressed because it is too large Load Diff

View File

@ -334,7 +334,12 @@ func (s resultsCache) isAtModifierCachable(r Request, maxCacheTime int64) bool {
} }
// This resolves the start() and end() used with the @ modifier. // This resolves the start() and end() used with the @ modifier.
expr = promql.PreprocessExpr(expr, timestamp.Time(r.GetStart()), timestamp.Time(r.GetEnd())) expr, err = promql.PreprocessExpr(expr, timestamp.Time(r.GetStart()), timestamp.Time(r.GetEnd()), time.Duration(r.GetStep())*time.Millisecond)
if err != nil {
// We are being pessimistic in such cases.
level.Warn(s.logger).Log("msg", "failed to preprocess expr", "query", query, "err", err)
return false
}
end := r.GetEnd() end := r.GetEnd()
atModCachable := true atModCachable := true

View File

@ -9,7 +9,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/opentracing/opentracing-go" "github.com/opentracing/opentracing-go"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"

View File

@ -17,7 +17,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/prometheus/common/route" "github.com/prometheus/common/route"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"

View File

@ -74,7 +74,8 @@ type queryCreator interface {
} }
type QueryFactory struct { type QueryFactory struct {
mode PromqlQueryMode mode PromqlQueryMode
disableFallback bool
prometheus *promql.Engine prometheus *promql.Engine
thanosLocal *engine.Engine thanosLocal *engine.Engine
@ -90,6 +91,7 @@ func NewQueryFactory(
enableXFunctions bool, enableXFunctions bool,
activeQueryTracker *promql.ActiveQueryTracker, activeQueryTracker *promql.ActiveQueryTracker,
mode PromqlQueryMode, mode PromqlQueryMode,
disableFallback bool,
) *QueryFactory { ) *QueryFactory {
makeOpts := func(registry prometheus.Registerer) engine.Opts { makeOpts := func(registry prometheus.Registerer) engine.Opts {
opts := engine.Opts{ opts := engine.Opts{
@ -134,6 +136,7 @@ func NewQueryFactory(
prometheus: promEngine, prometheus: promEngine,
thanosLocal: thanosLocal, thanosLocal: thanosLocal,
thanosDistributed: thanosDistributed, thanosDistributed: thanosDistributed,
disableFallback: disableFallback,
} }
} }
@ -159,7 +162,7 @@ func (f *QueryFactory) makeInstantQuery(
res, err = f.thanosLocal.MakeInstantQuery(ctx, q, opts, qry.query, ts) res, err = f.thanosLocal.MakeInstantQuery(ctx, q, opts, qry.query, ts)
} }
if err != nil { if err != nil {
if engine.IsUnimplemented(err) { if engine.IsUnimplemented(err) && !f.disableFallback {
// fallback to prometheus // fallback to prometheus
return f.prometheus.NewInstantQuery(ctx, q, opts, qry.query, ts) return f.prometheus.NewInstantQuery(ctx, q, opts, qry.query, ts)
} }

View File

@ -120,7 +120,7 @@ func (g *GRPCAPI) Query(request *querypb.QueryRequest, server querypb.Query_Quer
}) })
if result.Err != nil { if result.Err != nil {
if request.EnablePartialResponse { if request.EnablePartialResponse {
if err := server.Send(querypb.NewQueryWarningsResponse(err)); err != nil { if err := server.Send(querypb.NewQueryWarningsResponse(result.Err)); err != nil {
return err return err
} }
return nil return nil
@ -273,6 +273,9 @@ func extractQueryStats(qry promql.Query) *querypb.QueryStats {
} }
if explQry, ok := qry.(engine.ExplainableQuery); ok { if explQry, ok := qry.(engine.ExplainableQuery); ok {
analyze := explQry.Analyze() analyze := explQry.Analyze()
if analyze == nil {
return stats
}
stats.SamplesTotal = analyze.TotalSamples() stats.SamplesTotal = analyze.TotalSamples()
stats.PeakSamples = analyze.PeakSamples() stats.PeakSamples = analyze.PeakSamples()
} }

View File

@ -12,6 +12,7 @@ import (
"github.com/efficientgo/core/testutil" "github.com/efficientgo/core/testutil"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/promql" "github.com/prometheus/prometheus/promql"
"github.com/prometheus/prometheus/storage" "github.com/prometheus/prometheus/storage"
"github.com/prometheus/prometheus/util/annotations" "github.com/prometheus/prometheus/util/annotations"
@ -31,7 +32,7 @@ import (
func TestGRPCQueryAPIWithQueryPlan(t *testing.T) { func TestGRPCQueryAPIWithQueryPlan(t *testing.T) {
logger := log.NewNopLogger() logger := log.NewNopLogger()
reg := prometheus.NewRegistry() reg := prometheus.NewRegistry()
proxy := store.NewProxyStore(logger, reg, func() []store.Client { return nil }, component.Store, nil, 1*time.Minute, store.LazyRetrieval) proxy := store.NewProxyStore(logger, reg, func() []store.Client { return nil }, component.Store, labels.EmptyLabels(), 1*time.Minute, store.LazyRetrieval)
queryableCreator := query.NewQueryableCreator(logger, reg, proxy, 1, 1*time.Minute, dedup.AlgorithmPenalty) queryableCreator := query.NewQueryableCreator(logger, reg, proxy, 1, 1*time.Minute, dedup.AlgorithmPenalty)
remoteEndpointsCreator := query.NewRemoteEndpointsCreator(logger, func() []query.Client { return nil }, nil, 1*time.Minute, true, true) remoteEndpointsCreator := query.NewRemoteEndpointsCreator(logger, func() []query.Client { return nil }, nil, 1*time.Minute, true, true)
lookbackDeltaFunc := func(i int64) time.Duration { return 5 * time.Minute } lookbackDeltaFunc := func(i int64) time.Duration { return 5 * time.Minute }
@ -39,7 +40,7 @@ func TestGRPCQueryAPIWithQueryPlan(t *testing.T) {
expr, err := extpromql.ParseExpr("metric") expr, err := extpromql.ParseExpr("metric")
testutil.Ok(t, err) testutil.Ok(t, err)
lplan := logicalplan.NewFromAST(expr, &equery.Options{}, logicalplan.PlanOptions{}) lplan, err := logicalplan.NewFromAST(expr, &equery.Options{}, logicalplan.PlanOptions{})
testutil.Ok(t, err) testutil.Ok(t, err)
// Create a mock query plan. // Create a mock query plan.
planBytes, err := logicalplan.Marshal(lplan.Root()) planBytes, err := logicalplan.Marshal(lplan.Root())
@ -75,7 +76,7 @@ func TestGRPCQueryAPIWithQueryPlan(t *testing.T) {
func TestGRPCQueryAPIErrorHandling(t *testing.T) { func TestGRPCQueryAPIErrorHandling(t *testing.T) {
logger := log.NewNopLogger() logger := log.NewNopLogger()
reg := prometheus.NewRegistry() reg := prometheus.NewRegistry()
proxy := store.NewProxyStore(logger, reg, func() []store.Client { return nil }, component.Store, nil, 1*time.Minute, store.LazyRetrieval) proxy := store.NewProxyStore(logger, reg, func() []store.Client { return nil }, component.Store, labels.EmptyLabels(), 1*time.Minute, store.LazyRetrieval)
queryableCreator := query.NewQueryableCreator(logger, reg, proxy, 1, 1*time.Minute, dedup.AlgorithmPenalty) queryableCreator := query.NewQueryableCreator(logger, reg, proxy, 1, 1*time.Minute, dedup.AlgorithmPenalty)
remoteEndpointsCreator := query.NewRemoteEndpointsCreator(logger, func() []query.Client { return nil }, nil, 1*time.Minute, true, true) remoteEndpointsCreator := query.NewRemoteEndpointsCreator(logger, func() []query.Client { return nil }, nil, 1*time.Minute, true, true)
lookbackDeltaFunc := func(i int64) time.Duration { return 5 * time.Minute } lookbackDeltaFunc := func(i int64) time.Duration { return 5 * time.Minute }

View File

@ -43,6 +43,7 @@ import (
"github.com/prometheus/prometheus/storage" "github.com/prometheus/prometheus/storage"
"github.com/prometheus/prometheus/util/annotations" "github.com/prometheus/prometheus/util/annotations"
"github.com/prometheus/prometheus/util/stats" "github.com/prometheus/prometheus/util/stats"
v1 "github.com/prometheus/prometheus/web/api/v1"
"github.com/thanos-io/promql-engine/engine" "github.com/thanos-io/promql-engine/engine"
"github.com/thanos-io/thanos/pkg/api" "github.com/thanos-io/thanos/pkg/api"
@ -110,6 +111,7 @@ type QueryAPI struct {
replicaLabels []string replicaLabels []string
endpointStatus func() []query.EndpointStatus endpointStatus func() []query.EndpointStatus
tsdbSelector *store.TSDBSelector
defaultRangeQueryStep time.Duration defaultRangeQueryStep time.Duration
defaultInstantQueryMaxSourceResolution time.Duration defaultInstantQueryMaxSourceResolution time.Duration
@ -159,6 +161,7 @@ func NewQueryAPI(
tenantCertField string, tenantCertField string,
enforceTenancy bool, enforceTenancy bool,
tenantLabel string, tenantLabel string,
tsdbSelector *store.TSDBSelector,
) *QueryAPI { ) *QueryAPI {
if statsAggregatorFactory == nil { if statsAggregatorFactory == nil {
statsAggregatorFactory = &store.NoopSeriesStatsAggregatorFactory{} statsAggregatorFactory = &store.NoopSeriesStatsAggregatorFactory{}
@ -194,6 +197,7 @@ func NewQueryAPI(
tenantCertField: tenantCertField, tenantCertField: tenantCertField,
enforceTenancy: enforceTenancy, enforceTenancy: enforceTenancy,
tenantLabel: tenantLabel, tenantLabel: tenantLabel,
tsdbSelector: tsdbSelector,
queryRangeHist: promauto.With(reg).NewHistogram(prometheus.HistogramOpts{ queryRangeHist: promauto.With(reg).NewHistogram(prometheus.HistogramOpts{
Name: "thanos_query_range_requested_timespan_duration_seconds", Name: "thanos_query_range_requested_timespan_duration_seconds",
@ -407,17 +411,29 @@ func (qapi *QueryAPI) getQueryExplain(query promql.Query) (*engine.ExplainOutput
return eq.Explain(), nil return eq.Explain(), nil
} }
return nil, &api.ApiError{Typ: api.ErrorBadData, Err: errors.Errorf("Query not explainable")} return nil, &api.ApiError{Typ: api.ErrorBadData, Err: errors.Errorf("Query not explainable")}
} }
func (qapi *QueryAPI) parseQueryAnalyzeParam(r *http.Request, query promql.Query) (queryTelemetry, error) { func (qapi *QueryAPI) parseQueryAnalyzeParam(r *http.Request) bool {
if r.FormValue(QueryAnalyzeParam) == "true" || r.FormValue(QueryAnalyzeParam) == "1" { return (r.FormValue(QueryAnalyzeParam) == "true" || r.FormValue(QueryAnalyzeParam) == "1")
if eq, ok := query.(engine.ExplainableQuery); ok { }
return processAnalysis(eq.Analyze()), nil
func analyzeQueryOutput(query promql.Query, engineType PromqlEngineType) (queryTelemetry, error) {
if eq, ok := query.(engine.ExplainableQuery); ok {
if analyze := eq.Analyze(); analyze != nil {
return processAnalysis(analyze), nil
} else {
return queryTelemetry{}, errors.Errorf("Query: %v not analyzable", query)
} }
return queryTelemetry{}, errors.Errorf("Query not analyzable; change engine to 'thanos'")
} }
return queryTelemetry{}, nil
var warning error
if engineType == PromqlEngineThanos {
warning = errors.New("Query fallback to prometheus engine; not analyzable.")
} else {
warning = errors.New("Query not analyzable; change engine to 'thanos'.")
}
return queryTelemetry{}, warning
} }
func processAnalysis(a *engine.AnalyzeOutputNode) queryTelemetry { func processAnalysis(a *engine.AnalyzeOutputNode) queryTelemetry {
@ -530,7 +546,6 @@ func (qapi *QueryAPI) queryExplain(r *http.Request) (interface{}, []error, *api.
var qErr error var qErr error
qry, qErr = qapi.queryCreate.makeInstantQuery(ctx, engineParam, queryable, remoteEndpoints, planOrQuery{query: queryStr}, queryOpts, ts) qry, qErr = qapi.queryCreate.makeInstantQuery(ctx, engineParam, queryable, remoteEndpoints, planOrQuery{query: queryStr}, queryOpts, ts)
return qErr return qErr
}); err != nil { }); err != nil {
return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {} return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {}
} }
@ -614,6 +629,7 @@ func (qapi *QueryAPI) query(r *http.Request) (interface{}, []error, *api.ApiErro
var ( var (
qry promql.Query qry promql.Query
seriesStats []storepb.SeriesStatsCounter seriesStats []storepb.SeriesStatsCounter
warnings []error
) )
if err := tracing.DoInSpanWithErr(ctx, "instant_query_create", func(ctx context.Context) error { if err := tracing.DoInSpanWithErr(ctx, "instant_query_create", func(ctx context.Context) error {
@ -638,16 +654,10 @@ func (qapi *QueryAPI) query(r *http.Request) (interface{}, []error, *api.ApiErro
var qErr error var qErr error
qry, qErr = qapi.queryCreate.makeInstantQuery(ctx, engineParam, queryable, remoteEndpoints, planOrQuery{query: queryStr}, queryOpts, ts) qry, qErr = qapi.queryCreate.makeInstantQuery(ctx, engineParam, queryable, remoteEndpoints, planOrQuery{query: queryStr}, queryOpts, ts)
return qErr return qErr
}); err != nil { }); err != nil {
return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {} return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {}
} }
analysis, err := qapi.parseQueryAnalyzeParam(r, qry)
if err != nil {
return nil, nil, apiErr, func() {}
}
if err := tracing.DoInSpanWithErr(ctx, "query_gate_ismyturn", qapi.gate.Start); err != nil { if err := tracing.DoInSpanWithErr(ctx, "query_gate_ismyturn", qapi.gate.Start); err != nil {
return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: err}, qry.Close return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: err}, qry.Close
} }
@ -669,6 +679,17 @@ func (qapi *QueryAPI) query(r *http.Request) (interface{}, []error, *api.ApiErro
} }
return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: res.Err}, qry.Close return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: res.Err}, qry.Close
} }
// this prevents a panic when annotations are concurrently accessed
safeWarnings := annotations.New().Merge(res.Warnings)
warnings = append(warnings, safeWarnings.AsErrors()...)
var analysis queryTelemetry
if qapi.parseQueryAnalyzeParam(r) {
analysis, err = analyzeQueryOutput(qry, engineParam)
if err != nil {
warnings = append(warnings, err)
}
}
aggregator := qapi.seriesStatsAggregatorFactory.NewAggregator(tenant) aggregator := qapi.seriesStatsAggregatorFactory.NewAggregator(tenant)
for i := range seriesStats { for i := range seriesStats {
@ -686,7 +707,7 @@ func (qapi *QueryAPI) query(r *http.Request) (interface{}, []error, *api.ApiErro
Result: res.Value, Result: res.Value,
Stats: qs, Stats: qs,
QueryAnalysis: analysis, QueryAnalysis: analysis,
}, res.Warnings.AsErrors(), nil, qry.Close }, warnings, nil, qry.Close
} }
func (qapi *QueryAPI) queryRangeExplain(r *http.Request) (interface{}, []error, *api.ApiError, func()) { func (qapi *QueryAPI) queryRangeExplain(r *http.Request) (interface{}, []error, *api.ApiError, func()) {
@ -813,7 +834,6 @@ func (qapi *QueryAPI) queryRangeExplain(r *http.Request) (interface{}, []error,
var qErr error var qErr error
qry, qErr = qapi.queryCreate.makeRangeQuery(ctx, engineParam, queryable, remoteEndpoints, planOrQuery{query: queryStr}, queryOpts, start, end, step) qry, qErr = qapi.queryCreate.makeRangeQuery(ctx, engineParam, queryable, remoteEndpoints, planOrQuery{query: queryStr}, queryOpts, start, end, step)
return qErr return qErr
}); err != nil { }); err != nil {
return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {} return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {}
} }
@ -923,6 +943,7 @@ func (qapi *QueryAPI) queryRange(r *http.Request) (interface{}, []error, *api.Ap
var ( var (
qry promql.Query qry promql.Query
seriesStats []storepb.SeriesStatsCounter seriesStats []storepb.SeriesStatsCounter
warnings []error
) )
if err := tracing.DoInSpanWithErr(ctx, "range_query_create", func(ctx context.Context) error { if err := tracing.DoInSpanWithErr(ctx, "range_query_create", func(ctx context.Context) error {
queryable := qapi.queryableCreate( queryable := qapi.queryableCreate(
@ -946,16 +967,10 @@ func (qapi *QueryAPI) queryRange(r *http.Request) (interface{}, []error, *api.Ap
var qErr error var qErr error
qry, qErr = qapi.queryCreate.makeRangeQuery(ctx, engineParam, queryable, remoteEndpoints, planOrQuery{query: queryStr}, queryOpts, start, end, step) qry, qErr = qapi.queryCreate.makeRangeQuery(ctx, engineParam, queryable, remoteEndpoints, planOrQuery{query: queryStr}, queryOpts, start, end, step)
return qErr return qErr
}); err != nil { }); err != nil {
return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {} return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {}
} }
analysis, err := qapi.parseQueryAnalyzeParam(r, qry)
if err != nil {
return nil, nil, apiErr, func() {}
}
if err := tracing.DoInSpanWithErr(ctx, "query_gate_ismyturn", qapi.gate.Start); err != nil { if err := tracing.DoInSpanWithErr(ctx, "query_gate_ismyturn", qapi.gate.Start); err != nil {
return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: err}, qry.Close return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: err}, qry.Close
} }
@ -964,7 +979,6 @@ func (qapi *QueryAPI) queryRange(r *http.Request) (interface{}, []error, *api.Ap
var res *promql.Result var res *promql.Result
tracing.DoInSpan(ctx, "range_query_exec", func(ctx context.Context) { tracing.DoInSpan(ctx, "range_query_exec", func(ctx context.Context) {
res = qry.Exec(ctx) res = qry.Exec(ctx)
}) })
beforeRange := time.Now() beforeRange := time.Now()
if res.Err != nil { if res.Err != nil {
@ -976,6 +990,18 @@ func (qapi *QueryAPI) queryRange(r *http.Request) (interface{}, []error, *api.Ap
} }
return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: res.Err}, qry.Close return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: res.Err}, qry.Close
} }
// this prevents a panic when annotations are concurrently accessed
safeWarnings := annotations.New().Merge(res.Warnings)
warnings = append(warnings, safeWarnings.AsErrors()...)
var analysis queryTelemetry
if qapi.parseQueryAnalyzeParam(r) {
analysis, err = analyzeQueryOutput(qry, engineParam)
if err != nil {
warnings = append(warnings, err)
}
}
aggregator := qapi.seriesStatsAggregatorFactory.NewAggregator(tenant) aggregator := qapi.seriesStatsAggregatorFactory.NewAggregator(tenant)
for i := range seriesStats { for i := range seriesStats {
aggregator.Aggregate(seriesStats[i]) aggregator.Aggregate(seriesStats[i])
@ -992,7 +1018,7 @@ func (qapi *QueryAPI) queryRange(r *http.Request) (interface{}, []error, *api.Ap
Result: res.Value, Result: res.Value,
Stats: qs, Stats: qs,
QueryAnalysis: analysis, QueryAnalysis: analysis,
}, res.Warnings.AsErrors(), nil, qry.Close }, warnings, nil, qry.Close
} }
func (qapi *QueryAPI) labelValues(r *http.Request) (interface{}, []error, *api.ApiError, func()) { func (qapi *QueryAPI) labelValues(r *http.Request) (interface{}, []error, *api.ApiError, func()) {
@ -1145,7 +1171,6 @@ func (qapi *QueryAPI) series(r *http.Request) (interface{}, []error, *api.ApiErr
nil, nil,
query.NoopSeriesStatsReporter, query.NoopSeriesStatsReporter,
).Querier(timestamp.FromTime(start), timestamp.FromTime(end)) ).Querier(timestamp.FromTime(start), timestamp.FromTime(end))
if err != nil { if err != nil {
return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: err}, func() {} return nil, nil, &api.ApiError{Typ: api.ErrorExec, Err: err}, func() {}
} }
@ -1278,7 +1303,20 @@ func (qapi *QueryAPI) stores(_ *http.Request) (interface{}, []error, *api.ApiErr
if status.ComponentType == nil { if status.ComponentType == nil {
continue continue
} }
statuses[status.ComponentType.String()] = append(statuses[status.ComponentType.String()], status)
// Apply TSDBSelector filtering to LabelSets if selector is configured
filteredStatus := status
if qapi.tsdbSelector != nil && len(status.LabelSets) > 0 {
matches, filteredLabelSets := qapi.tsdbSelector.MatchLabelSets(status.LabelSets...)
if !matches {
continue
}
if filteredLabelSets != nil {
filteredStatus.LabelSets = filteredLabelSets
}
}
statuses[status.ComponentType.String()] = append(statuses[status.ComponentType.String()], filteredStatus)
} }
return statuses, nil, nil, func() {} return statuses, nil, nil, func() {}
} }
@ -1429,11 +1467,11 @@ func NewExemplarsHandler(client exemplars.UnaryClient, enablePartialResponse boo
err error err error
) )
start, err := parseTimeParam(r, "start", infMinTime) start, err := parseTimeParam(r, "start", v1.MinTime)
if err != nil { if err != nil {
return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {} return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {}
} }
end, err := parseTimeParam(r, "end", infMaxTime) end, err := parseTimeParam(r, "end", v1.MaxTime)
if err != nil { if err != nil {
return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {} return nil, nil, &api.ApiError{Typ: api.ErrorBadData, Err: err}, func() {}
} }
@ -1456,17 +1494,12 @@ func NewExemplarsHandler(client exemplars.UnaryClient, enablePartialResponse boo
} }
} }
var (
infMinTime = time.Unix(math.MinInt64/1000+62135596801, 0)
infMaxTime = time.Unix(math.MaxInt64/1000-62135596801, 999999999)
)
func parseMetadataTimeRange(r *http.Request, defaultMetadataTimeRange time.Duration) (time.Time, time.Time, error) { func parseMetadataTimeRange(r *http.Request, defaultMetadataTimeRange time.Duration) (time.Time, time.Time, error) {
// If start and end time not specified as query parameter, we get the range from the beginning of time by default. // If start and end time not specified as query parameter, we get the range from the beginning of time by default.
var defaultStartTime, defaultEndTime time.Time var defaultStartTime, defaultEndTime time.Time
if defaultMetadataTimeRange == 0 { if defaultMetadataTimeRange == 0 {
defaultStartTime = infMinTime defaultStartTime = v1.MinTime
defaultEndTime = infMaxTime defaultEndTime = v1.MaxTime
} else { } else {
now := time.Now() now := time.Now()
defaultStartTime = now.Add(-defaultMetadataTimeRange) defaultStartTime = now.Add(-defaultMetadataTimeRange)

View File

@ -99,6 +99,7 @@ var (
true, true,
nil, nil,
PromqlQueryModeLocal, PromqlQueryModeLocal,
false,
) )
emptyRemoteEndpointsCreate = query.NewRemoteEndpointsCreator( emptyRemoteEndpointsCreate = query.NewRemoteEndpointsCreator(
@ -162,38 +163,13 @@ func testEndpoint(t *testing.T, test endpointTestCase, name string, responseComp
func TestQueryEndpoints(t *testing.T) { func TestQueryEndpoints(t *testing.T) {
lbls := []labels.Labels{ lbls := []labels.Labels{
{ labels.FromStrings("__name__", "test_metric1", "foo", "bar"),
labels.Label{Name: "__name__", Value: "test_metric1"}, labels.FromStrings("__name__", "test_metric1", "foo", "boo"),
labels.Label{Name: "foo", Value: "bar"}, labels.FromStrings("__name__", "test_metric2", "foo", "boo"),
}, labels.FromStrings("__name__", "test_metric_replica1", "foo", "bar", "replica", "a"),
{ labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo", "replica", "a"),
labels.Label{Name: "__name__", Value: "test_metric1"}, labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo", "replica", "b"),
labels.Label{Name: "foo", Value: "boo"}, labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo", "replica1", "a"),
},
{
labels.Label{Name: "__name__", Value: "test_metric2"},
labels.Label{Name: "foo", Value: "boo"},
},
{
labels.Label{Name: "__name__", Value: "test_metric_replica1"},
labels.Label{Name: "foo", Value: "bar"},
labels.Label{Name: "replica", Value: "a"},
},
{
labels.Label{Name: "__name__", Value: "test_metric_replica1"},
labels.Label{Name: "foo", Value: "boo"},
labels.Label{Name: "replica", Value: "a"},
},
{
labels.Label{Name: "__name__", Value: "test_metric_replica1"},
labels.Label{Name: "foo", Value: "boo"},
labels.Label{Name: "replica", Value: "b"},
},
{
labels.Label{Name: "__name__", Value: "test_metric_replica1"},
labels.Label{Name: "foo", Value: "boo"},
labels.Label{Name: "replica1", Value: "a"},
},
} }
db, err := e2eutil.NewTSDB() db, err := e2eutil.NewTSDB()
@ -285,76 +261,24 @@ func TestQueryEndpoints(t *testing.T) {
ResultType: parser.ValueTypeVector, ResultType: parser.ValueTypeVector,
Result: promql.Vector{ Result: promql.Vector{
{ {
Metric: labels.Labels{ Metric: labels.FromStrings("__name__", "test_metric_replica1", "foo", "bar", "replica", "a"),
{ T: 123000,
Name: "__name__", F: 2,
Value: "test_metric_replica1",
},
{
Name: "foo",
Value: "bar",
},
{
Name: "replica",
Value: "a",
},
},
T: 123000,
F: 2,
}, },
{ {
Metric: labels.Labels{ Metric: labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo", "replica", "a"),
{ T: 123000,
Name: "__name__", F: 2,
Value: "test_metric_replica1",
},
{
Name: "foo",
Value: "boo",
},
{
Name: "replica",
Value: "a",
},
},
T: 123000,
F: 2,
}, },
{ {
Metric: labels.Labels{ Metric: labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo", "replica", "b"),
{ T: 123000,
Name: "__name__", F: 2,
Value: "test_metric_replica1",
},
{
Name: "foo",
Value: "boo",
},
{
Name: "replica",
Value: "b",
},
},
T: 123000,
F: 2,
}, },
{ {
Metric: labels.Labels{ Metric: labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo", "replica1", "a"),
{ T: 123000,
Name: "__name__", F: 2,
Value: "test_metric_replica1",
},
{
Name: "foo",
Value: "boo",
},
{
Name: "replica1",
Value: "a",
},
},
T: 123000,
F: 2,
}, },
}, },
}, },
@ -372,50 +296,19 @@ func TestQueryEndpoints(t *testing.T) {
ResultType: parser.ValueTypeVector, ResultType: parser.ValueTypeVector,
Result: promql.Vector{ Result: promql.Vector{
{ {
Metric: labels.Labels{ Metric: labels.FromStrings("__name__", "test_metric_replica1", "foo", "bar"),
{ T: 123000,
Name: "__name__", F: 2,
Value: "test_metric_replica1",
},
{
Name: "foo",
Value: "bar",
},
},
T: 123000,
F: 2,
}, },
{ {
Metric: labels.Labels{ Metric: labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo"),
{ T: 123000,
Name: "__name__", F: 2,
Value: "test_metric_replica1",
},
{
Name: "foo",
Value: "boo",
},
},
T: 123000,
F: 2,
}, },
{ {
Metric: labels.Labels{ Metric: labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo", "replica1", "a"),
{ T: 123000,
Name: "__name__", F: 2,
Value: "test_metric_replica1",
},
{
Name: "foo",
Value: "boo",
},
{
Name: "replica1",
Value: "a",
},
},
T: 123000,
F: 2,
}, },
}, },
}, },
@ -432,32 +325,14 @@ func TestQueryEndpoints(t *testing.T) {
ResultType: parser.ValueTypeVector, ResultType: parser.ValueTypeVector,
Result: promql.Vector{ Result: promql.Vector{
{ {
Metric: labels.Labels{ Metric: labels.FromStrings("__name__", "test_metric_replica1", "foo", "bar"),
{ T: 123000,
Name: "__name__", F: 2,
Value: "test_metric_replica1",
},
{
Name: "foo",
Value: "bar",
},
},
T: 123000,
F: 2,
}, },
{ {
Metric: labels.Labels{ Metric: labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo"),
{ T: 123000,
Name: "__name__", F: 2,
Value: "test_metric_replica1",
},
{
Name: "foo",
Value: "boo",
},
},
T: 123000,
F: 2,
}, },
}, },
}, },
@ -503,7 +378,6 @@ func TestQueryEndpoints(t *testing.T) {
} }
return res return res
}(500, 1), }(500, 1),
Metric: nil,
}, },
}, },
}, },
@ -525,7 +399,6 @@ func TestQueryEndpoints(t *testing.T) {
{F: 1, T: timestamp.FromTime(start.Add(1 * time.Second))}, {F: 1, T: timestamp.FromTime(start.Add(1 * time.Second))},
{F: 2, T: timestamp.FromTime(start.Add(2 * time.Second))}, {F: 2, T: timestamp.FromTime(start.Add(2 * time.Second))},
}, },
Metric: nil,
}, },
}, },
}, },
@ -778,7 +651,6 @@ func TestQueryAnalyzeEndpoints(t *testing.T) {
} }
return res return res
}(500, 1), }(500, 1),
Metric: nil,
}, },
}, },
QueryAnalysis: queryTelemetry{}, QueryAnalysis: queryTelemetry{},
@ -795,7 +667,7 @@ func TestQueryAnalyzeEndpoints(t *testing.T) {
func newProxyStoreWithTSDBStore(db store.TSDBReader) *store.ProxyStore { func newProxyStoreWithTSDBStore(db store.TSDBReader) *store.ProxyStore {
c := &storetestutil.TestClient{ c := &storetestutil.TestClient{
Name: "1", Name: "1",
StoreClient: storepb.ServerAsClient(store.NewTSDBStore(nil, db, component.Query, nil)), StoreClient: storepb.ServerAsClient(store.NewTSDBStore(nil, db, component.Query, labels.EmptyLabels())),
MinTime: math.MinInt64, MaxTime: math.MaxInt64, MinTime: math.MinInt64, MaxTime: math.MaxInt64,
} }
@ -804,7 +676,7 @@ func newProxyStoreWithTSDBStore(db store.TSDBReader) *store.ProxyStore {
nil, nil,
func() []store.Client { return []store.Client{c} }, func() []store.Client { return []store.Client{c} },
component.Query, component.Query,
nil, labels.EmptyLabels(),
0, 0,
store.EagerRetrieval, store.EagerRetrieval,
) )
@ -812,41 +684,16 @@ func newProxyStoreWithTSDBStore(db store.TSDBReader) *store.ProxyStore {
func TestMetadataEndpoints(t *testing.T) { func TestMetadataEndpoints(t *testing.T) {
var old = []labels.Labels{ var old = []labels.Labels{
{ labels.FromStrings("__name__", "test_metric1", "foo", "bar"),
labels.Label{Name: "__name__", Value: "test_metric1"}, labels.FromStrings("__name__", "test_metric1", "foo", "boo"),
labels.Label{Name: "foo", Value: "bar"}, labels.FromStrings("__name__", "test_metric2", "foo", "boo"),
},
{
labels.Label{Name: "__name__", Value: "test_metric1"},
labels.Label{Name: "foo", Value: "boo"},
},
{
labels.Label{Name: "__name__", Value: "test_metric2"},
labels.Label{Name: "foo", Value: "boo"},
},
} }
var recent = []labels.Labels{ var recent = []labels.Labels{
{ labels.FromStrings("__name__", "test_metric_replica1", "foo", "bar", "replica", "a"),
labels.Label{Name: "__name__", Value: "test_metric_replica1"}, labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo", "replica", "a"),
labels.Label{Name: "foo", Value: "bar"}, labels.FromStrings("__name__", "test_metric_replica1", "foo", "boo", "replica", "b"),
labels.Label{Name: "replica", Value: "a"}, labels.FromStrings("__name__", "test_metric_replica2", "foo", "boo", "replica1", "a"),
},
{
labels.Label{Name: "__name__", Value: "test_metric_replica1"},
labels.Label{Name: "foo", Value: "boo"},
labels.Label{Name: "replica", Value: "a"},
},
{
labels.Label{Name: "__name__", Value: "test_metric_replica1"},
labels.Label{Name: "foo", Value: "boo"},
labels.Label{Name: "replica", Value: "b"},
},
{
labels.Label{Name: "__name__", Value: "test_metric_replica2"},
labels.Label{Name: "foo", Value: "boo"},
labels.Label{Name: "replica1", Value: "a"},
},
} }
dir := t.TempDir() dir := t.TempDir()
@ -2002,7 +1849,7 @@ func TestRulesHandler(t *testing.T) {
EvaluationTime: all[3].GetAlert().EvaluationDurationSeconds, EvaluationTime: all[3].GetAlert().EvaluationDurationSeconds,
Duration: all[3].GetAlert().DurationSeconds, Duration: all[3].GetAlert().DurationSeconds,
KeepFiringFor: all[3].GetAlert().KeepFiringForSeconds, KeepFiringFor: all[3].GetAlert().KeepFiringForSeconds,
Annotations: nil, Annotations: labels.EmptyLabels(),
Alerts: []*testpromcompatibility.Alert{}, Alerts: []*testpromcompatibility.Alert{},
Type: "alerting", Type: "alerting",
}, },
@ -2017,7 +1864,7 @@ func TestRulesHandler(t *testing.T) {
EvaluationTime: all[4].GetAlert().EvaluationDurationSeconds, EvaluationTime: all[4].GetAlert().EvaluationDurationSeconds,
Duration: all[4].GetAlert().DurationSeconds, Duration: all[4].GetAlert().DurationSeconds,
KeepFiringFor: all[4].GetAlert().KeepFiringForSeconds, KeepFiringFor: all[4].GetAlert().KeepFiringForSeconds,
Annotations: nil, Annotations: labels.EmptyLabels(),
Alerts: []*testpromcompatibility.Alert{}, Alerts: []*testpromcompatibility.Alert{},
Type: "alerting", Type: "alerting",
}, },

View File

@ -19,7 +19,7 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"

View File

@ -18,7 +18,8 @@ import (
"github.com/thanos-io/thanos/pkg/extprom" "github.com/thanos-io/thanos/pkg/extprom"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"
@ -144,7 +145,7 @@ func TestUpload(t *testing.T) {
testutil.Equals(t, 3, len(bkt.Objects())) testutil.Equals(t, 3, len(bkt.Objects()))
testutil.Equals(t, 3727, len(bkt.Objects()[path.Join(b1.String(), ChunksDirname, "000001")])) testutil.Equals(t, 3727, len(bkt.Objects()[path.Join(b1.String(), ChunksDirname, "000001")]))
testutil.Equals(t, 401, len(bkt.Objects()[path.Join(b1.String(), IndexFilename)])) testutil.Equals(t, 401, len(bkt.Objects()[path.Join(b1.String(), IndexFilename)]))
testutil.Equals(t, 595, len(bkt.Objects()[path.Join(b1.String(), MetaFilename)])) testutil.Equals(t, 621, len(bkt.Objects()[path.Join(b1.String(), MetaFilename)]))
// File stats are gathered. // File stats are gathered.
testutil.Equals(t, fmt.Sprintf(`{ testutil.Equals(t, fmt.Sprintf(`{
@ -153,6 +154,7 @@ func TestUpload(t *testing.T) {
"maxTime": 1000, "maxTime": 1000,
"stats": { "stats": {
"numSamples": 500, "numSamples": 500,
"numFloatSamples": 500,
"numSeries": 5, "numSeries": 5,
"numChunks": 5 "numChunks": 5
}, },
@ -197,7 +199,7 @@ func TestUpload(t *testing.T) {
testutil.Equals(t, 3, len(bkt.Objects())) testutil.Equals(t, 3, len(bkt.Objects()))
testutil.Equals(t, 3727, len(bkt.Objects()[path.Join(b1.String(), ChunksDirname, "000001")])) testutil.Equals(t, 3727, len(bkt.Objects()[path.Join(b1.String(), ChunksDirname, "000001")]))
testutil.Equals(t, 401, len(bkt.Objects()[path.Join(b1.String(), IndexFilename)])) testutil.Equals(t, 401, len(bkt.Objects()[path.Join(b1.String(), IndexFilename)]))
testutil.Equals(t, 595, len(bkt.Objects()[path.Join(b1.String(), MetaFilename)])) testutil.Equals(t, 621, len(bkt.Objects()[path.Join(b1.String(), MetaFilename)]))
} }
{ {
// Upload with no external labels should be blocked. // Upload with no external labels should be blocked.
@ -229,7 +231,7 @@ func TestUpload(t *testing.T) {
testutil.Equals(t, 6, len(bkt.Objects())) testutil.Equals(t, 6, len(bkt.Objects()))
testutil.Equals(t, 3727, len(bkt.Objects()[path.Join(b2.String(), ChunksDirname, "000001")])) testutil.Equals(t, 3727, len(bkt.Objects()[path.Join(b2.String(), ChunksDirname, "000001")]))
testutil.Equals(t, 401, len(bkt.Objects()[path.Join(b2.String(), IndexFilename)])) testutil.Equals(t, 401, len(bkt.Objects()[path.Join(b2.String(), IndexFilename)]))
testutil.Equals(t, 574, len(bkt.Objects()[path.Join(b2.String(), MetaFilename)])) testutil.Equals(t, 600, len(bkt.Objects()[path.Join(b2.String(), MetaFilename)]))
} }
} }
@ -587,8 +589,8 @@ type errBucket struct {
failSuffix string failSuffix string
} }
func (eb errBucket) Upload(ctx context.Context, name string, r io.Reader) error { func (eb errBucket) Upload(ctx context.Context, name string, r io.Reader, opts ...objstore.ObjectUploadOption) error {
err := eb.Bucket.Upload(ctx, name, r) err := eb.Bucket.Upload(ctx, name, r, opts...)
if err != nil { if err != nil {
return err return err
} }

View File

@ -18,7 +18,8 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/golang/groupcache/singleflight" "github.com/golang/groupcache/singleflight"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"
@ -41,7 +42,8 @@ const FetcherConcurrency = 32
// to allow depending projects (eg. Cortex) to implement their own custom metadata fetcher while tracking // to allow depending projects (eg. Cortex) to implement their own custom metadata fetcher while tracking
// compatible metrics. // compatible metrics.
type BaseFetcherMetrics struct { type BaseFetcherMetrics struct {
Syncs prometheus.Counter Syncs prometheus.Counter
CacheBusts prometheus.Counter
} }
// FetcherMetrics holds metrics tracked by the metadata fetcher. This struct and its fields are exported // FetcherMetrics holds metrics tracked by the metadata fetcher. This struct and its fields are exported
@ -91,6 +93,9 @@ const (
// MarkedForNoDownsampleMeta is label for blocks which are loaded but also marked for no downsample. This label is also counted in `loaded` label metric. // MarkedForNoDownsampleMeta is label for blocks which are loaded but also marked for no downsample. This label is also counted in `loaded` label metric.
MarkedForNoDownsampleMeta = "marked-for-no-downsample" MarkedForNoDownsampleMeta = "marked-for-no-downsample"
// ParquetMigratedMeta is label for blocks which are marked as migrated to parquet format.
ParquetMigratedMeta = "parquet-migrated"
// Modified label values. // Modified label values.
replicaRemovedMeta = "replica-label-removed" replicaRemovedMeta = "replica-label-removed"
) )
@ -103,6 +108,11 @@ func NewBaseFetcherMetrics(reg prometheus.Registerer) *BaseFetcherMetrics {
Name: "base_syncs_total", Name: "base_syncs_total",
Help: "Total blocks metadata synchronization attempts by base Fetcher", Help: "Total blocks metadata synchronization attempts by base Fetcher",
}) })
m.CacheBusts = promauto.With(reg).NewCounter(prometheus.CounterOpts{
Subsystem: FetcherSubSys,
Name: "base_cache_busts_total",
Help: "Total blocks metadata cache busts by base Fetcher",
})
return &m return &m
} }
@ -161,6 +171,7 @@ func DefaultSyncedStateLabelValues() [][]string {
{duplicateMeta}, {duplicateMeta},
{MarkedForDeletionMeta}, {MarkedForDeletionMeta},
{MarkedForNoCompactionMeta}, {MarkedForNoCompactionMeta},
{ParquetMigratedMeta},
} }
} }
@ -174,7 +185,7 @@ func DefaultModifiedLabelValues() [][]string {
type Lister interface { type Lister interface {
// GetActiveAndPartialBlockIDs GetActiveBlocksIDs returning it via channel (streaming) and response. // GetActiveAndPartialBlockIDs GetActiveBlocksIDs returning it via channel (streaming) and response.
// Active blocks are blocks which contain meta.json, while partial blocks are blocks without meta.json // Active blocks are blocks which contain meta.json, while partial blocks are blocks without meta.json
GetActiveAndPartialBlockIDs(ctx context.Context, ch chan<- ulid.ULID) (partialBlocks map[ulid.ULID]bool, err error) GetActiveAndPartialBlockIDs(ctx context.Context, activeBlocks chan<- ActiveBlockFetchData) (partialBlocks map[ulid.ULID]bool, err error)
} }
// RecursiveLister lists block IDs by recursively iterating through a bucket. // RecursiveLister lists block IDs by recursively iterating through a bucket.
@ -190,9 +201,17 @@ func NewRecursiveLister(logger log.Logger, bkt objstore.InstrumentedBucketReader
} }
} }
func (f *RecursiveLister) GetActiveAndPartialBlockIDs(ctx context.Context, ch chan<- ulid.ULID) (partialBlocks map[ulid.ULID]bool, err error) { type ActiveBlockFetchData struct {
lastModified time.Time
ulid.ULID
}
func (f *RecursiveLister) GetActiveAndPartialBlockIDs(ctx context.Context, activeBlocks chan<- ActiveBlockFetchData) (partialBlocks map[ulid.ULID]bool, err error) {
partialBlocks = make(map[ulid.ULID]bool) partialBlocks = make(map[ulid.ULID]bool)
err = f.bkt.Iter(ctx, "", func(name string) error {
err = f.bkt.IterWithAttributes(ctx, "", func(attrs objstore.IterObjectAttributes) error {
name := attrs.Name
parts := strings.Split(name, "/") parts := strings.Split(name, "/")
dir, file := parts[0], parts[len(parts)-1] dir, file := parts[0], parts[len(parts)-1]
id, ok := IsBlockDir(dir) id, ok := IsBlockDir(dir)
@ -205,15 +224,20 @@ func (f *RecursiveLister) GetActiveAndPartialBlockIDs(ctx context.Context, ch ch
if !IsBlockMetaFile(file) { if !IsBlockMetaFile(file) {
return nil return nil
} }
partialBlocks[id] = false
lastModified, _ := attrs.LastModified()
delete(partialBlocks, id)
select { select {
case <-ctx.Done(): case <-ctx.Done():
return ctx.Err() return ctx.Err()
case ch <- id: case activeBlocks <- ActiveBlockFetchData{
ULID: id,
lastModified: lastModified,
}:
} }
return nil return nil
}, objstore.WithRecursiveIter()) }, objstore.WithUpdatedAt(), objstore.WithRecursiveIter())
return partialBlocks, err return partialBlocks, err
} }
@ -231,10 +255,11 @@ func NewConcurrentLister(logger log.Logger, bkt objstore.InstrumentedBucketReade
} }
} }
func (f *ConcurrentLister) GetActiveAndPartialBlockIDs(ctx context.Context, ch chan<- ulid.ULID) (partialBlocks map[ulid.ULID]bool, err error) { func (f *ConcurrentLister) GetActiveAndPartialBlockIDs(ctx context.Context, activeBlocks chan<- ActiveBlockFetchData) (partialBlocks map[ulid.ULID]bool, err error) {
const concurrency = 64 const concurrency = 64
partialBlocks = make(map[ulid.ULID]bool) partialBlocks = make(map[ulid.ULID]bool)
var ( var (
metaChan = make(chan ulid.ULID, concurrency) metaChan = make(chan ulid.ULID, concurrency)
eg, gCtx = errgroup.WithContext(ctx) eg, gCtx = errgroup.WithContext(ctx)
@ -257,10 +282,14 @@ func (f *ConcurrentLister) GetActiveAndPartialBlockIDs(ctx context.Context, ch c
mu.Unlock() mu.Unlock()
continue continue
} }
select { select {
case <-gCtx.Done(): case <-gCtx.Done():
return gCtx.Err() return gCtx.Err()
case ch <- uid: case activeBlocks <- ActiveBlockFetchData{
ULID: uid,
lastModified: time.Time{}, // Not used, cache busting is only implemented by the recursive lister because otherwise we would have to call Attributes() (one extra call).
}:
} }
} }
return nil return nil
@ -313,12 +342,16 @@ type BaseFetcher struct {
blockIDsLister Lister blockIDsLister Lister
// Optional local directory to cache meta.json files. // Optional local directory to cache meta.json files.
cacheDir string cacheDir string
syncs prometheus.Counter syncs prometheus.Counter
g singleflight.Group cacheBusts prometheus.Counter
g singleflight.Group
mtx sync.Mutex mtx sync.Mutex
cached map[ulid.ULID]*metadata.Meta
cached *sync.Map
modifiedTimestamps map[ulid.ULID]time.Time
} }
// NewBaseFetcher constructs BaseFetcher. // NewBaseFetcher constructs BaseFetcher.
@ -346,8 +379,9 @@ func NewBaseFetcherWithMetrics(logger log.Logger, concurrency int, bkt objstore.
bkt: bkt, bkt: bkt,
blockIDsLister: blockIDsLister, blockIDsLister: blockIDsLister,
cacheDir: cacheDir, cacheDir: cacheDir,
cached: map[ulid.ULID]*metadata.Meta{}, cached: &sync.Map{},
syncs: metrics.Syncs, syncs: metrics.Syncs,
cacheBusts: metrics.CacheBusts,
}, nil }, nil
} }
@ -390,6 +424,22 @@ var (
ErrorSyncMetaCorrupted = errors.New("meta.json corrupted") ErrorSyncMetaCorrupted = errors.New("meta.json corrupted")
) )
func (f *BaseFetcher) metaUpdated(id ulid.ULID, modified time.Time) bool {
if f.modifiedTimestamps[id].IsZero() {
return false
}
return !f.modifiedTimestamps[id].Equal(modified)
}
func (f *BaseFetcher) bustCacheForID(id ulid.ULID) {
f.cacheBusts.Inc()
f.cached.Delete(id)
if err := os.RemoveAll(filepath.Join(f.cacheDir, id.String())); err != nil {
level.Warn(f.logger).Log("msg", "failed to remove cached meta.json dir", "dir", filepath.Join(f.cacheDir, id.String()), "err", err)
}
}
// loadMeta returns metadata from object storage or error. // loadMeta returns metadata from object storage or error.
// It returns `ErrorSyncMetaNotFound` and `ErrorSyncMetaCorrupted` sentinel errors in those cases. // It returns `ErrorSyncMetaNotFound` and `ErrorSyncMetaCorrupted` sentinel errors in those cases.
func (f *BaseFetcher) loadMeta(ctx context.Context, id ulid.ULID) (*metadata.Meta, error) { func (f *BaseFetcher) loadMeta(ctx context.Context, id ulid.ULID) (*metadata.Meta, error) {
@ -398,8 +448,8 @@ func (f *BaseFetcher) loadMeta(ctx context.Context, id ulid.ULID) (*metadata.Met
cachedBlockDir = filepath.Join(f.cacheDir, id.String()) cachedBlockDir = filepath.Join(f.cacheDir, id.String())
) )
if m, seen := f.cached[id]; seen { if m, seen := f.cached.Load(id); seen {
return m, nil return m.(*metadata.Meta), nil
} }
// Best effort load from local dir. // Best effort load from local dir.
@ -456,8 +506,9 @@ func (f *BaseFetcher) loadMeta(ctx context.Context, id ulid.ULID) (*metadata.Met
} }
type response struct { type response struct {
metas map[ulid.ULID]*metadata.Meta metas map[ulid.ULID]*metadata.Meta
partial map[ulid.ULID]error partial map[ulid.ULID]error
modifiedTimestamps map[ulid.ULID]time.Time
// If metaErr > 0 it means incomplete view, so some metas, failed to be loaded. // If metaErr > 0 it means incomplete view, so some metas, failed to be loaded.
metaErrs errutil.MultiError metaErrs errutil.MultiError
@ -470,21 +521,29 @@ func (f *BaseFetcher) fetchMetadata(ctx context.Context) (interface{}, error) {
var ( var (
resp = response{ resp = response{
metas: make(map[ulid.ULID]*metadata.Meta), metas: make(map[ulid.ULID]*metadata.Meta),
partial: make(map[ulid.ULID]error), partial: make(map[ulid.ULID]error),
modifiedTimestamps: make(map[ulid.ULID]time.Time),
} }
eg errgroup.Group eg errgroup.Group
ch = make(chan ulid.ULID, f.concurrency) activeBlocksCh = make(chan ActiveBlockFetchData, f.concurrency)
mtx sync.Mutex mtx sync.Mutex
) )
level.Debug(f.logger).Log("msg", "fetching meta data", "concurrency", f.concurrency) level.Debug(f.logger).Log("msg", "fetching meta data", "concurrency", f.concurrency)
for i := 0; i < f.concurrency; i++ { for i := 0; i < f.concurrency; i++ {
eg.Go(func() error { eg.Go(func() error {
for id := range ch { for activeBlockFetchMD := range activeBlocksCh {
id := activeBlockFetchMD.ULID
if f.metaUpdated(id, activeBlockFetchMD.lastModified) {
f.bustCacheForID(id)
}
meta, err := f.loadMeta(ctx, id) meta, err := f.loadMeta(ctx, id)
if err == nil { if err == nil {
mtx.Lock() mtx.Lock()
resp.metas[id] = meta resp.metas[id] = meta
resp.modifiedTimestamps[id] = activeBlockFetchMD.lastModified
mtx.Unlock() mtx.Unlock()
continue continue
} }
@ -517,8 +576,8 @@ func (f *BaseFetcher) fetchMetadata(ctx context.Context) (interface{}, error) {
var err error var err error
// Workers scheduled, distribute blocks. // Workers scheduled, distribute blocks.
eg.Go(func() error { eg.Go(func() error {
defer close(ch) defer close(activeBlocksCh)
partialBlocks, err = f.blockIDsLister.GetActiveAndPartialBlockIDs(ctx, ch) partialBlocks, err = f.blockIDsLister.GetActiveAndPartialBlockIDs(ctx, activeBlocksCh)
return err return err
}) })
@ -540,13 +599,20 @@ func (f *BaseFetcher) fetchMetadata(ctx context.Context) (interface{}, error) {
} }
// Only for complete view of blocks update the cache. // Only for complete view of blocks update the cache.
cached := make(map[ulid.ULID]*metadata.Meta, len(resp.metas))
cached := &sync.Map{}
for id, m := range resp.metas { for id, m := range resp.metas {
cached[id] = m cached.Store(id, m)
}
modifiedTimestamps := make(map[ulid.ULID]time.Time, len(resp.modifiedTimestamps))
for id, ts := range resp.modifiedTimestamps {
modifiedTimestamps[id] = ts
} }
f.mtx.Lock() f.mtx.Lock()
f.cached = cached f.cached = cached
f.modifiedTimestamps = modifiedTimestamps
f.mtx.Unlock() f.mtx.Unlock()
// Best effort cleanup of disk-cached metas. // Best effort cleanup of disk-cached metas.
@ -631,8 +697,12 @@ func (f *BaseFetcher) fetch(ctx context.Context, metrics *FetcherMetrics, filter
func (f *BaseFetcher) countCached() int { func (f *BaseFetcher) countCached() int {
f.mtx.Lock() f.mtx.Lock()
defer f.mtx.Unlock() defer f.mtx.Unlock()
var i int
return len(f.cached) f.cached.Range(func(_, _ interface{}) bool {
i++
return true
})
return i
} }
type MetaFetcher struct { type MetaFetcher struct {
@ -1085,3 +1155,46 @@ func ParseRelabelConfig(contentYaml []byte, supportedActions map[relabel.Action]
return relabelConfig, nil return relabelConfig, nil
} }
var _ MetadataFilter = &ParquetMigratedMetaFilter{}
// ParquetMigratedMetaFilter is a metadata filter that filters out blocks that have been
// migrated to parquet format. The filter checks for the presence of the parquet_migrated
// extension key with a value of true.
// Not go-routine safe.
type ParquetMigratedMetaFilter struct {
logger log.Logger
}
// NewParquetMigratedMetaFilter creates a new ParquetMigratedMetaFilter.
func NewParquetMigratedMetaFilter(logger log.Logger) *ParquetMigratedMetaFilter {
return &ParquetMigratedMetaFilter{
logger: logger,
}
}
// Filter filters out blocks that have been marked as migrated to parquet format.
func (f *ParquetMigratedMetaFilter) Filter(_ context.Context, metas map[ulid.ULID]*metadata.Meta, synced GaugeVec, modified GaugeVec) error {
for id, meta := range metas {
if meta.Thanos.Extensions == nil {
continue
}
extensionsMap, ok := meta.Thanos.Extensions.(map[string]interface{})
if !ok {
continue
}
parquetMigrated, exists := extensionsMap[metadata.ParquetMigratedExtensionKey]
if !exists {
continue
}
if migratedBool, ok := parquetMigrated.(bool); ok && migratedBool {
level.Debug(f.logger).Log("msg", "filtering out parquet migrated block", "block", id)
synced.WithLabelValues(ParquetMigratedMeta).Inc()
delete(metas, id)
}
}
return nil
}

View File

@ -18,9 +18,11 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
promtest "github.com/prometheus/client_golang/prometheus/testutil" promtest "github.com/prometheus/client_golang/prometheus/testutil"
"github.com/prometheus/prometheus/tsdb" "github.com/prometheus/prometheus/tsdb"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"
@ -71,30 +73,38 @@ func TestMetaFetcher_Fetch(t *testing.T) {
dir := t.TempDir() dir := t.TempDir()
var ulidToDelete ulid.ULID var ulidToDelete ulid.ULID
r := prometheus.NewRegistry()
noopLogger := log.NewNopLogger() noopLogger := log.NewNopLogger()
insBkt := objstore.WithNoopInstr(bkt) insBkt := objstore.WithNoopInstr(bkt)
baseBlockIDsFetcher := NewConcurrentLister(noopLogger, insBkt)
baseFetcher, err := NewBaseFetcher(noopLogger, 20, insBkt, baseBlockIDsFetcher, dir, r) r := prometheus.NewRegistry()
recursiveLister := NewRecursiveLister(noopLogger, insBkt)
recursiveBaseFetcher, err := NewBaseFetcher(noopLogger, 20, insBkt, recursiveLister, dir, r)
testutil.Ok(t, err) testutil.Ok(t, err)
fetcher := baseFetcher.NewMetaFetcher(r, []MetadataFilter{ recursiveFetcher := recursiveBaseFetcher.NewMetaFetcher(r, []MetadataFilter{
&ulidFilter{ulidToDelete: &ulidToDelete}, &ulidFilter{ulidToDelete: &ulidToDelete},
}, nil) }, nil)
for i, tcase := range []struct { for _, tcase := range []struct {
name string name string
do func() do func(cleanCache func())
filterULID ulid.ULID filterULID ulid.ULID
expectedMetas []ulid.ULID expectedMetas []ulid.ULID
expectedCorruptedMeta []ulid.ULID expectedCorruptedMeta []ulid.ULID
expectedNoMeta []ulid.ULID expectedNoMeta []ulid.ULID
expectedFiltered int expectedFiltered int
expectedMetaErr error expectedMetaErr error
expectedCacheBusts int
expectedSyncs int
// If this is set then use it.
fetcher *MetaFetcher
baseFetcher *BaseFetcher
}{ }{
{ {
name: "empty bucket", name: "empty bucket",
do: func() {}, do: func(_ func()) {},
expectedMetas: ULIDs(), expectedMetas: ULIDs(),
expectedCorruptedMeta: ULIDs(), expectedCorruptedMeta: ULIDs(),
@ -102,7 +112,7 @@ func TestMetaFetcher_Fetch(t *testing.T) {
}, },
{ {
name: "3 metas in bucket", name: "3 metas in bucket",
do: func() { do: func(_ func()) {
var meta metadata.Meta var meta metadata.Meta
meta.Version = 1 meta.Version = 1
meta.ULID = ULID(1) meta.ULID = ULID(1)
@ -125,28 +135,8 @@ func TestMetaFetcher_Fetch(t *testing.T) {
expectedNoMeta: ULIDs(), expectedNoMeta: ULIDs(),
}, },
{ {
name: "nothing changed", name: "meta 2 and 3 have corrupted data on disk ",
do: func() {}, do: func(cleanCache func()) {
expectedMetas: ULIDs(1, 2, 3),
expectedCorruptedMeta: ULIDs(),
expectedNoMeta: ULIDs(),
},
{
name: "fresh cache",
do: func() {
baseFetcher.cached = map[ulid.ULID]*metadata.Meta{}
},
expectedMetas: ULIDs(1, 2, 3),
expectedCorruptedMeta: ULIDs(),
expectedNoMeta: ULIDs(),
},
{
name: "fresh cache: meta 2 and 3 have corrupted data on disk ",
do: func() {
baseFetcher.cached = map[ulid.ULID]*metadata.Meta{}
testutil.Ok(t, os.Remove(filepath.Join(dir, "meta-syncer", ULID(2).String(), MetaFilename))) testutil.Ok(t, os.Remove(filepath.Join(dir, "meta-syncer", ULID(2).String(), MetaFilename)))
f, err := os.OpenFile(filepath.Join(dir, "meta-syncer", ULID(3).String(), MetaFilename), os.O_WRONLY, os.ModePerm) f, err := os.OpenFile(filepath.Join(dir, "meta-syncer", ULID(3).String(), MetaFilename), os.O_WRONLY, os.ModePerm)
@ -163,7 +153,7 @@ func TestMetaFetcher_Fetch(t *testing.T) {
}, },
{ {
name: "block without meta", name: "block without meta",
do: func() { do: func(_ func()) {
testutil.Ok(t, bkt.Upload(ctx, path.Join(ULID(4).String(), "some-file"), bytes.NewBuffer([]byte("something")))) testutil.Ok(t, bkt.Upload(ctx, path.Join(ULID(4).String(), "some-file"), bytes.NewBuffer([]byte("something"))))
}, },
@ -173,7 +163,7 @@ func TestMetaFetcher_Fetch(t *testing.T) {
}, },
{ {
name: "corrupted meta.json", name: "corrupted meta.json",
do: func() { do: func(_ func()) {
testutil.Ok(t, bkt.Upload(ctx, path.Join(ULID(5).String(), MetaFilename), bytes.NewBuffer([]byte("{ not a json")))) testutil.Ok(t, bkt.Upload(ctx, path.Join(ULID(5).String(), MetaFilename), bytes.NewBuffer([]byte("{ not a json"))))
}, },
@ -181,46 +171,71 @@ func TestMetaFetcher_Fetch(t *testing.T) {
expectedCorruptedMeta: ULIDs(5), expectedCorruptedMeta: ULIDs(5),
expectedNoMeta: ULIDs(4), expectedNoMeta: ULIDs(4),
}, },
{
name: "some added some deleted",
do: func() {
testutil.Ok(t, Delete(ctx, log.NewNopLogger(), bkt, ULID(2)))
{
name: "filter not existing ulid",
do: func(_ func()) {},
filterULID: ULID(10),
expectedMetas: ULIDs(1, 2, 3),
expectedCorruptedMeta: ULIDs(5),
expectedNoMeta: ULIDs(4),
},
{
name: "filter ulid 1",
do: func(_ func()) {
var meta metadata.Meta var meta metadata.Meta
meta.Version = 1 meta.Version = 1
meta.ULID = ULID(6) meta.ULID = ULID(1)
var buf bytes.Buffer var buf bytes.Buffer
testutil.Ok(t, json.NewEncoder(&buf).Encode(&meta)) testutil.Ok(t, json.NewEncoder(&buf).Encode(&meta))
testutil.Ok(t, bkt.Upload(ctx, path.Join(meta.ULID.String(), metadata.MetaFilename), &buf)) testutil.Ok(t, bkt.Upload(ctx, path.Join(meta.ULID.String(), metadata.MetaFilename), &buf))
}, },
expectedMetas: ULIDs(1, 3, 6),
expectedCorruptedMeta: ULIDs(5),
expectedNoMeta: ULIDs(4),
},
{
name: "filter not existing ulid",
do: func() {},
filterULID: ULID(10),
expectedMetas: ULIDs(1, 3, 6),
expectedCorruptedMeta: ULIDs(5),
expectedNoMeta: ULIDs(4),
},
{
name: "filter ulid 1",
do: func() {},
filterULID: ULID(1), filterULID: ULID(1),
expectedMetas: ULIDs(3, 6), expectedMetas: ULIDs(2, 3),
expectedCorruptedMeta: ULIDs(5), expectedCorruptedMeta: ULIDs(5),
expectedNoMeta: ULIDs(4), expectedNoMeta: ULIDs(4),
expectedFiltered: 1, expectedFiltered: 1,
}, },
{
name: "use recursive lister",
do: func(cleanCache func()) {
cleanCache()
},
fetcher: recursiveFetcher,
baseFetcher: recursiveBaseFetcher,
expectedMetas: ULIDs(1, 2, 3),
expectedCorruptedMeta: ULIDs(5),
expectedNoMeta: ULIDs(4),
},
{
name: "update timestamp, expect a cache bust",
do: func(_ func()) {
var meta metadata.Meta
meta.Version = 1
meta.MaxTime = 123456
meta.ULID = ULID(1)
var buf bytes.Buffer
testutil.Ok(t, json.NewEncoder(&buf).Encode(&meta))
testutil.Ok(t, bkt.Upload(ctx, path.Join(meta.ULID.String(), metadata.MetaFilename), &buf))
},
fetcher: recursiveFetcher,
baseFetcher: recursiveBaseFetcher,
expectedMetas: ULIDs(1, 2, 3),
expectedCorruptedMeta: ULIDs(5),
expectedNoMeta: ULIDs(4),
expectedFiltered: 0,
expectedCacheBusts: 1,
expectedSyncs: 2,
},
{ {
name: "error: not supported meta version", name: "error: not supported meta version",
do: func() { do: func(_ func()) {
var meta metadata.Meta var meta metadata.Meta
meta.Version = 20 meta.Version = 20
meta.ULID = ULID(7) meta.ULID = ULID(7)
@ -230,14 +245,40 @@ func TestMetaFetcher_Fetch(t *testing.T) {
testutil.Ok(t, bkt.Upload(ctx, path.Join(meta.ULID.String(), metadata.MetaFilename), &buf)) testutil.Ok(t, bkt.Upload(ctx, path.Join(meta.ULID.String(), metadata.MetaFilename), &buf))
}, },
expectedMetas: ULIDs(1, 3, 6), expectedMetas: ULIDs(1, 2, 3),
expectedCorruptedMeta: ULIDs(5), expectedCorruptedMeta: ULIDs(5),
expectedNoMeta: ULIDs(4), expectedNoMeta: ULIDs(4),
expectedMetaErr: errors.New("incomplete view: unexpected meta file: 00000000070000000000000000/meta.json version: 20"), expectedMetaErr: errors.New("incomplete view: unexpected meta file: 00000000070000000000000000/meta.json version: 20"),
}, },
} { } {
if ok := t.Run(tcase.name, func(t *testing.T) { if ok := t.Run(tcase.name, func(t *testing.T) {
tcase.do() r := prometheus.NewRegistry()
var fetcher *MetaFetcher
var baseFetcher *BaseFetcher
if tcase.baseFetcher != nil {
baseFetcher = tcase.baseFetcher
} else {
lister := NewConcurrentLister(noopLogger, insBkt)
bf, err := NewBaseFetcher(noopLogger, 20, insBkt, lister, dir, r)
testutil.Ok(t, err)
baseFetcher = bf
}
if tcase.fetcher != nil {
fetcher = tcase.fetcher
} else {
fetcher = baseFetcher.NewMetaFetcher(r, []MetadataFilter{
&ulidFilter{ulidToDelete: &ulidToDelete},
}, nil)
}
tcase.do(func() {
baseFetcher.cached.Clear()
testutil.Ok(t, os.RemoveAll(filepath.Join(dir, "meta-syncer")))
})
ulidToDelete = tcase.filterULID ulidToDelete = tcase.filterULID
metas, partial, err := fetcher.Fetch(ctx) metas, partial, err := fetcher.Fetch(ctx)
@ -281,8 +322,10 @@ func TestMetaFetcher_Fetch(t *testing.T) {
if tcase.expectedMetaErr != nil { if tcase.expectedMetaErr != nil {
expectedFailures = 1 expectedFailures = 1
} }
testutil.Equals(t, float64(i+1), promtest.ToFloat64(baseFetcher.syncs))
testutil.Equals(t, float64(i+1), promtest.ToFloat64(fetcher.metrics.Syncs)) testutil.Equals(t, float64(max(1, tcase.expectedSyncs)), promtest.ToFloat64(baseFetcher.syncs))
testutil.Equals(t, float64(tcase.expectedCacheBusts), promtest.ToFloat64(baseFetcher.cacheBusts))
testutil.Equals(t, float64(max(1, tcase.expectedSyncs)), promtest.ToFloat64(fetcher.metrics.Syncs))
testutil.Equals(t, float64(len(tcase.expectedMetas)), promtest.ToFloat64(fetcher.metrics.Synced.WithLabelValues(LoadedMeta))) testutil.Equals(t, float64(len(tcase.expectedMetas)), promtest.ToFloat64(fetcher.metrics.Synced.WithLabelValues(LoadedMeta)))
testutil.Equals(t, float64(len(tcase.expectedNoMeta)), promtest.ToFloat64(fetcher.metrics.Synced.WithLabelValues(NoMeta))) testutil.Equals(t, float64(len(tcase.expectedNoMeta)), promtest.ToFloat64(fetcher.metrics.Synced.WithLabelValues(NoMeta)))
testutil.Equals(t, float64(tcase.expectedFiltered), promtest.ToFloat64(fetcher.metrics.Synced.WithLabelValues("filtered"))) testutil.Equals(t, float64(tcase.expectedFiltered), promtest.ToFloat64(fetcher.metrics.Synced.WithLabelValues("filtered")))
@ -1211,3 +1254,157 @@ func Test_ParseRelabelConfig(t *testing.T) {
testutil.NotOk(t, err) testutil.NotOk(t, err)
testutil.Equals(t, "unsupported relabel action: labelmap", err.Error()) testutil.Equals(t, "unsupported relabel action: labelmap", err.Error())
} }
func TestParquetMigratedMetaFilter_Filter(t *testing.T) {
logger := log.NewNopLogger()
filter := NewParquetMigratedMetaFilter(logger)
// Simulate what might happen when extensions are loaded from JSON
extensions := struct {
ParquetMigrated bool `json:"parquet_migrated"`
}{
ParquetMigrated: true,
}
for _, c := range []struct {
name string
metas map[ulid.ULID]*metadata.Meta
check func(t *testing.T, metas map[ulid.ULID]*metadata.Meta, err error)
}{
{
name: "block with other extensions",
metas: map[ulid.ULID]*metadata.Meta{
ulid.MustNew(2, nil): {
Thanos: metadata.Thanos{
Extensions: map[string]interface{}{
"other_key": "other_value",
},
},
},
},
check: func(t *testing.T, metas map[ulid.ULID]*metadata.Meta, err error) {
testutil.Ok(t, err)
testutil.Equals(t, 1, len(metas))
},
},
{
name: "no extensions",
metas: map[ulid.ULID]*metadata.Meta{
ulid.MustNew(1, nil): {
Thanos: metadata.Thanos{
Extensions: nil,
},
},
},
check: func(t *testing.T, metas map[ulid.ULID]*metadata.Meta, err error) {
testutil.Equals(t, 1, len(metas))
testutil.Ok(t, err)
},
},
{
name: "block with parquet_migrated=false",
metas: map[ulid.ULID]*metadata.Meta{
ulid.MustNew(3, nil): {
Thanos: metadata.Thanos{
Extensions: map[string]interface{}{
metadata.ParquetMigratedExtensionKey: false,
},
},
},
},
check: func(t *testing.T, metas map[ulid.ULID]*metadata.Meta, err error) {
testutil.Equals(t, 1, len(metas))
testutil.Ok(t, err)
},
},
{
name: "block with parquet_migrated=true",
metas: map[ulid.ULID]*metadata.Meta{
ulid.MustNew(4, nil): {
Thanos: metadata.Thanos{
Extensions: map[string]interface{}{
metadata.ParquetMigratedExtensionKey: true,
},
},
},
},
check: func(t *testing.T, metas map[ulid.ULID]*metadata.Meta, err error) {
testutil.Equals(t, 0, len(metas))
testutil.Ok(t, err)
},
},
{
name: "mixed blocks with parquet_migrated",
metas: map[ulid.ULID]*metadata.Meta{
ulid.MustNew(5, nil): {
Thanos: metadata.Thanos{
Extensions: map[string]interface{}{
metadata.ParquetMigratedExtensionKey: true,
},
},
},
ulid.MustNew(6, nil): {
Thanos: metadata.Thanos{
Extensions: map[string]interface{}{
metadata.ParquetMigratedExtensionKey: false,
},
},
},
ulid.MustNew(7, nil): {
Thanos: metadata.Thanos{
Extensions: nil,
},
},
},
check: func(t *testing.T, metas map[ulid.ULID]*metadata.Meta, err error) {
testutil.Equals(t, 2, len(metas))
testutil.Ok(t, err)
testutil.Assert(t, metas[ulid.MustNew(6, nil)] != nil, "Expected block with parquet_migrated=false to remain")
testutil.Assert(t, metas[ulid.MustNew(7, nil)] != nil, "Expected block without extensions to remain")
},
},
{
name: "block with serialized extensions",
metas: map[ulid.ULID]*metadata.Meta{
ulid.MustNew(8, nil): {
Thanos: metadata.Thanos{
Extensions: extensions,
},
},
},
check: func(t *testing.T, metas map[ulid.ULID]*metadata.Meta, err error) {
testutil.Equals(t, 0, len(metas))
testutil.Ok(t, err)
},
},
} {
t.Run(c.name, func(t *testing.T) {
r := prometheus.NewRegistry()
synced := promauto.With(r).NewGaugeVec(
prometheus.GaugeOpts{
Name: "test_synced",
Help: "Test synced metric",
},
[]string{"state"},
)
modified := promauto.With(r).NewGaugeVec(
prometheus.GaugeOpts{
Name: "test_modified",
Help: "Test modified metric",
},
[]string{"state"},
)
ctx := context.Background()
m, err := json.Marshal(c.metas)
testutil.Ok(t, err)
var outmetas map[ulid.ULID]*metadata.Meta
testutil.Ok(t, json.Unmarshal(m, &outmetas))
err = filter.Filter(ctx, outmetas, synced, modified)
c.check(t, outmetas, err)
})
}
}

View File

@ -16,7 +16,8 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/storage" "github.com/prometheus/prometheus/storage"
@ -247,7 +248,7 @@ func GatherIndexHealthStats(ctx context.Context, logger log.Logger, fn string, m
} }
stats.LabelNamesCount = int64(len(lnames)) stats.LabelNamesCount = int64(len(lnames))
lvals, err := r.LabelValues(ctx, "__name__") lvals, err := r.LabelValues(ctx, "__name__", nil)
if err != nil { if err != nil {
return stats, errors.Wrap(err, "metric label values") return stats, errors.Wrap(err, "metric label values")
} }

View File

@ -21,7 +21,7 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"

View File

@ -15,7 +15,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
@ -306,7 +307,7 @@ func compareIndexToHeader(t *testing.T, indexByteSlice index.ByteSlice, headerRe
minStart := int64(math.MaxInt64) minStart := int64(math.MaxInt64)
maxEnd := int64(math.MinInt64) maxEnd := int64(math.MinInt64)
for il, lname := range expLabelNames { for il, lname := range expLabelNames {
expectedLabelVals, err := indexReader.SortedLabelValues(ctx, lname) expectedLabelVals, err := indexReader.SortedLabelValues(ctx, lname, nil)
testutil.Ok(t, err) testutil.Ok(t, err)
vals, err := headerReader.LabelValues(lname) vals, err := headerReader.LabelValues(lname)

View File

@ -12,7 +12,8 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"

View File

@ -13,7 +13,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
promtestutil "github.com/prometheus/client_golang/prometheus/testutil" promtestutil "github.com/prometheus/client_golang/prometheus/testutil"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
"github.com/thanos-io/objstore/providers/filesystem" "github.com/thanos-io/objstore/providers/filesystem"

View File

@ -7,10 +7,14 @@ import (
"context" "context"
"sync" "sync"
"time" "time"
"unsafe"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
xsync "golang.org/x/sync/singleflight"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"
@ -48,6 +52,7 @@ type ReaderPool struct {
// Keep track of all readers managed by the pool. // Keep track of all readers managed by the pool.
lazyReadersMx sync.Mutex lazyReadersMx sync.Mutex
lazyReaders map[*LazyBinaryReader]struct{} lazyReaders map[*LazyBinaryReader]struct{}
lazyReadersSF xsync.Group
lazyDownloadFunc LazyDownloadIndexHeaderFunc lazyDownloadFunc LazyDownloadIndexHeaderFunc
} }
@ -122,18 +127,16 @@ func NewReaderPool(logger log.Logger, lazyReaderEnabled bool, lazyReaderIdleTime
// with lazy reader enabled, this function will return a lazy reader. The returned lazy reader // with lazy reader enabled, this function will return a lazy reader. The returned lazy reader
// is tracked by the pool and automatically closed once the idle timeout expires. // is tracked by the pool and automatically closed once the idle timeout expires.
func (p *ReaderPool) NewBinaryReader(ctx context.Context, logger log.Logger, bkt objstore.BucketReader, dir string, id ulid.ULID, postingOffsetsInMemSampling int, meta *metadata.Meta) (Reader, error) { func (p *ReaderPool) NewBinaryReader(ctx context.Context, logger log.Logger, bkt objstore.BucketReader, dir string, id ulid.ULID, postingOffsetsInMemSampling int, meta *metadata.Meta) (Reader, error) {
var reader Reader if !p.lazyReaderEnabled {
var err error return NewBinaryReader(ctx, logger, bkt, dir, id, postingOffsetsInMemSampling, p.metrics.binaryReader)
if p.lazyReaderEnabled {
reader, err = NewLazyBinaryReader(ctx, logger, bkt, dir, id, postingOffsetsInMemSampling, p.metrics.lazyReader, p.metrics.binaryReader, p.onLazyReaderClosed, p.lazyDownloadFunc(meta))
} else {
reader, err = NewBinaryReader(ctx, logger, bkt, dir, id, postingOffsetsInMemSampling, p.metrics.binaryReader)
} }
if err != nil { idBytes := id.Bytes()
return nil, err lazyReader, err, _ := p.lazyReadersSF.Do(*(*string)(unsafe.Pointer(&idBytes)), func() (interface{}, error) {
} return NewLazyBinaryReader(ctx, logger, bkt, dir, id, postingOffsetsInMemSampling, p.metrics.lazyReader, p.metrics.binaryReader, p.onLazyReaderClosed, p.lazyDownloadFunc(meta))
})
reader := lazyReader.(Reader)
// Keep track of lazy readers only if required. // Keep track of lazy readers only if required.
if p.lazyReaderEnabled && p.lazyReaderIdleTimeout > 0 { if p.lazyReaderEnabled && p.lazyReaderIdleTimeout > 0 {

View File

@ -6,12 +6,16 @@ package indexheader
import ( import (
"context" "context"
"path/filepath" "path/filepath"
"sync"
"testing" "testing"
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/prometheus/client_golang/prometheus"
promtestutil "github.com/prometheus/client_golang/prometheus/testutil" promtestutil "github.com/prometheus/client_golang/prometheus/testutil"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
"github.com/stretchr/testify/require"
"github.com/thanos-io/objstore"
"github.com/thanos-io/objstore/providers/filesystem" "github.com/thanos-io/objstore/providers/filesystem"
"github.com/efficientgo/core/testutil" "github.com/efficientgo/core/testutil"
@ -132,3 +136,60 @@ func TestReaderPool_ShouldCloseIdleLazyReaders(t *testing.T) {
testutil.Equals(t, float64(2), promtestutil.ToFloat64(metrics.lazyReader.loadCount)) testutil.Equals(t, float64(2), promtestutil.ToFloat64(metrics.lazyReader.loadCount))
testutil.Equals(t, float64(2), promtestutil.ToFloat64(metrics.lazyReader.unloadCount)) testutil.Equals(t, float64(2), promtestutil.ToFloat64(metrics.lazyReader.unloadCount))
} }
func TestReaderPool_MultipleReaders(t *testing.T) {
ctx := context.Background()
blkDir := t.TempDir()
bkt := objstore.NewInMemBucket()
b1, err := e2eutil.CreateBlock(ctx, blkDir, []labels.Labels{
labels.New(labels.Label{Name: "a", Value: "1"}),
labels.New(labels.Label{Name: "a", Value: "2"}),
labels.New(labels.Label{Name: "a", Value: "3"}),
labels.New(labels.Label{Name: "a", Value: "4"}),
labels.New(labels.Label{Name: "b", Value: "1"}),
}, 100, 0, 1000, labels.New(labels.Label{Name: "ext1", Value: "val1"}), 124, metadata.NoneFunc, nil)
testutil.Ok(t, err)
require.NoError(t, block.Upload(ctx, log.NewNopLogger(), bkt, filepath.Join(blkDir, b1.String()), metadata.NoneFunc))
readerPool := NewReaderPool(
log.NewNopLogger(),
true,
time.Minute,
NewReaderPoolMetrics(prometheus.NewRegistry()),
AlwaysEagerDownloadIndexHeader,
)
dlDir := t.TempDir()
m, err := metadata.ReadFromDir(filepath.Join(blkDir, b1.String()))
testutil.Ok(t, err)
startWg := &sync.WaitGroup{}
startWg.Add(1)
waitWg := &sync.WaitGroup{}
const readersCount = 10
waitWg.Add(readersCount)
for i := 0; i < readersCount; i++ {
go func() {
defer waitWg.Done()
t.Logf("waiting")
startWg.Wait()
t.Logf("starting")
br, err := readerPool.NewBinaryReader(ctx, log.NewNopLogger(), bkt, dlDir, b1, 32, m)
testutil.Ok(t, err)
t.Cleanup(func() {
testutil.Ok(t, br.Close())
})
}()
}
startWg.Done()
waitWg.Wait()
}

View File

@ -10,7 +10,7 @@ import (
"path" "path"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"

View File

@ -12,7 +12,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"
"github.com/thanos-io/thanos/pkg/testutil/custom" "github.com/thanos-io/thanos/pkg/testutil/custom"

View File

@ -16,7 +16,8 @@ import (
"path/filepath" "path/filepath"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/model/relabel" "github.com/prometheus/prometheus/model/relabel"
@ -52,6 +53,11 @@ const (
TSDBVersion1 = 1 TSDBVersion1 = 1
// ThanosVersion1 is a enumeration of Thanos section of TSDB meta supported by Thanos. // ThanosVersion1 is a enumeration of Thanos section of TSDB meta supported by Thanos.
ThanosVersion1 = 1 ThanosVersion1 = 1
// ParquetMigratedExtensionKey is the key used in block extensions to indicate
// that the block has been migrated to parquet format and can be safely ignored
// by store gateways.
ParquetMigratedExtensionKey = "parquet_migrated"
) )
// Meta describes the a block's meta. It wraps the known TSDB meta structure and // Meta describes the a block's meta. It wraps the known TSDB meta structure and
@ -209,6 +215,11 @@ func (m Meta) WriteToDir(logger log.Logger, dir string) error {
runutil.CloseWithLogOnErr(logger, f, "close meta") runutil.CloseWithLogOnErr(logger, f, "close meta")
return err return err
} }
// Force the kernel to persist the file on disk to avoid data loss if the host crashes.
if err := f.Sync(); err != nil {
return err
}
if err := f.Close(); err != nil { if err := f.Close(); err != nil {
return err return err
} }

View File

@ -9,7 +9,8 @@ import (
"testing" "testing"
"github.com/efficientgo/core/testutil" "github.com/efficientgo/core/testutil"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/prometheus/prometheus/tsdb" "github.com/prometheus/prometheus/tsdb"
) )

View File

@ -9,7 +9,8 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"

View File

@ -12,7 +12,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"
promtest "github.com/prometheus/client_golang/prometheus/testutil" promtest "github.com/prometheus/client_golang/prometheus/testutil"

View File

@ -17,7 +17,7 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/golang/groupcache/singleflight" "github.com/golang/groupcache/singleflight"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/opentracing/opentracing-go" "github.com/opentracing/opentracing-go"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"

View File

@ -16,7 +16,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"
promtest "github.com/prometheus/client_golang/prometheus/testutil" promtest "github.com/prometheus/client_golang/prometheus/testutil"
@ -236,25 +237,25 @@ func testGroupCompactE2e(t *testing.T, mergeFunc storage.VerticalChunkSeriesMerg
testutil.Assert(t, os.IsNotExist(err), "dir %s should be remove after compaction.", dir) testutil.Assert(t, os.IsNotExist(err), "dir %s should be remove after compaction.", dir)
// Test label name with slash, regression: https://github.com/thanos-io/thanos/issues/1661. // Test label name with slash, regression: https://github.com/thanos-io/thanos/issues/1661.
extLabels := labels.Labels{{Name: "e1", Value: "1/weird"}} extLabels := labels.FromStrings("e1", "1/weird")
extLabels2 := labels.Labels{{Name: "e1", Value: "1"}} extLabels2 := labels.FromStrings("e1", "1")
metas := createAndUpload(t, bkt, []blockgenSpec{ metas := createAndUpload(t, bkt, []blockgenSpec{
{ {
numSamples: 100, mint: 500, maxt: 1000, extLset: extLabels, res: 124, numSamples: 100, mint: 500, maxt: 1000, extLset: extLabels, res: 124,
series: []labels.Labels{ series: []labels.Labels{
{{Name: "a", Value: "1"}}, labels.FromStrings("a", "1"),
{{Name: "a", Value: "2"}, {Name: "b", Value: "2"}}, labels.FromStrings("a", "2", "b", "2"),
{{Name: "a", Value: "3"}}, labels.FromStrings("a", "3"),
{{Name: "a", Value: "4"}}, labels.FromStrings("a", "4"),
}, },
}, },
{ {
numSamples: 100, mint: 2000, maxt: 3000, extLset: extLabels, res: 124, numSamples: 100, mint: 2000, maxt: 3000, extLset: extLabels, res: 124,
series: []labels.Labels{ series: []labels.Labels{
{{Name: "a", Value: "3"}}, labels.FromStrings("a", "3"),
{{Name: "a", Value: "4"}}, labels.FromStrings("a", "4"),
{{Name: "a", Value: "5"}}, labels.FromStrings("a", "5"),
{{Name: "a", Value: "6"}}, labels.FromStrings("a", "6"),
}, },
}, },
// Mix order to make sure compactor is able to deduct min time / max time. // Mix order to make sure compactor is able to deduct min time / max time.
@ -267,48 +268,40 @@ func testGroupCompactE2e(t *testing.T, mergeFunc storage.VerticalChunkSeriesMerg
// Due to TSDB compaction delay (not compacting fresh block), we need one more block to be pushed to trigger compaction. // Due to TSDB compaction delay (not compacting fresh block), we need one more block to be pushed to trigger compaction.
{ {
numSamples: 100, mint: 3000, maxt: 4000, extLset: extLabels, res: 124, numSamples: 100, mint: 3000, maxt: 4000, extLset: extLabels, res: 124,
series: []labels.Labels{ series: []labels.Labels{labels.FromStrings("a", "7")},
{{Name: "a", Value: "7"}},
},
}, },
// Extra block for "distraction" for different resolution and one for different labels. // Extra block for "distraction" for different resolution and one for different labels.
{ {
numSamples: 100, mint: 5000, maxt: 6000, extLset: labels.Labels{{Name: "e1", Value: "2"}}, res: 124, numSamples: 100, mint: 5000, maxt: 6000, extLset: labels.FromStrings("e1", "2"), res: 124,
series: []labels.Labels{ series: []labels.Labels{labels.FromStrings("a", "7")},
{{Name: "a", Value: "7"}},
},
}, },
// Extra block for "distraction" for different resolution and one for different labels. // Extra block for "distraction" for different resolution and one for different labels.
{ {
numSamples: 100, mint: 4000, maxt: 5000, extLset: extLabels, res: 0, numSamples: 100, mint: 4000, maxt: 5000, extLset: extLabels, res: 0,
series: []labels.Labels{ series: []labels.Labels{labels.FromStrings("a", "7")},
{{Name: "a", Value: "7"}},
},
}, },
// Second group (extLabels2). // Second group (extLabels2).
{ {
numSamples: 100, mint: 2000, maxt: 3000, extLset: extLabels2, res: 124, numSamples: 100, mint: 2000, maxt: 3000, extLset: extLabels2, res: 124,
series: []labels.Labels{ series: []labels.Labels{
{{Name: "a", Value: "3"}}, labels.FromStrings("a", "3"),
{{Name: "a", Value: "4"}}, labels.FromStrings("a", "4"),
{{Name: "a", Value: "6"}}, labels.FromStrings("a", "6"),
}, },
}, },
{ {
numSamples: 100, mint: 0, maxt: 1000, extLset: extLabels2, res: 124, numSamples: 100, mint: 0, maxt: 1000, extLset: extLabels2, res: 124,
series: []labels.Labels{ series: []labels.Labels{
{{Name: "a", Value: "1"}}, labels.FromStrings("a", "1"),
{{Name: "a", Value: "2"}, {Name: "b", Value: "2"}}, labels.FromStrings("a", "2", "b", "2"),
{{Name: "a", Value: "3"}}, labels.FromStrings("a", "3"),
{{Name: "a", Value: "4"}}, labels.FromStrings("a", "4"),
}, },
}, },
// Due to TSDB compaction delay (not compacting fresh block), we need one more block to be pushed to trigger compaction. // Due to TSDB compaction delay (not compacting fresh block), we need one more block to be pushed to trigger compaction.
{ {
numSamples: 100, mint: 3000, maxt: 4000, extLset: extLabels2, res: 124, numSamples: 100, mint: 3000, maxt: 4000, extLset: extLabels2, res: 124,
series: []labels.Labels{ series: []labels.Labels{labels.FromStrings("a", "7")},
{{Name: "a", Value: "7"}},
},
}, },
}) })

View File

@ -15,7 +15,8 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/run" "github.com/oklog/run"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"

View File

@ -15,7 +15,8 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/prometheus/model/histogram" "github.com/prometheus/prometheus/model/histogram"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"

View File

@ -2192,10 +2192,10 @@ func (b *memBlock) addSeries(s *series) {
b.postings = append(b.postings, sid) b.postings = append(b.postings, sid)
b.series = append(b.series, s) b.series = append(b.series, s)
for _, l := range s.lset { s.lset.Range(func(l labels.Label) {
b.symbols[l.Name] = struct{}{} b.symbols[l.Name] = struct{}{}
b.symbols[l.Value] = struct{}{} b.symbols[l.Value] = struct{}{}
} })
for i, cm := range s.chunks { for i, cm := range s.chunks {
if b.minTime == -1 || cm.MinTime < b.minTime { if b.minTime == -1 || cm.MinTime < b.minTime {

View File

@ -9,7 +9,8 @@ import (
"testing" "testing"
"github.com/efficientgo/core/testutil" "github.com/efficientgo/core/testutil"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/storage" "github.com/prometheus/prometheus/storage"
"github.com/prometheus/prometheus/tsdb/chunkenc" "github.com/prometheus/prometheus/tsdb/chunkenc"

View File

@ -10,7 +10,7 @@ import (
"path/filepath" "path/filepath"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"

View File

@ -12,7 +12,7 @@ import (
"testing" "testing"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"

View File

@ -10,7 +10,8 @@ import (
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/go-kit/log/level" "github.com/go-kit/log/level"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/thanos-io/objstore" "github.com/thanos-io/objstore"

View File

@ -13,7 +13,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto" "github.com/prometheus/client_golang/prometheus/promauto"
promtest "github.com/prometheus/client_golang/prometheus/testutil" promtest "github.com/prometheus/client_golang/prometheus/testutil"

View File

@ -14,7 +14,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/common/model" "github.com/prometheus/common/model"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"

14
pkg/compressutil/util.go Normal file
View File

@ -0,0 +1,14 @@
// Copyright (c) The Thanos Authors.
// Licensed under the Apache License 2.0.
package compressutil
import "github.com/prometheus/prometheus/util/compression"
// ParseCompressionType parses the two compression-related configuration values and returns the CompressionType.
func ParseCompressionType(compress bool, compressType compression.Type) compression.Type {
if compress {
return compressType
}
return compression.None
}

View File

@ -4,7 +4,7 @@
package extflag package extflag
import ( import (
"gopkg.in/alecthomas/kingpin.v2" "github.com/alecthomas/kingpin/v2"
) )
type FlagClause interface { type FlagClause interface {

View File

@ -24,8 +24,9 @@ import (
// EndpointGroupGRPCOpts creates gRPC dial options for connecting to endpoint groups. // EndpointGroupGRPCOpts creates gRPC dial options for connecting to endpoint groups.
// For details on retry capabilities, see https://github.com/grpc/proposal/blob/master/A6-client-retries.md#retry-policy-capabilities // For details on retry capabilities, see https://github.com/grpc/proposal/blob/master/A6-client-retries.md#retry-policy-capabilities
func EndpointGroupGRPCOpts() []grpc.DialOption { func EndpointGroupGRPCOpts(serviceConfig string) []grpc.DialOption {
serviceConfig := ` if serviceConfig == "" {
serviceConfig = `
{ {
"loadBalancingPolicy":"round_robin", "loadBalancingPolicy":"round_robin",
"retryPolicy": { "retryPolicy": {
@ -37,6 +38,7 @@ func EndpointGroupGRPCOpts() []grpc.DialOption {
] ]
} }
}` }`
}
return []grpc.DialOption{ return []grpc.DialOption{
grpc.WithDefaultServiceConfig(serviceConfig), grpc.WithDefaultServiceConfig(serviceConfig),

View File

@ -9,60 +9,14 @@ import (
"sort" "sort"
"text/template" "text/template"
"github.com/alecthomas/kingpin/v2"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/run" "github.com/oklog/run"
"github.com/opentracing/opentracing-go" "github.com/opentracing/opentracing-go"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/client_golang/prometheus" "github.com/prometheus/client_golang/prometheus"
"gopkg.in/alecthomas/kingpin.v2"
) )
const UsageTemplate = `{{define "FormatCommand"}}\
{{if .FlagSummary}} {{.FlagSummary}}{{end}}\
{{range .Args}} {{if not .Required}}[{{end}}<{{.Name}}>{{if .Value|IsCumulative}}...{{end}}{{if not .Required}}]{{end}}{{end}}\
{{end}}\
{{define "FormatCommands"}}\
{{range .FlattenedCommands}}\
{{if not .Hidden}}\
{{.FullCommand}}{{if .Default}}*{{end}}{{template "FormatCommand" .}}
{{.Help|Wrap 4}}
{{end}}\
{{end}}\
{{end}}\
{{define "FormatUsage"}}\
{{template "FormatCommand" .}}{{if .Commands}} <command> [<args> ...]{{end}}
{{if .Help}}
{{.Help|Wrap 0}}\
{{end}}\
{{end}}\
{{if .Context.SelectedCommand}}\
usage: {{.App.Name}} {{.Context.SelectedCommand}}{{template "FormatUsage" .Context.SelectedCommand}}
{{else}}\
usage: {{.App.Name}}{{template "FormatUsage" .App}}
{{end}}\
{{if .Context.Flags}}\
Flags:
{{alphabeticalSort .Context.Flags|FlagsToTwoColumns|FormatTwoColumns}}
{{end}}\
{{if .Context.Args}}\
Args:
{{.Context.Args|ArgsToTwoColumns|FormatTwoColumns}}
{{end}}\
{{if .Context.SelectedCommand}}\
{{if len .Context.SelectedCommand.Commands}}\
Subcommands:
{{template "FormatCommands" .Context.SelectedCommand}}
{{end}}\
{{else if .App.Commands}}\
Commands:
{{template "FormatCommands" .App}}
{{end}}\
`
type FlagClause interface { type FlagClause interface {
Flag(name, help string) *kingpin.FlagClause Flag(name, help string) *kingpin.FlagClause
} }
@ -87,7 +41,6 @@ type App struct {
// NewApp returns new App. // NewApp returns new App.
func NewApp(app *kingpin.Application) *App { func NewApp(app *kingpin.Application) *App {
app.HelpFlag.Short('h') app.HelpFlag.Short('h')
app.UsageTemplate(UsageTemplate)
app.UsageFuncs(template.FuncMap{ app.UsageFuncs(template.FuncMap{
"alphabeticalSort": func(data []*kingpin.FlagModel) []*kingpin.FlagModel { "alphabeticalSort": func(data []*kingpin.FlagModel) []*kingpin.FlagModel {
sort.Slice(data, func(i, j int) bool { return data[i].Name < data[j].Name }) sort.Slice(data, func(i, j int) bool { return data[i].Name < data[j].Name })

View File

@ -7,10 +7,10 @@ import (
"fmt" "fmt"
"strings" "strings"
"github.com/alecthomas/kingpin/v2"
extflag "github.com/efficientgo/tools/extkingpin" extflag "github.com/efficientgo/tools/extkingpin"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/common/model" "github.com/prometheus/common/model"
"gopkg.in/alecthomas/kingpin.v2"
) )
func ModelDuration(flags *kingpin.FlagClause) *model.Duration { func ModelDuration(flags *kingpin.FlagClause) *model.Duration {

View File

@ -483,14 +483,18 @@ func sortDtoMessages(msgs []proto.Message) {
m1 := msgs[i].(*dto.Metric) m1 := msgs[i].(*dto.Metric)
m2 := msgs[j].(*dto.Metric) m2 := msgs[j].(*dto.Metric)
lbls1 := labels.Labels{} builder := labels.NewBuilder(labels.EmptyLabels())
for _, p := range m1.GetLabel() { for _, p := range m1.GetLabel() {
lbls1 = append(lbls1, labels.Label{Name: *p.Name, Value: *p.Value}) builder.Set(p.GetName(), p.GetValue())
} }
lbls2 := labels.Labels{} lbls1 := builder.Labels()
builder.Reset(labels.EmptyLabels())
for _, p := range m2.GetLabel() { for _, p := range m2.GetLabel() {
lbls2 = append(lbls2, labels.Label{Name: *p.Name, Value: *p.Value}) builder.Set(p.GetName(), p.GetValue())
} }
lbls2 := builder.Labels()
return labels.Compare(lbls1, lbls2) < 0 return labels.Compare(lbls1, lbls2) < 0
}) })

View File

@ -11,16 +11,16 @@ import (
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/promql/parser" "github.com/prometheus/prometheus/promql/parser"
"github.com/thanos-io/promql-engine/execution/function" "github.com/thanos-io/promql-engine/execution/parse"
) )
// ParseExpr parses the input PromQL expression and returns the parsed representation. // ParseExpr parses the input PromQL expression and returns the parsed representation.
func ParseExpr(input string) (parser.Expr, error) { func ParseExpr(input string) (parser.Expr, error) {
allFuncs := make(map[string]*parser.Function, len(function.XFunctions)+len(parser.Functions)) allFuncs := make(map[string]*parser.Function, len(parse.XFunctions)+len(parser.Functions))
for k, v := range parser.Functions { for k, v := range parser.Functions {
allFuncs[k] = v allFuncs[k] = v
} }
for k, v := range function.XFunctions { for k, v := range parse.XFunctions {
allFuncs[k] = v allFuncs[k] = v
} }
p := parser.NewParser(input, parser.WithFunctions(allFuncs)) p := parser.NewParser(input, parser.WithFunctions(allFuncs))

View File

@ -9,7 +9,8 @@ import (
"math/rand" "math/rand"
"time" "time"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/opentracing/opentracing-go" "github.com/opentracing/opentracing-go"
"go.opentelemetry.io/otel/trace" "go.opentelemetry.io/otel/trace"

View File

@ -6,9 +6,9 @@ package model
import ( import (
"time" "time"
"github.com/alecthomas/kingpin/v2"
"github.com/prometheus/common/model" "github.com/prometheus/common/model"
"github.com/prometheus/prometheus/model/timestamp" "github.com/prometheus/prometheus/model/timestamp"
"gopkg.in/alecthomas/kingpin.v2"
) )
// TimeOrDurationValue is a custom kingping parser for time in RFC3339 // TimeOrDurationValue is a custom kingping parser for time in RFC3339

View File

@ -7,8 +7,8 @@ import (
"testing" "testing"
"time" "time"
"github.com/alecthomas/kingpin/v2"
"github.com/prometheus/prometheus/model/timestamp" "github.com/prometheus/prometheus/model/timestamp"
"gopkg.in/alecthomas/kingpin.v2"
"github.com/efficientgo/core/testutil" "github.com/efficientgo/core/testutil"
"github.com/thanos-io/thanos/pkg/model" "github.com/thanos-io/thanos/pkg/model"

View File

@ -14,7 +14,8 @@ import (
"time" "time"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/oklog/ulid" "github.com/oklog/ulid/v2"
"github.com/prometheus/common/model" "github.com/prometheus/common/model"
"github.com/prometheus/prometheus/config" "github.com/prometheus/prometheus/config"
"github.com/prometheus/prometheus/model/labels" "github.com/prometheus/prometheus/model/labels"

View File

@ -466,9 +466,12 @@ func TestEndpointSetUpdate_EndpointComingOnline(t *testing.T) {
func TestEndpointSetUpdate_StrictEndpointMetadata(t *testing.T) { func TestEndpointSetUpdate_StrictEndpointMetadata(t *testing.T) {
t.Parallel() t.Parallel()
info := sidecarInfo infoCopy := *sidecarInfo
info.Store.MinTime = 111 infoCopy.Store = &infopb.StoreInfo{
info.Store.MaxTime = 222 MinTime: 111,
MaxTime: 222,
}
info := &infoCopy
endpoints, err := startTestEndpoints([]testEndpointMeta{ endpoints, err := startTestEndpoints([]testEndpointMeta{
{ {
err: fmt.Errorf("endpoint unavailable"), err: fmt.Errorf("endpoint unavailable"),
@ -700,10 +703,8 @@ func TestEndpointSetUpdate_AvailabilityScenarios(t *testing.T) {
lset := e.LabelSets() lset := e.LabelSets()
testutil.Equals(t, 2, len(lset)) testutil.Equals(t, 2, len(lset))
testutil.Equals(t, "addr", lset[0][0].Name) testutil.Equals(t, addr, lset[0].Get("addr"))
testutil.Equals(t, addr, lset[0][0].Value) testutil.Equals(t, "b", lset[1].Get("a"))
testutil.Equals(t, "a", lset[1][0].Name)
testutil.Equals(t, "b", lset[1][0].Value)
assertRegisteredAPIs(t, endpoints.exposedAPIs[addr], e) assertRegisteredAPIs(t, endpoints.exposedAPIs[addr], e)
} }
@ -735,10 +736,8 @@ func TestEndpointSetUpdate_AvailabilityScenarios(t *testing.T) {
lset := st.LabelSets() lset := st.LabelSets()
testutil.Equals(t, 2, len(lset)) testutil.Equals(t, 2, len(lset))
testutil.Equals(t, "addr", lset[0][0].Name) testutil.Equals(t, addr, lset[0].Get("addr"))
testutil.Equals(t, addr, lset[0][0].Value) testutil.Equals(t, "b", lset[1].Get("a"))
testutil.Equals(t, "a", lset[1][0].Name)
testutil.Equals(t, "b", lset[1][0].Value)
testutil.Equals(t, expected, endpointSet.endpointsMetric.storeNodes) testutil.Equals(t, expected, endpointSet.endpointsMetric.storeNodes)
// New big batch of endpoints. // New big batch of endpoints.

View File

@ -18,7 +18,6 @@ import (
"github.com/efficientgo/core/testutil" "github.com/efficientgo/core/testutil"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/google/go-cmp/cmp"
"github.com/pkg/errors" "github.com/pkg/errors"
"github.com/prometheus/common/model" "github.com/prometheus/common/model"
"github.com/prometheus/prometheus/model/histogram" "github.com/prometheus/prometheus/model/histogram"
@ -362,7 +361,7 @@ func TestQuerier_Select_AfterPromQL(t *testing.T) {
// Regression test 1 against https://github.com/thanos-io/thanos/issues/2890. // Regression test 1 against https://github.com/thanos-io/thanos/issues/2890.
name: "when switching replicas don't miss samples when set with a big enough lookback delta", name: "when switching replicas don't miss samples when set with a big enough lookback delta",
storeAPI: newProxyStore(func() storepb.StoreServer { storeAPI: newProxyStore(func() storepb.StoreServer {
s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, nil, "./testdata/issue2890-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages) s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, labels.EmptyLabels(), "./testdata/issue2890-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages)
testutil.Ok(t, err) testutil.Ok(t, err)
return s return s
}()), }()),
@ -488,7 +487,7 @@ func TestQuerier_Select(t *testing.T) {
}, },
}, },
expectedAfterDedup: []series{{ expectedAfterDedup: []series{{
lset: nil, lset: labels.EmptyLabels(),
// We don't expect correctness here, it's just random non-replica data. // We don't expect correctness here, it's just random non-replica data.
samples: []sample{{1, 1}, {2, 2}, {3, 3}, {5, 5}, {6, 6}, {7, 7}}, samples: []sample{{1, 1}, {2, 2}, {3, 3}, {5, 5}, {6, 6}, {7, 7}},
}}, }},
@ -497,7 +496,7 @@ func TestQuerier_Select(t *testing.T) {
{ {
name: "realistic data with stale marker", name: "realistic data with stale marker",
storeEndpoints: []storepb.StoreServer{func() storepb.StoreServer { storeEndpoints: []storepb.StoreServer{func() storepb.StoreServer {
s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, nil, "./testdata/issue2401-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages) s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, labels.EmptyLabels(), "./testdata/issue2401-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages)
testutil.Ok(t, err) testutil.Ok(t, err)
return s return s
}()}, }()},
@ -541,7 +540,7 @@ func TestQuerier_Select(t *testing.T) {
{ {
name: "realistic data with stale marker with 100000 step", name: "realistic data with stale marker with 100000 step",
storeEndpoints: []storepb.StoreServer{func() storepb.StoreServer { storeEndpoints: []storepb.StoreServer{func() storepb.StoreServer {
s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, nil, "./testdata/issue2401-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages) s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, labels.EmptyLabels(), "./testdata/issue2401-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages)
testutil.Ok(t, err) testutil.Ok(t, err)
return s return s
}()}, }()},
@ -592,7 +591,7 @@ func TestQuerier_Select(t *testing.T) {
// Thanks to @Superq and GitLab for real data reproducing this. // Thanks to @Superq and GitLab for real data reproducing this.
name: "realistic data with stale marker with hints rate function", name: "realistic data with stale marker with hints rate function",
storeEndpoints: []storepb.StoreServer{func() storepb.StoreServer { storeEndpoints: []storepb.StoreServer{func() storepb.StoreServer {
s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, nil, "./testdata/issue2401-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages) s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, labels.EmptyLabels(), "./testdata/issue2401-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages)
testutil.Ok(t, err) testutil.Ok(t, err)
return s return s
}()}, }()},
@ -860,19 +859,12 @@ func newProxyStore(storeAPIs ...storepb.StoreServer) *store.ProxyStore {
nil, nil,
func() []store.Client { return cls }, func() []store.Client { return cls },
component.Query, component.Query,
nil, labels.EmptyLabels(),
0, 0,
store.EagerRetrieval, store.EagerRetrieval,
) )
} }
var emptyLabelsSameAsNotAllocatedLabels = cmp.Transformer("", func(l labels.Labels) labels.Labels {
if len(l) == 0 {
return labels.Labels(nil)
}
return l
})
func testSelectResponse(t *testing.T, expected []series, res storage.SeriesSet) { func testSelectResponse(t *testing.T, expected []series, res storage.SeriesSet) {
var series []storage.Series var series []storage.Series
// Use it as PromQL would do, first gather all series. // Use it as PromQL would do, first gather all series.
@ -889,7 +881,7 @@ func testSelectResponse(t *testing.T, expected []series, res storage.SeriesSet)
}()) }())
for i, s := range series { for i, s := range series {
testutil.WithGoCmp(emptyLabelsSameAsNotAllocatedLabels).Equals(t, expected[i].lset, s.Labels()) testutil.Assert(t, labels.Equal(expected[i].Labels(), s.Labels()))
samples := expandSeries(t, s.Iterator(nil)) samples := expandSeries(t, s.Iterator(nil))
expectedCpy := make([]sample, 0, len(expected[i].samples)) expectedCpy := make([]sample, 0, len(expected[i].samples))
for _, s := range expected[i].samples { for _, s := range expected[i].samples {
@ -914,15 +906,10 @@ func jsonToSeries(t *testing.T, filename string) []series {
var ss []series var ss []series
for _, ser := range data.Data.Results { for _, ser := range data.Data.Results {
var lbls labels.Labels builder := labels.NewBuilder(labels.EmptyLabels())
for n, v := range ser.Metric { for n, v := range ser.Metric {
lbls = append(lbls, labels.Label{ builder.Set(string(n), string(v))
Name: string(n),
Value: string(v),
})
} }
// Label names need to be sorted.
sort.Sort(lbls)
var smpls []sample var smpls []sample
for _, smp := range ser.Values { for _, smp := range ser.Values {
@ -933,7 +920,7 @@ func jsonToSeries(t *testing.T, filename string) []series {
} }
ss = append(ss, series{ ss = append(ss, series{
lset: lbls, lset: builder.Labels(),
samples: smpls, samples: smpls,
}) })
} }
@ -1073,7 +1060,7 @@ func TestQuerierWithDedupUnderstoodByPromQL_Rate(t *testing.T) {
logger := log.NewLogfmtLogger(os.Stderr) logger := log.NewLogfmtLogger(os.Stderr)
s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, nil, "./testdata/issue2401-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages) s, err := store.NewLocalStoreFromJSONMmappableFile(logger, component.Debug, labels.EmptyLabels(), "./testdata/issue2401-seriesresponses.json", store.ScanGRPCCurlProtoStreamMessages)
testutil.Ok(t, err) testutil.Ok(t, err)
t.Run("dedup=false", func(t *testing.T) { t.Run("dedup=false", func(t *testing.T) {
@ -1260,9 +1247,9 @@ func (s *testStoreServer) Series(r *storepb.SeriesRequest, srv storepb.Store_Ser
func storeSeriesResponse(t testing.TB, lset labels.Labels, smplChunks ...[]sample) *storepb.SeriesResponse { func storeSeriesResponse(t testing.TB, lset labels.Labels, smplChunks ...[]sample) *storepb.SeriesResponse {
var s storepb.Series var s storepb.Series
for _, l := range lset { lset.Range(func(l labels.Label) {
s.Labels = append(s.Labels, labelpb.ZLabel{Name: l.Name, Value: l.Value}) s.Labels = append(s.Labels, labelpb.ZLabel{Name: l.Name, Value: l.Value})
} })
for _, smpls := range smplChunks { for _, smpls := range smplChunks {
c := chunkenc.NewXORChunk() c := chunkenc.NewXORChunk()

View File

@ -75,7 +75,7 @@ func benchQuerySelect(t testutil.TB, totalSamples, totalSeries int, dedup bool)
if !dedup || j == 0 { if !dedup || j == 0 {
lset := labelpb.ZLabelsToPromLabels(created[i].Labels).Copy() lset := labelpb.ZLabelsToPromLabels(created[i].Labels).Copy()
if dedup { if dedup {
lset = lset[1:] lset = lset.MatchLabels(false, "a_replica")
} }
expectedSeries = append(expectedSeries, lset) expectedSeries = append(expectedSeries, lset)
} }

View File

@ -13,6 +13,7 @@ import (
"github.com/efficientgo/core/testutil" "github.com/efficientgo/core/testutil"
"github.com/go-kit/log" "github.com/go-kit/log"
"github.com/prometheus/prometheus/model/labels"
"github.com/prometheus/prometheus/promql/parser" "github.com/prometheus/prometheus/promql/parser"
"github.com/prometheus/prometheus/storage" "github.com/prometheus/prometheus/storage"
@ -71,7 +72,7 @@ func TestQuerier_Proxy(t *testing.T) {
logger, logger,
nil, nil,
store.NewProxyStore(logger, nil, func() []store.Client { return sc.get() }, store.NewProxyStore(logger, nil, func() []store.Client { return sc.get() },
component.Debug, nil, 5*time.Minute, store.EagerRetrieval, store.WithMatcherCache(cache)), component.Debug, labels.EmptyLabels(), 5*time.Minute, store.EagerRetrieval, store.WithMatcherCache(cache)),
1000000, 1000000,
5*time.Minute, 5*time.Minute,
dedup.AlgorithmPenalty, dedup.AlgorithmPenalty,
@ -85,7 +86,7 @@ func TestQuerier_Proxy(t *testing.T) {
// TODO(bwplotka): Parse external labels. // TODO(bwplotka): Parse external labels.
sc.append(&storetestutil.TestClient{ sc.append(&storetestutil.TestClient{
Name: fmt.Sprintf("store number %v", i), Name: fmt.Sprintf("store number %v", i),
StoreClient: storepb.ServerAsClient(selectedStore(store.NewTSDBStore(logger, st.storage.DB, component.Debug, nil), m, st.mint, st.maxt)), StoreClient: storepb.ServerAsClient(selectedStore(store.NewTSDBStore(logger, st.storage.DB, component.Debug, labels.EmptyLabels()), m, st.mint, st.maxt)),
MinTime: st.mint, MinTime: st.mint,
MaxTime: st.maxt, MaxTime: st.maxt,
}) })

View File

@ -183,6 +183,13 @@ func (r *remoteEngine) MaxT() int64 {
return r.maxt return r.maxt
} }
func (r *remoteEngine) PartitionLabelSets() []labels.Labels {
r.labelSetsOnce.Do(func() {
r.labelSets = r.adjustedInfos().LabelSets()
})
return r.labelSets
}
func (r *remoteEngine) LabelSets() []labels.Labels { func (r *remoteEngine) LabelSets() []labels.Labels {
r.labelSetsOnce.Do(func() { r.labelSetsOnce.Do(func() {
r.labelSets = r.adjustedInfos().LabelSets() r.labelSets = r.adjustedInfos().LabelSets()

View File

@ -39,10 +39,11 @@ func TestRemoteEngine_Warnings(t *testing.T) {
qryExpr, err := extpromql.ParseExpr("up") qryExpr, err := extpromql.ParseExpr("up")
testutil.Ok(t, err) testutil.Ok(t, err)
plan := logicalplan.NewFromAST(qryExpr, &query.Options{ plan, err := logicalplan.NewFromAST(qryExpr, &query.Options{
Start: time.Now(), Start: time.Now(),
End: time.Now().Add(2 * time.Hour), End: time.Now().Add(2 * time.Hour),
}, logicalplan.PlanOptions{}) }, logicalplan.PlanOptions{})
testutil.Ok(t, err)
t.Run("instant_query", func(t *testing.T) { t.Run("instant_query", func(t *testing.T) {
qry, err := engine.NewInstantQuery(context.Background(), nil, plan.Root(), start) qry, err := engine.NewInstantQuery(context.Background(), nil, plan.Root(), start)
@ -77,10 +78,11 @@ func TestRemoteEngine_PartialResponse(t *testing.T) {
qryExpr, err := extpromql.ParseExpr("up") qryExpr, err := extpromql.ParseExpr("up")
testutil.Ok(t, err) testutil.Ok(t, err)
plan := logicalplan.NewFromAST(qryExpr, &query.Options{ plan, err := logicalplan.NewFromAST(qryExpr, &query.Options{
Start: time.Now(), Start: time.Now(),
End: time.Now().Add(2 * time.Hour), End: time.Now().Add(2 * time.Hour),
}, logicalplan.PlanOptions{}) }, logicalplan.PlanOptions{})
testutil.Ok(t, err)
t.Run("instant_query", func(t *testing.T) { t.Run("instant_query", func(t *testing.T) {
qry, err := engine.NewInstantQuery(context.Background(), nil, plan.Root(), start) qry, err := engine.NewInstantQuery(context.Background(), nil, plan.Root(), start)

View File

@ -358,7 +358,7 @@ func ParseEval(lines []string, i int) (int, *evalCmd, error) {
break break
} }
if f, err := parseNumber(defLine); err == nil { if f, err := parseNumber(defLine); err == nil {
cmd.expect(0, nil, parser.SequenceValue{Value: f}) cmd.expect(0, parser.SequenceValue{Value: f})
break break
} }
metric, vals, err := parser.ParseSeriesDesc(defLine) metric, vals, err := parser.ParseSeriesDesc(defLine)
@ -373,7 +373,7 @@ func ParseEval(lines []string, i int) (int, *evalCmd, error) {
if len(vals) > 1 { if len(vals) > 1 {
return i, nil, raise(i, "expecting multiple values in instant evaluation not allowed") return i, nil, raise(i, "expecting multiple values in instant evaluation not allowed")
} }
cmd.expect(j, metric, vals...) cmd.expectMetric(j, metric, vals...)
} }
return i, cmd, nil return i, cmd, nil
} }
@ -480,13 +480,15 @@ func (ev *evalCmd) String() string {
return "eval" return "eval"
} }
// expect adds a new metric with a sequence of values to the set of expected // expect adds a sequence of values to the set of expected
// results for the query. // results for the query.
func (ev *evalCmd) expect(pos int, m labels.Labels, vals ...parser.SequenceValue) { func (ev *evalCmd) expect(pos int, vals ...parser.SequenceValue) {
if m == nil { ev.expected[0] = entry{pos: pos, vals: vals}
ev.expected[0] = entry{pos: pos, vals: vals} }
return
} // expectMetric adds a new metric with a sequence of values to the set of expected
// results for the query.
func (ev *evalCmd) expectMetric(pos int, m labels.Labels, vals ...parser.SequenceValue) {
h := m.Hash() h := m.Hash()
ev.metrics[h] = m ev.metrics[h] = m
ev.expected[h] = entry{pos: pos, vals: vals} ev.expected[h] = entry{pos: pos, vals: vals}

View File

@ -265,9 +265,6 @@ eval instant at 0m label_replace(testmetric, "dst", "", "dst", ".*")
# label_replace fails when the regex is invalid. # label_replace fails when the regex is invalid.
eval_fail instant at 0m label_replace(testmetric, "dst", "value-$1", "src", "(.*") eval_fail instant at 0m label_replace(testmetric, "dst", "value-$1", "src", "(.*")
# label_replace fails when the destination label name is not a valid Prometheus label name.
eval_fail instant at 0m label_replace(testmetric, "invalid-label-name", "", "src", "(.*)")
# label_replace fails when there would be duplicated identical output label sets. # label_replace fails when there would be duplicated identical output label sets.
eval_fail instant at 0m label_replace(testmetric, "src", "", "", "") eval_fail instant at 0m label_replace(testmetric, "src", "", "", "")

View File

@ -213,6 +213,7 @@ type Config struct {
DefaultTenant string DefaultTenant string
TenantCertField string TenantCertField string
EnableXFunctions bool EnableXFunctions bool
EnableFeatures []string
} }
// QueryRangeConfig holds the config for query range tripperware. // QueryRangeConfig holds the config for query range tripperware.

View File

@ -8,7 +8,6 @@ import (
"context" "context"
"encoding/json" "encoding/json"
io "io" io "io"
"math"
"net/http" "net/http"
"net/url" "net/url"
"sort" "sort"
@ -19,6 +18,7 @@ import (
"github.com/opentracing/opentracing-go" "github.com/opentracing/opentracing-go"
otlog "github.com/opentracing/opentracing-go/log" otlog "github.com/opentracing/opentracing-go/log"
"github.com/prometheus/prometheus/model/timestamp" "github.com/prometheus/prometheus/model/timestamp"
v1 "github.com/prometheus/prometheus/web/api/v1"
"github.com/weaveworks/common/httpgrpc" "github.com/weaveworks/common/httpgrpc"
"github.com/thanos-io/thanos/internal/cortex/querier/queryrange" "github.com/thanos-io/thanos/internal/cortex/querier/queryrange"
@ -28,11 +28,6 @@ import (
"github.com/thanos-io/thanos/pkg/store/labelpb" "github.com/thanos-io/thanos/pkg/store/labelpb"
) )
var (
infMinTime = time.Unix(math.MinInt64/1000+62135596801, 0)
infMaxTime = time.Unix(math.MaxInt64/1000-62135596801, 999999999)
)
// labelsCodec is used to encode/decode Thanos labels and series requests and responses. // labelsCodec is used to encode/decode Thanos labels and series requests and responses.
type labelsCodec struct { type labelsCodec struct {
partialResponse bool partialResponse bool
@ -400,8 +395,8 @@ func parseMetadataTimeRange(r *http.Request, defaultMetadataTimeRange time.Durat
// If start and end time not specified as query parameter, we get the range from the beginning of time by default. // If start and end time not specified as query parameter, we get the range from the beginning of time by default.
var defaultStartTime, defaultEndTime time.Time var defaultStartTime, defaultEndTime time.Time
if defaultMetadataTimeRange == 0 { if defaultMetadataTimeRange == 0 {
defaultStartTime = infMinTime defaultStartTime = v1.MinTime
defaultEndTime = infMaxTime defaultEndTime = v1.MaxTime
} else { } else {
now := time.Now() now := time.Now()
defaultStartTime = now.Add(-defaultMetadataTimeRange) defaultStartTime = now.Add(-defaultMetadataTimeRange)

Some files were not shown because too many files have changed in this diff Show More