11 KiB
Support Explicit Code Default Values in flagd Configuration
This ADR proposes adding support for explicitly configuring flagd to use code-defined default values by allowing null as a valid default variant. This change addresses the current limitation where users cannot differentiate between "use the code's default" and "use this configured default" without resorting to workarounds like misconfigured rulesets.
Background
Currently, flagd requires a default variant to be specified in flag configurations. This creates a fundamental mismatch with the OpenFeature specification and common feature flag usage patterns where code-defined defaults serve as the ultimate fallback.
The current behavior leads to confusion and operational challenges:
-
Two Sources of Truth: Applications have default values defined in code (as per OpenFeature best practices), while flagd configurations require their own default variants. This dual-default pattern violates the principle of single source of truth.
-
State Transition Issues: When transitioning a flag from DISABLED to ENABLED state, the behavior changes unexpectedly:
- DISABLED state: Flag evaluation falls through to code defaults
- ENABLED state: Flag evaluation uses the configured default variant
-
Workarounds: Users resort to misconfiguring rulesets (e.g., returning invalid variants) to force fallback to code defaults, which generates confusing error states and complicates debugging.
-
OpenFeature Alignment: The OpenFeature specification emphasizes that code defaults should be the ultimate fallback, but flagd's current design doesn't provide a clean way to express this intent.
Related discussions and context can be found in the OpenFeature specification and flagd flag definitions reference.
Requirements
- Explicit Code Default Support: Users must be able to explicitly configure a flag to use the code-defined default value as its resolution
- Backward Compatibility: Existing flag configurations must continue to work without modification
- Clear Semantics: The configuration must clearly indicate when code defaults are being used versus configured defaults
- Appropriate Reason Codes: Resolution details must include appropriate reason codes when code defaults are used (e.g.,
DEFAULTor a new specific reason) - Schema Validation: JSON schema must support and validate the new configuration options
- Provider Compatibility: All OpenFeature providers must handle the new behavior correctly
- Testbed Coverage: flagd testbed must include test cases for the new functionality
Considered Options
- Option 1: Allow
nullas Default Variant - Modify the schema to acceptnullas a valid value for defaultVariant, signaling "use code default" - Option 2: Make Default Variant Optional - Remove the requirement for defaultVariant entirely, with absence meaning "use code default"
- Option 3: Special Variant Value - Define a reserved variant name (e.g.,
"__CODE_DEFAULT__") that signals code default usage - Option 4: New Configuration Property - Add a new property like
useCodeDefault: truealongside or instead of defaultVariant - Option 5: Status Quo with Documentation - Keep current behavior but improve documentation about workarounds
Proposal
We propose implementing Option 1: Allow null as Default Variant, potentially combined with Option 2: Make Default Variant Optional for maximum flexibility.
The implementation leverages field presence in evaluation responses across all protocols (in-process, RPC, and OFREP). When a flag configuration has defaultVariant: null, the evaluation response omits the value field entirely, which serves as a programmatic signal to the client to use its code-defined default value.
This approach offers several key advantages:
- No Protocol Changes: RPC and OFREP protocols remain unchanged
- Clear Semantics: Omitted value field = "use your code default"
- Backward Compatible: Existing clients and servers continue to work
- Universal Pattern: Works consistently across all evaluation modes
The absence of a value field provides an unambiguous signal that distinguishes between "the server evaluated to null/false/empty" (value field present) and "the server delegates to your code default" (value field absent).
Implementation Details
-
Schema Changes:
{ "defaultVariant": { "oneOf": [ { "type": "string" }, { "type": "null" } ], "description": "Default variant to use. Set to null to use code-defined default." } } -
Evaluation Behavior:
- When flag has
defaultVariant: nulland targeting returns no match - Server responds with reason set to reason "ERROR" and error code "FLAG_NOT_FOUND"
- Client detects this reason value field and uses its code-defined default
- This same pattern works across all evaluation modes
- When flag has
-
Provider Implementation:
- No changes to existing providers
Design Rationale
Using "ERROR" reason: We intentionally reuse the existing "ERROR" reason code rather than introducing a new one (like "CODE_DEFAULT"). This retains the current behavior of an disabled flag and allows for progressive enablement of a flag without unexpected variations in flag evaluation behavior.
Advantages of this approach:
- The "ERROR" reason is already used for cases where the flag is not found or misconfigured, so it aligns with the intent of using code defaults.
- This approach avoids introducing new reason codes that would require additional handling in providers and clients.
API changes
Flag Configuration:
flags:
my-feature:
state: ENABLED
defaultVariant: null # Explicitly use code default
variants:
on: true
off: false
targeting:
if:
- "===":
- var: user-type
- "beta"
- on
OFREP Response when code default is indicated:
Single flag evaluation response
A single flag evaluation returns a 404 status code.
{
"key": "my-feature",
"errorCode": "FLAG_NOT_FOUND",
// Optional error details
"errorDetails": "Targeting not matched, using code default",
"metadata": {}
}
Bulk flag evaluation response
{
"flags": [
// Flag is omitted from bulk response
]
}
flagd RPC Response (ResolveBooleanResponse):
{
"reason": "ERROR",
"errorCode": "FLAG_NOT_FOUND",
"metadata": {}
}
Consequences
- Good, because it eliminates the confusion between code and configuration defaults
- Good, because it provides explicit control over default behavior without workarounds
- Good, because it aligns flagd more closely with OpenFeature specification principles
- Good, because it supports gradual flag rollout patterns more naturally
- Good, because it provides the ability to delegate to whatever is defined in code
- Good, because it requires no changes to existing RPC or protocol signatures
- Good, because it uses established patterns (field presence) for clear semantics
- Good, because it maintains full backward compatibility
- Bad, because it requires updates across multiple components (flagd, providers, testbed)
- Bad, because it introduces a new concept that users need to understand
- Neutral, because existing configurations continue to work unchange
Implementation Plan
- Update flagd-schemas with new JSON schema supporting null default variants
- Update flagd-testbed with comprehensive test cases for all evaluation modes
- Implement core logic in flagd to handle null defaults and omit value/variant fields
- Update OpenFeature providers with the latest schema and test harness to ensure they handle the new behavior correctly
- Documentation updates, migration guides, and playground examples to demonstrate the new configuration options
Testing Considerations
To ensure correct implementation across all components:
- Provider Tests: Each component (flagd, providers) must have unit tests verifying the handling of
nullas a default variant - Integration Tests: End-to-end tests across different language combinations (e.g., Go flagd with Java provider)
- OFREP Tests: Verify JSON responses correctly omits flags with a
nulldefault variant - Backward Compatibility Tests: Ensure old providers handle new responses gracefully
- Consistency Tests: Verify identical behavior across in-process, RPC, and OFREP modes
Open questions
- How should providers handle responses with missing value fields in strongly-typed languages?
- We'll handle the same way as with optional fields, using language-specific patterns (e.g., pointers in Go,
hasValue()in Java).
- We'll handle the same way as with optional fields, using language-specific patterns (e.g., pointers in Go,
- Should we support both
nulland absentdefaultVariantfields, or choose one approach?- Yes, we'll support both
nulland absent fields to maximize flexibility. An absentdefaultVariantwill be the equivalent ofnull.
- Yes, we'll support both
- What migration path should we recommend for users currently using workarounds?
- Update the flag configurations to use
defaultVariant: nulland remove any misconfigured rulesets that force code defaults.
- Update the flag configurations to use
- Should this feature be gated behind a configuration flag during initial rollout?
- We'll avoid public facing documentation until the feature is fully implemented and tested.
- How do we ensure consistent behavior across all provider implementations?
- Gherkin tests will be added to the flagd testbed to ensure all providers handle the new behavior consistently.
- Should providers validate that the reason is "DEFAULT" when value is omitted, or accept any omitted value as delegation?
- Providers should accept any omitted value as delegation.
- How do we handle edge cases where network protocols might strip empty fields?
- It would behaving as expected, as the absence of fields is the intended signal.
- When the client uses its code default after receiving a delegation response, what variant should be reported in telemetry/analytics?
- The variant will be omitted, indicating that the code default was used.
- Should we add explicit proto comments documenting the field omission behavior?
- Leave this to the implementers, but it would be beneficial to add comments in the proto files to clarify this behavior for future maintainers.