Restrict LIST and SCAN to only accessible paths
Summary
Add a new keyword, list_scan_response_keys_filter_path
, to ACL policies to support filtering well-formed list
and scan
responses to only subpaths accessible by the token.
Problem Statement
Administrators grant broad permissions via the list
endpoint. Notably, if a user only has read
or list
access to select subkeys, but is granted list
or scan
on a parent, all results are visible to the user even if they are not otherwise accessible. Complicating this is that certain paths are unauthenticated (such as PKI's certs/
listing, which is itself authenticated but any given entry under cert/:serial
is not), or similarly, that a list handler may not necessarily be a prefix of the single entry read handler (again, in the case of PKI's certificate or issuers lists).
This makes automatic, backend-agnostic filtering hard to achieve without some guidance from the plugin or policy author.
We opt to do the latter and push this mapping requirement onto the policy author.
This also gives us an indication at the system level that filtering should be applied, as it is a potentially expensive operation and different callers may have differing requirements (e.g., in KVv2, some users may wish to filter based on metadata/{{ key }}
access whereas others may wish to filter on data/{{ key }}
access).
If necessary, this parameter could be made to accept a list of paths. However, semantics would then also have to be defined (OR
versus AND
) to decide how many need to match to show the entry in the result.
User-facing Description
Policy authors may make a breaking (from a consuming application's PoV) change by introducing a list_scan_response_keys_filter_path
argument inside a given path
grant in an ACL policy. This modifies the specific list
response indicated by removing any paths whose corresponding item path (generated by evaluating {{ key }}
in list_scan_response_keys_filter_path
with each item in the response's "keys"
data field) which the token does not have access to. This is relatively expensive, requiring a policy check for every response key, so should likely be combined with required_parameters=limit
on list operations to prevent having to evaluate thousands or potentially millions (in the case of PKI certificates or recursive K/V listing over the entire mount) of policy checks.
Consuming applications will see a given key in the keys
response field if one of two conditions holds:
- The key ends in
/
and the application has thelist
capability on the resulting path, or - The key does not end in
/
and the application has theread
on the resulting path.
Example policy:
path "secret/metadata" {
capabilities = ["list"]
list_scan_response_keys_filter_path = "secret/metadata/{{ .key }}"
# list_scan_response_keys_filter_path = "secret/data/{{ .key }}"
}
path "secret/metadata/a" { # show up
capabilities = ["read"]
}
path "secret/data/b" { # not show up
capabilities = ["read"]
}
# "c" would never show up.
Technical Description
Permissions are currently governed only by a single type of policy, ACLs. Individual policies are compiled into an ACL (which is subsequently cached for potential reuse), which walks each policy and computes combined permission grants. We opt to say that if any list
operation (for a given path) sets list_scan_response_keys_filter_path
, a value will be present, and the first such policy (in list order) determines the resulting filter.
A token's is thus known at request handling time, which means we can, for LIST
or SCAN
operations where filtering is desired, re-evaluate (without decrementing the token's num_usage
!) the ACL against other paths without incurring additional storage accesses. Because such filtering is opt-in on a path-by-path basis, we can require plugins conform to the required response field structure (i.e., not be a secret
or auth
response and have only keys
and optionally key_info
).
As above, we know the two types of results from list entries (though only concrete keys are returned from SCAN
results usually) and can apply policy evaluation against the simulated operation. Notably, the underlying policy which granted this request is only returned by name. We modify the ACLResults
object to include the ResponseKeysFilterPath
value (as chosen above) for later evaluation.
Notably, we need to support advanced templating already: granted paths may match one or more wildcards so injection of the actual list or scan path must be possible, so that .key
can be made relative to them, as necessary. For instance, in the KVv2 engine, if one wanted to allow list globally, the request path would likely be in a path grant for secret/metadata/*
; e.g., secret/metadata/my/dir
, but if this were to be filtered by data access, one would need a filter path like secret/data/{{ .key }}
but for the my/dir
example, .key
would not contain my/dir
and so this would need to be dynamically added and the final filter path would thus be something like {{ .path | replace "secret/metadata/" "secret/data/" }}{{ .key }}
.
Rationale and Alternatives
One alternative is globally enforcing this today. However, as this is technically a breaking change (insofar as applications could be acting on complete list results, without necessarily having read permissions on any entries), this isn't desirable without opting in. However, as different applications may have different behavior, it would be hard to opt-in globally and thus per-application (or, by proxy, per-policy) opt-in makes sense.
Additionally, one could consider pushing filter path identification to the plugin author for each list response. However this isn't ideal as shown above: KV has multiple potentially valid paths which could independently or dependently determine whether an entry should be visible in a list response. Thus, having list_scan_response_keys_filter_path
take a string (and not simply be a boolean) seems to make the most sense to allow plugin authors to determine paths appropriately.
An enhancement for later would be to allow small CEL programs to filter results, which could interact with the ACL or broader policy system for determining whether to allow or deny entries in a list response. This would be a much broader change under the CEL strategy and should be under a follow-up RFC if desired.
Downsides
This does add some complexity to the already complex routing and response mechanism. However, it is almost entirely opt-in and thus performance impact is minimal unless this filtering is desired by an operator.
Parameter Interaction
This initial proposal lacks support for certain confounding parameters:
- Policies may require wrapping by setting either
min_wrapping_ttl
ormax_wrapping_ttl
; because our simulated request does not ask for wrapping, policy will deny this entry though the user will have access to it if requested with wrapping. - Policies may also require certain parameters be present in the request (via
required_parameters
) and these will also not be present in our simulated request.
This could either be resolved by adding additional parameters (which may not be sufficiently flexible) or by iterating through the attached ACL on the templated path to consider required parameters and wrapping information for each policy for each path.
This can be done later if users utilizing these features wish to have support.
Cross-Mount/Invalid Path Confusion
One additional downside is that, because policy authors set list_scan_response_keys_filter_path
, they could set this to a mount which doesn't exist or isn't the same as the list. That is, there is no common prefix requirement between the list and the templated path for filtering.
Wildcard Matching Difficulties
Lastly, one downside is that there are difficulties in supporting a wildcard match: if any path within a given prefix is allowed by any policy, show the key in the list result. For instance, if the list path was sys/mounts
, one could envision a list_scan_response_keys_filter_path
value of {{ .key }}/*
to indicate a matching mount should only be returned if the token's policy grants at least one operation on said mount. This is hard as the underlying trie structures do not allow reverse wildcard lookup.
This may require careful subtree iteration with backtracking, rooted at a certain concrete path and thus is left for the future if someone desires this behavior. This has parallels to an earlier dev-wg
mailing list post around glob-based alias matches.
Support for this is not added in this proposal and warrants another RFC in the future to address this shortcoming if desired.
Security Implications
In addition to the downsides and complexity noted above, this probably has net-positive security benefit: when desired, users who shouldn't see certain list
entries, because they could not read or list them, will no longer be able to see them if an operator applies relevant filtering. Because this is a breaking change and incurs some non-trivial performance impact, this cannot be done by default.
User/Developer Experience
For application developers, this may make results change based on a policy author's actions. On the whole, assuming the application was reading list results and already handling permission denied errors, this will likely improve an application's performance with fewer denied requests.
For policy authors, this is another variable to consider but may make quality of life better on certain endpoints.
However, the greatest improvement comes from web UI users: list results will now only contain relevant paths that a policy author sets and thus.
Unresolved Questions
Interactions with Pagination
When implementing pagination, the design was given as two request keys (after
and limit
), returning the same response structure. The semantics were implicitly that an empty result meant no further data was present. As a user may not have access to a given range of keys (say, there are four-digit numbers encoded in a K/V entry and the user only has access to the one 2024
, list
called with after=1000
and limit=100
would return 100 results from the plugin call which would be filtered to the empty list as the allowed 2024
is not in the returned [1000, 1100)
). This would look to the client like they should finish iteration (as there are no results) instead of having to increment their after
value.
However, the client may not know after
values that are useful for iteration. These may be random UUIDs (where the probability, with sufficient entries, of discovering a useful after=
value is rather small) or some other information-leaking construct. Thus returning a next
parameter would not be ideal, as the caller would also have to repeat calls and gains information about present-but-unaccessible keys.
This may mean OpenBao should add a warning (that all results were pruned) or itself iterate the list handler until at least one entry is returned if a non-empty list-page
result occurs. The latter adds overhead and auditing concern:
- Would we re-emit audit records for each subsequent internal? Part of me says no, because from an audit perspective we only care about the external request and its response, not internal work occurring as a result.
- Would they count towards quotas? As above.
- Would we allow policy authors to control behavior? Limiting the number of re-entries would be ideal.
- Would we iterate until the entire request was satisfied (up to
limit
, if there were sufficient ACL-permitted entries) or only a single value?
My current stance leads me to believe that ideally we'd fill the entire response as much as we could, letting the policy author set a recursion limit, not warn, and not have them count towards quotas or auditing.
Related Issues
Upstream:
Proof of Concept
https://github.com/cipherboy/openbao/pull/new/list-filter-by-readable