Fix MDQ Endpoint Behavior for EntityIDs with .xml or Trailing Slash #301
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
In PyFF 2.x, the MDQ handler attempts to parse the URL path and remove the extensions (like
.xml
or.json
) under the assumption that these are used to indicate the desired response format.However, in some cases, clients request metadata using fully encoded entityIDs like:
/entities/https%3A%2F%2Fidp.example.org.xml
In this case, the
.xml
is part of the actual entityID. PyFF would remove this suffix and attempt to resolvehttps://idp.example.org
, which does not exist in the metadata. The result is an empty EntitiesDescriptor in XML responses or an empty list in JSON.Solution
This patch modifies the _d() function to:
Only strip
.xml
or.json
suffixes if the remaining path does not appear to be a percent-encoded entityID or a hash-based entityID ({sha1}, {sha256}, {md5}).If the entityID appears to be encoded or hashed and ends in .xml or .json, it is treated as part of the true entityID and preserved during lookup.
Additional fix: Preserves a trailing
/
if present in the request path.