Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve xml hierarchy test methods, refactor test_trigger.py tests to use them #570

Merged
merged 8 commits into from
Nov 17, 2021

Conversation

lindsay-stevens
Copy link
Contributor

@lindsay-stevens lindsay-stevens commented Nov 1, 2021

Closes #563

Why is this the best possible solution? Were any other approaches considered?

This follows on from #548 to add the discussed xml__xpath_match kwarg for PyxformTestCase and to switch to the lxml package from ElementTree for the purposes of test XPath evaluation. The benefit of lxml is that it's XPath 1.0 processor can do a lot more than ElementTree's XPath subset. Gains include being able to write compound predicates (with and) in one predicate instead of like .//bind[1=1][1=1], the use of axes, and numerous XPath functions. Also it allows the old code for switching XML tokenizers for Python 3.7 to be removed. More thorough tests for the xml__xpath* matchers are added as well.

The way around the default namespace limitation in lxml (mentioned in #548) was to tell the XPath processor about the same namespace URI/URL but assign a non-empty and not-None prefix to it. The lxml docs say the namespace match is actually on the URI not the prefix, so it's a bit like SQL aliases in different SELECT clauses. Since the default namespace is http://www.w3.org/2002/xforms I went with a prefix of x for this purpose. The assignment happens in pyxform_test_case.py on L231 which is finally used at L556. This means that in any usages of the xml__xpath* test matchers, XPath expressions referring to elements in the default namespace need to be written like .//x:bind instead of .//bind. Attributes without a prefix don't require this x prefix. There are numerous examples in test_pyxform_test_case.py.

This PR also delivers on the main purpose of #563 to refactor the test_trigger.py tests to use the new xml__xpath* matchers. With above improvements and addition of xml__xpath_match, it was all possible with xml__xpath_match.

Further details in the commit messages.

What are the regression risks?

This is a new testing feature so there would be no regressions unless other branches had already started using it.

It's possible that the refactors of test_trigger.py test are not perfect, despite being careful. Each of these tests passes. The XPath results can be checked by setting debug=True, which prints out the XML document, each XPath expression and what it matched.

Does this change require updates to documentation? If so, please file an issue here and include the link below.

Not applicable.

Before submitting this PR, please make sure you have:

  • included test cases for core behavior and edge cases in tests
  • run nosetests and verified all tests pass
  • run black pyxform tests to format code
  • verified that any code or assets from external sources are properly credited in comments

- adds xpath_match test matcher, which is a convenience on xpath_count
  where exactly one match is required.
- adds more tests for xpath_* matchers. Previous test cases were
  minimal, the new suite covers 1 or n XPaths with 1 or n matches, for
  each of the matchers. The test data fed to these test cases cover
  a range of common XPath patterns, including a search for an attribute
  value (see s1c5).
- changes the XPath engine. The previous approach in XLSForm#548 used stdlib
  elementtree xpath, which is like XPath but has limited expressions
  capabilities. The use of lxml makes available a wider range of XPath
  capabilities.
- changes XPath requirements. lxml is a bit more strict on the XPath
  spec so elements in the default namespace must be prefixed with
  something (x), since the document contains other namespaces.
- now used by tests/pxform_test_case.py
- fix n_match tests: cases weren't using the suite2 cases because of
  typo in the convenience lists. For xpath_match, skip
  n_xpath_n_match_pass scenario and rework n_xpath_n_match_fail.
- fix s2c1: incorrect count
- fix s2c2: incorrect exact match strings
- add docs: describe suites and cases. Clarification in check_content.
- add validation: check the tuple types for xpath_exact and xpath_count.
- move matcher code: to facilitate formalised func signatures / types /
  docstrings, and to allow neater / less indented code (vs keeping it
  all inline with the check_content func).
- remove tokenizer funcs: not needed anymore since change to using lxml.
- update invocation (L337-L346) and docstring: usage of xpath matchers
  intended and tested with whole XML document only, not the
  instance/model/itext fragments.
- update xml__xpath_exact input args: for the expected strings, now
  takes a set instead of list. XPath processes in document order but
  document order not necessarily stable between runs.
- Equivalent for model__contains, xml__contains is generally a XPath
  for the element in the string with attribute predicates to check the
  content is the same. The improvement though is the check for that
  element in a precise document location. So in most cases the full
  path was used rather than a full search (e.g. ".//x:bind") or axis.
- Equivalent for xml__excludes is a XPath for the root element
  ("/h:html") so that there is 1 match, but with a predicate asserting
  that no descendant elements have the name and attributes of interest.
@lindsay-stevens lindsay-stevens changed the title Pyxform 563 Improve xml hierarchy test methods, refactor test_trigger.py tests to use them Nov 1, 2021
Copy link
Contributor

@lognaturel lognaturel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I focused mostly on the updated tests which look good to me. I agree with the rationale for introducing lxml. The approach for the default namespace makes sense.

I'm wondering whether we should strongly recommend using XPath matchers for new tests now that all this investment has gone into making them nice? The biggest benefit I see is that it removes the attribute order issue entirely and makes tests more robust against changes to things like whitespace in the XForms output. If you agree, @lindsay-stevens, would be great to add to the README.

@codecov-commenter
Copy link

codecov-commenter commented Nov 4, 2021

Codecov Report

Merging #570 (161c3e9) into master (6ca3484) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #570   +/-   ##
=======================================
  Coverage   84.20%   84.20%           
=======================================
  Files          25       25           
  Lines        3672     3672           
  Branches      865      865           
=======================================
  Hits         3092     3092           
  Misses        440      440           
  Partials      140      140           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ca3484...161c3e9. Read the comment docs.

@lindsay-stevens
Copy link
Contributor Author

Thanks @lognaturel I agree and have added a section to the README for this. I've focussed on xml__xpath_match because after having used it I think that will probably be all that's needed in >95% of cases.

@lindsay-stevens
Copy link
Contributor Author

Appveyor build error seems to be random pip error, I only added to the Readme since the passing build in baec985

@lindsay-stevens lindsay-stevens merged commit 66c0d5b into XLSForm:master Nov 17, 2021
@lindsay-stevens lindsay-stevens deleted the pyxform-563 branch November 17, 2021 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor test_trigger.py tests to use xml hierarchy test methods
3 participants