-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve xml hierarchy test methods, refactor test_trigger.py tests to use them #570
Conversation
- adds xpath_match test matcher, which is a convenience on xpath_count where exactly one match is required. - adds more tests for xpath_* matchers. Previous test cases were minimal, the new suite covers 1 or n XPaths with 1 or n matches, for each of the matchers. The test data fed to these test cases cover a range of common XPath patterns, including a search for an attribute value (see s1c5). - changes the XPath engine. The previous approach in XLSForm#548 used stdlib elementtree xpath, which is like XPath but has limited expressions capabilities. The use of lxml makes available a wider range of XPath capabilities. - changes XPath requirements. lxml is a bit more strict on the XPath spec so elements in the default namespace must be prefixed with something (x), since the document contains other namespaces.
- now used by tests/pxform_test_case.py
- fix n_match tests: cases weren't using the suite2 cases because of typo in the convenience lists. For xpath_match, skip n_xpath_n_match_pass scenario and rework n_xpath_n_match_fail. - fix s2c1: incorrect count - fix s2c2: incorrect exact match strings - add docs: describe suites and cases. Clarification in check_content. - add validation: check the tuple types for xpath_exact and xpath_count.
- move matcher code: to facilitate formalised func signatures / types / docstrings, and to allow neater / less indented code (vs keeping it all inline with the check_content func). - remove tokenizer funcs: not needed anymore since change to using lxml. - update invocation (L337-L346) and docstring: usage of xpath matchers intended and tested with whole XML document only, not the instance/model/itext fragments. - update xml__xpath_exact input args: for the expected strings, now takes a set instead of list. XPath processes in document order but document order not necessarily stable between runs.
- Equivalent for model__contains, xml__contains is generally a XPath for the element in the string with attribute predicates to check the content is the same. The improvement though is the check for that element in a precise document location. So in most cases the full path was used rather than a full search (e.g. ".//x:bind") or axis. - Equivalent for xml__excludes is a XPath for the root element ("/h:html") so that there is 1 match, but with a predicate asserting that no descendant elements have the name and attributes of interest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I focused mostly on the updated tests which look good to me. I agree with the rationale for introducing lxml
. The approach for the default namespace makes sense.
I'm wondering whether we should strongly recommend using XPath matchers for new tests now that all this investment has gone into making them nice? The biggest benefit I see is that it removes the attribute order issue entirely and makes tests more robust against changes to things like whitespace in the XForms output. If you agree, @lindsay-stevens, would be great to add to the README.
Codecov Report
@@ Coverage Diff @@
## master #570 +/- ##
=======================================
Coverage 84.20% 84.20%
=======================================
Files 25 25
Lines 3672 3672
Branches 865 865
=======================================
Hits 3092 3092
Misses 440 440
Partials 140 140 Continue to review full report at Codecov.
|
Thanks @lognaturel I agree and have added a section to the README for this. I've focussed on |
Appveyor build error seems to be random pip error, I only added to the Readme since the passing build in baec985 |
Closes #563
Why is this the best possible solution? Were any other approaches considered?
This follows on from #548 to add the discussed
xml__xpath_match
kwarg forPyxformTestCase
and to switch to the lxml package from ElementTree for the purposes of test XPath evaluation. The benefit of lxml is that it's XPath 1.0 processor can do a lot more than ElementTree's XPath subset. Gains include being able to write compound predicates (withand
) in one predicate instead of like.//bind[1=1][1=1]
, the use of axes, and numerous XPath functions. Also it allows the old code for switching XML tokenizers for Python 3.7 to be removed. More thorough tests for thexml__xpath*
matchers are added as well.The way around the default namespace limitation in lxml (mentioned in #548) was to tell the XPath processor about the same namespace URI/URL but assign a non-empty and not-None prefix to it. The lxml docs say the namespace match is actually on the URI not the prefix, so it's a bit like SQL aliases in different SELECT clauses. Since the default namespace is
http://www.w3.org/2002/xforms
I went with a prefix ofx
for this purpose. The assignment happens inpyxform_test_case.py
on L231 which is finally used at L556. This means that in any usages of thexml__xpath*
test matchers, XPath expressions referring to elements in the default namespace need to be written like.//x:bind
instead of.//bind
. Attributes without a prefix don't require thisx
prefix. There are numerous examples intest_pyxform_test_case.py
.This PR also delivers on the main purpose of #563 to refactor the
test_trigger.py
tests to use the newxml__xpath*
matchers. With above improvements and addition ofxml__xpath_match
, it was all possible withxml__xpath_match
.Further details in the commit messages.
What are the regression risks?
This is a new testing feature so there would be no regressions unless other branches had already started using it.
It's possible that the refactors of
test_trigger.py
test are not perfect, despite being careful. Each of these tests passes. The XPath results can be checked by settingdebug=True
, which prints out the XML document, each XPath expression and what it matched.Does this change require updates to documentation? If so, please file an issue here and include the link below.
Not applicable.
Before submitting this PR, please make sure you have:
tests
nosetests
and verified all tests passblack pyxform tests
to format code