-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
157 missing translations warning #571
Conversation
- added tests have a couple of todo's to follow up - will also need to add: - test cases for happy path and other translatable columns - the same sort of check for the translatable choices sheet columns - also added some obvious type annotations and moved repeated usages of "default" for the default language to the constants module.
- in the tests in the prior commit it was becoming difficult to discern the correct / normal behaviour is for various translation scenarios - includes numerous todo's for suspect behaviour that may be bugs - added guidance_hint to translatable columns list
- simplified representation of issue 157 source case where label + hint are in French, with a default lang image column, and no default set. - media columns were not being detected correctly since the survey_sheet object nests audio, image and video in a "media" dict. - updated constants list with comment on other possible translatable columns in case pyxform supports them in future. - general tidy up of test xpaths and usages of them, such as to remove combination/shortcut xpaths, specify absence of ref or form attributes where relevant, condense guidance/audio/image/video xpaths. - remove duplicate test todo's for the same issue in a similar context.
Codecov Report
@@ Coverage Diff @@
## master #571 +/- ##
==========================================
+ Coverage 83.59% 83.79% +0.19%
==========================================
Files 25 26 +1
Lines 3761 3825 +64
Branches 930 952 +22
==========================================
+ Hits 3144 3205 +61
Misses 474 474
- Partials 143 146 +3
Continue to review full report at Codecov.
|
- moved TranslationTest.test_missing_media_itext to TestTranslations - moved new TranslationTest tests to TestTranslationsSurvey - added TestTranslationsChoices - similar to survey tests, added to clarify existing behaviour - includes some initial cases, for with/out a lang and choice_filter - some overlap with module test_secondary_instance_translations.py which should be resolved once these tests are complete. - added a XPathHelper class to try and make the choices tests more legible and easier to document, vs. the constants used for survey tests. Could adapt / extend helper for survey tests as well.
- Update TestTranslations and TestTranslationsSurvey that were using string constants to use XPathHelper so that the assertions being made are easier to follow. - Add XPathHelper method docstrings to describe what they assert.
- Previously just showing the XForm made it a mystery which of the (potentially many) XPath expressions failed to match.
- Add to existing warning for missing translations where the affected columns are constraint_message or required_message - Due to a difference in the internally used name and supported external aliases, the TRANSLATABLE_SURVEY_COLUMNS object is now a dict and moved to aliases.py so it is closer to the existing aliases dicts to help keep them in sync.
- debug=True causes the XForm and other test info to be printed out but this shouldn't be present on committed tests. - If warnings__* parameters are misspelled then they don't do anything.
- checks columns shown in aliases.py - check done alongside survey check so there's just warning message about missing translations - add choices warning test cases for same set of scenarios as for survey - update existing survey tests to use warnings__contains or not_contains instead of passing in list and doing assertIn / assertNotIn
- the three related functions for this check are parts of a validation, and they otherwise clutter up xls2json.py a bit. - may make sense to move other xls2json checks into the same package so it's clearer what validations pyxform does.
- combination of existing tests with same name pattern but in one test case that has both warning and no-warning assertions.
Pretty clean approach! I am still worried that The added testing is pretty helpful! |
- to check whether the missing translation check incurs a reasonable time cost, considering that it traverses the full XLSForm dict available in xls2json rather than earlier in the pipeline to just check sheet headers. It seems there is very little difference for a giant form with 500 questions and 2 choices per question.
# Conflicts: # pyxform/xls2json.py
Hi @KeynesYouDigIt thanks for your review! Please let me know if any changes or further clarification are needed. |
@lindsay-stevens no im good, this all seems fine! I'm afraid I don't have official review permissions so someone else will have to do the official review. |
Hi @lognaturel would you be able to review? |
Will do! I believe dashes are filled in for labels because Validate requires a label for every language. |
Hi @lognaturel just a reminder about this - no worries if it's til the new year. Thanks! |
Hi @lognaturel happy new year! Just a reminder about this PR ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As seems to be the case for things you tackle, @lindsay-stevens, there's some mind-bending stuff here! I'm happy with this and we can always come back if we want to warn more aggressively as I mention. I'll leave this as a comment for now in case any of my inline thoughts spark any ideas, particularly for another way to arrange the warning.
- missing translations check can still work while only looking at first row, which is a little faster and still passes the tests. - while profiling, found an inefficient unique check in section.py. The benefit is only apparent with forms >1000 questions, but it is significant: q=5000 28s vs. 11s, q=10000 103s vs. 21s.
Thanks for reviewing @lognaturel! I've replied to each comment, please let me know if any changes are needed at this stage (or via separate tickets). |
- default_language setting shouldn't influence the warning since the underlying issue is missing languages, and the setting just happens to be a way around one of the problems that causes. - updated the warning format to be one line per lang/sheet, with column names listed in the line. - includes corresponding test changes for the above, and re-configures how the check func processes data to better match the warning format.
@lognaturel thanks for the feedback / discussion above. The outstanding items should all now be addressed, either by new commits or new tickets for follow up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What a journey!
Closes #157
Adds a warning if it looks like a XLSForm is missing a translation on one or more translatable columns.
The originating issue was a test case concerning only survey
label
andmedia::image
. This PR covers all translatable columns (per the Reference Table. On the survey sheet: label, hint, guidance_hint, media::image, media::audio, media::video, constraint_message, and required_message. On the choices sheet: label, media::image, media::audio, and media::video.The warning message specifies the columns and languages which appear to be missing. The (unwrapped) template is:
Where
{cols_msg}
lists the columns (for either or both ofsurvey
andchoices
) separated by a semicolon and the languages separated by a comma, e.g.'hint': 'English', 'French'; 'label': 'English'
. It can be a bit difficult to read when there are a lot of missing columns / languages but I'm not sure of a better alternative, such as using list notation (square brackets) etc. However, presenting the warning like this seems preferable to emitting a separate warning for each column. Example:Why is this the best possible solution? Were any other approaches considered?
It would have been a good outcome if there was clearly a way that pyxform could produce an XForm that avoided the issue that the warning is for. However there were not many tests for translations previously. This made it difficult to tell whether aspects of and differences in XLSForm input and XForm output were symptoms of the originating issue, or normal behaviour, or undiscovered bugs.
This PR includes a raft of relatively pedantic XPath tests for translations behaviour. There are probably still more permutations that don't have tests but this should be a good boost to test coverage. In
test_translations.py
there are some TODO comments which should be reviewed and converted to tickets if they represent genuine bugs, or removed if they note normal behaviour. Once the TODO's are resolved, maybe it would be possible for pyxform to emit an XForm such that the warning is not required.What are the regression risks?
Some users may expect no warning for certain XLSForms with missing translation columns. Depending on how warnings are handled it may or may not be a problem. The changes in this PR are otherwise additive.
Does this change require updates to documentation? If so, please file an issue here and include the link below.
The
guidance_hint
column was missing from the XLSForm Reference Table but I've submitted a PR for that already.Before submitting this PR, please make sure you have:
tests
nosetests
and verified all tests passblack pyxform tests
to format code