-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Messy but working branch used for training models in parameter scan #54
Conversation
von mises fix also fix size/shape mistake just plot two points for uniform pdf
* major overhaul - two point and top observables implemented * update example runcards * clean up a bit, add docstrings, add 2m corr length * correct bug in autocorrelation: one dim at a time! * log info acceptance fraction, not log debug * make correlators connected * take sample interval into account in plots * docstrings on observables actions * fix bug in plotting known pdfs von mises fix also fix size/shape mistake just plot two points for uniform pdf * correct typo Co-authored-by: marshrossney <[email protected]>
* circle projection with nn.Parameters fix ncp circle * add docstring and example runcard * upgrade ncp to coupling layer correct single line * add option to specify final activation circle projection with nn.Parameters fix ncp circle add docstring and example runcard upgrade ncp to coupling layer correct single line add option to specify final activation * remove duplicate action Co-authored-by: marshrossney <[email protected]>
* generic MixtureDist class * simply class - takes list of dist objects Co-authored-by: marshrossney <[email protected]>
Add rational quadratic spline
Tagged commit for the version which I used to generate results for our 2020 phi four paper.
|
||
_sample_training_output = collect("sample", ("training_context",)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this 'collect over input card' because the values in the input card are used rather than being parsed by ConfigParser
(production functions work, however).
Apart from skipping all the checks, this means we cannot return anything other than the input in a parse function. For example, if I let def produce_target_dist(self, target: str)
become def parse_target(self, target: str)
and then use target
in place of target_dist
throughout the code (which is much nicer) then the training works but the sampling does not, since _sample_training_output
sees the string version of target
provided in the training input card, not the object returned by parse_target
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So moving forwards I am looking at different ways to ensure consistency between the training, sampling and report-making stages without using 'collect over input card'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that you would know this but reportengine has a parse_from_
mechanism which means in the runcard I can do:
sample: {from_: training_context}
but it also means inside a production rule you can do something like
self.parse_from_(namespace, key)
I'd have to see the exact requirement to give a better example but I think this would solve you're problem because parse_from_
would return the output of the parsing function acting on the key so I think you could get your consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first example there is assuming sample isn't a production rule. It's all a bit complicated but I think it would be best for me to show you a proper example of this in action
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah excellent, yes I think I understand vaguely how this could work, based on the 'Validphys 2 guide'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation really is quite poor for some of these features. Soon the nnpdf code will be public so you will be able to see examples of using reportengine including using parse_from_
. Since training_output
has an as_input()
method you can actually write:
self.parse_from_:(training_output, <key>)
which is what I really should have done in the training_context
but for the selection of keys we wanted. I was incredibly lazy so just dump the unprocessed config instead which is really rubbish. Sorry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it was because we were taking so many of the keys. But a bit of verbosity is a small price to pay for this thing to work properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to apologise! It's often been the case that I've had funny ideas about how to do things and you've had to quickly morph them into something that works with reportengine.. Anyway this was always gonna be a case of iterative improvement and it's getting there..
I'll take your lead here. But consider the alternative: we merge this and accept that somethings might not be optimal and then you draft me in to help with the cleanup? I can, for example, do some of the reportengine related things much quicker in the first instance and then the example will be there for you to play around with. |
Yeah I do see the advantage in doing things that way. There are too many things I'm not happy with though, and the only reason I've not fixed them yet is to focus on the actual output, so I'm really quite keen to have a stab at redoing things in a way that I think works. For me it's not wasted time even if the process is slightly slower than it could be or if we end up undoing some of the changes. |
Ok! Let me know if you need a hand |
Thanks. Essentially what I want to do is make it easier to use drop-in replacements for things like the training loop, coupling layers etc. without having to worry about adding a load of junk to the ReportEngine graph and without hard-coding a bunch of stuff into the core code. These are the two strategies I've been working with up to now and it's less than ideal! I've totally changed my mind about dropping ReportEngine for the training though. My personal feeling is that we should keep it but just allow the user to play with new ideas that haven't been carefully integrated into the core code by us. For example, user defines their own training class that inherits from the core one but adds a step in the training loop that anneals the temperature. I imagine it would be fairly easy to implement something generic for input cards like |
This is the very messy branch I've been using to train models over the last couple of months, with some O(3)-related features (more options for lattice partitionings etc.) missing and some phi^4-related features thrown in without a care (e.g. splitting the inputs into positive and negative 'magnetisation').
I would prefer not to bother merging this PR since I'm currently working on a major clean-up and restructure, so putting any effort into merging this is probably not worthwhile. This is more to facilitate discussion (and spot bugs!) while I'm still working on cleaning things up. This might also be useful for flagging up things that aren't working very well so I can think about more sensible ways of doing them at the same time as doing this restructuring..