-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Conversation
- etheminer does not support any CUDA version that requires the obsolete shuffle without sync. - Copying mix to shared mem is expensive, do it only for actual solutions.
@@ -114,11 +92,15 @@ __device__ __forceinline__ uint64_t compute_hash( | |||
} | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual optimization. Above is just cleanup
@@ -86,16 +74,6 @@ __device__ __forceinline__ uint64_t compute_hash( | |||
uint32_t thread_mix = fnv_reduce(mix[p]); | |||
|
|||
// update mix accross threads | |||
#if CUDA_VERSION < SHUFFLE_DEPRECATED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did I miss one?
Allows CUDA to use shifts instead of multiplies and sequential access of the mix. Assume cuda arch >= 3 and cuda toolkit >= 9 and remove deprecated code and definitions.
Also deleted 2 unused CUDA files |
@chfast Should I squash all of this? It might actually be easier to inspect one commit at a time? |
I wonder if we still need to support sm 30 & 35 Kepler architecture? Other than the Tesla K40 & K80 those were all 2GB or less GPUs. I think the K40/80 have long since become unprofitable. Even Maxwell architecture is barely profitable and it's a later technology. |
Isn't the K40 & K80 the ones some Macs have? |
This is quite a lot of changes, let's better keep all the commits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see another review.
I think we should still support 3.0 and 3.5 since those are still pretty common in big datacenters |
@@ -338,10 +338,10 @@ class Farm: public FarmFace | |||
* @param _wp The WorkPackage that the Solution is for. | |||
* @return true iff the solution was good (implying that mining should be . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is no longer accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, thank you.
etheminer does not support any CUDA architecture
that requires the obsolete shuffle without sync.
Copying mix to shared mem is expensive, do it
only for actual solutions.
Include whole work package in solution. Works out
to about the same as including all of its components
individually. Shortens construction of solution.
redefine, resize, and realign results buffer such that
cuda can use shift instead of multiply. As well as
sequential addresses for mix copy.
Delete unused cuda files, previously needed for
pre-shuffle architectures.