wiki:KeySearcher OpenCL

Version 3 (modified by Sven Rech, 11 years ago) (diff)


There is a big (driver related) problem when trying to use the KeySearcher OpenCL capabilities with Nvidia GPUs:

There is an absolutely mysterious and not explainable behaviour when using KeySearcher with OpenCL and with a RegEx cost function. It highly depends on which regex you choose if the code behaves right or wrong. If it behaves wrong, you will get -1.0 as a result for every bruteforced key.

This is not an error of the regex algorithm itself, because the very same algorithm works perfectly in CPU mode. Also, the same OpenCL code works perfectly when executing on ATIs OpenCL CPU emulation driver.

Also it doesn't seem to be some kind of "Out of memory" exception (as can be expected, because the regex algorithm uses large transition tables), because I already tried to reorganized the OpenCL code in such a way, that high memory usage is avoided. It still did not work. It is hard (or next to impossible) to tell at the moment, what causes the problem (or even what the actual underlying problem is anyway). It is only clear, that the OpenCL code isn't executed the way it should.

Additionally there is the observation, that with newer version of the NVidia driver, problems start to differ. For instance, after I observed the above described problems, I installed the newest Nvidia drivers. After that, the problem was still observable, but not as often. This means that now some regular expressions, which were problematic before the driver installation, work normally, but some of them still don't.

Another hint that the Nvidia OpenCL implementation is in a bad state is, that the newest driver always opens a console window for a very short time when the OpenCL code compiles in the background. This was not the case with older drivers. It can be assumed, that this behaviour is not intended, which shows in which development stage the implementation seems to be.

The described problem applies for the windows drivers as well as for the linux drivers (32 and 64 bit).

This problem was solved by disabling compilation optimization. But it should be reviewed in the future, whether Nvidia finally gets this right...