As an Amazon Associate I earn money from qualifying purchases.

Tuesday, July 1, 2014

Mining CryptoNote Coins with NVIDIA - Settings and Performance

[Update: You can also just use Nicehash, with there auto-switching NiceHashMiner. That's what I'm doing these days, and it pays out in BTC.]

I'm going to assume most people are already aware of the AMD GPU miner for CryptoNote coins, so I'm not going to discuss that for now -- though it's probably worth its own post in the near future. Instead, I want to look at mining CryptoNote coins with NVIDIA hardware, since we now have an open source (thanks TSIV!) version of ccMiner for just that purpose. First things first, grab the Windows binaries (or if you're doing Linux, use the source code and compile it). But now that you have the executable, how do you get it running?

The good news is that all of the CryptoNote coins that I've looked at where you can use a pool just rely on your wallet's address. As Monero is currently the most profitable of the coins (AFAICT), I'm going to use that as my example, but other CryptoNote coins should be similar in practice. Before you can get mining, you'll need the wallet, and you'll want to download and sync with the blockchain. For Monero, that means grabbing the latest files and then running them. Assuming this is your first time running, do the following:

  1. Download the wallet and daemon files and extract them to an appropriate folder (e.g. C:\Mining\Monero).
  2. You'll save yourself a lot of time by downloading the partial blockchain that's available on the main thread -- get the appropriate chain for Windows, OS X, or Linux (64-bit only), or else you'll just need to run the daemon and wait for it to finish syncing (which can take hours the first time). For Windows, put the blockchain.bin file in %AppData%/bitmonero; on Linux/OS X put the blockchain.bin in ~/.bitmonero.
  3. Now start up the bitmonerod.exe daemon and wallet; I use the batch file below to accomplish this. If you want this to auto-start when you boot up the PC, create a shortcut to the batch file and put it in C:\ProgramData\Microsoft\Windows\Start Menu\Programs\StartUp
@echo off
tasklist /FI "IMAGENAME eq bitmonerod.exe" 2>NUL | find /I /N "bitmonerod.exe">NUL
if not %ERRORLEVEL% == 0 (
  echo Starting node...
  start /MIN /LOW bitmonerod.exe
) else (
  echo Node already started.

tasklist /FI "IMAGENAME eq simplewallet.exe" 2>NUL | find /I /N "simplewallet.exe">NUL
if not %ERRORLEVEL% == 0 (
  if exist wallet.bin.keys (
    echo Starting previous wallet...
    start simplewallet.exe --wallet wallet.bin
  ) else (
    echo Starting new wallet...
    start simplewallet.exe --generate-new-wallet wallet.bin
) else (
  echo Wallet already started.
At this point, you should have a wallet, which really is just an EXE that talks to the daemon and can send certain commands. The daemon is in charge of downloading the blockchain and staying in sync with the network, while the wallet holds your coins and had a key (which is password protected, so you need to input the password each time you start the wallet).

To get mining, you need to know your Monero address, so go to the wallet window and type in "address" (without the quotes). Other commands you can use are available by typing "help", and thankfully there are only a few commands you need to know. Now take that address and go to one of the Monero pools -- I'd suggest as the best candidate, though you're free to use others. With that address in hand, and a pool (or two or three) selected, it's time to start mining.

For mining, you now have three options: CPU, AMD GPU, or NVIDIA GPU. The basic settings are similar for each -- you use the miner with the pool address and your wallet address. For CPUs, use the number of CPU cores you have (real cores without Hyper-Threading is usually best on Intel, while for AMD you can use all available cores). For AMD, there's no real configuration to speak of right now -- fire and forget, with the knowledge that you're donating 5% of coins to the developer (Claymore), at least until an open source version becomes available. For NVIDIA, you can just run with the defaults as well, but my whole reason for this post is to tell you not to do that! The default syntax is as follows:
ccminer.exe -l 8x64 -o stratum+tcp:// -u 48JM22E3ZfPSoFCukcizpSR2hCsBnAExT4ACvrpYx5czFgEyR12LWwK9JpgYRZKjsRHp8ynDcQegbhCspvjHd7gaL8qbzYy -p x
Okay, the addresses are really long, but the real item you want to pay attention to is the "-l [threads]x[blocks]" setting. Here's what I know: the default setting is 8x40, and on most NVIDIA GPUs that I've tried it absolutely sucks. That may be a bit harsh, but basically it's not doing any sort of tuning so you're going to get anywhere from decent to mediocre to terrible performance, but very likely not optimal performance. On one GPU (GTX 860M), the defaults gave me 50 H/s, but with fine tuning I got up to nearly four times that hash rate. So how do you tune the settings?

I suggest starting by trying to find a good thread setting; anything between 6 and 12 is potentially good, so start with 6x64 and then try 7x64, 8x64, etc. up to 12x64. Find whichever setting gives the best starting performance. Then start trying different values for the blocks setting. I started at 32 and then tried 48, 64, 80, 96, 112, and 128 -- so basically increments of 16. You'll likely find that many of these result in similar performance, but somewhere in that range you should see better results. Choose the two highest results and then try a thread settings half-way between those, and continue narrowing things down until you find what appears to be an optimal setting. It doesn't need to be an even number either, so just give it a whirl. Note that the first score you get will probably be a bit lower than your average hash rate, but it's consistent so if you get 150 H/s and then things level off at 180 H/s, a starting score of 180 H/s will likely reach an average speed of 210 H/s (give or take).

So far everything is simple enough, but there are a few final items to note. First, on most systems running the ccMiner for CryptoNote makes the system useless for doing much else -- it becomes extremely laggy. The exception is if you have a laptop with NVIDIA Optimus, as the Intel iGPU can still happily run the Windows code and stay responsive while your GPU gets pounded. But laptop GPUs are slower, so I don't really recommend that route -- I suppose a desktop with the display connected to the Intel port would also work similarly.

The second item to note is that on most systems I've tried, the NVIDIA drivers will "time out" and give you a crash message. This is Windows basically detecting that the drivers haven't responded properly for a while, and so it stops them and of course your mining quits as well. You can usually get around this via a registry edit, but in some cases even that may not work all the time so be prepared to fiddle around a bit. The registry hack is easy enough:

  1. Run "regedit.exe" from the Start Menu.
  2. Navigate to "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers"
  3. Right-click on the right panel and create a new 32-bit DWORD value.
  4. Name the key "TdrDelay" and assign it a value of anywhere from 10 to 30 (decimal -- 0A to 1E hex).
  5. Reboot and you should be set.
I've found that not all systems give the same hash rates, even with the same clocks, and there's certainly more investigating to be done. Here's what I'm running so far along with the approximate hashing rate, if you're interested. This is listed in order of increasing performance.
  1. GT 750M: -l 8x64 = ~56 H/s
  2. GTX 860M: -l 8x32 = ~200 H/s
  3. GTX 870M: -l 8x63 = ~205 H/s
  4. GTX 780M: -l 7x71 = ~210 H/s
  5. GTX 780M #2: -l 8x72 = ~225 H/s
  6. GTX 880M: -l 8x64 = ~245 H/s
  7. GTX 770: -l 8x72 = ~280 H/s (probably can do better with more tuning)
  8. GTX 780: -l 8x96 = ~380 H/s
As you can see, there's no clear rhyme or reason to what settings will work best, so you'll need to use some trial and error to figure it out. There are also some real oddities, for example the GTX 780 is rocking along at 380 H/s but I can't get anywhere close to that with the GTX 770 -- in theory, the 780 is only about 25% faster, but here I'm seeing a 35% increase. Maxwell also does reasonably well on the laptop side of things, bringing in 200 H/s despite being in theory quite a bit slower than GTX 870M and 780M.

What's interesting is that for desktop GPUs, most people are talking about getting around 260-280 H/s with the GTX 750 Ti, a Maxwell GPU with 640 cores running at 1020 MHz. The GTX 760M by comparison is the same 640 cores at a similar clock speed, but I can't get the same level of performance. The faster desktop VRAM (5.4GHz vs. 5GHz) might help some, but unless people are doing some decent overclocking to hit 260+ H/s I don't know why there's such a gap.

Obviously, TSIV and others are going to continue to improve performance over time, so I don't expect things to end at the above results, and I also suspect we'll see some auto-tuning implemented at some point. In the meantime, just know that for many of the GPUs I've played around with, specifying a launch configuration isn't just helpful, it's absolutely necessary.

Donations are welcome:
XMR: 48JM22E3ZfPSoFCukcizpSR2hCsBnAExT4ACvrpYx5czFgEyR12LWwK9JpgYRZKjsRHp8ynDcQegbhCspvjHd7gaL8qbzYy
BTC: 153qS9Ze32hnV3fwirZLWNka4wBAowc21E

If you like reading these blog posts but don't want to subscribe to my thrice-weekly newsletter, please consider making purchases through my Amazon Affiliate links!

1 comment:

  1. Thank you for this! getting just under 200 H/s on my 860M