Wednesday, September 17, 2014

BitFury Group's Upcoming ASIC: More Efficient SHA256 Hashing

Earlier this week, the BitFury Group issued a technology roadmap update for their ASICs. What's interesting about BitFury is that unlike many of their competitors, they've apparently put a lot more work into designing a custom ASIC. I'll get to what this means in a moment, but first we need to take a step back and discuss general microprocessor design principles.

Designing a CPU, GPU, SoC, ASIC, etc. can be done in many ways, but from a high level there are two general approaches. One is to use machine algorithms to optimize and lay out the transistors, and the other is to basically design the logic circuits and do the layout "by hand". There are pros and cons to either approach, of course.

The machine algorithms can do a great job of testing and validating the design and basically get you up and running a lot faster than if you were to have a human (many humans) perform the same work. What's more, there are many companies that now sell functional blocks of compute logic, so you can integrate these functional blocks into your chip a lot easier if you let the machines do the legwork. The drawbacks to machine layouts are that they typically use more area and they tend to be less power efficient. These aren't necessarily inviolate rules, but that's the basic idea.

Doing the layout by hand is basically the reverse of the above: it can take much longer to complete the design, validation, testing, etc. However, a human can generally see the big picture better than a machine and so they can optimize better for die area as well as power. This can in turn lead to potentially higher performance, which can be very important for high performance (or low power) microprocessors.

The above is a low-level discussion of processor design, but one of the interesting ideas is the use of ready made functional blocks. If you've ever wondered why it took a while to see the first SHA256 (Bitcoin) ASICs and then suddenly there was an explosion of competing designs from several companies, it's because once there was a tested and validated solution available, many other companies were able to license/buy the basic design and then just place more chips on a board to improve performance.

There are still multiple Bitcoin ASIC designs of course. The earliest ASICs were built on 90nm or even 130nm process technology (because it was cheap, mature, readily available, and easier to use), but as the competition heated up things shifted to newer and smaller process nodes. Today, the fastest and most efficient ASICs are manufactured on 28nm process technology, and 20nm designs will probably come out within the next year (the 20nm fabrication facilities are busy making things like Apple's A8 and the new Qualcomm Snapdragon cores, so they would cost a lot more to use). However, even 55nm ASICs can still be efficient enough to earn a profit -- or at least, the power cost of running them is lower than the value of BTC they generate.

BitFury is a prime example of this last case, as up until now they have been using 55nm process technology. The key to staying competitive even with an older process node is that BitFury uses their own custom logic (i.e. it's not licensed from another company), and they apparently put a bit more effort into optimizing for power and efficiency. Or more likely, they run at lower clocks and they're not really performance competitive right now -- the current BitFury ASICs can hit 3.5 TH/s at 2800W (give or take), but the power use is likely the limiting factor.

The latest announcement is basically BitFury Group saying that their 28nm custom logic ASIC is nearly ready. With the smaller process node and additional time spent optimizing for power efficiency, BitFury is claiming that they will have ASICs capable of running at 0.2 J/GH (essentially 0.2 W/GH) by the end of 2014, most likely late December. They're also working on an even more efficient design that will use 0.1 J/GH (W/GH, assuming 0.1 J/s) in mid-2015. That doesn't really tell us a lot in a vacuum, though, so let's compare those power numbers with some existing ASICs.

BitFury's own 3500BF I mentioned above delivers 3500 GH/s at 2800W, so it's doing about 0.8 W/GH. It also uses 1320 BF864C55 chips, which can run at a voltage range of 0.5V to 1.2V depending on your desired efficiency, with 0.5 J/GH being the maximum efficiency while 3.8 GH/s is the maximum performance -- but you have to choose one or the other. (Ever wonder why you can overclock ASICs? It's because you're just trading efficiency for higher performance, so if you have an older ASIC that's pulling 420W at the wall and you drop the clocks 10%, you'll likely end up improving efficiency by more than 10%.)

The KnC Neptune is currently targeting 3500 GH/s at 1950W, so it's slightly more efficient than the 3500BF (0.56 W/GH), but it's already on 28nm. Butterfly Labs' Monarch is more like existing chips, as it's capable of 700 GH/s at 490W (0.7 W/GH). Bitmain has their AntMiner S3 that's also around 0.78 W/GH, though the Antminer S4 is "coming soon". Looking at a list of other ASICs, those figures are pretty similar to the other "state of the art" designs.

Since I've been on a Hashlet's kick, I might as well toss out the Hashlet Genesis as well. GAW isn't saying how much power the Genesis actually uses, but the cost to run it (hosting included) is $0.02 per 10 GH. Doing the math at $0.10 per kWh, that would mean GAW is basically charging you at a rate equivalent to roughly 0.83 W/GH.

Basically, if we look at most of the currently shipping Bitcoin ASICs, the best you might get out of them is 0.5 W/GH, so BitFury Group is claiming they will more than double the hashing efficiency, and by the middle of next year they'll double it again. It's not just about efficiency of course -- the initial price will also largely determine whether or not a new ASIC is worth buying, so keep that in mind.

As I've said in the past, the real money makers in the whole Bitcoin Gold Rush are the people selling the mining hardware -- or other services as the case may be. You can't expect them to give the stuff away, obviously, but they're taking a healthy profit in most cases and hitting ROI is sometimes difficult (especially when the manufacturers mine with the hardware for a month or two before shipping to customers). Hopefully the 28nm BitFury parts get to end customers sooner rather than later, as we're getting close to the point where many of the current ASICs are going to have to be retired. Anything worse than about 3 W/GH is now breaking even on power, but really you'd want to be below 1.5 W/GH to keep mining viable -- and if you pay more for electricity (like $0.30 per kWh), you'd be breaking even at just 1 W/GH!

Anyway, for those thinking the current levels of efficiency were the end of the road for Bitcoin ASICs, there's still plenty of room left for optimizations. The first wave is now over (and probably the second and third as well), and the focus is now on refining designs rather than just getting them out the door. It's going to be interesting to see what sort of pricing we get on the next generation of ASICs, but we're still a few months away it looks like.

No comments:

Post a Comment