Ethash: how it works when mining Ethereum

Ethash: how it works when mining Ethereum

Ethash: how it works when mining Ethereum

Today Ethereum mining on video cards is the norm, and so far the miners have not made a strong leap to launch the Ethash mining algorithm on specialized hardware solutions (for example, FPGA and ASIC). There are Asiks on the network, but they do not give such a big advantage as on Bitcoin.

Many articles and forums explain this by saying that the development of ASICs for Ethash is a memory problem).

Here we talk about where Ethereum’s tight memory binding comes from and what the next generation of custom mining devices for ETH mining might look like.

Here we talk about where Ethereum’s tight memory binding comes from and what the next generation of custom mining devices for ETH mining might look like.

For a more technical, programmer-oriented explanation of the Ethereum mining algorithm called Ethash, please refer to the Ethash page in the Ethereum GitHub repository….

A quick explanation of Proof-of-Work

When mining with Proof-of-Work, miners look for a solution (called a one-time number – “nonce”), which when hashed gives an output value that is less than a predefined target threshold.

Due to the cryptographic nature of the hash function of each currency, there is no way to reverse engineer or reverse calculate a one-time number (“nonce”) that satisfies the target threshold limit.

Instead, miners should “guess and verify” hashes as quickly as possible and hope that they are the first miners in the entire cryptocurrency network to find a valid one-time number. Thus, they will find a new block.

How the Ethash algorithm works

DAG file

The Ethash algorithm relies on a pseudo-random ^(https://crypto-mining.club/redirect/https://en.wikipedia.org/wiki/Pseudorandomness) dataset initialized with the current length of the block chain.

This is called a DAG file and is restored every 30,000 blocks (or every ~ 5 days). As of September 2019, the DAG is ~ 3.22 GB, and the DAG will continue to grow in size as the blockchain grows.

The features of creating DAGs are not so relevant for this article, but you can read more about DAG generation here ^(https://crypto-mining.club/redirect/http://ethereum.stackexchange.com/questions/1993/what-actually-is-a-dag).

The progress of the Ethash hash algorithm can be summarized as follows:

Ethash: как он работает при майнинге Ethereum ^(https://crypto-mining.club/redirect/https://bytwork.com/sites/default/files/inline/images/ethash_algorithm.png)

Ethereum hash algorithm working principle

  1. The Preprocessed Header – the preprocessed header (obtained from the last block) and Current Nonce (the current one-time number ^(https://crypto-mining.club/redirect/https://ru.wikipedia.org/wiki/Nonce)), combined with the use of the SHA-3-like algorithm to create our initial 128 bytes of the mix, are called Mix-0 here.
  2. Mix is used to calculate which 128-byte page from the DAG needs to be extracted, represented by the “Get DAG Page” block.
  3. Mix is combined with the resulting DAG page. This is done using the “Ethereum-specific” blending function to generate the next mix, called Mix 1 here.
  4. Steps 2 and 3 are repeated 64 times, resulting in a Mix of 64.
  5. Mix 64 is post-processed to produce a shorter 32-byte Mix Digest.
  6. Mix Digest is compared with a predefined 32-byte Target Threshold (target threshold). If the Mix Digest is less than or equal to the Target Threshold, then the current non-current number (Current Nonce) is considered successful and will be broadcast to the Ethereum network. Otherwise, the current one-time number is considered invalid, and the algorithm is restarted with another one-time number (either by increasing the current one-time number, or by choosing a new one randomly).

Why is Ethash tied to memory?

Each mixing operation requires a 128-byte read from the DAG (see Figure 1, step 2).

Hashing one one-time number requires 64 mixes, resulting in (128 bytes x 64) = 8 KB of memory reading. Random access reading (each 128-byte page is selected pseudo-randomly based on the mixing function), so putting a small DAG fragment in the L1 or L2 cache will not help much, since the next DAG fetch is very likely to lead to a lack of cache.

Since retrieving DAG pages from memory is much slower than compute blending, we will hardly see any performance improvement from speeding up mix compute.

The best way to speed up Ethash’s hash algorithm is to speed up fetching a 128-byte DAG page from memory.

Thus, we consider the Ethash algorithm to be tightly tied to memory or related to memory ^(https://crypto-mining.club/redirect/https://en.wikipedia.org/wiki/Memory_bound_function), as the system memory bandwidth limits our performance.

Reaching memory bandwidth limit in real hardware

As an example of how memory bandwidth limitations affect real hardware, let’s take a closer look at the mining performance of a commonly used video card: RX 590.

XFX Radeon RX 590 Fatboy Коробка Ethash: как он работает при майнинге Ethereum ^(https://crypto-mining.club/redirect/https://bytwork.com/sites/default/files/inline/images/xfx_radeon_rx_590_fatboy_korobka.jpg)

If Ethash hashing really requires a lot of memory, we expect that the actual mining speed for this equipment will be very close to the maximum theoretical hash speed, provided that sampling the DAG pages is the only step performed.

We can calculate this maximum theoretical hashrate as follows:

(Memory bandwidth) / (DAG memory extracted for hashing) = maximum theoretical hashrate

(256 gigabytes / sec) / (8 kilobytes / hash) = 32 megabytes / sec.

The empirical hash of the RX 490 during actual operation is ~ 31 mega / s.

This small delay can easily be explained by memory latency or other fast operations on the system. Thus, the performance of this video card is the same as expected, provided that data hashing is difficult for memory, and the choice of DAG pages is a speed limiting step.

Victory over video cards: the next generation of mining devices for mining ETH

The only way that Ethereum user mining equipment can come in handy is if it is more economical or energy efficient with memory bandwidth (less than $ / (GB / s) or less W / (GB / s)).

Option 1: High Memory Bandwidth FPGA / ASICs

Looking at the RX 590, we can calculate a bit ($ 245 per card / (256 GB / s)) to see the hash rate is $ 0.95 / GB / s.

Compared to a single GDDR5 chip (e.g. Micron EDW4032BABG ^(https://crypto-mining.club/redirect/http://www.digikey.com/product-detail/en/micron-technology-inc/EDW4032BABG-60-F-R-TR/EDW4032BABG-60-F-R-TR-ND/6136217)), which costs $ 6.83 and has a bandwidth of 24 GB / s, we can do better – $ 0.28 / GB / s.

Thus, if we can create our own chip (either ASIC or FPGA) than the interface with 9 GDDR5 chips, we will have 216 GB / s memory bandwidth at a price of $ 61.47.

However, this will not be a complete device, since we need an FPGA or ASIC memory controller, a printed circuit board and auxiliary electronics.

If the shipped final assembly (adding additional parts, processes, tests and logistics) costs less than the RX 590 (only $ 245), then the user board will surpass the video card.

That is, until a faster, more efficient and cheaper video card appears on the market.

For example, HBM graphics cards ^(https://crypto-mining.club/redirect/https://en.wikipedia.org/wiki/High_Bandwidth_Memory) are already available. But if you find inexpensive off-the-shelf FPGA or ASIC chips with 5-10 DDR or HBM memory controllers or your company has experience creating specialized ASIC devices with high memory bandwidth, you can do without hardware.

However, in this situation, you should probably change create your own business model and instead create video cards, as this is already a huge market.

Option 2: Use Next Generation Mobile Chipsets

As the use of smartphones and mobile 3D graphics grows, we will see more mobile friendly and high memory bandwidth.

It can be solutions for mobile systems on a chip with an integrated graphics processor (for example, NVidia Tegra X1 ^(https://crypto-mining.club/redirect/http://www.nvidia.com/object/tegra-x1-processor.html)) or a standalone mobile graphics processor (for example, PowerVR Series 8XE ^(https://crypto-mining.club/redirect/https://www.imgtec.com/powervr/graphics/series8xe-plus/)), or specialized processors with high bandwidth or focused on neural networks with integrated memory (for example , Movidius Myriad 2 ^(https://crypto-mining.club/redirect/https://www.movidius.com/solutions/vision-processing-unit)).

These classes of devices will continue to evolve, and if the cost, power and memory bandwidth get to the right place, we may well see Ethereum custom miners with 10-20 mobile graphics processors or VPUs located on the same board.

Conclusions

Sequential DAG page samples in the Ethash hash algorithm reach the memory bandwidth limits of modern hardware.

Their theoretical maximum hashrate is currently limited.

How will we see future Ethereum miners? They probably won’t be based on ASIC or FPGA. Most likely, they will be based on ready-made chips (mobile GPUs or VPUs), and not on the form factor of a traditional video card, which we are so used to seeing in modern computers because mobile GPUs or VPUs are more tuned for memory bandwidth.

This article is about the Ethash protocol, based on Proof-of-Work, which is used to mine Ethereum. In systems based on Proof-of-Work, like this one, miners perform significant amounts of computation to find new blocks, and receive cash rewards.

As soon as the Ethereum network switches to the Proof-of-Stake ^(https://crypto-mining.club/redirect/https://bytwork.com/articles/chto-takoe-pow-i-pos-i-chem-oni-otlichayutsya#sect2) system (presumably after 2020 with the Ethereum 2.0 or Serenity phase ^(https://crypto-mining.club/redirect/https://bytwork.com/events/zapusk-ethereum-20)), cash rewards will be given to Ethereum currency holders, not miners, which is likely to make Ethereum mining obsolete.

When this transition occurs, it is not yet clear that it is expected that the first phase will be launched on January 3, 2020.

Recommended Related Articles:

Leave a Reply

Your email address will not be published. Required fields are marked *