I’ve been reading up on Ethereum for the last couple of days. Apparently, doing the initial sync is one of the major issues people run into (at least with geth). That includes me. I first tried syncing on an HDD, and that didn’t work. I then used a mediocre machine with SSD, but it still kept on running with no apparent end in sight. So I decided to use a ridiculously large machine on Azure and sync there. Turns out that with this machine is was able to do a –fast sync in a little under 8 hours.
Specs
I used an Azure Standard_L16s storage optimized VM. This beast has 16 cores, 128 gigs of memory and 80,000 IOPS and 800MBps throughput on its temporary storage disk. Ought to be enough you’d say. I started geth with ./geth --maxpeers 25 --cache 64000 --verbosity 4 >> geth.log 2>&1
Overview
Azure VM Instance | Standard_L16s |
OS | Ubuntu 16.04.4 LTS |
CPU | 16 cores |
Memory | 128GB |
Disk IOPS (spec) | 80,000 |
Disk throughput (spec) | 800 MBps |
Geth version | geth-linux-amd64-1.8.3-329ac18e |
Geth maxpeers | 25 |
Geth cache | 64,000MB |
Results
Start time | 2 apr 2018 20:46:43 UTC |
End time * | 3 apr 2018 04:27:15 UTC |
Total duration | 7h 40m 32s |
Imported blocks at catch up time | 5,369,956 |
Blocks caught up | 3 apr 2018 00:11:08 (3h 24m 25s) |
Total imported state trie entries | 114,566,252 |
State caught up | 3 apr 2018 04:24:07 (7h 37m 24s) |
du -s ~/.ethereum | 77,948,852 |
* End time defined as first single-block “Imported new chain segment” log message
CPU/Load/Memory
Disk
Network
Peers
Blocks
State trie
Notes
- Firewall needs to be open for port 30303 (I opened both UDP and TCP). Otherwise you won’t get enough peers.
- Syncing actually seems to take more time with more peers. I settled on the default of 25. With 100 peers it was much slower.
- Importing the chain segments did not take significant time, contrary to the comment mentioned in the github issue.
Conclusions
Disk IO is mostly used while fetching the blocks. After that, the system’s resources are barely used, which makes me think the bottleneck is the network. Though even during block syncing, the resources are barely maxed out, so probably the process is constrained by the network the entire time. I’m not familiar enough with Geth/Ethereum to ascertain this for sure though. As stated above, increasing the number of peers didn’t improve the situation, but made it worse.
Hope this post was of some help. If you have results to share, please let me know.