1 2018-06-09 01:28:37	0|fanquake|Empact Am alive and well :p
 2 2018-06-09 05:36:42	0|murrayn|Is there a reason -O2 is specifically enabled in configure with --enable-debug? Should this not be -Og?
 3 2018-06-09 16:33:00	0|gmaxwell|sipa: linked on your SHANI pr is an implementation where someone else noticed the throuput/latency relationship that I noticed.. they also do a 4way and it's faster (by a small amount) than 2-way.
 4 2018-06-09 16:33:34	0|gmaxwell|they get 18% speedup for 2way over 1way, and 21% for 4-way over two-way.
 5 2018-06-09 16:34:06	0|gmaxwell|I'm not sure if that difference is even worth it, though perhaps throughput might increase for later cpus.
 6 2018-06-09 16:36:14	0|sipa|interesting, i'll try that too
 7 2018-06-09 16:36:38	0|gmaxwell|Their implementation might be interesting to look at to see if they had some smarter way of dealing with register pressure.
 8 2018-06-09 16:36:58	0|sipa|another remarkable thing i noticed: the speedup of 64-specialized shani over variable length shani was close to 2x
 9 2018-06-09 16:37:11	0|sipa|far higher than the ratio observed elsewhere
10 2018-06-09 16:37:25	0|sipa|gmaxwell: from what i can see it's just interleaving
11 2018-06-09 16:38:08	0|gmaxwell|(presumably register churn is why their attempt at 8-way was slower 2/4 way)
12 2018-06-09 16:39:17	0|gmaxwell|sipa: The 64-specialized saves expander work, which I guess isn't as fast with shani?  or maybe it's just that shani is faster so calling overhead (which the specialized reduces) matters more?
13 2018-06-09 16:40:03	0|provoostenator|Memory management is a pain. I have a device with 1 GB RAM, trying to squeeze as much as possible out of it during IBD. Without swap, if I set it slight too high, it crashes when dbcache gets too large. With swap, it starts using the swap, which presumably defeats the purpose. Is there any way to _have_ swap but prevent dbcache from using it?
14 2018-06-09 16:40:57	0|gmaxwell|provoostenator: I doubt swapping is actually defeating the purpose, at least if it isn't doing it heavily.
15 2018-06-09 16:41:16	0|gmaxwell|The data that gets swapped is infrequently used stuff first...
16 2018-06-09 16:44:06	0|sipa|gmaxwell: SHANI has special instructions both for expansion and transform
17 2018-06-09 16:44:10	0|provoostenator|It indeed didn't seem very slow, so maybe it's not too bad in practice then. 450 MB dbcache (with maxmempool=5) seems about the max without swap.
18 2018-06-09 17:08:34	0|sipa|gmaxwell: 4-way seems a bit slower here, but that may be due to less than perfectly interleaved code being emitted
19 2018-06-09 21:03:44	0|provoostenator|I have a new theory as to why my aggresive pruning IBD branch is _slower_ than master. Namely that dirty CCoinsCacheEntry read/write doesn't perform well for very large cache sizes. See See also https://github.com/bitcoin/bitcoin/pull/12404#issuecomment-395998702
20 2018-06-09 21:03:55	0|provoostenator|(theory, still have to measure this)
21 2018-06-09 21:26:20	0|phantomcircuit|provoostenator, aggressive pruning?
22 2018-06-09 21:26:37	0|sipa|phantomcircuit: #12404
23 2018-06-09 21:26:39	0|gribble|https://github.com/bitcoin/bitcoin/issues/12404 | Prune more aggressively during IBD by Sjors · Pull Request #12404 · bitcoin/bitcoin · GitHub
24 2018-06-09 21:27:43	0|phantomcircuit|oh
25 2018-06-09 21:33:29	0|phantomcircuit|sipa, does flushing the cache still remove everything?
26 2018-06-09 21:33:51	0|sipa|yes
27 2018-06-09 21:36:30	0|phantomcircuit|sipa, and there's no way to flush "upto block x" right?
28 2018-06-09 21:37:50	0|sipa|phantomcircuit: indeed, because there may have been entries created before x, but spent after x, which wouldn't be present on disk
29 2018-06-09 21:38:29	0|sipa|it is possible with the non-atomic flushing since 0.15 (which writes to disk a range of blocks rather than a single up-to-x point)
30 2018-06-09 21:38:40	0|sipa|though it's pretty complicated to reason about
31 2018-06-09 21:58:16	0|phantomcircuit|sipa, so to enable that you'd need to keep around entries that are a record of an entry being deleted?
32 2018-06-09 22:00:28	0|sipa|phantomcircuit: you actually don't
33 2018-06-09 22:01:18	0|sipa|you just need to accurately keep track of (a) the block up to which you've flushed everything and (b) the block up to which effects may be present on disk, and at startup replay the blocks' UTXO effects between those 2
34 2018-06-09 22:01:25	0|sipa|that's already implemented even
35 2018-06-09 22:01:44	0|sipa|however, once you introduce partial flushing during reorgs which may overlap etc... it becomes far more complicated
36 2018-06-09 22:02:26	0|phantomcircuit|yeah wasn't thinking about reorgs
37 2018-06-09 22:03:18	0|sipa|all of this is doable, and i think i know all the algorithms necessary to implement it
38 2018-06-09 22:03:56	0|sipa|with the goal of being able to have a background process that just periodically (and asynchronously) flushes the oldest dirty UTXO entries (and wipes the oldest non-dirty ones)
39 2018-06-09 22:04:18	0|sipa|but it's a pretty big amount of work without knowing if it'll actually speed things up :)
40 2018-06-09 22:05:08	0|phantomcircuit|sipa, i had a patch which did this, but broke consensus across reorgs
41 2018-06-09 22:05:13	0|phantomcircuit|it was a substantial speed up
42 2018-06-09 22:05:24	0|phantomcircuit|but that was a while ago, so possibly it wouldn't be as large anymore?
43 2018-06-09 22:05:45	0|sipa|since per-txout in 0.15 performance profiles of such things may have shifted drastically
44 2018-06-09 22:05:52	0|sipa|it could be less or more of a speedup now :)
45 2018-06-09 22:10:28	0|phantomcircuit|yeah
46 2018-06-09 22:10:38	0|phantomcircuit|iirc it was really simple to do
47 2018-06-09 23:38:51	0|phantomcircuit|sipa, the FRESH flag looks a bit confusing
48 2018-06-09 23:39:55	0|phantomcircuit|the idea is that if an entry is added and spent before a flush it's effectively a noop ?
49 2018-06-09 23:40:13	0|sipa|it just means "this entry does not exist in the parent cache, so if it is spent, we can just forget about it"
50 2018-06-09 23:40:45	0|sipa|phantomcircuit: it's *the* major performance gain our cache gives
51 2018-06-09 23:40:46	0|phantomcircuit|ok i get that
52 2018-06-09 23:41:06	0|phantomcircuit|yeah
53 2018-06-09 23:42:27	0|sipa|because it avoids entries ever hitting disk at all
54 2018-06-09 23:46:18	0|phantomcircuit|sipa, yup i definitely get it