1 2016-10-05 02:03:39	0|wumpus|MSG_FILTERED_WITNESS_BLOCK defined in protocol.h is neither used in our source code nor defined in any BIP, should be in BIP144 I guess?
  2 2016-10-05 02:18:25	0|wumpus|(looking up these numbers for documentation in protocol.h for #8880)
  3 2016-10-05 02:19:47	0|luke-jr|if it's not used in code, why define it at all?
  4 2016-10-05 02:20:36	0|wumpus|I'd say to reserve the bit pattern, though this should happen in the BIP too
  5 2016-10-05 02:21:08	0|luke-jr|ah, so we use the value, just via setting a bit instead of the enum key
  6 2016-10-05 02:21:46	0|wumpus|that's my assumption. I haven't checked
  7 2016-10-05 02:36:29	0|GitHub60|[13bitcoin] 15EvertonMelo opened pull request #8886: Update build-unix.md (06master...06patch-1) 02https://github.com/bitcoin/bitcoin/pull/8886
  8 2016-10-05 02:49:49	0|xinxi|sipa: are you there?
  9 2016-10-05 02:50:17	0|xinxi|I am wondering why Bitcoin Core does not store the blockchain in a distributed way to save space?
 10 2016-10-05 02:52:12	0|wumpus|there are some ideas for adding range negotiation to the protocol, so that nodes can store different ranges of blocks
 11 2016-10-05 02:52:17	0|wumpus|see e.g. last meeting notes
 12 2016-10-05 02:53:00	0|wumpus|the answer to 'why' is always: because no one has designed/implemented this yet
 13 2016-10-05 02:53:15	0|xinxi|So theoretically, it is possible.
 14 2016-10-05 02:53:16	0|wumpus|in any case if you are short in space you can use pruning
 15 2016-10-05 02:53:26	0|xinxi|And blocksize won't be a problem seriously.
 16 2016-10-05 02:53:50	0|wumpus|you're confounding issues
 17 2016-10-05 02:54:21	0|xinxi|Sure blocksize increase will cause other issues like network latency.
 18 2016-10-05 02:54:48	0|wumpus|increasing block size is a problem for utxo set growth, for the time it takes to initially sync, for bandwidth requirements of nodes and miners, for CPU validation overhead. Storing the block chain is the least of worries really
 19 2016-10-05 02:55:07	0|wumpus|anyhow move this to #bitcoin, this is off topic here
 20 2016-10-05 02:55:12	0|xinxi|OK
 21 2016-10-05 02:56:05	0|xinxi|Thank you.
 22 2016-10-05 02:59:51	0|xinxi|Another question directly about Bitcoin's current protocol.
 23 2016-10-05 03:00:18	0|xinxi|Is it possible to increase OP_RETURN's data size with a softfork?
 24 2016-10-05 03:00:35	0|wumpus|this topic is about development, discussion about changing the source code, use #bitcoin for general questions
 25 2016-10-05 03:00:40	0|wumpus|see topic ^^
 26 2016-10-05 03:07:53	0|GitHub168|13bitcoin/06master 14eeeebdd 15MarcoFalke: [doc] Rework docs...
 27 2016-10-05 03:07:53	0|GitHub168|[13bitcoin] 15laanwj pushed 2 new commits to 06master: 02https://github.com/bitcoin/bitcoin/compare/d7615af34e8e...f92805025d5b
 28 2016-10-05 03:07:54	0|GitHub168|13bitcoin/06master 14f928050 15Wladimir J. van der Laan: Merge #8879: [doc] Rework docs...
 29 2016-10-05 03:08:03	0|GitHub181|[13bitcoin] 15laanwj closed pull request #8879: [doc] Rework docs (06master...06Mf1610-docCleanup) 02https://github.com/bitcoin/bitcoin/pull/8879
 30 2016-10-05 03:12:30	0|wumpus|sigh... https://github.com/bitcoin/bitcoin/pull/8886#discussion_r81892323
 31 2016-10-05 03:15:20	0|GitHub173|[13bitcoin] 15fanquake opened pull request #8887: [Doc] Improve GitHub issue template (06master...06link-stackexchange) 02https://github.com/bitcoin/bitcoin/pull/8887
 32 2016-10-05 03:26:05	0|GitHub91|[13bitcoin] 15theuni opened pull request #8888: CConnman: Remove global usage in wallet and some gui (06master...06no-global-connman) 02https://github.com/bitcoin/bitcoin/pull/8888
 33 2016-10-05 03:33:22	0|achow101|the values for MSG_WITNESS_BLOCK and MSG_WITNESS_TX should probably be defined in BIP144...
 34 2016-10-05 03:34:42	0|wumpus|good point achow101
 35 2016-10-05 04:00:00	0|luke-jr|wumpus: wtf @ 8886
 36 2016-10-05 04:47:24	0|GitHub22|[13bitcoin] 15luke-jr opened pull request #8889: Qt/ModalOverlay: Use theme tooltip colours (06master...06overlay_theme) 02https://github.com/bitcoin/bitcoin/pull/8889
 37 2016-10-05 04:55:29	0|NicolasDorier|sipa: about PrecomputedTransactionData, I noticed you calculate it even if the Transaction has no witness https://github.com/bitcoin/bitcoin/blob/d93f0c61843f9da2a662f54ea1794ae0d263b196/src/main.cpp#L2458 I don't know if it has big impact on performance during IBT. Just dropping it here in case you overlooked it
 38 2016-10-05 04:55:44	0|NicolasDorier|IBD
 39 2016-10-05 05:07:20	0|GitHub117|[13bitcoin] 15fanquake opened pull request #8890: [Doc] Update Doxygen configuration file (06master...06update-doxyfile-1-8-12) 02https://github.com/bitcoin/bitcoin/pull/8890
 40 2016-10-05 05:17:54	0|sipa|wumpus: i guess MSG_FILTERED_WITNESS_BLOCK is what we'd need for a bip37 extension with segwit data
 41 2016-10-05 05:18:35	0|sipa|NicolasDorier: yes, i'm aware
 42 2016-10-05 06:58:39	0|wumpus|sipa: makes sense
 43 2016-10-05 07:01:53	0|wumpus|luke-jr: I suspect it's a very young person and someone that doesn't know English very well, and he's just trying to be helpful. But indeed, wtf
 44 2016-10-05 07:11:32	0|wumpus|and indeed, in fact most of build-unix.md is about linux distros, and the only UNIX that gets mentioned separately has its own document (openbsd), but no, adding kernel versions will not clarify that in any way :)
 45 2016-10-05 08:14:52	0|dcousens|btcdrak: "uncompressed keys which are disallowed under segwit.", interesting,  no issue but can't find that in the BIP?
 46 2016-10-05 08:16:38	0|btcdrak|dcousens: because it is too late to change the consensus layer without causing delays, so it is implemented as policy for now and consider for future sf.
 47 2016-10-05 08:16:56	0|btcdrak|see last meetung notes
 48 2016-10-05 08:17:00	0|dcousens|btcdrak: ok, sweet, will do
 49 2016-10-05 08:18:46	0|dcousens|btcdrak: seen in notes, cheers ;)
 50 2016-10-05 08:22:02	0|btcdrak|I think we will make a note in the BIP bow, thanks.
 51 2016-10-05 08:23:46	0|wumpus|maybe, though it should be very clear that it is policy and not consensus then
 52 2016-10-05 08:24:45	0|gmaxwell|My suggestion is that it should state that it's policy created with the intent of making it consensus in a future softfork.
 53 2016-10-05 08:30:15	0|wumpus|or already create a BIP for such a future softfork, although that may be premature if it's to be combined with other things that may still come to light using this in the wild
 54 2016-10-05 08:31:06	0|gmaxwell|Well we can do like we did with CSV, we had three BIPs triggered on one bit.
 55 2016-10-05 08:31:14	0|wumpus|indeed. true.
 56 2016-10-05 08:31:21	0|gmaxwell|The thing I'd combine it with would be low-S enforcement.
 57 2016-10-05 08:32:43	0|dcousens|gmaxwell: agreed
 58 2016-10-05 08:32:46	0|wumpus|yes that would make sense, it's a comparable thing
 59 2016-10-05 08:33:39	0|gmaxwell|(low-S doesn't have the blowing up anyone's pubkey risk, though, and already IsStandard for non-segwit, so no need to warn about it.)
 60 2016-10-05 08:34:34	0|dcousens|btcdrak: that was uncompressed rejection only in witnesses,  or even in regular scriptSigs?
 61 2016-10-05 08:35:51	0|luke-jr|indeed, node policy isn't something BIPs should cover, except in the case of it being preparation for a softfork
 62 2016-10-05 08:36:29	0|btcdrak|dcousens: only for segwit
 63 2016-10-05 08:36:31	0|dcousens|or I suppose the question is targeted at the idea in general,  would it be a soft-fork to reject all uncompressed keys? Or only uncompressed keys in witness scripts
 64 2016-10-05 08:36:34	0|luke-jr|dcousens: only in witnesses; it'd break compatibility otherwise
 65 2016-10-05 08:37:01	0|dcousens|ok
 66 2016-10-05 08:52:14	0|gmaxwell|dcousens: I would be reasonable for a library to strongly discourage uncompressed keys in whatever ways it can do so without breaking things.
 67 2016-10-05 08:53:12	0|dcousens|gmaxwell: indeed,  defaults are almost all against it now anyway... but I think by the time that soft-fork came around, it'd be something I'd probably move into the "hard enough to do you have to do it manually" camp
 68 2016-10-05 08:53:24	0|dcousens|rather than just flick a switch
 69 2016-10-05 08:55:35	0|gmaxwell|we can't,  unfortunately, get rid of uncompressed keys for non-segwit, as that would confiscate coins. But for segwit we can so long as it's clearly non-kosher from the start.
 70 2016-10-05 08:56:42	0|dcousens|gmaxwell: hmmm, I suppose that is risky,  but I can see some actors aggrevating the issue for agendas
 71 2016-10-05 08:58:16	0|GitHub50|[13bitcoin] 15fanquake opened pull request #8891: [Doc] Update bips.md for Segregated Witness (06master...06segwit-bips-md) 02https://github.com/bitcoin/bitcoin/pull/8891
 72 2016-10-05 08:58:57	0|gmaxwell|dcousens: changing things for existing keys is just completely impratical ... people with random uncompressed keys printed out... coins paid to them.. no idea that the keys are uncompressed.
 73 2016-10-05 08:59:49	0|dcousens|gmaxwell: I 100% agree its a good move,  I just wonder if its worth delaying instead of policy for a short time... probably already discussed in the meeting,  I'll look up the logs more in depth
 74 2016-10-05 09:00:03	0|dcousens|delaying segwit* (and then putting it in that SF)
 75 2016-10-05 09:00:38	0|gmaxwell|policy makes it safer.
 76 2016-10-05 09:00:45	0|GitHub135|[13bitcoin] 15laanwj opened pull request #8892: doc: Add build instructions for FreeBSD (06master...062016_10_freebsd_build) 02https://github.com/bitcoin/bitcoin/pull/8892
 77 2016-10-05 09:00:59	0|gmaxwell|even if its outright bard in SW some genius might still manage to send infinity coins to an invalid script
 78 2016-10-05 09:01:17	0|gmaxwell|with it non-standard, they could coordinate with miners to rescue them.
 79 2016-10-05 09:02:00	0|dcousens|gmaxwell: true, problematic from the start.  I suppose it's just one of things that might need some big bold writing and dates
 80 2016-10-05 09:18:56	0|wumpus|no, it's not worth delaying segwit
 81 2016-10-05 09:19:31	0|wumpus|there will always be some last-minutes things that could have been slightly better, don't let them delay what is a great improvement
 82 2016-10-05 10:36:14	0|GitHub188|[13bitcoin] 15laanwj closed pull request #8886: Add Arch Linux instructions to build-unix.md (06master...06patch-1) 02https://github.com/bitcoin/bitcoin/pull/8886
 83 2016-10-05 11:46:26	0|wumpus|achow101: https://github.com/bitcoin/bips/pull/457
 84 2016-10-05 12:44:32	0|GitHub79|[13bitcoin] 15laanwj pushed 2 new commits to 06master: 02https://github.com/bitcoin/bitcoin/compare/f92805025d5b...223f4c2dd5fa
 85 2016-10-05 12:44:33	0|GitHub79|13bitcoin/06master 14223f4c2 15Wladimir J. van der Laan: Merge #8884: Bugfix: Trivial: RPC: getblockchaininfo help: pruneheight is the lowest, not highest, block...
 86 2016-10-05 12:44:33	0|GitHub79|13bitcoin/06master 14a78e542 15Luke Dashjr: Bugfix: Trivial: RPC: getblockchaininfo help: pruneheight is the lowest, not highest, block
 87 2016-10-05 12:44:47	0|GitHub78|[13bitcoin] 15laanwj closed pull request #8884: Bugfix: Trivial: RPC: getblockchaininfo help: pruneheight is the lowest, not highest, block (06master...06trivial_pruneheight_help) 02https://github.com/bitcoin/bitcoin/pull/8884
 88 2016-10-05 15:54:22	0|Lauda|Any estimate for 0.13.1 yet?
 89 2016-10-05 15:56:31	0|sipa|soon.
 90 2016-10-05 15:56:46	0|instagibbs|Two Weeks(TM)
 91 2016-10-05 15:57:23	0|sipa|TW
 92 2016-10-05 15:57:31	0|sipa|not TM, i hope
 93 2016-10-05 15:57:32	0|Lauda|I dislike the word soon as it can mean anywhere between now and sometime in the distant future. :P
 94 2016-10-05 15:57:45	0|sipa|Lauda: that is exactly what it meams
 95 2016-10-05 15:57:57	0|sipa|there are very few remaining issues
 96 2016-10-05 15:58:28	0|achow101|Lauda: as soon as everything in the milestone is taken care of
 97 2016-10-05 15:58:48	0|Lauda|tl;dr: Soon.
 98 2016-10-05 15:59:04	0|Lauda|I've checked the milestone not long ago.
 99 2016-10-05 15:59:16	0|achow101|they keep adding new stuff
100 2016-10-05 16:16:26	0|cdecker|Soon is still better than 'eventually' :-)
101 2016-10-05 16:34:37	0|instagibbs|achow101, people can ask for milestone tags, may not happen tho
102 2016-10-05 16:34:44	0|instagibbs|a number have been pruned off already
103 2016-10-05 18:01:54	0|cfields_|jonasschnelli: ping
104 2016-10-05 18:31:31	0|GitHub16|[13bitcoin] 15jnewbery opened pull request #8894: [Testing] Include fRelay in mininode version messages (06master...06mininode-relay-field) 02https://github.com/bitcoin/bitcoin/pull/8894
105 2016-10-05 21:51:32	0|GitHub134|[13bitcoin] 15JeremyRubin opened pull request #8895: Better SigCache Implementation (06master...06cuckoocache-pull-request) 02https://github.com/bitcoin/bitcoin/pull/8895
106 2016-10-05 21:52:41	0|jeremyrubin|\me \o/
107 2016-10-05 21:53:57	0|sipa|this is not \LaTeX
108 2016-10-05 21:54:34	0|gmaxwell|oh good. someone implement a Cuckoo Hash Table... now we need a lossy version and can get rid of those cahe destroying huge bloom filters we're using all over the place.
109 2016-10-05 21:55:40	0|jeremyrubin|sipa: latex irc plugin would be dope
110 2016-10-05 21:56:05	0|sipa|jeremyrubin:
111 2016-10-05 21:56:11	0|jeremyrubin|$\mu$'s eveywhere
112 2016-10-05 21:56:32	0|sipa|Ã¡Â»â°ÃÂ£ Ã¡Âºâ¦Ã¡Â»ÂÃ¡Â»Â¯ÃÂ¾Ãâ ÃÂ§ÃÂ«Ãâ¢ÃÂ§Ã¡ÂºÂ©ÃÂ®ÃâÃÂ¼ÃÂ· Ãâ¢ÃÂ¯Ã¡Â¸Â¿pÃÂ¾ÃÂ¬ÃâÃ¡Â»Â¹ Ã¡ÂºÂÃâ¢ÃÂ®ÃÂ§ÃÂ«ÃËÃÅ¸ ÃÂ¥Ã¡ÂºÂ½xÃÂ§ ÃÂµÃÂÃÂ£ÃÂ¥ ÃÂºÃ¡Â»âÃÂ£ÃÂ¡ ÃÂ´Ãâ ÃÂÃÂ«ÃÂ¤Ãâ¡ÃâÃÂ¬ÃÂ¥ÃÂ©ÃÂÃâ¢
113 2016-10-05 21:56:56	0|jeremyrubin|gmaxwell: this is a lossy version
114 2016-10-05 21:58:28	0|gmaxwell|whats lossy about it? looks like if there is a collision you ripple, like an ordinary cuckoo hash.
115 2016-10-05 21:58:44	0|jeremyrubin|gmaxwell: once depth is reached, last thing touched goes bye-bye
116 2016-10-05 21:58:56	0|jeremyrubin|gmaxwell: no rehashing
117 2016-10-05 21:59:32	0|gmaxwell|Did you test if that optimization mattered? I think if makes a difference at all, your table is over populated and the performance will be poor.
118 2016-10-05 21:59:48	0|bsm117532|https://sourceforge.net/projects/pidgin-latex/
119 2016-10-05 22:00:05	0|gmaxwell|(because if it makes a difference it means you are hitting the limit often, and if you're doing that you're probably doing it for almost every insert.)
120 2016-10-05 22:01:58	0|jeremyrubin|gmaxwell_: I tested many many optimizations... not that one because I have a fixed size table so rehashing can't help
121 2016-10-05 22:02:20	0|jeremyrubin|gmaxwell: ^^^
122 2016-10-05 22:02:33	0|gmaxwell|jeremyrubin: what you need to do if you hit the maximum then is go delete entries at random.
123 2016-10-05 22:02:55	0|jeremyrubin|gmaxwell: I did however test with more hashes per entry (10)
124 2016-10-05 22:03:06	0|kanzure|is it correct that 8393 should be marked "needs backport" ? https://github.com/bitcoin/bitcoin/pull/8393#issuecomment-238783829
125 2016-10-05 22:03:19	0|gmaxwell|If you look at a cuckoo hash number of expected accesses as a function of table fullness, you get this behavior where it's basically 2 until the table is half full and then it shoots up like a rocket.
126 2016-10-05 22:03:30	0|jeremyrubin|Yeah
127 2016-10-05 22:03:41	0|jeremyrubin|so fortunately insert performance isn't a large priority
128 2016-10-05 22:03:50	0|sipa|with 3 hashes you can get to 90% full :)
129 2016-10-05 22:03:51	0|gmaxwell|So your depth limit means that you'll never hit it until you're too full, then you'll often hit it.
130 2016-10-05 22:04:17	0|jeremyrubin|and we'd rather spend the compute poking around the table looking for things that have been deleted than clear something that hasn't been deleted
131 2016-10-05 22:04:34	0|jeremyrubin|So the "eager-delete" strategy is a loser
132 2016-10-05 22:04:45	0|jeremyrubin|sipa: yes, but on a non-fixed memory system
133 2016-10-05 22:04:53	0|sipa|?
134 2016-10-05 22:05:04	0|jeremyrubin|*non-fixed
135 2016-10-05 22:05:14	0|jeremyrubin|eg, you can get to 90% before needing a re-hash
136 2016-10-05 22:05:21	0|jeremyrubin|but we never increase memory size
137 2016-10-05 22:05:29	0|jeremyrubin|so we always get to 100% full
138 2016-10-05 22:05:37	0|jeremyrubin|it's just a matter of how fast
139 2016-10-05 22:06:19	0|jeremyrubin|There are gains to be made with more hashes/other things I agree. I'm pretty excited about some of them, but this is minimal & those all add some uneeded complexity
140 2016-10-05 22:06:46	0|gmaxwell|jeremyrubin: what is the depth limit you're using?
141 2016-10-05 22:06:54	0|jeremyrubin|log(size)
142 2016-10-05 22:07:26	0|gmaxwell|yea, so you're going to end up doing log() accesses on insert basically every time once its run long enough.
143 2016-10-05 22:07:56	0|jeremyrubin|Yeah that's fine
144 2016-10-05 22:08:09	0|jeremyrubin|We don't really care about insert performance (non-hot path)
145 2016-10-05 22:08:43	0|jeremyrubin|And we also do care about deleting things that have been erased preferentially (rather than randomly)
146 2016-10-05 22:08:56	0|jeremyrubin|so it's worth it.
147 2016-10-05 22:10:09	0|sipa|jeremyrubin: i see
148 2016-10-05 22:14:41	0|jeremyrubin|besides, log2(40mb/32bytes per entry) ~~ 21 which is not trivial, but not huge by any means
149 2016-10-05 22:15:26	0|gmaxwell|it's a lot of random memory accesses.
150 2016-10-05 22:16:29	0|jeremyrubin|gmaxwell: I tested designs that did less random mem accesses and it wasn't terribly better for much added complexity
151 2016-10-05 22:16:52	0|jeremyrubin|gmaxwell: I'll float those later if this gets through review
152 2016-10-05 22:17:30	0|jeremyrubin|gmaxwell: for instance, one can split keys to an 8-bit tag and 258-bit main value, and do initial lookups on the tags.
153 2016-10-05 22:18:06	0|gmaxwell|what matters is the access, 1-64 bytes (and likely 1-128 bytes) will all end up being basically the same speed.
154 2016-10-05 22:18:22	0|jeremyrubin|gmaxwell: you can also do buckets of two hashes so that k can be stored at h0(k) & ~1 and h0(k) | 1.
155 2016-10-05 22:18:33	0|jeremyrubin|gmaxwell: wrong. l1/l2 matters at that size
156 2016-10-05 22:19:07	0|jeremyrubin|gmaxwell: you can also do linear probing style stuff with 64 entries per bucket (and say, 2 buckets per entry allowed) and 1 byte tags
157 2016-10-05 22:19:44	0|jeremyrubin|(I implemented and tested all of the above, none are a clear winner over the K.I.S.S. solution PR'd)
158 2016-10-05 22:21:55	0|sipa|(h0(k) & 1, h0(k) | 1) won't result in a cuckoo style behaviour, i think
159 2016-10-05 22:22:05	0|sipa|as the collisions are perfectly correlated
160 2016-10-05 22:22:11	0|gmaxwell|no, it won't.
161 2016-10-05 22:22:37	0|jeremyrubin|sipa: that's why you have two hashes still
162 2016-10-05 22:22:42	0|jeremyrubin|sorry if that was unclear
163 2016-10-05 22:22:48	0|sipa|ah, you mean in addition
164 2016-10-05 22:22:50	0|sipa|right
165 2016-10-05 22:22:55	0|jeremyrubin|yes
166 2016-10-05 22:23:04	0|gmaxwell|but this table doesn't really make use of cuckoo style behavior once it fills.
167 2016-10-05 22:23:11	0|sipa|you're turning each entry into a 2-way associativr cache
168 2016-10-05 22:23:13	0|gmaxwell|it just becomes a 2-way set assc cache.
169 2016-10-05 22:23:23	0|jeremyrubin|sure
170 2016-10-05 22:24:20	0|jeremyrubin|gmaxwell: more or less the same thing
171 2016-10-05 22:24:38	0|sipa|what would be the effect if you greatly reduced the max iterations?
172 2016-10-05 22:24:48	0|jeremyrubin|less likely to find garbage on insert
173 2016-10-05 22:24:48	0|sipa|(in the otherwise kiss design)
174 2016-10-05 22:25:24	0|sipa|so why bother doing 21 iterations?
175 2016-10-05 22:25:35	0|gmaxwell|Well I'm saying this cuckoo table is a not one, at least once filled it's a two way set assc cache.  I think you can probably make your insert behavior stop at the first collision once you're over some fullness threshold (like 90%) and you'll see performance improve.
176 2016-10-05 22:26:18	0|BlueMatt|remember: we really give 0 shits about insert performance....ATMP isnt our hot-path, and it just did sigchecks, so a bunch of lookups (even hitting memory) isnt all that bad
177 2016-10-05 22:26:18	0|gmaxwell|sorry if I didn't state it clearly above: this data structure is a cucokoo hash table that turns into a 2way set assc cache once it becomes overfull
178 2016-10-05 22:26:33	0|gmaxwell|BlueMatt: I'm also thinking about other places where we should use this.
179 2016-10-05 22:26:40	0|BlueMatt|ahh, ok, fair enough
180 2016-10-05 22:26:45	0|gmaxwell|And I care about the overall cpu usage of bitcoin, not just the block race. :)
181 2016-10-05 22:27:04	0|gmaxwell|it looks like great work too, I'm not being negative, just thinking it through.
182 2016-10-05 22:27:09	0|BlueMatt|ehh, doing a few more memory hits after doing ATMP checksigs isnt all that much too notice :p
183 2016-10-05 22:27:40	0|jeremyrubin|sipa: Finding garbage means finding a signature you don't need again (unless reorg)
184 2016-10-05 22:27:46	0|sipa|yes, in case it isn't clear, i'm just playing devil's advocate to understand
185 2016-10-05 22:28:05	0|jeremyrubin|sipa: finding !garbage means you delete something maybe valuable
186 2016-10-05 22:28:23	0|jeremyrubin|sipa: (yes :) )
187 2016-10-05 22:28:32	0|jeremyrubin|If i math'd right: (1-((5000/((float(40<<20)/(32.0+1.0/8))))))**21
188 2016-10-05 22:28:42	0|gmaxwell|though I wonder if its code complexity wouldn't be reduced by dropping the cuckoo behavior entirely... thats a question that depends on how full the cache is typically.
189 2016-10-05 22:28:43	0|jeremyrubin|== ~92%
190 2016-10-05 22:28:59	0|jeremyrubin|should be chance of deleting with this design
191 2016-10-05 22:29:15	0|jeremyrubin|gmaxwell: what is the cuckoo behavior? Eg, one hash location?
192 2016-10-05 22:29:27	0|gmaxwell|jeremyrubin: no, cuckoo behavior is the rippling insertion.
193 2016-10-05 22:29:42	0|jeremyrubin|gmaxwell: so not iterating at all?
194 2016-10-05 22:29:49	0|gmaxwell|In a plain 2way set assc cache, you'll probe both locations on insert and if both are full you'll evict one (at random or whatever).
195 2016-10-05 22:29:54	0|BlueMatt|gmaxwell: i think, here, thats motly there so that you can find entries marked deleted while rippling
196 2016-10-05 22:30:03	0|BlueMatt|if you didnt have the delete flags, probably
197 2016-10-05 22:30:19	0|gmaxwell|okay, I wasn't thinking about the delete flags, hows that work?
198 2016-10-05 22:30:34	0|BlueMatt|its the sigcache - we used to delete things while checking sigs in blocks
199 2016-10-05 22:30:40	0|BlueMatt|now they're marked for deletion in an atomic
200 2016-10-05 22:30:55	0|BlueMatt|so jeremyrubin's thing goes through and treats slots as empty if they have a delete flag on insert
201 2016-10-05 22:30:56	0|gmaxwell|I don't think that matters actually.
202 2016-10-05 22:31:08	0|BlueMatt|this has an interesting benifit of making it so that reorgs at the tip might use sigcache
203 2016-10-05 22:31:10	0|gmaxwell|When you go to insert, if either are free or marked delete, overwrite.
204 2016-10-05 22:31:23	0|gmaxwell|otherwise pick one and overwrite.
205 2016-10-05 22:31:30	0|gmaxwell|rippling at that point doesn't do you any good.
206 2016-10-05 22:31:41	0|gmaxwell|the other entries you would delete will get deleted when something tries to insert in them.
207 2016-10-05 22:31:46	0|jeremyrubin|yes it does? the next one might find a trash
208 2016-10-05 22:32:15	0|sipa|jeremyrubin: i see!
209 2016-10-05 22:32:20	0|gmaxwell|I don't.
210 2016-10-05 22:32:24	0|jeremyrubin|k1 and k2 sharing location X are highly unlikely to have another in common
211 2016-10-05 22:32:54	0|gmaxwell|If I insert k1  and one of its locations contains trash, I'll use that location.
212 2016-10-05 22:33:06	0|sipa|gmaxwell: if you iterate 21 times, you've had 21 independent chances of finding an empty/deleted position, so you don't need to evict anything potentially valua le
213 2016-10-05 22:33:11	0|gmaxwell|I don't need someone to have come in front of me and taken out the trash for me, I can do it then.
214 2016-10-05 22:33:18	0|jeremyrubin|eg, h1(k1) == h1(k2), unlikely h2(k1) != h2(k2)
215 2016-10-05 22:33:43	0|gmaxwell|sipa: yes; thats equal to saying that cuckoo is helpful if the table is not overfull.
216 2016-10-05 22:34:16	0|sipa|well, yes, but it won't be if many entries are marked deleted
217 2016-10-05 22:34:17	0|gmaxwell|which it is, but if the table is normally more than 50% full, it doesn't help.
218 2016-10-05 22:34:36	0|sipa|i think it can be way more than 50% full
219 2016-10-05 22:34:57	0|jeremyrubin|will always go to 100% full
220 2016-10-05 22:35:00	0|gmaxwell|which is what I said above... depending on how full the table is (and by full I am counting trash as deleted), this could just be a set assc cache with no loss in performance.
221 2016-10-05 22:35:00	0|jeremyrubin|except after a block
222 2016-10-05 22:35:21	0|sipa|if there are no deleted positions (or very few), agree
223 2016-10-05 22:35:40	0|jeremyrubin|there are n_sigops(last_block) empty slots
224 2016-10-05 22:35:44	0|BlueMatt|gmaxwell: by recursing you're less likely to throw out non-trash if there are delete positions available, this is esp true if you eg have blocks come in quickly in succession
225 2016-10-05 22:35:46	0|gmaxwell|sipa: not just 'very few'. The benefit of cuckoo drops of exponentially after 50%.
226 2016-10-05 22:36:27	0|gmaxwell|BlueMatt: not if the table is well over 50% full (deleted entries are not full in this measure).
227 2016-10-05 22:36:37	0|sipa|gmaxwell: i believe that if you're 90% full, you have a very high chance that a new insert will end up with you consuming an empty position
228 2016-10-05 22:36:45	0|jeremyrubin|This can be addressed by increasing hashes to 3 or 4 (or 10!)
229 2016-10-05 22:36:57	0|jeremyrubin|not fact 10 just 10 excitement
230 2016-10-05 22:37:14	0|BlueMatt|gmaxwell: indeed, though if you have quick blocks it means you're less full...
231 2016-10-05 22:37:15	0|gmaxwell|yes, the table can run fuller if there are more probes, but with a linear slowdown for your lookup.
232 2016-10-05 22:37:37	0|BlueMatt|as an aside: we could also add a few bits to indicate feerate for each entry, which would make the probing actually really matter
233 2016-10-05 22:37:53	0|sipa|BlueMatt: or age
234 2016-10-05 22:37:58	0|BlueMatt|or age
235 2016-10-05 22:38:02	0|jeremyrubin|Yep
236 2016-10-05 22:38:05	0|sipa|add generation numbers like the bloom filter cache
237 2016-10-05 22:38:07	0|jeremyrubin|Yep
238 2016-10-05 22:38:10	0|gmaxwell|sipa: I think you're overestimating it, when you're 90% full most entries end up in the posistion they're in because their sibling was alreaady full.
239 2016-10-05 22:38:24	0|jeremyrubin|Or like.. clear them on eviction
240 2016-10-05 22:38:31	0|sipa|gmaxwell: 0.9^21, no?
241 2016-10-05 22:39:32	0|gmaxwell|sipa: no, because the when you consider entry X the current member is more likely to be in X because its alternate Y is full.  The sequence of tests are no longer independant.
242 2016-10-05 22:39:59	0|jeremyrubin|Wait what?
243 2016-10-05 22:40:13	0|gmaxwell|this is why the performance falls off so dramatically at over 50% full.
244 2016-10-05 22:40:13	0|jeremyrubin|That's only true if we prefer h0 over h1?
245 2016-10-05 22:40:14	0|BlueMatt|this is only true if your table is small compared to your iteration count?
246 2016-10-05 22:41:15	0|jeremyrubin|Also... so what! If it's really such a big deal we can just clear 50% of things out from this one
247 2016-10-05 22:41:23	0|jeremyrubin|and still be on par with existing :P
248 2016-10-05 22:41:25	0|gmaxwell|BlueMatt: I'm responding to sipa saying that the odds you fail to find a free entry in 21 hops is 0.9^21
249 2016-10-05 22:41:35	0|BlueMatt|gmaxwell: i mean it should be close
250 2016-10-05 22:41:42	0|sipa|gmaxwell: i don't understand
251 2016-10-05 22:41:53	0|sipa|gmaxwell: maybe we have different assumptions
252 2016-10-05 22:42:02	0|jeremyrubin|I agree with sipa
253 2016-10-05 22:42:16	0|jeremyrubin|my math was: ((1-((5000/((float(40<<20)/(32.0+1.0/8))))))**1)
254 2016-10-05 22:42:23	0|jeremyrubin|oops last one should be a 21
255 2016-10-05 22:42:31	0|BlueMatt|anyway, headed out, y'all figure it out and report back :P
256 2016-10-05 22:42:33	0|jeremyrubin|for a block clearing 5k things
257 2016-10-05 22:43:13	0|jeremyrubin|Gonna head out for some dinner; looking forward to hearing more of your thoughts.
258 2016-10-05 22:45:13	0|jeremyrubin|gmaxwell: if you want I have a super templated version where you can enable/disable a ton of things... but it's quite unsightly and needs some bugfixes.
259 2016-10-05 22:45:26	0|gmaxwell|sipa: as a cuckoo hashtable fills, everythign initially starts out in it's h0 position. then you start getting collisions, when the table is way overfull, and you hit a collision (p=.9) then you look at that entry and go to put it in its alternate... but the reason it was in the entry you looked at was often because the alternate was already full. So the comparison you now make on the alternate
260 2016-10-05 22:45:32	0|gmaxwell|if >p=.9 even though the table is only 90% full overall.
261 2016-10-05 22:48:19	0|sipa|if evictions are independent from insertions (and evictions are responsible for all of the 10% empties), then i think any chain of entries up to 21 deep will have 0.9^21 chance to not have any evicted entries
262 2016-10-05 22:48:50	0|sipa|you're talking about a model where there are only insertions
263 2016-10-05 22:49:50	0|gmaxwell|okay, if we assume that the table is completely full, then evictions happen, then on your next insert I agree with you.
264 2016-10-05 22:50:06	0|gmaxwell|but on the insert after that the performance will be worse than 0.899^21
265 2016-10-05 22:50:13	0|sipa|right.
266 2016-10-05 22:51:21	0|sipa|the iterations are at fullness just a search for entries that were deleted after the chain being traversed was created
267 2016-10-05 22:51:52	0|sipa|and increasing the iteration count means you're willing to spend more time looking for those
268 2016-10-05 22:52:19	0|sipa|i'm not convinced that making this choice depend on the table size is the best choice
269 2016-10-05 22:52:42	0|sipa|but it looks simple and it works better than what we have
270 2016-10-05 22:53:18	0|gmaxwell|But a 2-way set assc cache would be even simpler.
271 2016-10-05 22:54:08	0|jeremyrubin|sipa: you could make the choice based on the number of sigops in the last block
272 2016-10-05 22:54:43	0|jeremyrubin|eg, st (1-empty%)^iter > .50
273 2016-10-05 22:54:45	0|gmaxwell|and at very full levels it will perform no worse, you give an argument which I can can buy that the complexity does improve performance, e.g. at the 90% full level. I dunno what level of fullness would be typical.
274 2016-10-05 22:56:04	0|gmaxwell|if the cache has 40MB of 32 byte entries. and a single block empties 4000 and before a new block arrives 4000 are added, then the low watermark would be 99.6%
275 2016-10-05 22:56:36	0|sipa|yeah, i think in optimal cases you want to artificially keep the fullness low
276 2016-10-05 22:56:42	0|gmaxwell|and 21 probes would be pretty unlikely to find a deleted entry.
277 2016-10-05 22:57:20	0|gmaxwell|meaning a plain 2way set assc would have performed just as well but with simpler code and 40x faster insertion.
278 2016-10-05 22:57:37	0|sipa|find some function of the fullness for the maximum number of iterations
279 2016-10-05 22:57:48	0|sipa|which is 1 when completely
280 2016-10-05 22:57:50	0|sipa|full
281 2016-10-05 22:58:27	0|sipa|jeremyrubin: can you efficiently keep track of fullness?
282 2016-10-05 22:58:40	0|gmaxwell|well what you want is something that is an insertion time tolerance, and when that tolerance would not be likely to yield an improvement, you just go 1 deep.
283 2016-10-05 22:59:22	0|sipa|it's a balance between chance of evicting a live entry and increasing fullness
284 2016-10-05 23:00:00	0|sipa|perhaps the max iterations should be 1 already at 90%
285 2016-10-05 23:00:19	0|sipa|and $TOLERANCE at 0%
286 2016-10-05 23:00:30	0|sipa|and some nifty fitted function in between
287 2016-10-05 23:01:00	0|gmaxwell|well the question you have is the marginal rate of return for each probe.
288 2016-10-05 23:01:41	0|gmaxwell|for low full levels the tolerance exists to prevent cycles.
289 2016-10-05 23:02:08	0|gmaxwell|the probably of going more than N times when you are M full, is negligible unless there is a cycle.
290 2016-10-05 23:02:37	0|gmaxwell|so thats normally where the max probe count (which then triggers a resize and/or rehashing of the whole time) in an ordinary cuckoo hash.
291 2016-10-05 23:03:10	0|gmaxwell|I'm sure tromp could tell us all about those probablities. :P
292 2016-10-05 23:05:56	0|sipa|there is an annoyong collision in my brain's cuckoo table between john trump and donald tromp.
293 2016-10-05 23:06:51	0|gmaxwell|... just move one to its alternative location?
294 2016-10-05 23:07:16	0|gmaxwell|s/whole time/whole table/
295 2016-10-05 23:08:17	0|gmaxwell|In any case, unless there is something surprising about fullness going on, with our current tables, I believe that setting the max depth to 1 (turning this into a 2way set assc cache) would not hurt block validation performance.
296 2016-10-05 23:14:29	0|gmaxwell|but a generation count could make it be effectively 50% free at all time, and then that argument goes out the window.
297 2016-10-05 23:15:53	0|jeremyrubin|you can cheaply do generations for cost of one bit
298 2016-10-05 23:16:12	0|gmaxwell|it would be fine to reduce the key size by a couple bits.
299 2016-10-05 23:16:34	0|jeremyrubin|hell, you can even replace marking garbage at all and just do generations
300 2016-10-05 23:16:41	0|jeremyrubin|gmaxwell: yeah I built a version like that
301 2016-10-05 23:16:50	0|jeremyrubin|you use the last bit/byte for flags
302 2016-10-05 23:17:10	0|gmaxwell|to be honest the keys could be 16 bytes, since the hash is privately salted... but at least with the old datastructure there wouldn't have been a huge performance improvement from that.
303 2016-10-05 23:17:11	0|jeremyrubin|If memory overhead is really a must
304 2016-10-05 23:17:58	0|jeremyrubin|gmaxwell: agree 100%, but I thought that wasn't reviewable
305 2016-10-05 23:18:12	0|gmaxwell|If we wanted the same code to work as a replacement for the bloom filter elsewhere we'd want to use the location xor trick, where other_index =  this_index^H(key)   ... this lets you use very small keys.. the original insert tries to go into index = H(bigger_key).
306 2016-10-05 23:18:41	0|gmaxwell|jeremyrubin: yea, I wouldn't bother but it would be good to know what performance we were leaving on the floor, if any. If it's a lot it would be worth considering.
307 2016-10-05 23:20:21	0|jeremyrubin|yeah; the nice thing is I think this patch is a nesc first step to trying out those designs; so my feeling is we should try to review this then such improvements can be measured
308 2016-10-05 23:23:08	0|jeremyrubin|I didn't try a version with half sized keys FYI.. figured proposing that would be DOA
309 2016-10-05 23:23:54	0|gmaxwell|it would only be an interesting proposal if it showed a big speedup, otherwise it's not worth trying to reason about the security.
310 2016-10-05 23:24:38	0|jeremyrubin|Also if we're going to 128 bit may as well do md5 or something if you really want things to go faster
311 2016-10-05 23:24:50	0|jeremyrubin|md5+salt should be just as secure
312 2016-10-05 23:24:55	0|jeremyrubin|(iirc)
313 2016-10-05 23:25:42	0|sipa|SipHash128.
314 2016-10-05 23:25:45	0|sipa|me ducks
315 2016-10-05 23:26:28	0|jeremyrubin|haha isn't that experimental tho
316 2016-10-05 23:27:09	0|jeremyrubin|but yeah computing the hashes is a hot path thing, so if one really wants to rice this sucker would make things faster
317 2016-10-05 23:27:30	0|jeremyrubin|(ComputeEntry called once per lookup)
318 2016-10-05 23:31:03	0|GitHub138|[13bitcoin] 15randy-waterhouse opened pull request #8896: Update INSTALL landing redirection notice for build instructions. (06master...06master) 02https://github.com/bitcoin/bitcoin/pull/8896
319 2016-10-05 23:31:05	0|tunafizz|needs moar power al *grunt*grunt*grunt*
320 2016-10-05 23:31:30	0|gmaxwell|well if you want faster, using blake2b would be the same speed as md5, and keeping the ~32 byte output I wouldn't really have much to fuss about the security.
321 2016-10-05 23:35:22	0|tunafizz|needs the binford2000 hash al *grunt*grunt*grunt*
322 2016-10-05 23:38:39	0|sipa|gmaxwell: 3 cycles per byte, damn...