1 2016-10-06 02:06:07	0|Dabs|uh. md5? isn't that "broken" ?
 2 2016-10-06 02:26:45	0|luke-jr|Dabs: not because of its speed
 3 2016-10-06 02:58:52	0|jeremyrubin|Dabs: probably not if you use a unique secret salt
 4 2016-10-06 03:12:04	0|Dabs|yeah, but ... you'll have to always justify using a "broken" primitive. ... anyway. I guess it depends on the purpose.. hmac md5 or something.
 5 2016-10-06 03:18:20	0|jeremyrubin|Dabs: I'm not seriously advocating using it... but it's really not broken in this use case.
 6 2016-10-06 03:18:32	0|jeremyrubin|Dabs: For the same reason RIPEMD-160 is OK
 7 2016-10-06 03:19:49	0|jeremyrubin|oops RIPEMD != RIPEMD-160
 8 2016-10-06 03:20:20	0|jeremyrubin|Anyways, most of those collisions require the attacker to fully control both documents being hashed
 9 2016-10-06 03:20:26	0|jeremyrubin|to collide their hashes
10 2016-10-06 03:48:28	0|luke-jr|Dabs: MD5 is not a primitive of BLAKE2b, they just have similar speeds
11 2016-10-06 06:19:14	0|wumpus|please don't use md5 in new code
12 2016-10-06 06:19:55	0|wumpus|it's, at its current level of brokenness still fine for some continuing legacy usages, but for new designs it should be avoided
13 2016-10-06 06:22:48	0|wumpus|there are modern fast and secure hash functions (indeed as BLAKE), and if speed trumps cryptographic security there are tons of other options
14 2016-10-06 06:23:08	0|wumpus|
15 2016-10-06 07:28:10	0|GitHub95|13bitcoin/06master 147d8afb4 15fanquake: [Doc] Improve GitHub issue template
16 2016-10-06 07:28:10	0|GitHub95|[13bitcoin] 15MarcoFalke pushed 2 new commits to 06master: 02https://github.com/bitcoin/bitcoin/compare/223f4c2dd5fa...61d191fbf953
17 2016-10-06 07:28:11	0|GitHub95|13bitcoin/06master 1461d191f 15MarcoFalke: Merge #8887: [Doc] Improve GitHub issue template...
18 2016-10-06 07:28:25	0|GitHub95|[13bitcoin] 15MarcoFalke closed pull request #8887: [Doc] Improve GitHub issue template (06master...06link-stackexchange) 02https://github.com/bitcoin/bitcoin/pull/8887
19 2016-10-06 09:58:16	0|NicolasDorier|roasbeef: You can now measure your size https://testnet.smartbit.com.au/block/0000000000001280a3d758cb956e565daffd5af0e6b6723f28e1cbcda0da8652 ;)
20 2016-10-06 10:00:39	0|btcdrak|ah, great it's been made segwitty
21 2016-10-06 14:24:23	0|morcos|gmaxwell: sorry i missed discussion yesterday on the cuckoo sigcache.  jeremyrubin and i did put a LOT of work into different designs.   but i do think this design if far better than the existing sig cache.
22 2016-10-06 14:24:41	0|sipa|morcos: i don't think there is doubt that it's better :)
23 2016-10-06 14:25:06	0|morcos|it might be fine to reduce the depth limit or untie it from the size of the table.  but i seriously doubt it impacts the performance of ATMP.  (once we saw how inefficient the old deleting behavior was anyway)
24 2016-10-06 14:25:57	0|jeremyrubin|wumpus: it wasn't a serious suggestion ;)
25 2016-10-06 14:26:06	0|morcos|one thing to keep in mind is that with a 40MB sigcache, it probably isn't going to get full except in the event of an attack.  i ran a simulation over 6 months, and i was getting very close to the maximal possible hit rate on a 40MB cache.
26 2016-10-06 14:26:19	0|morcos|the fact that we delete sigs that were in blocks makes a HUGE difference
27 2016-10-06 14:26:34	0|sipa|morcos: yes... but we should know how it behaves in case of an attack as well
28 2016-10-06 14:26:47	0|sipa|without attack, i expect the size of the cache to not even matter that much
29 2016-10-06 14:26:53	0|sipa|as we'll only ever have live entries in it
30 2016-10-06 14:27:08	0|morcos|my idea for how to make it even more fool proof from ever getting full would be to implement a Rolling version where you have 2 and insert and check in both of them and then alternately clear them after some amount of time
31 2016-10-06 14:27:33	0|sipa|morcos: ha, the old rolling bloom filter design :)
32 2016-10-06 14:27:42	0|morcos|sipa: in the absence of attack there are still txs that get generated but are never mined (replaced, doublespent, too low fee)
33 2016-10-06 14:27:48	0|sipa|right
34 2016-10-06 14:27:55	0|morcos|sipa: yeah i just didn't think it was worth the complication for the first PR
35 2016-10-06 14:28:02	0|sipa|morcos: completely agreed
36 2016-10-06 14:28:44	0|jeremyrubin|generations are better than rolling I think
37 2016-10-06 14:28:53	0|sipa|jeremyrubin: yes, i think so
38 2016-10-06 14:29:18	0|sipa|especially given your entries are already 128-256 bits, adding a few bits for a generation number doesn't add much
39 2016-10-06 14:29:46	0|sipa|using two staggered sets effectively makes your entries twice the size
40 2016-10-06 14:29:55	0|sipa|or even 4 times
41 2016-10-06 14:30:14	0|morcos|sure, sounds good to me too
42 2016-10-06 14:30:28	0|sipa|as in the worst case time, you just emptied one, and the other one is half full, so you have 25% utilization of your entire set
43 2016-10-06 14:30:36	0|jeremyrubin|Also fee is maybe even better than generations
44 2016-10-06 14:31:00	0|sipa|use an ARC :)
45 2016-10-06 14:31:09	0|jeremyrubin|You can also emulate generations
46 2016-10-06 14:31:34	0|jeremyrubin|by createnewblocking a 10mb block or something from mempool
47 2016-10-06 14:31:40	0|jeremyrubin|after deleting a bunch of things randomly
48 2016-10-06 14:31:52	0|jeremyrubin|if you want 0 memory overhead
49 2016-10-06 14:31:55	0|sipa|https://en.wikipedia.org/wiki/Adaptive_replacement_cache
50 2016-10-06 14:32:10	0|jeremyrubin|sipa: ah, not automatic reference counting
51 2016-10-06 14:32:43	0|jeremyrubin|ARC seems not to work for this use case?
52 2016-10-06 14:32:50	0|jeremyrubin|Things are write once read once?
53 2016-10-06 14:33:10	0|sipa|ah, i'm confusing with the coin cache
54 2016-10-06 17:06:56	0|gmaxwell|morcos: part of the reason I was asking questions is that I wasn't sure if we knew how it would perform once someone sent in a huge set of junk transactions. But have no doubt, I'm sure its much better than what we currently have.
55 2016-10-06 17:07:17	0|gmaxwell|I'm surprised it didn't bencmark out as faster with only a single thread.
56 2016-10-06 17:08:30	0|morcos|gmaxwell: it was ever so slightly faster (1%) for a single thread, but thats within noise.  i just think the amount of time taken in either case is small, it was only the lock contention that was a true problem.
57 2016-10-06 17:09:41	0|morcos|but now you inspired a hopefully efficient generation keeping design that will never get full, but only delete old generations if it needs to..   :)
58 2016-10-06 17:11:44	0|gmaxwell|sipa: re generation number, I had just assumed the entries would be changed to 252 bits, so you'd only lose at most 6% at a time, and would only delete things if it had to.
59 2016-10-06 17:15:16	0|morcos|the idea jeremyrubin and i want to try is actually a separate bit vector that just keeps track of whehter each entry is in the current generation or not.  that's only touched during insert, so its lock free.  and then use a simple heuristic to trigger an occasional testforcleanup.
60 2016-10-06 17:16:33	0|morcos|when doing that you just loop that vector and the garbage collection vector and if the current generation not marked for deletion is > 25% of capacity then mark for deletion old generation and increase generation.
61 2016-10-06 17:17:13	0|morcos|that has the nice property that you don't actually delete old things unless you have to, but under an attack you'll delete them often enough to stop yourself getting full.
62 2016-10-06 17:45:13	0|gmaxwell|I wish we could easily delete based on an entry being in the top of the mempool or not.
63 2016-10-06 17:45:38	0|gmaxwell|hm. I suppose that when we evict something from the mempool, we could revalidate it again with the cache in delete mode.
64 2016-10-06 17:45:50	0|gmaxwell|that would be kinda like that.
65 2016-10-06 17:46:32	0|gmaxwell|or insert things with a lower feerate 1 or two generations behind the current generation. (assuming many generations)
66 2016-10-06 17:50:39	0|michagogo|Today's lesson in my programming lessons: threading is hard
67 2016-10-06 17:51:09	0|sdaftuar|gmaxwell: i was thinking about doing that (revalidate in delete mode) for things that we don't accept to our mempool, as well
68 2016-10-06 17:51:38	0|morcos|gmaxwell: with a 40MB cache, it's just not necessary to make further optimizations.  someone would have to flood you with 250k excess signatures in order for you to start marking for deletion things older than when the flood started.
69 2016-10-06 17:51:42	0|michagogo|(Also, each iteration of a for loop has its own scope for local variables, and those variables stick around even after the iteration is done and the name is reused for something else)
70 2016-10-06 17:52:16	0|morcos|if that time frame is recent enough that its causing a problem for your cache hit rate, then i think that implies your bigger problem is the tx flood and the DoS on checking all those sigs in the first place
71 2016-10-06 17:52:19	0|michagogo|(or rather, reused for the next iteration)
72 2016-10-06 17:54:47	0|gmaxwell|morcos: okay, for some reason I thought sipa's measurements had shown that we still had a needlessly low hitrate.
73 2016-10-06 17:56:17	0|morcos|gmaxwell: hmm, i'd be interested to see that, but i think any low hit rates we have are probably due to txs we never accepted to our mempool in the first place, not sigs that we accidentally evicted..
74 2016-10-06 17:57:17	0|gmaxwell|yea, I may be conflating things. it might be that right now its from things we never accept, and when sipa tried validating everything, we found it tainted the cache.
75 2016-10-06 17:58:21	0|morcos|gmaxwell: i attempted an approximation by inserting 10 random signatures for every tx my node saw and deleting those same 10 if the tx appeared in a block
76 2016-10-06 17:59:18	0|morcos|i ran that for 6 months and the hit rate was 98.35% on average, whereas perfect would have been 98.38% and the existing algo was 97.99%.  with 40MB
77 2016-10-06 18:00:06	0|gmaxwell|hm. why was the existing algo lower?
78 2016-10-06 18:00:08	0|morcos|in reality many of those txs probably wouldn't have passed ATMP and made it into the cache, leading to a lower hit rate in practice, but not due to cache filling
79 2016-10-06 18:00:28	0|morcos|gmaxwell: primarily because you can cram more signatures in 40MB with the new design
80 2016-10-06 18:00:46	0|morcos|and partially because on a reorg the old algo doesn't have them, but the new one does with high probability
81 2016-10-06 18:00:52	0|gmaxwell|Makes sense.
82 2016-10-06 18:01:30	0|gmaxwell|yea, I'm really happy about the delete flag. I really didn't like the current code's behavior under reorg.
83 2016-10-06 18:01:43	0|gmaxwell|not good for network convergence for reorg to be much slower.
84 2016-10-06 18:02:01	0|morcos|gmaxwell: heh, probably MANY things to improve there
85 2016-10-06 18:02:19	0|gmaxwell|yes. but any improvement is good. :)
86 2016-10-06 18:03:03	0|gmaxwell|I'd take a wag that the validation performance limiter is now the additional sha256 used in the lookup.
87 2016-10-06 18:08:09	0|morcos|gmaxwell: as far as jeremyrubin and i can tell, the biggest limiter now is how much you blow out your various machine caches with different permutations of coinsviewcache size and flushing algo, sig cache lookup pattern, etc..  although this tends to affect performance AFTER validation has finished. i.e. slows down removeForBlock
88 2016-10-06 18:09:32	0|morcos|to really get better performance we'll need to switch data representations to eliminate so much allocating, deallocating and copying of vectors.
89 2016-10-06 18:10:09	0|morcos|but sdaftuar's idea is the next lowest hanging fruit is to just speed the path from validation finished to actually relaying to new peers, which isn't optimized at all right now
90 2016-10-06 18:10:39	0|gmaxwell|well BIP152 allows you to send the block before its verified.
91 2016-10-06 18:10:52	0|gmaxwell|so that would be an obvious thing to actually do.
92 2016-10-06 18:11:47	0|morcos|sure.  same issue still applies though.  you're just deciding to send it earlier.
93 2016-10-06 18:13:01	0|sdaftuar|i think gmaxwell's point is that the way you'd do that would be in the handling of eg ProcessNewBlock, so it'd actually go out earlier (rather than in SendMessages)
94 2016-10-06 18:13:07	0|sdaftuar|that's basically what i was planning to do
95 2016-10-06 18:13:37	0|gmaxwell|well it gets the whole validation process out of that critical path.
96 2016-10-06 19:01:46	0|michagogo|(Go figure... After a bunch of weeks of having something else every time, this week I'm around but the meeting is cancelled)
97 2016-10-06 19:20:54	0|jeremyrubin|gmaxwell: note you don't actually have to re-evaluate the signature, just see if it's hash is present on eviction (but you still need to eval script to extract signatures)