full-disclosure-uk August 2008 archive

Main Archive Page > Month Archives > full-disclosure-uk archives |

Date: Fri Aug 08 2008 - 20:33:18 GMT

To: "Leichter, Jerry" <leichter_jerrold@emc.com>

At Fri, 8 Aug 2008 15:52:07 -0400 (EDT),
Leichter, Jerry wrote:

*> *

*> | > > Funnily enough I was just working on this -- and found that we'd*

*> | > > end up adding a couple megabytes to every browser. #DEFINE*

*> | > > NONSTARTER. I am curious about the feasibility of a large bloom*

*> | > > filter that fails back to online checking though. This has side*

*> | > > effects but perhaps they can be made statistically very unlikely,*

*> | > > without blowing out the size of a browser.*

*> | > Why do you say a couple of megabytes? 99% of the value would be*

*> | > 1024-bit RSA keys. There are ~32,000 such keys. If you devote an*

*> | > 80-bit hash to each one (which is easily large enough to give you a*

*> | > vanishingly small false positive probability; you could probably get*

*> | > away with 64 bits), that's 320KB. Given that the smallest Firefox*

*> | > [...]*

*> You can get by with a lot less than 64 bits. People see problems like*

*> this and immediately think "birthday paradox", but there is no "birthday*

*> paradox" here: You aren't look for pairs in an ever-growing set,*

*> you're looking for matches against a fixed set. If you use 30-bit*

*> hashes - giving you about a 120KB table - the chance that any given*

*> key happens to hash to something in the table is one in a billion,*

*> now and forever. (Of course, if you use a given key repeatedly, and*

*> it happens to be that 1 in a billion, it will hit every time. So an*

*> additional table of "known good keys that happen to collide" is worth*

*> maintaining. Even if you somehow built and maintained that table for*

*> all the keys across all the systems in the world - how big would it*

*> get, if only 1 in a billion keys world-wide got entered?)*

I don't believe your math is correct here. Or rather, it would be correct if there was only one bad key.

Remember, there are N bad keys and you're using a b-bit hash, which has 2^b distinct values. If you put N' entries in the hash table, the probability that a new key will have the same digest as one of them is N'/(2^b). If b is sufficiently large to make collisions rare, then N'=~N and we get N/(2^b).

To be concrete, we have 2^15 distinct keys, so, the probability of a false positive becomes (2^15)/(2^b)=2^(b-15). To get that probability below 1 billion, b+15 >= 30, so you need about 45 bits. I chose 64 because it seemed to me that a false positive probability of 2^{-48} or so was better.

-Ekr

Full-Disclosure - We believe in it.

Charter: http://lists.grok.org.uk/full-disclosure-charter.html Hosted and sponsored by Secunia - http://secunia.com/

*
*