For those of you going to SPAA this coming week, I'll see you there. I'll be giving the last two talks at the conference, to what I expect (based on the timing) will be a nearly empty room. That just means there will be no pressure.
If you want to hear more about the papers, you can go to Abstract Talk, where I talk about the papers. Here is the link for the paper Balanced Allocations and Double Hashing, and the link for the paper Parallel Peeling Algorithms. I haven't done podcasts for Abstract Talk before, so be forgiving if you go to listen. It seems like a cool idea; what do people think of it in practice?
For those who expect to be sightseeing in Prague during the final session (or who just aren't going to SPAA), here's the brief overview.
For Balanced Allocations and Double Hashing:
In the well-known balanced allocations paradigm, balls are hashed sequentially into bins, where each ball gets d random choices from the hash functions, and is then placed in the least loaded. With double hashing, we replace the d random choices with d choices of the form a, a+b, a+2b, a+3b,... a+(d-1)b, where a and b are random values (determined by hashing). That is, we build the d choices from 2 random numbers instead of using d random numbers. (The numbers are taken mod the size of the hash table, and b should be relatively prime to the hash table size... let's stop worrying about details.) We find empirically that this makes no difference, in a very strong sense; the fraction of bins with load j appears the same for every value of j for both systems, so you can't really tell them apart. We provide a theoretical explanation, based on fluid limit models, for why this happens.
For Parallel Peeling Algorithms:
The analysis of several algorithms and data structures can be framed as a peeling process on a random hypergraph: vertices with degree less than k are removed until there are no vertices of degree less than k left. The remaining hypergraph is known as the k-core. We analyze parallel peeling processes, where in each round, all vertices of degree less than k are removed. It is known that, below a specific edge density threshold, the k-core is empty
with high probability. We show that, with high probability, below this threshold, only O(log log n) rounds of peeling are needed to obtain the empty k-core for r-uniform hypergraphs. Interestingly, above this threshold, Ω(log n) rounds of peeling are required to find the non-empty k-core. Since most algorithms and data structures aim to peel to an empty k-core
this asymmetry appears fortunate; nature is on our side. We verify the theoretical results both with simulation and with a parallel implementation using graphics processing units (GPUs). Our implementation provides insights into how to structure parallel peeling algorithms for efficiency in practice.
The Parallel Peeling Algorithms paper was, to my surprise, awarded Best Paper. Maybe I'll write up more about the surprise at some point, but I'd certainly like to thank the committee for the honor, and pass the credit to where it is due, with my co-authors Jiayang Jiang and Justin Thaler.