Minimal Perfect Hashing

This research line investigates the design of minimal perfect hash functions that enable constant-time, collision-free access to large static datasets while compressing storage requirements near the information-theoretic lower bound.

People
Giulio Ermanno Pibiri

Description

Given a set S of n distinct keys, a function f that bijectively maps the keys of S into the first n natural numbers is called a minimal perfect hash function (MPHF) for S. Algorithms that find such functions when n is large and retain constant evaluation time are of practical interest. For instance, search engines and databases typically use minimal perfect hash functions to quickly assign identifiers to static sets of variable-length keys such as strings.

The challenge is to design an algorithm which is efficient in three different aspects: time to find f (construction time), time to evaluate f on a key of S (lookup time), and space of representation for f.

The main, general-purpose, MPHF developed and maintained by RAVEN is PTHash. PTHash combines very fast construction with good space effectiveness and very fast lookup time. (Interestingly, PTHash inspired subsequent development, such as PHOBIC, PtrHash, and PHast.)

Main Collaborations

Peter Sanders’ group (KIT, Germany).

Software

PTHash: PTHash is a fast and compact minimal perfect hash function.
LPHash: LPHash is a fast and compact locality-preserving minimal perfect hashing for k-mer sets.

Publications

PTHash: Revisiting FCH Minimal Perfect Hashing International ACM Conference on Research and Development in Information Retrieval (SIGIR), 2021.
Parallel and External-Memory Construction of Minimal Perfect Hash Functions with PTHash IEEE Transactions on Knowledge and Data Engineering (TKDE), 2023.
Locality-Preserving Minimal Perfect Hashing of k-mers Bioinformatics, 2023.
PHOBIC: Perfect Hashing with Optimized Bucket Sizes and Interleaved Coding Annual European Symposium on Algorithms (ESA), 2024.
Modern Minimal Perfect Hashing: A Survey ACM Computing Surveys (CSUR), 2026

RAVEN — Research on Algorithms VENice

Minimal Perfect Hashing

Description

Main Collaborations

Software

Publications