Rabin fingerprint

The Rabin fingerprinting scheme is a method for implementing fingerprints using polynomials over a finite field. It was proposed by Michael O. Rabin.[1]

Scheme

Given an n-bit message m₀,...,m_n-1, we view it as a polynomial of degree n-1 over the finite field GF(2).

f(x)=m_{0}+m_{1}x+\ldots +m_{n-1}x^{n-1}

We then pick a random irreducible polynomial $p(x)$ of degree k over GF(2), and we define the fingerprint of the message m to be the remainder $r(x)$ after division of $f(x)$ by $p(x)$ over GF(2) which can be viewed as a polynomial of degree k-1 or as a k-bit number.

Applications

Many implementations of the Rabin–Karp algorithm internally use Rabin fingerprints.

The Low Bandwidth Network Filesystem (LBFS) from MIT uses Rabin fingerprints to implement variable size shift-resistant blocks.[2] The basic idea is that the filesystem computes the cryptographic hash of each block in a file. To save on transfers between the client and server, they compare their checksums and only transfer blocks whose checksums differ. But one problem with this scheme is that a single insertion at the beginning of the file will cause every checksum to change if fixed-sized (e.g. 4 KB) blocks are used. So the idea is to select blocks not based on a specific offset but rather by some property of the block contents. LBFS does this by sliding a 48 byte window over the file and computing the Rabin fingerprint of each window. When the low 13 bits of the fingerprint are zero LBFS calls those 48 bytes a breakpoint and ends the current block and begins a new one. Since the output of Rabin fingerprints are pseudo-random the probability of any given 48 bytes being a breakpoint is $2^{-13}$ (1 in 8192). This has the effect of shift-resistant variable size blocks. Any hash function could be used to divide a long file into blocks (as long as a cryptographic hash function is then used to find the checksum of each block): but the Rabin fingerprint is an efficient rolling hash, since the computation of the Rabin fingerprint of region B can reuse some of the computation of the Rabin fingerprint of region A when regions A and B overlap.

Note that this is a problem similar to that faced by rsync.

gollark: The nuclearcraft ones are just too slow.

gollark: Copy in a known-good reactor constantly to avert meltdown issues, replace all cooling with moderators and cells packed as densely as possible, figure out how to automate all components from raw resources, feed most power-producing fuel, repeat.

gollark: Oh yeah, copy in a known-good reactor constantly.

gollark: Powered by a single electrolytic separator!

gollark: With enough, I don't know, formation planes and an internal ME network, or turtles or something, self-repairing repeatedly-meltdowning reactors could become the power source of the future.

References

Michael O. Rabin (1981). "Fingerprinting by Random Polynomials" (PDF). Center for Research in Computing Technology, Harvard University. Tech Report TR-CSE-03-01. Retrieved 2007-03-22. Cite journal requires |journal= (help)
Athicha Muthitacharoen, Benjie Chen, and David Mazières "A Low-bandwidth Network File System"

External links

Andrei Z. Broder (1993). "Some applications of Rabin's fingerprinting method". Retrieved 2011-09-12. Cite journal requires |journal= (help)
David Andersen (2007). "Exploiting Similarity for Multi-Source Downloads using File Handprints". Retrieved 2007-04-12. Cite journal requires |journal= (help)
Ross N. Williams (1993). "A painless guide to CRC Error detection algorithms".

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Michael O. Rabin (1981). "Fingerprinting by Random Polynomials" (PDF). Center for Research in Computing Technology, Harvard University. Tech Report TR-CSE-03-01. Retrieved 2007-03-22. Cite journal requires |journal= (help)

[2] Athicha Muthitacharoen, Benjie Chen, and David Mazières "A Low-bandwidth Network File System"

Rabin fingerprint

Scheme

Applications

See also

References

External links