Differentiable Fuzzy Crypt Hashes
- Differentiable fuzzy cryptographic hashes are continuous relaxations of classical hash functions, replacing bit-wise operations with smooth, gradient-friendly analogs.
- They employ fuzzy primitives (e.g., fuzzy AND, OR, XOR, and ADD) that mimic Boolean logic on binary inputs while allowing backpropagation for optimization.
- Neural network inversion experiments demonstrate effective recovery in low-round settings, though increased rounds lead to vanishing gradients and random guess behavior.
Differentiable fuzzy cryptographic hashes (“fuzzy hashes”) generalize classical cryptographic hash functions (CHFs) to real-valued inputs via continuous relaxations of bit-wise operations. By replacing discrete logic with differentiable “fuzzy” primitives, the resulting hashes become almost everywhere smooth functions of the input, enabling gradient propagation and optimization-based analysis. This framework forms the basis for neural network-based inversion studies and opens new avenues for differentiable cryptanalytical techniques, as detailed in Goncharov (2019) (Goncharov, 2019).
1. Continuous Relaxation of Cryptographic Hashes
Standard hash functions are piecewise constant when the message bits are considered over %%%%1%%%%: infinitesimal changes to inputs typically have zero effect, precluding optimization via gradient methods. Fuzzy hashes address this limitation by letting each “bit” take values on a real interval, either (linear) or (circular):
- Standard fuzzy bit:
- Circular fuzzy bit: , mapped as
Logical and arithmetic primitives (NOT, AND, OR, XOR, ADD) are replaced with real-valued, differentiable analogs that coincide with their Boolean forms on . This construction yields , a smooth (or piecewise smooth) mapping suitable for gradient-based methods.
2. Fuzzy Bitwise and Arithmetic Operations
Fuzzy primitives are defined to preserve Boolean correspondence, output range, and differentiability. The main operations are:
| Operation | Formula | Gradient (w.r.t. first argument) |
|---|---|---|
| NOT | $\fnot(a) = 1 - a$ | |
| AND | $a\,\fand\,b = ab$ | |
| OR | $a\,\for\,b = a + b - ab$ | $1 - b$ |
| XOR | $a\,\fxor\,b = a(1-b) + b(1-a)$ | $1 - 2b$ |
Fuzzy ADD generalizes -bit addition with carries. For words , , define sum bits and carries recursively:
- , where $D(a, b, c) = (a\,\fxor\,b)\,\fxor\,c = a(1-b)(1-c) + b(1-a)(1-c) + c(1-a)(1-b) - abc$
This construction ensures exact agreement with binary addition modulo 2 at the vertices .
3. Construction of Differentiable Cryptographic Hash Functions
The overall structure of major hash algorithms (e.g., MD5, SHA-1, SHA-2, SHA-3) remains unchanged, except every operation is replaced by its fuzzy counterpart. Bitwise permutations, rotations, and shifts remain as linear reindexings, while core non-linear layers are constructed with fuzzy logic and arithmetic.
- Merkle–Damgård-based hashes (MD5, SHA-1, SHA-2-256):
- Message padding, block structure, and state initialization follow reference specifications.
- Internal operations are recomputed using fuzzy primitives.
- Example: SHA-2 round functions use fuzzy XOR, AND, OR, and ADD in state updates.
- SHA-3/Keccak (sponge function):
- The 1600-bit state uses fuzzy operations in the -step and others, while reshuffling steps are applied without modification.
This yields a fully differentiable mapping , with total input-output gradient computable by contemporary backpropagation frameworks.
4. Neural Network Inversion Experiments
Given the differentiable fuzzy hash , neural networks are trained to invert the hash for short message lengths and few hash rounds. The primary objective is, for a target (possibly fuzzy) hash , to construct a neural network such that .
Neural network architecture and training:
- Fully connected multilayer perceptron, with
- ELU activation in hidden layers,
- Batch normalization after every layer,
- Sigmoid/hard sigmoid output activation.
- Loss: difference or binary cross-entropy between predicted and target hash.
- Optimizer: Adam or Nadam, typical learning rate $0.002$.
- Modes:
- General inverter: train on many pairs to generalize inversion.
- Single-hash inverter: fix and iteratively refine .
Performance Summary:
| Hash, Rounds, Message | Avg Bit-Misses | Random Guess (for comparison) |
|---|---|---|
| SHA-1, 1r, 32b | 2.9 | 16.1 |
| SHA-2, 1r, 32b | 6.1 | 32.0 |
| MD5, 1r, 32b | 5.5 | 16.0 |
| SHA-3, 1r, 64b | 0 (perfect) | 64.0+ |
As the number of rounds and message length increase, inversion accuracy approaches that of a random guess, indicating effective diffusion. Perfect or near-perfect inversion is possible for severely weakened hash settings ( rounds, reduced diffusion). For Keccak, inversion is possible for the -step alone but fails for more complete step compositions.
5. Gradient Flow and Analytical Properties
Each fuzzy primitive exhibits nonzero gradient almost everywhere on , ensuring that, in principle, gradient signals can propagate back from hash outputs to every input bit. Concrete formulas such as $\frac{\partial\,\fand(a, b)}{\partial\,a} = b$ and $\frac{\partial\,\fxor(a, b)}{\partial\,a} = 1 - 2b$ are provided for autodiff systems. The overall input-hash Jacobian is constructed by chain rules through the hash function’s computation graph.
Repeated compositions, particularly of fuzzy ADD, drive intermediate values toward $0.5$, where derivatives diminish—leading to vanishing gradients. This effect is less severe for circular representations, which however increase non-linearity, creating optimization challenges.
6. Limitations, Comparison, and Open Directions
While differentiable fuzzy hashes constitute a proof-of-concept for gradient-based cryptanalytics, their current practical utility is limited:
Neural network inversion outperforms random guessing only for weakened (low-round) hashes.
- No guarantee exists of finding an exact preimage; results are evaluated by the count of message bit mismatches after rounding to .
- Classical preimage attack methods (SAT solvers, meet-in-the-middle, etc.) remain more effective for nontrivial hash settings.
- The NN needs to be retrained for each message length.
- Strong diffusion (beyond rounds for SHA-1/SHA-2 and $2$ rounds for SHA-3) causes inversion attempts to stall at random guessing levels.
Potential directions for further investigation include:
- Mixed use of several fuzzy operation types (product t-norm, min-max, circular per-bit transformations).
- Hybrid pipelines combining NN-based outputs as “warm starts” for SAT solvers.
- Deeper NN architectures (residual, attention, convolution layers).
- Training to invert intermediate states or single rounds sequentially.
- Ensemble inference to mitigate prediction variance.
7. Significance and Future Perspectives
Differentiable fuzzy hashing provides a mathematical framework for continuous, back-propagatable analogs of classical hash functions. While current results do not offer cryptanalytical breakthroughs, the method demonstrates a direct path for applying neural models to hash inversion and suggests new connections between cryptographic function analysis, optimization, and neural network architectures (Goncharov, 2019). Plausible implications include exploring fuzzy hashes as differentiable surrogates for cryptographic primitives in theoretical research, gradient-based inversion heuristics, and the construction of hybrid attack algorithms. Further advances in fuzzy primitive design, network architectures, and optimization strategies may widen the applicability or impact of this technique.