Algorithm for license codes




Hello,

I know that posting here the details of a new algorithm would irk most of us, but this is the only place I know where I can ask for a reliable advice. Also, I attempted to document myself the best I can, but I eventually have to land here in the hope to find someone with better skills in math than me. :-)


I'm studying the feasibility of an algorithm based on public key cryptography that is suited for generating strong license codes for product activation.

The idea is that the encoding algorithm along with some private information are kept by the vendor and used to generate short alphanumerical codes. Upon installation (or possibly later on) the product requires one valid code to become operational.


My goals are the following:

(1) prevent, or at least make it difficult for an attacker to create a keygenerator. When I say difficult I mean that the cost of producing an illegal code should be higher than the cost of buying one. Although this clearly depends on the price of the license and the cost of the computing power available to the attacker, a rough estimate of 2^64 in the search space is quite satisfactory to me.

(2) have short codes (roughly 30 alphanumerical characters) which can be easily printed on a label, typed on a keyboard or spoken over a phoneline.


The algorithm however cannot and does not attempt to prevent an attacker analyzing the bytecode and altering the instruction flow, to disable/circumvent the check.

Although this will likely result in a unauthorized working copy of the product, this doesn't cover future upgrades (which an illegal keygenerator would) and gives some added value to my algorithm if compared to most in use today which have either:

- poor security (the generating function can be easily inverted by a skilled reverse engineer)
- license codes too long to be practical.


Major public key algorithms like RSA and ECC would meet the secuirty requirement but produce codes that are too long.

Viceversa symmetric algorithms would be fine for code lenght but fail to meet the security requirement because the key has to be at disposal of the decoding routine and would be therefore easy to recover.

I therefore shifted my attention to public key algoritms based on multivariate cryptosystems which are believed to have lower security if compared to RSA and ECC, but operate with shorter bit vectors.

HFE and related are however hevaily encumbered with patents and are unlikely to be usable on a royality free basis, so I decided to play a bit with the underlaying math and devise my own logic.

Well, to tell the truth this is not something new, it's already known and probably discarded (see below) but maybe - and that's why I'm asking here - discarded for general use, while under the contraints I set may be worthwhile.

Anyway, the idea is to generate a second degree system of equations defined in F_2 in order to make it easy to compute:

y = F( X )

where:

Y = [ y_1,y_2,...,y_m ] is a m bit length vector
X = [ x_1,x_2,...,x_n ] is a n bit length vector

with m <= n and F expressed as a set P_1,...,P_m of polynomials of 2nd degree over F_2 such that:

y_1 = P_1(x_1,...,x_m)
....
y_m = P_m(X_1,...,x_m)


The public key part are the polynomials, the private part are an equivalent representation of the polynomials such that it's easy, given Y, to compute the reverse X = F^-1( Y ).

So far it's not different from HFE (for example the core function of Quartz).

However my trapdoor function is simpler (although certainly weaker) than that.

My idea is to use an auxiliary system like that:

given:

n_1 < n_2 < ... < n_k < n_(k+1) = n
m_1 < m_2 < ... < m_k < m_(k+1) = m

and the set of randomly generated 2nd degree polynomials (defined in F_2) Q_1,..., Q_m such that:

z_1 = Q_1( w_1, ..., w_n1 )
z_2 = Q_2( w_1, ..., w_n1 )
....
z_m1= Q_m1( w_1, ..., w_n1 )
-------------------------------
z_m1+1 = Q_m1+1( w_1, ..., w_n2 )
z_m1+2 = Q_m1+2( w_1, ..., w_n2 )
....
z_m2 = Q_m2( w_1, ..., w_n2 )
-------------------------------
....
-------------------------------
z_mk+1 = Q_mk( w_1, ..., w_nk )
z_mk+2 = Q_mk( w_1, ..., w_nk )
....
z_m = Q_mk( w_1, ..., w_nk )


In practice it works in blocks, the first m1 equations are polynomials that have only the fist n1 terms as input and consistute the first block.

The second block is defind by m2-m1 equations in the n2 unknowns and so on.

If we choose n_1, ... n_k, m_1, ..., m_k properly we can meet the following useful constraints:

- redundancy: if the number of polynomials is less than the number of unknowns the system is underdefined and we can increase at wish the chances of having at least one solution.

- solvability: if the number of unknowns for each block of polynomials is not too high, we can bruteforce a solution for the randomly defined equations.


Basically, given z_1, ..., z_m1 we can bruteforce a solution for w_1, ...., w_n1 (first block).
Then we substitute the values w_1, ..., w_n1 in the second block and bruteforce a solution for it:
z_m1+1, ..., z_m2 with only w_n1+1, ..., w_n2 as unknowns.

Proceding this way, one block a time, we can solve the whole system.

The next step is to hide the system weak structure.

My idea is to apply two linear transformations L1 and L2 such that:

w_i = L1_i ( x_1, ..., x_n )

and

Y_j = L2_j ( z_1, ..., z_m )


Being transfomration L1 and L2 linear the whole system Y = L2( Z ( L1 ( X ) ) ) reamins of 2nd degree, but unless you can decompose the function, it should not be easy to invert.


This scheme is very similar to the one round scheme described by Patarin in his paper: "Asymetric Crypto with S-Boxes"

http://www.ussrback.com/cryptopapers/1997/www.smartcard.bull.com/sct/fr/fichier/icics97b.ps

He claims that his scheme is broken, beacuse variables can be easily separated.

However I believe that if S-box are large, say for example: n_i = 16*i, m_j = 15*j, with n=128 and m=120 the four attacks he presents cannot be easily exploited. Also, his scheme is slightly different because I use bruteforce on every block.

I would be anyway happy if "easily broken" would however require resources above the low threshold I set.

I've posted six months ago a challenge on a reverse engineering site with a working example of my second degree decoder (full source C inclued, not very long to read and very easy to understand (but not to solve, I hope)).

http://www.crackmes.de/users/gbe32241/sddecoder/

No one has yet solved it (the posted solution is a fake).
While this doesn't mean much, at least has encouraged me to ask here for your opinion.

Any comments ?

Kind regards,
Giuliano Bertoletti.














.



Relevant Pages

  • Re: A Revolutionary Breakthrough in Data Compression!
    ... What I have got is a minor improvement of the classic Huffman ... This is usually addressed by modifying the algorithm to arrange the ... even require rearrangement of the codes: ...
    (comp.compression)
  • Re: Generating a large sequence of unique, random numbers
    ... > Generate n unique codes of length l so that they are non-predictable. ... > whatsoever which algorithm to use or how to start in general. ... Luby-Rackoff construct, using F1..F4, producing L' and R'. ... Use that to encrypt the ...
    (sci.crypt)
  • Re: Adaptive Huffman algorithm with syllable compression?
    ... I had now a closer look at LZW algorithm and no it isn't LZW. ... 12 Bit long codes, and so on. ... I don't understand which element make you thinking about Adaptive Huffman. ... don't understand how the syllable compression is integrated in this. ...
    (comp.compression)
  • Re: sparse polynomial arithmetic
    ... Remaining within the boundaries of this representation (which are ... do arithmetics directly on codes ... perfect hash values in a hash set ... The algorithm first checks if the exponents of the polynomials to be ...
    (sci.math.symbolic)
  • MACIS 2007 - CALL FOR PARTICIPATION
    ... Sciences (MACIS) is a new series of conferences where foundational research on theoretical and practical problems of mathematics for computing and information processing may be presented and discussed. ... Computing Roots of Polynomials using Bivariate Quadratic Clipping ... An algorithm for checking whether the toric ideal of an affine monomial curve is a complete intersection ...
    (comp.specification.z)