RE: ascii output encryption needed as well

From: Michael Wojcik (Michael.Wojcik@merant.com)
Date: 10/30/01


Message-ID: <27B17B8B25A3D411B45800805FA7F01C0160E498@mtvmail.merant.com>
From: Michael Wojcik <Michael.Wojcik@merant.com>
To: secprog@securityfocus.com
Subject: RE: ascii output encryption needed as well
Date: Tue, 30 Oct 2001 13:06:39 -0800


> From: Joshua P. Luben [mailto:josh@ebard.net]
> Sent: Tuesday, October 30, 2001 1:03 PM

> I was recently delegated a project to build an secure serial
> number/registration number system. Our serial numbers are 18
> chars long. We would like the output to be a-z, A-Z, 0-9
>
> One approach we have tried is simple xoring the serial string with a
> salt/key. The output, however, falls within 0-255. Not all of
> those values are representable in a string, some are control values,
> etc.

Plus it probably has little to no security. Your description is too vague
to be sure, but I'd be very surprised if you're securing your data in any
meaningful way. Simple linear combination of plaintext with key material
that gets reused - either within a message or in multiple messages - is easy
to defeat.

Don't take this wrong, but "salt/key" suggests an author who doesn't know
much about cryptography. There's nothing wrong with that - but secure
systems aren't built by people who don't understand the technology they're
working with, any more than good OO software is built by programmers with a
strictly procedural background, for example.

> An option to fix this is binhexing the string, but that doubles the
> length of the string ...

As John Viega noted in another message, it's possible to build a
cryptosystem that enciphers from one arbitrary alphabet to another
(including from one arbitrary alphabet to itself). However, I would not
recommend you do this if you're concerned about the secrecy of your data.
(You may not be. Often serial number software licensing systems, for
example, have only minimal protection; they only need to raise the bar until
the effort to defeat them is roughly the same as the effort to just patch
the software to skip the licensing check. And if you're using symmetric
encryption then the key has to be stored in the software, which means an
attacker could ferret it out.) Cryptosystems should be designed and
implemented by experts. Getting them right is non-trivial.

If you want to explore this route, consider the following:

A typical bytewise stream cipher enciphers to and from the alphabet [0,255]
- ie the range of eight-bit values, considered as unsigned pure binary
numbers. After the key setup phase, it loops over the plaintext, taking it
one symbol (byte) at a time. It combines each plaintext symbol with a
symbol from the keystream generator to produce a ciphertext symbol.

The combining operator has to be a reversable function of two values from
the alphabet yielding another value from the alphabet. RC4 uses bitwise
exclusive-or because that's a handy operator that fits the criteria and is
fast and pretty much universally available. There are other operators that
would work just as well. Addition modulo 256 (where the inverse is addition
with the inverse of one of the operands, modulo 256) is one.

Now consider adapting RC4 to an alphabet that has only 5 symbols - the
vowels a, e, i, o, and u. First, map the alphabet to [0,4] so that it's
composed of contiguous values:

   a = 0
   e 1
   i 2
   o 3
   u 4

It's probably easiest to perform this transformation on input and reverse it
on output, but it could be done right around the cipher, or even within the
cipher.

Next, find a combiner. Bitwise exclusive-or won't work, because 1 xor 4
(combine e with u) will give you 5, which is out of range. If your alphabet
size isn't a power of 2, exclusive-or is out. So use addition modulo the
size of the alphabet (5). Ciphertext will be in the alphabet [0,4], which
can be transformed back into {a,e,i,o,u}.

Note that I'm not recommending converting RC4 to use this system. RC4's
internal entropy is in the permutation of the keystream pool. Reduce the
size of the pool and you reduce the number of permutations (factorially!)
and hence the amount of entropy. (I think - I'm by no means an expert, and
that's just off the top of my head.) It's just an example. Based on it,
you should be able to convert your existing system into one that enciphers
(however weakly) your 18-digit string within the same character set.

Michael Wojcik
Principal Software Systems Developer, Micro Focus
Department of English, Miami University



Relevant Pages

  • Re: Magic Excel function or UDF?
    ... When I entered this the word was in C5 and the alphabet string was in A18. ... Function GetIndex(sWord As String, sAlphabet As String) _ ... Dim i As Integer, iPos As Integer ...
    (microsoft.public.excel.programming)
  • Re: Getting a random letter.
    ... private string DoLameEncryption ... Dim RandomClass As New Random ... start with a random letter of the alphabet, ... having to loop through it all? ...
    (microsoft.public.dotnet.framework.aspnet)
  • Re: DFA recognizing the language {w | top(w) mod 3 = bottom(w) mod 3}
    ... For the alphabet: ... you need only three states (unless the empty string is not ... the DFA given this information. ... transition from s to t on symbol c, there is also a transition from t ...
    (comp.theory)
  • Re: utf-8 of a string
    ... I have been sent a string in some language whose alphabet ... This can hardly be answered without knowing what kind of alphabet or code ... programming your own in perl using the module Encode is not difficult, ... so there is no longer an appropriate group AFAIK. ...
    (comp.lang.perl.misc)
  • The ONLY Allowable ONE Byte Block Stream Cipher is Given.
    ... int counter=0; ... //as a equates to the second for loop a repetition happens. ... //END appears the plain text number, main appears the cipher text ... //What happens is a number becomes alphabet relative state. ...
    (sci.crypt)