Quasi-Algoruthms / Quasi-Functions

Quasi-Algoruthms / Quasi-Functions
Polymorph Encryption
[Joined Thesis v.0.9]

Kostadin Bajalcaliev
http://home.cyberarmy.com/kbajalc
Kbajalc@freemail.org.mk

Abstract

Quasi-Algorithms are new theoretical approach in understanding computation and can have a profound practical usage in many sciences. Redefinition of term "Algorithm" is presented. Practical effects on cryptography are discussed, especially block cipher design trough the theory of Quasi Algorithms.

Introduction

In my prior works on Cryptography, especially Block cipher design I have introduce the idea of Polymorph Encryption. [To note that there is no similarity between my theory of Polymorph Encryption and similar named Virus design technique]. Since ANIGMA cipher published in 1997 Polymorph Encryption is the building block in my Block-Cipher designs and a center of my research work. Acknowledging the need to generalize the concept and to make precise definition as well to establish the terminology exhaustive research has been done. This thesis is the summary of the results of this research work.

Polymorph Encryption is a strategy of designing a secure Block-Cipher. The main idea is to design dynamic algorithms using data-depend operations, data-depend operand chousing etc. As a good illustration here is the F function used in SQ2:

unsigned long F(unsigned long x)

{

int i; unsigned long r;

for(i=0;i<RP;i++)

{

r=(x*(x+11));

switch((int)(r>>30))

{

case 0: x+=S[(int)((r>>26)%16)]; break;

case 1: x*=S[(int)((r>>26)%16)]; break;

case 2: x^=S[(int)((r>>26)%16)]; break;

case 3: x-=S[(int)((r>>26)%16)]; break;

}

return x;

}

Here S[] is the key array. The two most-significant bits of x are used to chouse the operation, and the next four the operand. This is recurrent function i.e. the previous value of the variable is used to compute the next. The effects of this design principle can be summarized as: No Key = No Algorithm. If we know F(x) and S can we find x? Generally No, because some of the bits from x_k-1are used to chouse the operation and the operand participating in computation of x_k, if k is some small number we can run a brute force trying to figure out the input. This is the same as trying to find inverse function. But generally we have no idea is it F bijective at all. Later this will be explained in more details.

RC5 is one of the most innovative block ciphers, for the first time there is something called data-depend rotations. In Blowfish we have the idea of Key-Depend S-boxes. In DES two bits from the block are used to chouse the S-box to be used. CRAB introduces using of more F functions. The BIG question is what is in common between them, what is common between secure designs we already have? This thesis objective is to answer that question.

Polymorph Encryption (more math)

Let get serious and more maths to be boring enough. The cornerstone of mathematics is the idea of function. Function is a nothing more than a strict receipt how to calculate something (hoping to be useful). Later the properties of this ‘receipt’ will be discussed. For now a couple of word about Polymorph Encryption and the example from the Introduction, the SQ2 function. This function written in mathematical stile is:

Here o is one of the basic arithmetical operations, and S is the key array. If you look in your math books there is nothing like this, there are recurrent functions but functions with variable operations are out of the question. The first reason for this situation is the obvious difficulty to spell this king of ‘equation’, "x_k {plus, minus, by, divided} depending on the value of x_k, S depending on the value of x_k" it is very frustrating. But in cryptography this king of ‘equations’ are more than welcomed. Let we discuss the properties of the hypothetical encryption system using this kind of functions (let it be SQ2). The first question any cryptographer will ask is this function reversible? I am sorry to say, I don know, it depends. If F_k(x) and S are known can we found x? Let k=2 which means:

As implemented in SQ2 the two most significant bits are used to chouse the operation and the next 4 to chouse the operand. We know the value of F(x)=x₂, in order to find x we need as first to found x₁ but here comes the problem, the 6 most significant bits of x₁ were used to calculate x₂ and we don’t know x₁. In SQ2 the operation set is {+, -, *, XOR}. The only solution is to try to guess what those 6 bits were. After that we need to test our assumption. And we do, if the assumption was for example 101011 that also mean that we assume that our function was:

Assuming that our function was F’ and that 6 MSB of x₁ are 101011 we test the assumption inverting the last step.

The only thing left is to check are the 6 MSB of x₁, are they really 101011, if they are not we try again and again and we have reinvented the brute force attack. If the 6 MSB of x₁’ are not the same with our assumption we are certain that x₁ is not the same with x₁’, but if bits are the same there is no guarantee that x₁’ is the correct value of x₁. We can chouse S so that two distinct values of x result with the same value of F(x). Mathematically if F(x)=F(y) does not means x = y than F is not a bijective function and such functions are not reversible. It is very likely that randomly chosen values of S are going to result is a non bijective function. Let we assume that most of the values of S result in bijective functions (even I can not prove that there is at least one such an array S). Since the only way to reverse F is the brute force we should estimate the needed work to reverse F_k(x), it is 2^6k. This not a small problem for k=4 of bigger. Further discussion is pointless since knowing the result and the key by definition means knowledge of x, in cryptography there is decryption function. The other case, x and F(x) are known and we need to find S. We go the same way, first an assumption is made for the 6 MSB of x₁, but here comes the BIG problem, now we must make an assumption for S that after subtracting it from x₂ will result in x₁ with the 6 MSB as previously assumed. Using the same 101011 pattern we make an assumption for S[11], there is 2²⁶ such values of S[11]. This is simply too mach for brute force attack. So finding S having enough (x, F(x)) pairs is VERY VERY hard task. This is of great cryptographic value, since S is the Key.

General model for Polymorph Encryption F-function

Since SQ2 is not the only encryption algorithm that has [or going to] implement the Polymorph Encryption we need a definition of general model for Polymorph F-function [R-function]. The simplest mathematical notation for this model will be:

Here c(x) is a condition function and H is our set of primitive/elementary functions. There is an example. Using this template the Sq2 R-function is given below.

The idea is very simple to describe in math world. We divide the domain set into n subsets according to C – function and to each of them we assign different transformation rule. Instead of chousing simple functions to form H we can use entire algorithms as given below:

And again there is R-function. By incorporating this function in a standard Feistel Network we have our new Block-Cipher.

Cryptoanalysis of Polymorph Encryption

The mathematical model for Polymorph Encryption is discussed within Quasi Algorithms section. Today there is a lot of different attacks develop to extract the key form the knowledge of (x, E(x,k)) where E(x,k) is encryption under key k. Differential and Linear Cryptoanalysis are certainly the most commonly used. Differential Cryptoanalysis examine the difference propagation between pears of plaintexts and ciphertexts. In order to discover the key the encryption function is carefully examined for defects. The goal is to find some linear part of the function or a defect that cause some part of the encryption algorithm in significant number of cases to act linearly. [Linearity functions are those which can be inverted in relatively simple manner, all arithmetic operations are linear functions, in differential cryptoanalysis that mean that small difference in the plaintexts is causing a proportionally small difference in the ciphertexts] However we try to find a defect in a function that will leak an information about the key. There is a significantly different environment in Polymorph Encryption. Because of the nature of the F function we can not examine it for defects since every value of S make different impact on the difference propagation. Data-depend operations incorporated in Polymorph Encryption cause the algorithm to be ideterminated. Without S we can not predict what king of difference propagation the algorithm will show. Instead of designing strong non-linear functions a.k.a. S-box Polymorph Encryption is making a totally different approach. The goal is not to make a resistant system to the given attack but to make a system incompatible to the attack [plant virus is not going to cause any damage on the ants, but the biggest tree can die from the infection and vise versa]. The same stand for Linear Cryptoanalysis, we can not discover the eventual linear dependencies between the plaintext, the ciphertext and the key when the key is the algorithm itself. Shortly Polymorph Encryption stands for Encryption Algorithm = KEY. Different keys determine different encryption algorithms.

Now a small review of some well known design that incorporate something that can be considered as part of Polymorph Encryption parental tree. The first is the oldest, DES, one of design solutions in DES is especially worth of mention. Before entering the F function the block is expanded from 32 to 48 bits and it is split in 4 sub-blocks each 6 bits. The MSB and LSB of those sub-blocks are used to chouse one of 4 S-boxes available trough that the other 4 bits are going to be processed. This is simplest implementation of data dependent-operation but as the ‘history’ has shown just enough to defeat Differential Cryptoanalysis. In RC5 Ron Rivest use data-depend rotations which is first explicit Polymorph-like design solution. As we know RC5 had survive a lot of analysis and still stand secure. Blowfish is an example of well-realized key-depend S-boxes. Key-depend S-boxes are nothing more that realization of Polymorph Encryption credo Encryption Algorithm = Key. Can you analyze Blowfish without fixing the S-boxes?

Every well informed cryptographer will notice that Polymorph Encryption is not a new genius work even no one has ever formulated it as separate theory. The building blocks of the security of some of widest used ciphers are the principles here described under the name of Polymorph Encryption.

As an illustration the real live situation [not so real, let say form some American psihology institute]. You are seated in a new European car, an exclusive model, and the doctor tells you "You are part of our new experiment, there is a Russian nuclear bomb in your car, if you don’t find the way to the Nuclear Energy Faculty in Iakoato, you are going to be …. Now you are in Sibiria and you have 120 hours to reach the goal" You the lack lab mouse, have never been in Russia and Russian is not you field of best interest, and here we go, what you are going to do? Start the car quickly, but which way to go, somehow you chouse one and the next crossways bring the same dilemma. You try to examine the environment there is a cow crossing one of your potential directions and the other one have a sleeping elephant on the road. You chouse one, the cow is more attractive. And so on, cossway after crossway. After 119 hours and 50 minutes you decide to stop and to wait for your destiny, there is no Iaokato city in russia. Bum. You are awake this was just a simulation. Your shrink seem more than happy he have diagnosed you syndrome. How the hell that happened. Your choices on the crossways have shown what you fair and what you prefer. Your personality have driven you and you have drive the car just as intimidator.

The same thing happens when a block is entering the Polymorph Encryption Algorithm. There are crossways, the F-function, something must be done and block values determine what it is going to be. After that the block came out of the function changed. The next F-function execution produces the same must-to-do-something situation. The output is a mix from the personality of the patient (the plaintext) and environment (the key).

Quasi Algorithms / Quasi Functions

Now it is time to say couple of words about the mathematical model behind Polymorph Encryption. Even this part of the thesis has a little practical value it is important to define the rules of the game. The general model has been already described. The starting point in theoretical definition of what have been presented as Polymorph Encryption is the definition of Algorithm. As the functions are the building block of mathematics Algorithms are the building block of everyday activity of ordinary people.

Algorithm is a calculation procedure which have these properties: It is consisted of finite number of steps / primitive operations [usually arithmetical and logical]. Each step is fully defined, the order of execution of steps is strictly determined [there is begging and end]. There is a finite need of space and time in order to get the work done [complexity of the algorithm].

It is well known that each algorithm is a function and vise versa, in math we are going to use term ‘quasi function’ and for everybody else it is ‘quasi algorithm’. In order to begin, a simple function as an example, just a simple polynom.

The first question, what is the algorithm to calculate this equation? The people will have a no problem to do it but how to explain the situation to your dearest computer? As the algorithm was defined there is something called steps and order of steps and definition of every step. This means that our elegant mathematical notation should be substituted with a more rough form of expression. The computer no matter how powerful is usually very stupid, so we the humans must explain to him what he should do first, second, third … in order to calculate the polynom. And here it is the algorithm:

If you don’t get why this simplification is needed just try to write a calculator in your favorite C that will compute human friendly expressions.

However our problem the polynom is finally reborn in algorithm. Each step is one of the basic arithmetical operations that your favorite processor producer has somehow trained your CPU how to perform. The essence of the algorithm is the step order, if you mix the steps you will gain something that maybe nothing. Because of this the algorithm can be defined as Order-of-steps + definition-of-steps = algorithm. [We also need data structures but since they are only semantical category there is no need to interfere with them now]. About the step, in each step there is information about several things, the operands, the operation and where to store the result. [This is a clear situation to every assembler programmer]. One more abstraction can be done, we can extract the skeleton of the algorithm by abstracting the operations. The other case, abstracting the operands is absurd. Instead of addition, multiplication, subtraction and division we can write just ‘o’ – same operation. As presented above. What it this now? It is not an algorithm, we have no idea what should happened in those steps, but we do know the order of steps, we know which data is affected in each step. This generalized algorithm or skeleton is just a template telling us everything about the conditions and environment in which we should calculate something but on our free will is to chouse some meaningful set of operations.

If we write the example function in prefix notation we have:

With an abstraction of the operations substituting tham with the general operation o, we get:

In order to turn this skeleton in a function we need to substitute those o_j with some of the available binary operations, which our famous mathematicians have canonized thousands of years ago {+,-,*,/,^,q}. After this substitution is done our skeleton is turned into algorithm / function. Our example function have 10 o, operation places where we can insert the operations we chouse. Our set of operations is very small only 6 of them. However on each operation place we can put any of them which result in having 6¹⁰ different functions with the given skeleton. The function we started with is just one of them. Having this short discussion in mind we can define Quasi Algorithm:

Quasi Algorithm is a calculation procedure which is determined by the order of step of primitive arithmetical transformations, for each step it is uniquely determined which data [operands] are involved in the transformation and where the result is stored. Actual primitive operations taking place in every step are not specified.

The difference between algorithm and quasi algorithm is obvious. As an input to the quasi algorithm besides the argument x [maybe more of them] we pass an information about the primitive operations which should be executed in each step. The quasi function conjugated to our example is:

Because this is not a function concerning our mathematical heritage we need to make a little expansion. We define a general overlay function, ?, which play the role of quasi algorithm processor. The quasi function, the arguments and the information about the primitive operation to be executed in each step are the arguments passed to the overlay function.

Here sigma is a variation over the set of available primitive operations {+,-,*,/,^,q} [q is notation for n-th root function, ^ is power function] For each possible value of sigma ? present a function that have the F skeleton. Our example function is writen in this notation as: ?(F,(-,+,+,+,^,*,^,*,^,/),x). Just to note that the variation sigma is of n-th degree where n is the number of operation places in the skeleton F. Expanding the notation our f is now:

For a given value of sigma ? is a function in a form we are used to see in our math books. Further generalization is possible, instead of sigma we can place any variation-generator-function which as arguments accept the set of possible operation, the degree of the variation and a number, the result is a variation in form we need.

The set A can be the default set of operations {+,-,*,/,^,q} or any other set of binary operations which can be user defined. Binary functions can also be included in the set if we chouse to accept them as primitive operations. The second argument is |F| or absolute value of the template or degree of freedom which is defined as number of operation places in the template need to be filed.

The overlay function ? is a family of functions in respect to the different values of sigma parameter. Here is example number 2.

Using this template the overlay function is:

Mathematics has developed a wide specter of methods and boring theorems how to analyze the behavior of functions. It is now my intention to show how useless that grandiose theory can be. Just a simple task to find:

Certainly sigma can have any of the possible values. How to calculate this limit? The easiest solution to run over all the possible values of sigma, filling the set of possible limits for the given skeleton. The solution is:

But brute-force approach is exactly what the math wants to avoid, instead of empirical conclusions about something we usually tend to develop a model to calculate the result instead of guess-and-check approach. Simply Brute force is out of question, just imagine the example 1 function and the quest for integral.

Conclusion about Quasi Algorithms

Even here nothing more than basic idea and little math notation have been presented I am sure Quasi Algorithm are worth of interest and research. Quasi Algorithms Theory is very brave term for what is here described but future research are certainly going to show the importance of this ‘theory’. A quick summary:

2. Quasi Algorithms are real thing they exist.
4. Every algorithm belong to a class of functions determined by its skeleton
6. Algorithm = Quasi Algorithm + Operations-to-be-inserted
8. Today stage of development of math can not cope with Quasi Algorithms
10. We hope that will be the case in near future
12. Cryptography is number one benefic of this theory

I am not so sure about other fields of science but cryptography can certainly get the most form Quasi Algorithms. Polymorph Encryption described in this thesis is just a example how to use the theory of Quasi Algorithms. If we figure out how to analyze Quasi Functions that will be a plus for every possible field of science [at least those involving math]. Fining templates and patterns is what we are trying so hard to teach our computers to do so. Quasi Algorithm is theory that separates the pattern from the actual implementation, and that is really a simple task. Analyzing those patterns will enable to investigate the behavior of whole class of problems instead of separate specimens. Dividing problems in classes is certainly the main objective of every science. There so very interesting question that can be derived from the theory of Quasi Algorithms:

2. If there are inverse functions for all or part of the function having skeleton F, are these inverse functions share a common skeleton?
4. Is it possible to reconstruct the function if we have its skeleton and finite number of pairs (x,F_s(x)) [there is similar problem when we have the n-th epitome of the function and some finite number of pairs (x,f(x)) ]?
6. Can we expand the available mathematical theory in order to include Quasi Algorithms?
8. Are there classes of Quasi Algorithms that show some special properties?

This thesis not tend to be complete nor present some class-one-importance discovery, in contrary I am trying to formulate the idea and to scratch the potential use of it.

Cryptoanalysis of Polymorph Encryption trough Quasi Algorithms

The revolutionary concept introduced by Quasi Algorithms in cryptography is not just a change in notation but a deep movement in understanding the process of encryption at first place. Standard model of encryption using S-box or some other non-linear structure tend to use fixed functions incorporated in Feistel Network and here is a new ‘secure’ block cipher. But ‘secure’ was certainly not a characteristic for more than 99% of block cipher proposed until today. Polymorph Encryption introduces something totally different. Using the equation of general overlay function ?:

The sigma function use the same argument as C. The direct consequence is that every value of x and k make the decision which function from the family C [the encryption Quasi Algorithm] they are going to be processed trough. This is similar to the situation to encrypt a plaintext under the plaintext itself as a key. As stated before the main strategy against the cryptoanalysis is to make cipher incompatible to them. This certainly does not imply that there will be no attack discovered against Polymorph Encryption but our current mathematical understanding of things is just not enough to even start a basic analysis of Polymorph Encryption and Quasi Functions.

In order to understand Quasi functions as implemented in Polymorph Encryption here is a little illustration. The set of input values is divide into subsets according to the function C. To each subset different function from H is ascribed. Since F function is executed more than once we there is hoping from one subset to other. Let x_k=?(F,S (A,|F|,(x_k-1,K),(x_k-1,K)) naturally x_k-1 belong to some subset according to the division in respect of C. After the execution of the corresponding function x_k may not belong to the same subset any more. If this is the case, and in practice this effect is exactly what make the cipher secure, then calculating x_k+1from x_k is going to be according to a different rule than x_k-1to x_k. All the functions in H should be distinct and all of them equally probable to be chosen. Also they must have different difference propagation, clearly it is not the same to multiply x with other number or just to add them or xor them. The effect from this subset hoping produces the uncertainty in deference propagation shown by the Polymorph F-function.

Supporting info and contact

This thesis make a extensive reference to algorithm and other thesis, since it will be very long and boring reading to include all of them in one place here is a index of my prior works somehow referenced in this thesis:

2. ANIGMA cipher, there are three separate documents an outcome from that project: ANIGMA, MEX128 and Variable Function Technology, in the last one is the original description of what is now Polymorph Encryption
4. A2 cipher, containing some discussion how to implement Polymorph Encryption into the F-function
6. SQ1 there an interesting survey on implementing Polymorph Encryption in Stream Cipher Design even the thesis major objective is information loss theory.
8. SQ2 used here as an example of Polymorph encryption. This one of my major works implementing all the theory explained here.
10. SQ3 a.k.a. Z876 is the newest design also using Polymorph encryption but in some alternative approach than discerned here. I recommend you to read the documentation about this cipher together with SQ2 they are typical specimens of Polymorph Encryption. SQ3 have no F-function as usually used instead there is change in Feistel Network.

All this document are publicly available trough my web sites in research section:

http://kbajalc.8m.com

http://eon.pmf.ukim.edu.mk

http://home.cyberarmy.com/kbajalc

I hope you are going to find this interesting, any help is welcome in order to speed up the research. Contact information is listed below:

Kostadin Bajalcaliev

Ul. April 26^th Street #14

1480 Gevgelija

Macedonia

Europe

Kbajalc@freemail.org.mk

Kbajalc@eon.pmf.ukim.edu.mk

Please write a message for any suggestion you may have.