Stream Cipher design postulates – SQ model

 

Kostadin Bajalcaliev

April 26th Street num: 14,

91480 Gevgelija, Macedonia, Europe

Kbajalc@vmacedonia.com

 

 

 

 

 

 

 

1. Introduction

In this thesis I am going to discus Stream Cipher (SC) design postulates. More than a year I am trying to define a basic strategy in SC design. This thesis is sublimation of the results of my research in SC. Today there are lot of SCs, but there is still no literature about their design. Block Ciphers are different story, a lot of research is done to define the rules of the game, a lot of designs are publicly available, some of them are even standardized. SCs are the first cryptographic primitives ever used, but today they do not get the desert attention. There a lot of reasons both political and technical. Most of SC designs even today are based on LFSR, but there are some alternative designs as SEAL and RC4. This thesis is devoted to those "alternate" approaches in SC design.

However the things are changing and finally the "east wind" has start. In last couple of years more SC designs are published than decades ago. SC is nothing more than secure pseudo-random bit generator SPRBG, a lot of is done in PRBG field but most of researches are done in order to produce sequences with given statistic properties, the progress in Secure PRBG is not even close to that in PRBG. However today we have SPRBGs, the question is what makes them secure? I am not going to discuss the basic design solutions in this field, before you read this thesis you should be familiar with the problem and the mathematics behind.

These questions are to be answered here:

    1.  
    2. What are the basic rules in order to design statistically good SC?
    3.  
    4. What are the basic requirements a given SC to be secure?
    5.  
    6. Are there any methods that can be proved to be resistant to any breaking attempt?

 

2. Basic assumptions

Here some basic assumptions that are important in order to understand this thesis are given.

In this thesis the discussion always goes around random bits, random sequences etc. What means something to be random? This question has no definitive answer, philosophically nothing is random, everything has a cause, nothing happened by itself. The randomness is a characteristic exhibited by processes, these processes act in such a manner that there is no way to predict the outcome with probability greater than the real one. When a coin is thrown can the outcome be predicted whit probability greater then ½? No. Every physical process is "random", because nothing can happen in exactly the same manner twice. RANDOM in this thesis mean that a given process can not be predicted if some finite quantity of information is not available.

When a coin is thrown, in order to predict the outcome of the experiment with probability greater than 50% the exact force used to thrown the coin, the air resistance, the intensity of some magnetic field etc. should be known. Even if all the factors that can influence the outcome are isolated, there is no way to measure their exact intensity. That why there is no way to predict the behavior of any physical process with 100% certainty. Because the computers are discrete systems and SCs are determinate algorithms, some information used by the algorithm should be the KEY, and without knowing the KEY the algorithm should look as natural unpredictable process. The KEY is the only unknown of the SC.

Now let me present the assumptions needed by this thesis:

  1.  
  2. If A is a random sequence then any subsequence of A is random too.
  3.  
  4. If A is a random sequence and B is or it is not a random sequence than C=A*B (where * is any entropy-1 operation) is also a random sequence
  5.  
  6. If A is a random sequence and A’ is a sequence formed from randomly (but not obligatory sequential) chosen elements of A, than A’ is a random sequence.
  7.  
  8. Discarding some positions in a random sequence A by a fixed rule results in sequence A’ that is also random.

The problem about randomness is how to measure it? There is no exact formula to compute how random is something. A lot of tests are designed to test is it a given sequence random, best of all is DIEHARD battery of tests. In this thesis the statement "it pass randomness tests" means that a given sequence did not fail on DIEHARD tests. Any randomness test is measurement how much a given sequence digress in its characteristics from a real random sequence with same length. Because there are no "real random" sequence the needed "expected values" are arithmetically obtained and are practically unattainable.

Second important term used in this thesis is ENTROPY. Entropy is a measurement of information quantity; the minimum number of bits needed to present some information. Every information has entropy, and there is no information having 0 entropy. Entropy coefficient (ECF) is (minimum number of bits needed to present the information) divided by (number of bits used to present that information). Ordinary English text has entropy 1.3bits per charter, this mean that 1MB of English text may be presented with 171KB. All compression algorithms are based on the fact that "natural" information usually has ECF significantly less that 1. Here some basic entropy assumptions:

    1.  
    2. Information A is Entropy-1 (E-1) if it is presented with the minimum bits required, ECF=1
    3.  
    4. The process W is E-1 transformation if the information A’’ produced by processing the information A has ECF which is not less than ECF of A.
    5. Example for entropy-1 transformation is addition Y=X0+C, if this transformation is calculated for all possible inputs in range 0..255 the output will be uniformly distributed, 256 of the results will be 0, 256 of the results will be 1 etc. If a same value is add to all information blocks that will not cause any change in entropy coefficient of the processed information. Any structure that satisfies the previous assumption is entropy-1 transformation. The simplest entropy-1 transformation is Y=X. S-boxes are usually entropy-1 structures because in most of the cases they are permutations.
    6. If C is a complex transformation C=AB, y=C(x)=A(B(x)); then ECF(C) < min(ECF(A),ECF(B))
    7. This assumption explains way the S-box should be permutation (E-1 objects), in opposite case the algorithm using non E-1 s-boxes will be weaken because of S-box defect. In order to test some complex transformation to confirm that it is E-1 object, it should be shown that the 2nd assumption is satisfied. For example it should be shown that F(A,B,…..X) is E-1 object; it is enough to show that for all possible arguments A,B….X the output is uniformly distributed.
    8. Every "statistically random" self-mutating process can be view as E-1 object, Quasi E-1 object.

This is assumption will be used letter to show how non E-1 objects can be used in SC design. By definition SC are self-mutating processes. This assumption just summarizes that statistically strong SC can be viewed as Quasi E-1 object because of the uniform distribution.

 

3. Universal SC model

In every field of science there are so general rules of the game. In cryptography almost for all primitives there general concepts, Block Ciphers have the Festel Network, in Public Key Cryptography the basic problem is integer intractability, all Hash functions are similar, etc. Stream Cipher field is chaotic. LFSR is still the dominant concept of SC design, but LFSR is clearly "Stone Age of cryptography". Because of political reasons SC are still mostly used in Army, very weak constructions are used and that is the reason why public debate of SC design is not welcomed. There a lot of Block Ciphers some of them are standardized and available for public use, but there is no Stream Cipher standardized in any form, for the sake of true there is very little SCs at all.

In order to define general approach more design solution are to be analyzed in order to find out what is common for them. This analysis is left for the reader, it is beyond the reach of this thesis to criticize some particular designs or to give grades about their quality. This thesis is limited on examining the theoretical basis of SC. Whit "General SC model" I do not claim that I have found formula which can express every possible design.

Mathematically every SC is an iterative function xk=F(xk-1). This form is common for all SC designs. The simplest one is the congruent generator xI=(xI-1*m+n) mod g; where m and n are KEY values and x0 is the seed. The general model presented here is template what every SC should have in its design solution. It is clear that SC can not be some mathematical function, the congruent generators are purely mathematical and they are also can be break using the same mathematics used to construct them. The stream cipher should be such structure that is inexpressible with mathematical model, breaking SC can not be such a problem that commute in solving some mathematical equant.

Now what SC should be? In order to avoid the problems listed before, the SC should be such a function that can mutate, so that every value can be generated from the previous one using a different rule. The problem is to finding a function that can mutate. Every function F:A->B is some mapping, injective or biective. If those sets are "small" the function can be substituted with a look-up table. The most simple look-up table is the S-box. Any function over discrete set A is going to translate it into set B of the same range and the same number of elements. {A,B} is the look-up table for a given function. But there such look-up tables that can not be reduced on some mathematical operation, but they have the property needed, they can mutate according to some rule. Here is the simplest example for Mutating-function:

  • Input: in

    Output: out

    Permutation: P[0..n]

    Swap(P[in],P[in+1]);

    Out=P[in];

  • This function is trivial, but it can be used to "connect" the values with the ones before them. Let me now explain the general model of SC using mutating-function. Every SC (according to SQ model) has three parts.

    Here is a symbolic representation of SQ model:

  • Input: in

    Output: out

    Set of Transformations: ST(function,value)

    Output produce routine: OPR(function,value)

    Default Transformation: DT(function)

    Mutating Function: F(value) (This is the permutation)

    ST(F,in);

    DT(F);

    Out=OPR(F,in);

  • Set of transformation is set of permutation which are fixed or key depend. This set can be constructed arithmetically or each element transformation to be independent construction. Example of arithmetical type is Swap(P[in],P[in+1]), example for independent type is: if In=0 then Swap(P[3],P[5]); if In=1 then Swap (P[7],P[13]); if In=2 then Swap(P[4],P[9]) and Swap(P[11],P[0]) etc. It is very important to construct this set very carefully.

    OPR should be some "one-way" function, that is a function which is E-1 but having the output value and the Permutation is not enough to find out the input with 100% certainty. This step is responsible for extracting a fixed number of bits from the state values and forming the output.

    The last step is the counter structure, some state variable that linearly increases in all iteration. In the stream ciphers something must to be in a continuos movement. I suggest DT to be Ideal Permutation of 1st degree. (This means that after permutation no element is on its previous place.) This is better approach then using classic counter as in RC4.

    It is a matter of security analysis which step should be the first, the second and the third. This question will be investigated letter. In the next chapter I will show that extremely simple constructions according to this model can show amazing statistical properties.

     

    4. Simple Constructions

    Here is a simple cipher constructed using the theory described before. This construction produces 8-bit output per iteration.

  • EC-1

    Input: in

    Output: out

    Permutation: P[0..255]

  • Swap P[in],P[in+1] ST

    P<<<1 DT

    Out=P[P[in] XOR P[in+1]] ORG

  • This is an extremely simple algorithm but it passes statistical tests of randomness. The first step ST is a simple swap that is different for all possible inputs. The second step is an Ideal Permutation of 1st degree; this permutation is just a cyclic-shift 1 place in left. This is the counter structure in this design solution, without this element the algorithm will stuck because in general case there are such input value which will not do any change, or a cycle of transformation will be established. The ORG part is here to assure that all iterations are one-way transformations. If P and Out are known, is it possible to found the previous state of the algorithm? Yes and No. Having Out the most is can be disclosed about the previous state is (P[in] XOR P[in+1]). Now the permutation is search to find out which two neighbor elements XORed give Out. But there can be more then one such pair. So it will be a problem to find In for a given Out and P but it is not impossible in this particular design. The quest for the correct value of In (going back-ward) required some quantity of information that is not carried by the Out. If ST step is Swap(P[in], P[P[in]]); now it will be harder to break the cipher, but it will be still possible. So we have the second version of our simple cipher.

  • EC-2

    Input: in

    Output: out

    Permutation: P[0..255]

  • Swap P[in],P[P[in+1]] ST

    P<<<1 DT

    Out=P[P[in] XOR P[in+1]] ORG

  • This version can be improved in ORG step, instead P[in] XOR P[in+1] we can put P[in] XOR P[in+1] + P[in+2], but the problem is not entirely solved with this change.

  • EC-3

    Input: in

    Output: out

    Permutation: P[0..255]

  • Swap P[in],P[P[in+1]] ST

    P<<<1 DT

    Out=P[P[in] XOR P[in+1] + P[in+2]] ORG

  • It is still possible, if the attacker found the permutation P and a long enough sequence, by trying every value by brute-force to discover the complete previous state of the algorithm. We can continue doing these this kind of changes but constructing ideal one-way transformation is impossible. This is because the output carry most of the information needed to invert the transformation process. Latter a simple strategy will be described how to solve this problem without changing the algorithm. However in this part simple algorithms are proposed which satisfy the statistical tests of randomness. To made them usable only needed to be done is to make them secure too.

    Now let me present one statistically weak construction.

    EC-4

  • Input: in

    Output: out

    Permutation: P[0..255]

  • Swap P[in],P[P[in+1]] ST

    P<<<1 DT

    Out=P[P[in] XOR P[in+1] +in] ORG

  • This version does not pass DIEHARD and it can not be made secure PRBG because by definition secure PRBG should be statistically strong too.

     

    5. Statistical Analysis

    The important question is what makes SQ model cipher statistically strong? I am going to show that it is possible to construct ciphers using general template that are statistically strong. The example cipher 4 is statistically weak, Why? Let me explain the problem by comparing EC-3 and EC-4, the only difference is in ORG step. Because all EC ciphers are recursive function this is the main problem.

  • EC-3: Out=P[P[in] XOR P[in+1] + P[in+2]]
  • EC-4: Out=P[P[in] XOR P[in+1] + in]

    Firstly EC-3 and EC-4 ORG step is tested to see are they E-1 objects. Let the test be done over finite set {0..63}. Both of them are equally distributed, E-1 objects, in EC-3 every value occur 63*62 times, in EC-4 every value occur 64*63 times. Defining the problem is a hard task, but there is certainly something wrong. One explanation is that EC-4 is not entirely P function (permutation) dependent, any weakness in the input distribution will result in cipher weakness too. If x=EC-4(x) (Out=In), next iteration will have a problem because of P[in] XOR P[in+1]+in. This "error" will be preserved because there is one "free" argument [in] which is not affected by the previous mutation of P. EC-3 have no such problem and it is best construction comparing to EC-1,2,4. EC-1 and EC-2 have ORG(in)=P[P[in] XOR P[in+1]] but P[in] XOR P[in+1] will not ever result 0, all other values are equally distributed. DT corrects this problem, but it is not recommended to use constructions that certainly do not act as E-1 objects. In order to prevent the problem in general case ORG should be E-1 object.

    There is no general template that can be good no matter how it is implemented. EC-1,2,3,4 are examples that SQ model is statistically strong and secure, but each particular implementation should be tested to evaluate its behavior. The problem of SC design can be summarize as: Making a self-mutating function that preserves the entropy. Every design must act as E-1 object. In order to test a given design is it E-1 object or not, the behavior of every step should be test. EC-4 is E-1 object but the ORG step does not respect the self-mutating principle, because of this it is weak construction.

    The second rule is that no value can be used from the previous iteration unaffected by the mutation. In order to obey this rule P function should be used to filtrate the values that are inherited from the previous iteration. But having structures like P(P(P(x))) is also dangerous because it is possible P to have cycles and this recursive formation will ignore itself in some number of cases producing loss of entropy. In order to have an effective design, recursive use of P should be avoided. When there is more that 1 referencing to P in ST and ORG they should be done "independently", [in] and [in+1] are obviously not independent at all. Putting [in] and [counter] will solve the problem. Here is changed version of EC-1:

  • EC-5

    Input: in

    Output: out

    Counter: C

    Permutation: P[0..255]

  • Swap P[in],P[C] ST

    P<<<1 DT

    Out=P[P[in] XOR P[C]] ORG

  • Having the counter enhance the algorithm producing all the elements to be E-1 objects. This construction passes DIEHARD. How let summarize the basic rules about SC design concerning their statistical properties.

    1.  
    2. The SC must be (or act as) E-1 object, all the steps must be (or act as) E-1 objects
    3.  
    4. All values used by the algorithm must be affected by self-mutation, non of them should enter the next round unaffected
    5.  
    6. In every step where more then once P function is used, it should be referenced by independent indexes
    7.  
    8. No recursive reference to P with degree greater then 2 is aloud
    9.  
    10. No function of form G(x)=F(x)*x should be used (where * is E-1 operation)

    The goal of all this rules is to preserve the entropy, If only one step produce entropy loss the whole algorithm behavior is threaten. Good statistical behavior of the algorithm is the major condition to have secure algorithm.

     

    6. Security Problem

    In all classic books about Cryptography you can find that there is no secure algorithm, only they can be more or less complex to break. I am going to show that it is possible to design provably secure algorithms. The output sequence and all the details about the algorithm are unconditionally available to the attacker. Attacker do not know the state of algorithm, his objective is to reconstruct generator state so he can produce any further output. Because setting the initial state is in fact SC keying, the last statement mean that attacker can not know the key. This is the standard assumption about security, only unknown is the key, the whole security of the algorithm relay on the secrecy of the key.

    Classical approach to solve this problem is by making the algorithm to be a complex function of transformation, having pairs x, F(x) (just to remained that every SC is xk=F(xk-1)) is not enough to reconstruct F in a given point. But using only this approach we are always forced to make assumptions about security, putting the security of the design on the hardness to solve or invert some mathematical construction.

    Before I define my idea how to make provably secure SC, here is a little illustration story.

  • Old Puzzle Problem

    There is a puzzle, a very old one, and someone is trying to solve it. But the bricks are old and damaged, so it is hard to distinguish their forms for sure. Two originally different parts are now almost the same and we can not do anything to found their original form. We try to predict the original form of the bricks but there are so many of them. Also the colors are damaged. Some bricks are in better condition and we have a general idea what the puzzle was, but we can never say it for sure.

  • I am using this museum story to introduce the using of the same idea in SC design. For example, there is a PRBG with 8 bits output per iteration, if we discard one of them we get 7 bit value (I call this "secondary output"), so the output sequence constructed does not carry the whole information about the generated output values. Having xk=F(xk-1); 8 bit generator, discarding the first result is ZYYYYYYY, where Z is the discarded bit. Let Z’XXXXXXX and Z’’YYYYYYY be two successive values connected by Z’YYYYYYY=F(Z’’XXXXXXX); but Z’ and Z’’ are unknown so there 4 different equates and only one of them is the true.

  • 0YYYYYYY=F(0XXXXXXX);

    0YYYYYYY=F(1XXXXXXX);

    1YYYYYYY=F(0XXXXXXX);

    1YYYYYYY=F(1XXXXXXX);

  • It is impossible to mathematically found which is the true one if Z is unknown. Prediction can be made but that is equal to the problem of guessing the original form of the puzzle brick. If the generator output is 32bit but only one bit from each generated value is used to form the output sequence, then guessing these 31 missing bits is practically impossible.

    This is the "information loss" approach to solve the security problem about SC. Because the output available to the attacker does not carry the whole information needed to apply some attack he can not do it. He can not do it even if he found some efficient approach to break it in its pure form. This approach is done using the 4th assumption about randomness from the beginning of this thesis.

  • Discarding some positions in a random sequence A by a fixed rule results in sequence A’ that is also random.
  • The principle of Information Lose can be formulated as:

  • In order to design provably secure SC only two conditions are to be satisfied:
      1.  
      2. The generator pass Statistical Test of Randomness,
      3.  
      4. The generated output sequence should not carry the whole information provided by the output values used to construct it.

    Information Lose technique is just a variation of the main principle of SC security:

  • Knowing finite number of Input-Output pairs is not enough to reconstruct the state of the algorithm in a given point.
  • If you carefully examine DIEHARD tests, you will see that most of the tests are done over particular bit groups from 32bit sequential blocks. This is because the 4th assumption must be satisfied by random sequences. Linear Congruent Generators are mathematically insecure and breakable, but they have excellent statistical behavior. If Information Lose Approach is applied and for example only the 30 MSBits are used to form the output of the generator can any mathematician break it? This strategy is a simple approach how to design absolutely secure SC. Any statistically strong generator can be upgraded to secure one using the Information Loss Approach.

    Another security problem is "keying". We have a given SC procedure but how to put the key in the design? Usual approach is by making the initial state of the generator key-depend. There are standard solutions how to make an initial permutation according to given key. Other way is to use the key as a plaintext, transforming the SC function from Xk=F(Xk-1) (used in encryption (Ck=Pk*Xk) to Xk=F(Ck-1) as a keying mode). Over-encrypt "the key" couple of times until the inner state of SC is enough affected by the key, after this the generation process can continue as usual. There are a lot of keying methods each suitable for given design, however it is important to define which part(s) from the inner state of SC can be keyed and to chouse the best available method.

    The last security dilemma discussed here is the order of steps; it is hard to tell which order is the ideal one. There are two approaches: (1.Self-Mutation, 2.Output-Generation) or the reverse one. If there is no entropy problem ST-DT-ORG order should be used. If ST-DT-ORG order is used the output of the iteration is affected by the mutation and it should be harder to reverse the process. If ORG-ST-DT or ORG-DT-ST order is used then the output of the iteration is the value according to which ST is done. So the attacker is going to have two pieces of information: the input in ORG, and its output that is input for ST. This info is enough to attack ORG step and that potentially leaks information about P state. If the ST-DT-ORG order is used the attacker have knowledge only about the input and the output of this chain, in order to reverse ST he mast to reverse ORG. In this situation the attacker is forced to made prediction about the behavior of the whole chain, this is more complex than predicting just one step from the algorithm structure (as ORG-(ST-DT) chain).

     

    7. Quasi E-1 transformations

    This part of the thesis has to answer to the question: are only the E-1 transformations usable for SC design? Here is an example cipher:

  • EC-6

    Input: in

    Output: out

    Counter: C

    Variation: V[0..8192]

  • V[in]=(V[in]+V[C]) mod 8192 ST

    V<<<7 DT

    Out=V[V[in] + V[C]] ORG

  • {* The output sequence is formed from the 8 LSB of the generated values *}

  •  
  • If we carefully analyze this cipher we are going to conclude that no step is E-1 transformation. But this SC does not fail on DIEHARD. Why? ST step can not be analyzed statically, we must to do a statistical test how this Transformation is going to behave. If this transformation do not behave like E-1 object the whole cipher will be biased. This algorithm do not stuck, also it is pretty random and secure. The conclusion from this example is:

  • Transformations used in SC design do not need to be Arithmetically E-1 object, it is enough that transformations have approximately equal distribution of the output (to act as E-1, QE-1).
  • This is the 4th Entropy assumption. Instead of E-1 objects, Quasi-E-1 object can be used. The security increases when QE-1 objects are used. This can be explain with the unpredictability of QE-1 functions, they have no equal distribution in given point but they behave as they have. If someone is going to analyze SC designed using QE-1 transformation he will be forced to make much more predictions about the algorithm’s behavior comparing to the case when SC with E-1 transformations is analyzed. This increased level of predictions needed makes the analysis attempt nearly impossible.

    When QE-1 transforms are used Information Lose must be increased because not all the bits produced by this kind of generator are secure. If EC-6 is implemented on base 256 it will be statistically weak.

     

    8. SQ1 Algorithm

    SQ1 is algorithm of my own design which implement all the theory described in this thesis. SQ1 is a hybrid design using both E-1 and QE-1 transformations. It is secure and statistically strong. This is very simple and fast solution for generating pseudo random bytes wherever they are needed.

  • Algorithm: SQ-1

    Sum Register: Sr (This is the input value In)

    Output: Out

    Variation: V[MS]

    Permutation: P[MS]

    Other variables: a,b,c

  • a=P[Sr]]; b=V[a]; XP

    V[a]=b; VT

    Swap P[a], P[b]; ST

    P<<<1; DT

    Out=P[P[a]+P[b]]; ORG

    Sr<<1;

    Sr=(Sr+Out) mod MS;

  • Previously in this thesis I have define some basic rules which must be respected during SC design in order the SC to have good statistical behavior. Here is explanation for compliance to each of them:

    The SC must be (or act as) E-1 object, all the steps must be (or act as) E-1 objects

    This rule is respected because every step (ST,DT,ORG) in this algorithm is E-1 object. The first two steps (XP,VT) are something that is not included in the general template. These are QE-1 objects.

    All values used by the algorithm must be affected by self-mutation, non-of them should enter the next iteration unaffected

    The only value inherited from the previous step is Ls (The last output value), In the beginning it is expanded in two values in a such a manner that is affected by the last transformation of P and V.

    In every step where more then once P function is used, it should be referenced by independent indexes

    First two steps are there to ensure that this rule is respected. XP step expands Ls into two "independent" indexes that are required by the letter steps. VT step transform the Variation according to the last transformation of the Permutation assuring that XP in the next iteration will be affected by the self-mutation.

    No recursive reference to P with degree greater then 2 is aloud

    The only recursive reference to P is in ORG step and it is of degree 2.

    No function of form G(x)=F(x)*x should be used

    There no function of the forbidden form.

    The Variation is responsible to make all attempts to solve the relation Xk=F(Xk-1) impossible. In the XP step the previous generated value is expanded in two values using V and P, it is expected that a and b are going to act as independent values. In fact they are not, but in order to produce a having b the attacker must to know both P and V. Because V is as much unpredictable as P there is no way to reverse the iteration in order to reconstruct the previous state having finite number of successive values produced by SQ1. Here is the maximum what an attacker can get trying to reverse the iteration:

    According to this predicting the behavior of given iteration is reduced on founding P and V, and the complexity level going back-ward is MS (this mean that at least log2MS bit information must be predicted).

    Variation V is realization of the proposed keying method in "Security Problems" of this thesis. Finding V is as difficult as finding the key. The Variation is used to "encrypt" the input to the algorithm, because of this there is no mathematically presentable relation between In and Out.

    Other design solution that increases the complexity of the algorithm is the Sum Register. SQ1 is totally recursive cipher of the form Xn=F(X0+X1….Xn-1). Cracking this construction is much harder than the basic concept Xn=F(Xn-1). Interacting with Information Lose this technique increase the level of required predictions in order to break SQ1 seccurity.

    Summarizing the statistical behave of SQ1 the conclusion it that every structure used is E-1 or QE-1 structure. Variation V is QE-1 structure, because its transformation is always done according to P mutation from the previous iteration it can be approximated that V exhibit the same statistical properties as P. This is realization of the 3rd assumption about randomness from the beginning of this thesis. Previously I have show that self-mutating Variation Cipher (EC-6) can be also secure under some assumption. Even Variation Ciphers are not theoretically secure as Permutation ciphers they are much more unpredictable.

    Security of SQ1 is guaranteed by Information Lose technique. See the Source code on the end of this thesis for more details. As it is previously shown the complexity level of predicting the iteration is equal to range of the data structures used. If the V and P are in range 0-255 (char) then every output required at least 8 bits of predicted information to found the link between In and Out. This equal the output entropy so there is the paradox, to guess the iteration transform, we need much more info than the output carry about the process that produce it. Adding the information lose technique it came out that having 10 bit range of the algorithm data structures and 8 bit secondary output, the attacker have to make at least 12 bits of prediction in order to reconstruct Variation knowing Permutation in a given point.

     

    9. Conclusion

    In this thesis I have tried to explain my ideas about Stream Cipher design. On the end I propose SC cipher of my own design. SQ1 is secure and statistically strong algorithm. I hope that this thesis will join the attempt to initiate an open academic discussion about SC design principles, as we have a lot of analysis about Block Ciphers we should pay the same attention to SC.

    On the beginning of thesis three questions are pointed as main subject of analysis here, now let me summarize the answers:

     

    What are the basic rules in order to design statistically good SC?

    In the 5th part I summarize the basic rules for constructing statistically strong SC:

      1.  
      2. The SC must be (or act as) E-1 object, all the steps must be (or act as) E-1 objects
      3.  
      4. All values used by the algorithm must be affected by self-mutation, non of that should enter the next iteration unaffected
      5.  
      6. In every step where more then once P function is used, it should be referenced by independent indexes
      7.  
      8. No recursive reference to P with degree greater then 2 is aloud
      9.  
      10. No function of form G(x)=F(x)*x should be used

     

    What are the basic requirements a given SC to be secure?

    In 6th and 7th part I have shown that the main requirement a given SC design to be secure is:

  • Knowing finite number of pairs Input-Output is not enough to reconstruct the state of the algorithm in a given point.
  • Are there any methods that can be proved to be resistant to any breaking attempt?

    This question is the answered under assumption that the attacker has finite (real) computing power. In part 7th I have shown that for every Input-Output pare in SQ1 we need at least MS+InfoLost bits additional information in order to guess the right state of the Variation knowing the Permutation. Because the Permutation in practice can not be available to an attacker in any state this attack is impossible. However the Information Lose Technique is a general approach how to make a given SC secure, because ILT force the SC to obey the basic requirement a given SC to be security.

    According to this, the thesis in front of you has answered the questions it has been written about at first place. Most of the terms used here are not standardized because there is not a lot of them. This thesis has show that there is a general approach in SC design, it has shown some of the basic rules of the game. Nothing you are going to find in this thesis is a receipt how to cook secure SCs, all the rules, assumptions and proposed solutions here are just a condition that every SC must to satisfy in order to be secure. However every design solution must be statistically tested in order to see is it secure or not. Also for every solution all possible reverse attempts should be analyzed in order to define the minimum quantity of information that is missing in order to be able to predict the values of data structures used by the algorithms in a given point according to Input-Output pairs. Much deeper analysis of SQ1 is required. I will be happy if You found this subject worth of your interest and time.