540 likes | 656 Views
Security in Computing Chapter 12, Cryptography Explained Part 1. Summary created by Kirk Scott. This set of overheads corresponds to the first portion of section 12.1 in the book The overheads for Chapter 12 roughly track the topics in the chapter Keep this in mind though:
E N D
Security in ComputingChapter 12, Cryptography ExplainedPart 1 Summary created by Kirk Scott
This set of overheads corresponds to the first portion of section 12.1 in the book • The overheads for Chapter 12 roughly track the topics in the chapter • Keep this in mind though: • On some topics I simply go over the book’s material • On other topics I expand on the book’s material in a significant way • You are responsible not just for what’s in the book, but also what’s in the overheads that’s not in the book
1. Overview • There are several sources of tension in cryptography • A system should be easy to use • It should be difficult to break • Cryptography is the most important security tool • It is only of valuable within a context of other protocols and technologies
Development of cryptographic systems should be left to the experts • It’s too difficult for amateurs • Improperly implemented, cryptography only gives a false sense of security • Application of cryptographic systems should be within the reach of the average user • If they’re too difficult, they will be avoided
2. Hard Problems • Cryptographic systems are based on so-called hard problems • Sound systems have their foundation in advanced math and theoretical computer science • The goal of this chapter is to explain, generally, the mathematical approaches involved • The mechanics of how these systems are incorporated into the Web, for example, will not be covered
3. Mathematics for Cryptography • These are some of the topics covered in this part of the book • Complexity • NP-Completeness • Examples of NP-Complete Problems • P, NP, and EXP Problems
These topics relate to the question, “What is a hard problem?” • In cryptography, “hard” has an additional characteristic • Not only should the system be based on a hard problem • The hard problem should not be susceptible to an easy solution.
4. Complexity • The basis for this discussion comes from theoretical computer science • What does the phrase “NP-complete” signify? • The book gives an intuitive explanation • I will just follow along • Since this isn’t a class on theory, it’s not necessary to master these concepts in order to continue considering cryptographic systems
5. Three Examples of NP-Completeness • The satisfiability problem • The knapsack problem • The clique problem • “Easy to state” • “Not hard to understand” • “Straightforward to solve” (using brute force to check all possible solutions) • No other (apparent) solutions
6. Satisfiability • Let logical expressions be created using these rules: • They contain logical variables v1, v2, …, vn, which can take on a value of T or F • The variables can be negated • The variables are combined into clauses using logical OR • The clauses are combined into an expression using logical AND
The expression is satisfiable if there exists a set of T/F values for the vi that cause the expression to evaluate to T overall • The brute force approach: • Test all different possible combinations of T/F for the vi, checking to see whether any cause the expression to evaluate to true
7. Knapsack • Let a set of non-negative integers, S = {a1, a2, …, an} be given • Let a target value, integer T, be given • Is there a subset of S such that the sum of its elements = T? • I.e., if vi can only be 1 or 0. is there a set of vi such that:
In other words, T is the knapsack • The ai are objects to be put into it • Is there some set of objects that will fill the knapsack exactly? • The brute force approach: • Test all different possible combinations of 1 or 0 for the vi, checking to see whether the dot product of a and v equals T
8. Clique • Let a graph G of p nodes be given • To be a graph, each node has to be connected to at least one other node • Let n <= p be given • Is there a subset of n fully interconnected nodes (a clique) in G? • On the following overhead, a clique of size 4, (v1, v2, v7, v8) is shown
The brute force approach: • Test all different possible subsets of G of size n, checking to see whether any are fully interconnected • This problem is not exactly like the other ones • This will be pursued shortly
9. Characteristics of NP-Complete Problems • 1. Solvable by checking all possibilities • Either you find a solution or you find that one doesn’t exist • 2. There are 2n possibilities • Each can be checked in some bounded time quantum, so the time complexity overall is 2n, exponential in the size of the problem • 3. Observe that problems come from different areas: logic, math, graph theory • However, in a sense they are all the same
4. If you could guess perfectly, checking the proposed solution is quick • The book makes this more specific: checking one possibility is of polynomial complexity • On the surface the idea of guessing may seem irrelevant • It can be related to cryptography as follows: • Knowing the algorithm/key means the system can be used in polynomial time • It’s the attacker that’s condemned to exponential time
10. Side Note on the Clique Problem • It seems clear that solving satisfiability and the knapsack are 2n problems • The clique problem is different • It depends first on how many different ways there are to choose n distinct nodes out of p total • It then depends on checking for every possible connection among the n
Choosing n from p, “p choose n”, is the binomial coefficient: • The number of edges in a fully interconnected graph with n nodes is n(n - 1)/2
Then an expression for the complexity would be the product of these two: • Strictly speaking, this is factorial, not exponential • However, from a computational perspective, factorial is as impractical as exponential
11. A Mathematical Interlude on Binomial Coefficients and Fully Interconnected Graphs • In order to understand the general discussion so far, it would not be necessary to go further into the math • However, serious math will be coming, and this is a good time to start getting used to it • Derivations of the formulas for binomial coefficients and fully interconnected graphs will be presented
12. Binomial Coefficients • Consider the question of how many different ways there are of choosing k elements out of a set of n without repitition: • There are n choices for the first, (n – 1) remaining for the second, (n – 2) for the third, down to (n – k + 1) for the kth • The choices are independent of each other, so the total number of choices for all k is the product of these factors: • n(n – 1)(n – 2)…(n – k + 1)
There is no repetition among the individual k elements of the set • But some element x could be chosen first, second, third, …, or last • In other words, we are interested in the number of different arrangements of a set of k elements • This would be k!
The binomial coefficient, the number of different ways of choosing k elements from n, is the total number of ways of choosing elements without repetition, divided by the number of different possible orderings of the k:
The numerator can be expressed differently by adding by changing the denominator • Or put another way, the appearance of the formula can be changed by multiplying both the numerator and the denominator by (n – k)!
Note that for n and k integer we expect the binomial coefficient to be an integer • In other words, k!(n – k)! should go into n! evenly • This could be proven, but even the mathematicians tend to wave their hands at this • It may be intuitively apparent
It’s also possible to give a rather circular verbal argument • For each of the subsets of size k that we’re trying to find, n(n – 1)(n – 2)…(n – k + 2) has to include all of the arrangements of the k elements • Therefore, it should be divisible by k!, the number of arrangements
In other words, if (n choose k) is what we say it is • And k! is what we say it is, the circular argument which shows that n(n – 1)(n – 2)…(n – k + 1) is an integer can be expressed this way: • The product of two integers should give an integer
13. Fully Interconnected Networks • You can draw a set of nodes and start drawing the connections between them and by counting, come to the following conclusion for n nodes: • The number of connections for the first node is (n – 1), the number for the second node is (n – 2), for the third is (n – 3), and so on down, with the (n – 1)st node having 1 connection
Reversing the node number/connection count allow would give the same count, only in a more convenient form: • 1st node, 1 connection; 2nd node, 2 connections; …; (n – 1)st node, (n – 1) connections • Then the total number of connections is:
The point is to derive an algebraic summation for this expression • This will be done by an inductive proof • The reason for doing this is to introduce you to or refresh your memory of inductive proofs • They will come up again later when explaining some of the math for encryption • It’s nice to get a relatively simple, understandable example early on
It is easier to deal with this expression: • Inductive proofs are like recursion for mathematicians • You have a base case which can clearly be shown • You also have a hypothesis about what the result should be
If you can show that assuming the hypothesis applied to the (k – 1)st case implies that the kth case works, then the hypothesis works for all cases from the base case on up • The hypothesis will be: • This is what we want to show
For n = 1, the base case, this is easy: • n(n + 1)/2 = 1(1 + 1)/2 = 2/2 = 1 • The sum of the integers from 1 to 1 is clearly 1 • Now suppose that this holds for (k – 1)
The sum of the first k would be the previous expression plus k: • = (k – 1)k/2 + 2k/2 • = [(k – 1)k + 2k]/2 • = (k2 – k + 2k)/2 • = (k2 + k)/2 • = k(k + 1)/2, QED
We showed that it worked for n = 1 • We then showed that if it worked for k – 1, it worked for k • This means that if it worked for 1, it works for 2 • If it works for 2, it works for 3 • If it works for 3, it works for 3 • Ad infinitum
14. Where Were We? • The interlude just showed that some of the math that we’re using can be derived • The derivations in this case were only important as an introduction to the math we’ll be doing later on • The topic at hand is still the question of complexity and the meaning of NP-Completeness
15. The Classes P, NP, and EXP • Let P stand for the set of all problems that can be solved by an algorithm with complexity bounded by a polynomial • Without having a supercomputer or a network of parallel, cooperating computers, polynomial algorithms tend to be at the limit of practicality for implementation
Formally, NP stands for “non-deterministic polynomial”, that is, the set of problems that have algorithms with complexities with this characteristic • It is understandable, but misleading to think of NP as meaning simply “not polynomial”
An NP problem is one that would have a polynomial solution if you could guess perfectly • You could restate the meaning of P by saying that such problems/algorithms are deterministically polynomial • NP problems are not bounded by polynomial complexity in the same way P problems are
EXP stands for the set of all problems that can be solved by an algorithm with complexity bounded by an exponential • NP problems are not exactly the same as EXP • On the other hand, judging from the examples given earlier, if you’re reduced to guessing, in the worst case you’ll have to guess an exponential (or factorial) number of times
Practically speaking, for our purposes, an NP problem can be thought of as one with an exponential solution • In any case, NP or EXP problems are hard problems • They might serve as the basis for a cryptosystem
16. NP-Completeness • There is one more theoretical/terminological thing to consider • An NP-complete problem is a problem that has all of the characteristics of all other NP problems • The NP-complete problems are those that really do break down into (exponential) guessing, where perfect guessing is polynomial
It doesn’t matter what domain the problem comes from, logic, math, graph theory… • The problems are equivalent • If one of them is NP, they all are • If by chance one could be shown to have a polynomial solution, then they could all be solved using the same algorithm
In a sense, NP problems are in a gray area • It is hypothesized that no better solutions exist than exponential guessing • Whether or not they have an easier solution is of interest if you build a cryptosystem on them
17. Karmarkar’s Algorithm • This is just a side note • The logicians and mathematicians claim that they have established various things • Progress in science sometimes shows previous certainties to be false • Up until 1984, no concrete polynomial algorithm had been found to solve linear programming problems
Linear programming refers to a set of m equations in n unknowns where you would like to optimize the value obtained by picking the right values for the unknowns • Optimization can be difficult
At first glance, this looks like a problem where you would try to find an answer by checking all possibilities • In a way, it’s worse • With numerical variables, testing all of the possibilities doesn’t make much sense when there is no limit on the value an unknown can take on or it can take on a range of values within the reals
In 1984, building on previous research in the area, Karmarkar devised an optimization algorithm that is categorized as “weakly polynomial” • Anyone who had previously believed that optimization in linear programming was not polynomial was shown to be wrong • Since that time the search has been on for an algorithm that is “strongly polynomial”