Advanced PHP and MySQL

Advanced PHP and MySQL Some Adventures and Experiments DIG 4104c – Spring 2013 J. M. Moshell

Midterm Exam Results • Cumulative grades not available – not all presentations • (finish those today) • BUT – most projects & presentations 85% to 95%, "orbiting • around A/B", so the Midterm, Final & Project 3 are 75% of • overall score. -2 -

Midterm Exam Results • Cumulative grades not available – not all presentations • (finish those today) • BUT – most projects & presentations 85% to 95%, "orbiting • around A/B", so the Midterm, Final & Project 3 are 75% of • overall score. • MTX: 225-250: Like an A. 200-224: Like a B. 150-199: C-ish -3 -

The Rest of the Semester (by popular request:) • PHP and MySQL • SOAP and Web Services • Evaluating Web Services: Classroom Feedback Systems • Commercial Payment Systems & E-Commerce • Security Adventures and PCI -4 -

Context: Registration Systems Lab • PHP and MySQL • SOAP and Web Services • Evaluating Web Services: Classroom Feedback Systems • Commercial Payment Systems & E-Commerce • Security Adventures and PCI -5 -

Context: Registration Systems Lab • PHP custom coded • registration system • MySQL database • (one per conference) • Uses several credit • card gateways • (client owned) as well as RSL's own authorize.net gateway -6 -

25 to 30 conferences/year • We charge $9 to $14 • per registrant • We had 26 conferences • in 2012 • Employees: • Carole Mann, President • David Mann, IT Manager • Mandy Mann, Conference Manager • +2 ladies and one part time professor/designer -7 -

Context: Registration Systems Lab • Specialized feature: • multiple gateways • for one (complex) • conference. • Problem: Hackers are growing more sophisticated • PCI (Payment Card Industry) compliance – getting tougher -8 -

System Architecture conf. code core code gateways core011 ieee ICSE13 cURL rsl rsl rsl icse13 rsl MJAA13 mjaa mjmembers mjaa13 PLDI13 ... etc -9 -

System Manager: moma moma moma ("Moshell's Manager) is built with Drupal -10 -

System Manager: moma (passwords obscured) -11 -

Today's problem: Insider Attack Assume a hacker wants to capture our clients' credit card info. Assume they're already inside our system, can modify code. (We consider how to keep 'em out ... later) What can we do to stop these bandits? Idea 1: Don't keep cc info in the database. - This is a basic rule for PCI* compliance - Payment Card Industry Association -12 -

Today's problem: Insider Attack Idea 2: Develop a system to detect any changes to your code. A kind of 'burglar alarm'. Design constraints: * must run whenever the code runs, to prevent use when contaminated. * must not impact the system's functionality == speed == frequent interruptions of service -13 -

Developing a Burglar Alarm Attacking the Burglar Alarm Idea 1) What if bandito replaces 100% of your code? * Must have a periodic external 'audit' to detect this ploy. * Unless this audit runs frequently, SOME data will be lost. 2) What if the bandito scans your code and deactivates the alarm? * Don't make this easy for them. -14 -

Developing a Burglar Alarm Some axioms of computer security: 1) Nothing is going to work ALL the time. You need layers. 2) Humans are the weakest point in the system. Automate it! 3) Security by Obscurity is a weak basis for a design. But you must start somewhere. == >> ACT NOW << == -15 -

Digital Signatures Why don't we just make a duplicate copy of the software, and compare for modifications? core011 ICSE13 = =? = =? core011a ICSE13a -16 -

Easy solutions that don't work 1) Comparing a hundred files over & over ... inefficient 2) The bandit could simply modify BOTH copies core011 ICSE13 = =? = =? core011a ICSE13a -17 -

What about a signature? ?? Can we design a unique shadow of some kind which is (a) fast to compute, (b) unique? (c) informative? core011 ICSE13 core011a ICSE13a signature -18 -

What about a signature? ?? Can we design a unique shadow of some kind which is (a) fast to compute, (b) unique? (c) informative? Fast: Something built into PHP's system, not a PHP loop across 100+ files, 150,000 lines of code. Unique: every different code-set has a different shadow. Informative: if shadow1 != shadow2, what does that tell us? -19 -

Comes the HASH CODE: A hash code is produced by an algorithm. Input: a body of data (e. g. a text file.) Output: a big integer or string. Properties: 1) Same input tomorrow yields same output. 2) Different inputs are very unlikely to yield same output. 3) Process is not reversible. -20 -

Really dumb HASH CODE: Take in all the characters, convert to numbers, add 'em up. Throw away high order digits. This is a text for which we want the hash code. 84 104 105 etc... --------- 453664 now the 4 digit hash is 3664. Change any text letter ... hashcode (probably) changes. -21 -

Really smart HASH CODE: sha1 is a hash algorithm built into PHP - widely used for cryptographic purposes - used for creating unique keys in git - input: any file of up to 2^64 bits (a LARGE number) - it's quite fast, because its widely used & needed - Produces something like this: 4b5437055d8adaeb9b47c7dfda18f400907cc146 -22 -

The architectural concept of the Alarm: First line of defense: self-checking against a stored signature. (Hidden, somewhere in our file hierarchy) core011 self-check ICSE13 self-check ICSE13a signature core011a signature ICSE13a signature core011a signature Hidden signature files -23 -

The architectural concept of the Alarm: Second line of defense: periodic audit checks against signatures on a DIFFERENT computer core011 ICSE13 Remote audit manager local audit agent ICSE13a signature core011a signature core011a signature ICSE13a signature Remote signature files -24 -

Focus on the first line of defense: How would you attack this system? -25 -

Focus on the first line of defense: How would you attack this system? 1) find the hidden signature files 2) find the self-check code in ICSE13 or in core011 -26 -

Focus on the first line of defense: How would you attack this system? Why that's hard: 1) find the hidden * our system has 11 gb signature files in 17,000 files * filenames not known 2) find the self-check code in ICSE13 * our system has or in core011 lots of places to look (and what are you looking for?) -27 -

Here's a partial list of the code modules: and it's not going to be called "security scan" ... ! -28 -

A common tactic: Trigger an error message and then search the code base for that error message. * Defenses: 1) generate your error messages from a database 2) scramble the source code so it's unsearchable. * But remember ... Security by Obscurity is a weak defense! -29 -

Another common tactic: run image of code Bandit copies our code, runs in his on WAMP environment. Looks for file accesses, error messages if not found. * Remedy: use file_exists to check for files, only write to files already found. * Turn off error messages, so no squawks if files not found. * A VERY GOOD hacker will get you anyway, by hacking PHP itself. But maybe we're too much trouble .... -30 -

Designing our Burglar Alarm Criterion 3: Informative: if shadow1 != shadow2, what does that tell us? We want our signature to not only holler BURGLAR! but to tell us which "room" he's in, so that we can examine the attack. -31 -

An idea: An XML signature directory1 file1 directory2 file2 file3 file4 -32 -

An idea: An XML signature directory1 file1 directory2 file2 file3 file4 <rsl> <dir> <name>directory1</name> <file> <name>file1</name> <sha>3f4eaa7843...</sha> </file><file> <name>file2</name> <sha>a7844afed...</sha> </file> </dir> etc -33 -

Compare two signatures. <rsl> <dir> <name>directory1</name> <file> <name>file1</name> <sha>3f4eaa7843...</sha> </file><file> <name>file2</name> <sha>a7844afed...</sha> </file> </dir> etc Where sha don't match, retrieve the filename and report it. -34 -

So now I know what my tasks are. 1) read the directory structure 2) construct an XML representation, with sha for each file 3) construct a comparator that can report file with difference 4) build Level 1 (self-test) into a conference (both for core and for conference-specific code) 5) build Level 2 (auditor test) into moma, across all conferences -35 -

Step 1: Prototype directory reading prototype code hmm1.php Key PHP functions: YOUR job: understand, investigate or ASK! You need to know WHAT it does, and WHY I used it. $d = dir($path); $entry = $d -> read(); file_exists($filepath); $fs=filesize($filepath); $fstuff=implode('',file($filepath)) $fsha=sha1($fstuff) -36 -

Step 1: Prototype directory reading prototype code hmm1.php Key programming techniques: 1) Show your results in detail (with <table>) to make it easier to diagnose and debug 2) Recursion: dirget CALLS ITSELF! 3) Limiting recursion. Why do we exclude path '.' ? -37 -

Step 1: Prototype directory reading prototype code hmm1.php Key programming techniques: 1) Show your results in detail (with <table>) to make it easier to diagnose and debug 2) Recursion: dirget CALLS ITSELF! 3) Limiting recursion. Why do we exclude path '.' ? -38 -

PRACTICE PROBLEM #1 Note: There will be several Practice Problems through this lecture. If you want an A on the final exam, WORK MOST OR ALL OF THEM. If you want to not get a demerit for next Monday's lecture, WORK AT LEAST ONE OF THEM. Your entire team can work the same one, as long as you can demonstrate and explain your results. -39 -

PRACTICE PROBLEM #1 Take the demo program hmm1.php and modify it so that it simply prints out a nice looking, hierarchical listing of the contents of the directory to which it is pointed. example: test1 file1.php file2.php test2 file3.php file4.php -40 -

Step 2: XML Prototype hmmXML2.php Goal: create an XML text file that stores the results of the directory traverse from prototype 1. Method: Find a working XML example, and "steal" elements of it. The example function 'xemit' is my "resource mine". -41 -

Step 2: XML Prototype hmmXML2.php examine 'xemit'. Note how it wants an xml string as a 'seed'. I discover that the example's XML string seed setup requires a specific syntax (left over from VERY EARLY PHP.) Analyze hmmXML2.php. Identify the key new commands. -42 -

PRACTICE PROBLEM #2 "Retrograde" Example hmmXML2.php That is, make it write the file 'xout.html' from the movie example, rather than from the directory system. Note: at this point I'm using an old function 'textsaver' that was designed to write out arrays of text. But I have only one 'line' of text (i. e. one string variable) and so I put it into an array cell, $text[0]. -43 -

Step 3: Read a stored file & compare Skipping forward to prototype hmmXML6.php: Read MAIN to see what's happening: 1. Load a file named xdata.xml (the previous scan.) 2. store this text in $xtext1. 3. Do the dirget magic to create new $xtext2. 4. Write this as the NEW xdata.xml file 5. Now we scan for a mismatch, using substr. if no mismatch, print "no mismatch found" else try to find the <file> tag and say WHERE! -44 -

Step 4: Production Code I have replaced critical information with xyz in the fourth ("production") version, as it's embedded in live commercial code. Demonstrate with localhost:icse13 control=xyz; then modify regsystem.php, try xyzcheck then try control=regtest examine the function 'unspooger'. Discuss how vulnerable this code REALLY is ... Dreamweaver can seek out the word 'correct' in <1 second. -45 -

Part 2: MySQL Extended Example In 3134 we do 'toy' problems with small tables. In RSL we have real-world databases (complex, but small) Table structure: 

Part 2: MySQL Extended Example In 3134 we do 'toy' problems with small tables. Objectives of the system: 1) Flexibility: each conference has different data needs, but we DO NOT want a unique database structure for each. 2) Historical record: We need to know all additions, deletions, errors and corrections. This is accounting for big bucks. So – we analyzed Drupal's table structure and stole (much of) it.

Part 2: MySQL Extended Example users: attendee number, login ID, password (encrypted), salt (We'll discuss 'salt' in the security lecture.) transactions: transaction ID, attendee number, date, time, worker So a given user can have any number of transactions Identified by 'tid' (transaction ID) an integer.

Part 2: MySQL Extended Example a transaction tracks 4 kinds of information: transtrings: Any data not financial, e. g. names, addresses. tid, fieldname, fieldvalue (up to 50 characters) trantexts: like transtring but can have text of ANY size tid, fieldname, fieldvalue (any size)

Part 2: MySQL Extended Example a transaction tracks 4 kinds of information: transtrings: Any data not financial, e. g. names, addresses. tid, fieldname, fieldvalue (up to 50 characters) trantexts: like transtring but can have text of ANY size tid, fieldname, fieldvalue (any size) tranumbers: how many of something, the person buys tid, fieldname, value, attendee type, paywhen, annotation tranmoney: payments, refunds, balances due tid, fieldname, amount, payclass, ..when, .. etc

Advanced PHP and MySQL

Advanced PHP and MySQL

Presentation Transcript

PHP and MySQL

PHP and MySQL

PHP and mySQL

PHP and MySQL

PHP and MySQL

PHP and MySQL

PHP and MySQL

Advanced PHP, Apache and MySQL

MySQL and PHP

PHP and MySQL

PHP and MySQL

PHP and MySQL

PHP and MySQL

PHP (and MySQL)

PHP and MySQL

MySQL and PHP

MySQL and PHP