- 219 Views
- Uploaded on
- Presentation posted in: General

Hide and Seek: An Introduction to Steganography

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Security and Error Correction/Detection in 802.1x and GSM

Hide and Seek: An Introduction to Steganography

Niels Provos and Peter Honeyman, University of Michigan IEEE Security and Privacy Journal, May-June 2003 (Vol. 1, No. 3)

Sweety Chauhan

October 24, 2005

CMSC 691I

Clandestine Channels

- New and Significant
- What is Steganography?
- Previous Work
- Steganographic systems for JPEG images
- Steganography Detection on the Internet
- Results

- Detection of Steganographic systems via statistical steganalysis
- Practical application of detection algorithms

- Art and Science of hiding communication
- A steganographic system embeds hidden content in unremarkable cover media
- A steganographic system consists of :
- Identifying cover’s medium redundant bits
- Embedding process which creates a stego medium by replacing the redundant bits with hidden message data

- Modern Steganography’s goal is to keep its mere presence undetectable
- But steganographic systems – leave behind detectable traces in the cover medium
- Though secret content is not revealed but its existence can be detected
- Modifying the cover medium changes its statistical properties
- Eavesdroppers can detect the distortions in the resulting stego medium’s statistical properties

The process of finding these distortions is called statistical steganalysis

- Three different aspects in information-hiding systems contend with each other:
- Capacity – amount of information that can be hidden in the cover medium
- Security – eavesdropper inability to detect hidden information
- Robustness – amount of modification the stego medium can withstand before an adversary can destroy hidden information

- Watermarking system – high level of robustness
- Steganography – high security and capacity
- Hidden information is fragile

- Classical Steganography system
- Security relies on the encoding system’s secrecy
- e.g. – Roman General shaving slave’s head and tattooing a message on it. After the hair grew back, the slave was sent to deliver the hidden message

- Modern Steganography
- Attempts to be detectable only if secret information is known (secret key)
- Similar to Kerckhoffs’ Principle of cryptography which holds that “a cryptographic system’s security should rely solely on the key material”

- Steganographic communication senders and receivers agree on a :
- steganographic system
- a shared secret key – determines how message is encoded in the cover medium

- To send a hidden message, for example,
- Alice creates a new image with digital camera
- Alice supplies the steganographic system with her shared secret and message
- The steganographic systems uses the shared secret to determine how the hidden message should be encoded in the redundant bits
- The result is the stego image that Alice sends to Bob
- When Bob receives the image, he uses the shared secret and the agreed steganographic system to retrieve the hidden message

- Why steganographic systems for JPEG format?
- System operate in a transform space
- Not affected by visual attacks (as in BMP images)
- Modifications are in the frequency domain instead of the spatial domain

- Neil F. Johnson and Sushil Jajodia showed steganographic systems for palette-based images leave easily detected distortions

For each color component, the JPEG image format uses a Discrete Cosine Transform (DCT) to transform successive 8x8 pixel block of the image into 64 DCT coefficients each

The DCT coefficients F(u, v) of an 8 x 8 block of image pixels f(x, y) are given by

The following operation quantizes the coefficients:

where Q(u,v) is a 64-element quantization table

- Sequential – for example: JSteg
- Pseudo Random – for example: Outguess 0.1
- Subtraction – for example: F5
- Statistics aware embedding

Least-significant bits of the quantized DCT coefficients is used as redundant bits to embed the hidden message

- Derek Upham’s JSteg Algorithm - does not require a shared secret
Input: message, cover image

Output: stego image

while data left to embed do

get next DCT coefficient from cover image

if DCT ≠ 0 and DCT ≠1 then

get next LSB from message

replace DCT LSB with message LSB

end if

insert DCT into stego image

end while

- As a result anyone who knows the steganographic system can retrieve the message hidden by JSteg

- Andreas Westfeld and Andreas Pfitzmann noticed that
- steganographic systems that change least-significant bits sequentially cause distortions detectable by steganalysis
- for a given image, the embedding of high-entropy data (often due to encryption) changed the histogram of color frequencies in a predictable way.

- Embedding uniformly distributed message bits reduces the frequency difference between adjacent DCT coefficients’
- By observing differences in the DCT coefficients’ frequency, embedding can be detected

Histogram before (a) and after (b) a hidden message is embedded in a JPEG image

Sequential changes to the

(a) original and

(b) modified image’s least-sequential bit of discrete cosine transform coefficients tend to equalize the frequency of adjacent DCT coefficients in the histograms

- Westfeld and Pfitzmann χ2-test
- determine whether the observed frequency distribution in an image matches a distribution that shows distortion from embedding hidden data

- The probability of embedding is determined by calculating p for a sample from the DCT coefficients
- The samples start at the beginning of the image and for each measurement the sample size is increased

- A high probability of embedding indicates that the image contains steganographic content
- Hidden message’s length can also be determined by JSteg

- Niels Provos’s Outguess 0.1 steganographic system
- Improves the encoding step by using a pseudo-random generator to select DCT coefficients at random
- The LSB of a selected DCT coefficient is replaced with encrypted message data

The algorithm replaces the least-significant bit of pseudo-randomly selected discrete cosine transform (DCT) coefficients with message data

- The OutGuess 0.1 algorithm :
Input: message, shared secret, cover image

Output: stego image

initialize PRNG with shared secret

while data left to embed do

get pseudo-random DCT coefficient from cover image

If DCT ≠ 0 and DCT ≠1 then

get next LSB from message

replace DCT LSB with message LSB

end if

insert DCT into stego image

end while

- χ2 -test can be extended to detect the local distortions in an image
- Two identical distributions produce about the same χ2 values in any part of the distribution
- Instead of increasing the sample size and applying the test at a constant position,
- a constant sample size is used and the sample position is increased (slided)

- The extended χ2-test detects pseudo-randomly embedded messages in JPEG images
- The detection rate depends on
- hidden message’s size
- number of DCT coefficients in an image
- can be improved by applying a heuristic that eliminates coefficients likely to lead to false negatives

The graph shows the detection rates for three different false-positive rates

The change rate refers to the fraction of discrete cosine transform (DCT) coefficients available for embedding a hidden message that have been modified

- Andreas Westfeld’s steganographic system, F5
- Instead of replacing the least-significant bit of DCT coefficient with message data
- F5 decrements its absolute value in a process called matrix encoding

- There is no coupling of any fixed pair of DCT coefficients
- χ2-test cannot detect F5

- Matrix encoding computes an appropriate (1, (2k– 1), k) Hamming code by calculating the message block size k from
- the message length and
- the number of nonzero non-DC coefficients

- The Hamming code (1, 2k– 1, k) encodes a k-bit message word m into an n-bit code word a with n = 2k– 1
- can recover from a single bit error in the code word

Input: message, shared secret, cover image

Output: stego image

initialize PRNG with shared secret

permutate DCT coefficients with PRNG

determine k from image capacity

calculate code word length n←2k – 1

while data left to embed do

get next k-bit message block

repeat

G←{n non-zero AC coefficients}

s←k-bit hash f of LSB in G

s←s k-bit message block

if s ≠0 then

decrement absolute value of DCT coefficient Gs

insert Gs into stego image

end if

untils = 0 or Gs ≠ 0

insert DCT coefficients from Ginto stego image

end while

- Embedding information with F5 leads to double compression
- Most of the images are stored already in the JPEG format which could confuse this detection algorithm.

- Fridrich and her group proposed a method for eliminating the effects of double compression by estimating the quality factor used to compress the cover image

- Previous discussed algorithms overwrite image data without directly considering the distortions that the embedding will cause
- To embed a single bit,
- a DCT coefficient’s value can either increment or decrement which allows change of DCT coefficient’s least-significant bit in two different ways
- Creating groups of DCT coefficients and using the parity of their least-significant bits as message bits

- For every DCT block, the space of all possible changes is searched to find a configuration that minimizes the change to image statistics

- Two Different classes of algorithms:
- Based on inherent statistical properties
- no need to find a representative training set
- estimate an embedded message’s length

- Based on class discrimination
- Creating a representative training set is often difficult
- Do not provide an estimate of the hidden message’s length

- Based on inherent statistical properties

- How previous discussed steganalytic methods can be used in real world setting?
- Created a steganography detection framework that
- gets JPEG images off the Internet and
- uses steganalysis to identify subsets of the images likely to contain steganographic content

- JSteg
- supports content encryption and compression before JSteg embeds the data
- uses the RC4 stream cipher for encryption

- JPHide
- uses Blowfish as a PRNG Version 0.5 supports additional compression of the hidden message
- uses slightly different headers to store embedding information
- Before the content is embedded, the content is Blowfish-encrypted with a user-supplied pass phrase

- OutGuess
- All use some form of least-significant bit embedding and are detectable with statistical analysis

- Stegdetect is an automated utility that can analyze JPEG images that have content hidden with JSteg, JPHide, and OutGuess 0.13b
- Stegdetect’s output lists
- the steganographic systems it finds in each image or
- writes “negative” if it couldn’t detect any

- Stegdetect’s false-negative rate depends on:
- The steganographic system and the embedded message’s size
- The smaller the message, the harder it is to detect by statistical means.

- Stegdetect is very reliable in finding images that have content embedded with JSteg
- For JPHide, detection depends also on the size and the compression quality of the JPEG images

Using Stegdetect over the Internet. (a) JPHide and (b) JSteg produce different detection results for different test images and message sizes

- Images from eBay auctions and discussion groups in the Usenet archive for analysis.
- Developed Crawl, a simple, efficient Web crawler that makes a local copy of any JPEG images it encounters on a Web page
- Crawl performs a depth-first search and has two key features:
- Images and Web pages can be matched against regular expressions
- Hence, include or exclude Web pages in the search

- Minimum and maximum image size can be specified
- Hence exclude images that are too small to contain hidden messages

- Images and Web pages can be matched against regular expressions
- Calculation of true positive rate – the probability that an image detected by Stegdetect really has steganographic content

Percentages of (false) positives for analyzed images

Test

EBAY

USENET

JSteg

0.003

0.007

JPHide

1

2.1

OutGuess

0.1

0.14

- After processing 2 million ebay images with Stagdetect
- Over 1% of all the images seemed to contain hidden content
- JPHide was detected most often

- Stegdetect cannot guarantee a hidden message’s existence
- To verify the hidden content, Stegbreak must launch a dictionary attack against the JPEG files
- JSteg-Shell, JPHide, or Outguess all hide content based on a user-supplied password
- an attacker can try to guess the password by taking a large dictionary and trying to use every single word in it to retrieve the hidden message
- embedded header information, so attackers can verify a guessed password using header information

Stegbreak Performance on a 1,200- MHz Pentium III

System

ONE IMAGE (words/second)

FIFTY IMAGES (words/second)

JPHide

4,500

8,700

OutGuess

18,000

34,000

JSteg

36,000

47,000

- From eBay and Usenet research
- No single hidden message was found

- Explanations for inability to find steganographic content on the Internet:
- All steganographic system users carefully choose passwords that are not susceptible to dictionary attacks
- Maybe images from sources that were not analyze carry steganographic content
- Nobody uses steganographic systems that researchers could find
- All messages are too small for analysis to detect

Either they are looking in the wrong place or there is no widespread use of steganography on the Internet

- Today, computer and network technologies provide easy-to-use communication channels for steganography
- Research work
- Provides an overview of existing steganographic systems
- presents methods for detecting them via statistical steganalysis

- Research new algorithms to
- Hide information
- Improve Steganalysis

- Hide and Seek: An Introduction to Steganography, Niels Provos, Peter Honeyman, IEEE Security and Privacy Journal, May-June 2003
- Cyber warfare: steganography vs. steganalysis , Huaiqing Wang, Shuozhong Wang , Communications of the ACM, Volume 47, Issue 10, October 2004
- http://www.outguess.org/detection.php
- http://www.jjtc.com/Security/stegtools.htm
- http://www.stack.nl/~galactus/remailers/index-stego.html

For Your

Presence

And

Patience

Any Questions

Presentation Slides and Research Papers are available at :

www.umbc.edu/~chauhan2/CMSC691I/