Create Presentation
Download Presentation

Download Presentation

Introduction to Fractals and Fractal Dimension

Download Presentation
## Introduction to Fractals and Fractal Dimension

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Introduction to Fractalsand Fractal Dimension**Christos Faloutsos**Intro to Fractals - Outline**• Motivation – 3 problems / case studies • Definition of fractals and power laws • Solutions to posed problems • More examples • Discussion - putting fractals to work! • Conclusions – practitioner’s guide • Appendix: gory details - boxcounting plots Copyright: C. Faloutsos (2001)**Problem #1: GIS - points**• Road end-points of Montgomery county: • Q : distribution? • not uniform • not Gaussian • ??? • no rules/pattern at all!? Copyright: C. Faloutsos (2001)**Problem #2 - spatial d.m.**Galaxies (Sloan Digital Sky Survey w/ B. Nichol) - ‘spiral’ and ‘elliptical’ galaxies (stores and households ...) - patterns? - attraction/repulsion? - how many ‘spi’ within r from an ‘ell’? Copyright: C. Faloutsos (2001)**Poisson**Problem #3: traffic • disk trace (from HP - J. Wilkes); Web traffic #bytes - how many explosions to expect? - queue length distr.? time Copyright: C. Faloutsos (2001)**Common answer for all three:**• Fractals / self-similarities / power laws • Seminal works from Hilbert, Minkowski, Cantor, Mandelbrot, (Hausdorff, Lyapunov, Ken Wilson, …) Copyright: C. Faloutsos (2001)**Road map**• Motivation – 3 problems / case studies • Definition of fractals and power laws • Solutions to posed problems • More examples and tools • Discussion - putting fractals to work! • Conclusions – practitioner’s guide • Appendix: gory details - boxcounting plots Copyright: C. Faloutsos (2001)**Definitions (cont’d)**• Simple 2-D figure has finite perimeter and finite area • area ≈ perimeter2 • Do all 2-D figures have finite perimeter, finite area, and area ≈ perimeter2? Copyright: C. Faloutsos (2001)**What is a fractal?**= self-similar point set, e.g., Sierpinski triangle: zero area; infinite length! ... Copyright: C. Faloutsos (2001)**Definitions (cont’d)**• Paradox: Infinite perimeter ; Zero area! • ‘dimensionality’: between 1 and 2 • actually: Log(3)/Log(2) = 1.58… Copyright: C. Faloutsos (2001)**Definitions (cont’d)**• “self similar” • pieces look like whole independent of scale • “dimension” dim = log(# self-similar pieces) / log(scale) • dim(line) = log(2)/log(2) = 1 • dim(square) = log(4)/log(2) = 2 • dim(cube) = log(8)/log(2) = 3 Copyright: C. Faloutsos (2001)**Q: fractal dimension of a line?**A: 1 (= log(2)/log(2)) Intrinsic (‘fractal’) dimension Copyright: C. Faloutsos (2001)**Q: dfn for a given set of points?**x y 5 1 4 2 3 3 2 4 Intrinsic (‘fractal’) dimension Copyright: C. Faloutsos (2001)**Q: fractal dimension of a line?**A: nn ( <= r ) ~ r^1 (‘power law’: y=x^a) Q: fd of a plane? A: nn ( <= r ) ~ r^2 Fd = slope of (log(nn) vs log(r) ) Intrinsic (‘fractal’) dimension Copyright: C. Faloutsos (2001)**Algorithm, to estimate it?**Notice avg nn(<=r) is exactly tot#pairs(<=r) / (2*N) Intrinsic (‘fractal’) dimension including ‘mirror’ pairs Copyright: C. Faloutsos (2001)**Dfn of fd:**ONLY for a perfectly self-similar point set: zero area; infinite length! ... =log(n)/log(f) = log(3)/log(2) = 1.58 Copyright: C. Faloutsos (2001)**log(#pairs**within <=r ) 1.58 log( r ) Sierpinsky triangle == ‘correlation integral’ Copyright: C. Faloutsos (2001)**Observations:**• Euclidean objects have integer fractal dimensions • point: 0 • lines and smooth curves: 1 • smooth surfaces: 2 • fractal dimension -> roughness of the periphery Copyright: C. Faloutsos (2001)**fd = embedding dimension -> uniform pointset**a point set may have several fd, depending on scale Important properties Copyright: C. Faloutsos (2001)**Road map**• Motivation – 3 problems / case studies • Definition of fractals and power laws • Solutions to posed problems • More examples and tools • Discussion - putting fractals to work! • Conclusions – practitioner’s guide • Appendix: gory details - boxcounting plots Copyright: C. Faloutsos (2001)**Problem #1: GIS points**• Cross-roads of Montgomery county: • any rules? Copyright: C. Faloutsos (2001)**1.51**Solution #1 A: self-similarity -> • <=> fractals • <=> scale-free • <=> power-laws (y=x^a, F=C*r^(-2)) • avg#neighbors(<= r ) = r^D log(#pairs(within <= r)) log( r ) Copyright: C. Faloutsos (2001)**1.51**Solution #1 A: self-similarity • avg#neighbors(<= r ) ~ r^(1.51) log(#pairs(within <= r)) log( r ) Copyright: C. Faloutsos (2001)**Examples:MG county**• Montgomery County of MD (road end-points) Copyright: C. Faloutsos (2001)**Examples:LB county**• Long Beach county of CA (road end-points) Copyright: C. Faloutsos (2001)**Solution#2: spatial d.m.**Galaxies ( ‘BOPS’ plot - [sigmod2000]) log(#pairs) log(r) Copyright: C. Faloutsos (2001)**Solution#2: spatial d.m.**log(#pairs within <=r ) - 1.8 slope - plateau! - repulsion! ell-ell spi-spi spi-ell log(r) Copyright: C. Faloutsos (2001)**spatial d.m.**log(#pairs within <=r ) - 1.8 slope - plateau! - repulsion! ell-ell spi-spi spi-ell log(r) Copyright: C. Faloutsos (2001)**r1**r2 r2 r1 spatial d.m. Heuristic on choosing # of clusters Copyright: C. Faloutsos (2001)**spatial d.m.**log(#pairs within <=r ) - 1.8 slope - plateau! - repulsion! ell-ell spi-spi spi-ell log(r) Copyright: C. Faloutsos (2001)**spatial d.m.**log(#pairs within <=r ) • - 1.8 slope • - plateau! • repulsion!! ell-ell spi-spi -duplicates spi-ell log(r) Copyright: C. Faloutsos (2001)**#bytes**time Solution #3: traffic • disk traces: self-similar: Copyright: C. Faloutsos (2001)**20%**80% Solution #3: traffic • disk traces (80-20 ‘law’ = ‘multifractal’) #bytes time Copyright: C. Faloutsos (2001)**Solution#3: traffic**Clarification: • fractal: a set of points that is self-similar • multifractal: a probability density function that is self-similar Many other time-sequences are bursty/clustered: (such as?) Copyright: C. Faloutsos (2001)**Tape#1**Tape# N time Tape accesses # tapes needed, to retrieve n records? (# days down, due to failures / hurricanes / communication noise...) Copyright: C. Faloutsos (2001)**Tape#1**Tape# N time Tape accesses 50-50 = Poisson # tapes retrieved real # qual. records Copyright: C. Faloutsos (2001)**Fast estimation of fd(s):**• How, for the (correlation) fractal dimension? • A: Box-counting plot: log(sum(pi ^2)) pi r log( r ) Copyright: C. Faloutsos (2001)**Definitions**• pi : the percentage (or count) of points in the i-th cell • r: the side of the grid Copyright: C. Faloutsos (2001)**Fast estimation of fd(s):**• compute sum(pi^2) for another grid side, r’ log(sum(pi ^2)) pi’ r’ log( r ) Copyright: C. Faloutsos (2001)**Fast estimation of fd(s):**• etc; if the resulting plot has a linear part, its slope is the correlation fractal dimension D2 log(sum(pi ^2)) log( r ) Copyright: C. Faloutsos (2001)**Hausdorff or box-counting fd:**• Box counting plot: Log( N ( r ) ) vs Log ( r) • r: grid side • N (r ): count of non-empty cells • (Hausdorff) fractal dimension D0: Copyright: C. Faloutsos (2001)**Definitions (cont’d)**• Hausdorff fd: r log(#non-empty cells) D0 log(r) Copyright: C. Faloutsos (2001)**Observations**• q=0: Hausdorff fractal dimension • q=2: Correlation fractal dimension (identical to the exponent of the number of neighbors vs radius) • q=1: Information fractal dimension Copyright: C. Faloutsos (2001)**Observations, cont’d**• in general, the Dq’s take similar, but not identical, values. • except for perfectly self-similar point-sets, where Dq=Dq’ for any q, q’ Copyright: C. Faloutsos (2001)**Examples:MG county**• Montgomery County of MD (road end-points) Copyright: C. Faloutsos (2001)**Examples:LB county**• Long Beach county of CA (road end-points) Copyright: C. Faloutsos (2001)**Summary So Far**• many fractal dimensions, with nearby values • can be computed quickly (O(N) or O(N log(N)) • (code: on the web) Copyright: C. Faloutsos (2001)**Dimensionality Reduction with FD**Problem definition: ‘Feature selection’ • given N points, with E dimensions • keep the k most ‘informative’ dimensions [Traina+,SBBD’00] Copyright: C. Faloutsos (2001)**Dim. reduction - w/ fractals**not informative Copyright: C. Faloutsos (2001)**Dim. reduction**Problem definition: ‘Feature selection’ • given N points, with E dimensions • keep the k most ‘informative’ dimensions Re-phrased: spot and drop attributes with strong (non-)linear correlations Q: how do we do that? Copyright: C. Faloutsos (2001)