1 / 80

Blobby Land Gordon J. Pace Chris Porter Christian Colombo March 2010

Blobby Land Gordon J. Pace Chris Porter Christian Colombo March 2010. Who killed Mr Blobby?. Mathematics and (partly) Computer Science are really all about one question: What can we say about objects which can be described in a finite manner?

neci
Download Presentation

Blobby Land Gordon J. Pace Chris Porter Christian Colombo March 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Blobby Land Gordon J. Pace Chris Porter Christian Colombo March 2010

  2. Who killed Mr Blobby? • Mathematics and (partly) Computer Science are really all about one question: What can we say about objects which can be described in a finite manner? • Compression is usually done by trying to find regularities in the source: • Finding concise rules which generate part of the data bigger than the rule. • Find common subparts and describe them once

  3. Mr Blobby is innocent • Applications of compression are numerous: • Smaller data for transmission and storage e.g. fit a whole movie at high resolution on a single DVD. • Less resources if the description we are compressing is of a real life process: e.g. process chain optimisation. • Allow you to win the 2010 ICTSA Programming Challenge

  4. Mr Blobby woz ’ere • Given: • an image with a 3-bit depth (8 colours) • a number n • To produce: • not more than n blobs of paint approximating the original image • in less than one minute. • using as little as possible paint

  5. From Mr Ducky…

  6. … to Mr Blobby

  7. … to Mr Blobby In 16 blobs

  8. … to Mr Blobby … and no, I did not produce this automatically

  9. Some Interesting Notes • The problem of finding an optimal solution is hard: • There’s an infinite search space (allowing fractions, and blobs with centres outside the grid) • But could be reduced to a finite one if we ignore the less paint is better constraint. • Someone asked “is it NP-complete?” It is in NP (so certainly not worse than NPC), and seems to be in NPC, but am not sure. • A brute force approach is out of the question

  10. Here’s an evaluation image (given with 5, 50, 100, 1000 blobs)

  11. Here’s what one team produced (1000 blobs)

  12. and another team (1000 blobs)

  13. and yet another (1000 blobs?!)

  14. The Teams

  15. The Teams: Industry-Based • CasaSoft: Karl Cassar, Mark Cassar • Exigy Ltd: Christian Tabone, Clint Spiteri, Bernard Cachia, Jean-Paul Sultana • Ixaris: JP Ebejer, Jimmy Borg, Brian Vella, Brian Attard Bason • Gigi tal-MITA: Daniel D'Agostino • CCBill EU: Emerson Farrugia, Christopher Torpuland, Briun, Wallace Wadge, Edward Mallia • Holistic: Ruben Gatt, Matthew Falzon, Shaun Camilleri, Luke Rocco • Atlas Insurance: Karl Fenech

  16. The Student Teams • Iš Vilniaus: Michael Debono, Andrew Borg Cardona • papa charlie: Ian Agius, James Scicluna • WM\{James}: Ingram Bondin, Kevin Falzon, Andrew Gauci • daemon: Kurt Farrugia, Stefan Lia • AJaZ: Andrew J Said, Julian Zammit • C^D: Claire Dimech, Daniela Cauchi • y.a.t.n.: Darren Demicoli • TDGS: Kyle Pullicino, Adrian Duca, Kurt Portelli, Paul Felice • B3ANi3S: David Borg, Duncan Borg, Thorsten Callus

  17. The Teams and their Solutions

  18. 01: CasaSoft • Implemented using C# (.Net Framework 3.5) • Based on a genetic algorithm. • Reduced search space by specifying a boundary in which blobs can be placed and using bounded accuracy. • Each generation of the genetic algorithm results in the best found blob according to the population which gives the best score. The genetic algorithm continues if there is still time available. • The population size and the number of epochs per generation are calculated as a function of the bitmap size. • An optimization to automatically calculate the largest square that can fit in the circle, enabling easy flagging of pixels to be painted. • The fitness function does not to check for similar pixels, so as not to slow the speed of the algorithm.

  19. An Evaluation Problem (50, 100, 1000 blobs)

  20. And their solution (50 blobs)

  21. 02: Holistic • Implementation language C# • Used a basic Genetic Algorithm to approximate the result. • No fine tuning and optimizations were done on this algorithm. • Alas, all problems timed out 

  22. 03: Ixaris • Approach: • Based on genetic algorithms (where genes are blobs) • Initial population based on each colour layer being considered as one blob (and 25 random mutants generated on this) • Genetic Operators – split (gives us two blobs), swap blobs, forceful remove, remove (if the blob has little effect), tweak radius, tweak centre position. • Changing probability distribution for the above • Multi-Threaded – reproduction of chromosomes done in parallel because of inherent independence from each other • Issues  • Loads of magic variables (e.g. children count, max fit population, noise filter threshold, initial population, convergence limit, probability distributions for genetic operators) • Interesting note: how to find good approximations for these (in a few hours)? (Neural Networks, Approximate Bayesian Computing for parameter estimation seem too onerous) • As it very so often happens, last minute improvement fixes actually broke the solution (image reflected along x-axis – damn cartesian coordinate system) • Don’t laugh.  (Fixed at 8:20pm – too late).

  23. An Evaluation Problem (5, 50, 100 blobs)

  24. And their solution (50 blobs)

  25. 04: Gigi tal-MITA • Implementation language: C • Using a floodfill approach, identify connected areas of the same colour. • Count the pixels in each area, order them in descending order of number of pixels, and calculate the centroid of each group. • Fit circles into the image: • Start at the centroid with a circle of maximum radius for that area. • Decrease the radius until it has a pretty damn high level of inclusion (defined by a threshold variable) • Inclusion is good if it includes that particular colour or even any other - as long as it's not white (background colour) • Blob the circle with the best fit • Repeat these steps, renumbering pixels which still have not been blobbed. • On Sunday morning I realised I did not have enough time to complete the algorithm as I would have wanted to. Realising that submitting an average solution is better than submitting nothing at all, I went for a compromise - I simply used a circle of average radius. Although this leaves room for a good deal if inaccurate pixels especially if the area is non-circular, overall it has given me an accuracy of 50-69%, which I guess is reasonable for an approximate solution.

  26. An Evaluation Problem (5, 50, 100 blobs)

  27. Their Solution (100 blobs)

  28. 05: Pr0nsters Inc. CCBill EU • Our solution has the following main stages: • Edge detection • Voronoi generation • Clustering/filtering • Start by detecting colour edges. • Next, using the edges, generate a Voronoi tessellation diagram to create a number of polygons giving the locations where cells meet. • These are transformed into circles. • Clustering/filtering mechanism: • Take the most frequent colour and use that as our base blob to cover the image. • Order the circles by radius • Calculate circle overlaps - anything more than a fixed percentage gets dropped (changing the threshold until the number of blobs is reduced as required). • Stuff we didn’t manage to complete in time: • A better clustering mechanism by use of the DBScan algorithm to detect “noise” circles.  This was implemented but was too late to be merged in the main program

  29. An Evaluation Problem (5, 50, 100, 1000 blobs)

  30. Their Solution (1000 blobs)

  31. 07: Atlas Insurance Technologies: C#, .NET Framework 3.5, multithreading. • Heuristic driven using image resampling: • Bitmap is downsampled, by a factor of 2, to a minimum resample area of 1×1. • And also upsampled, by a factor of 3, to a maximum area of 2048×2048. • From the smallest downsample containing as-yet-unmatched pixels, randomly pick one unmatched pixel, and use it as the bounding square for a blob which would set it to the correct color. • Repeat the following until the largest upsample has been processed: • Compute the respective similarity gains obtained from transforming the bounding square by 1 pixel (only) in all directions, and pick the best transformation as the improved bounding square. • Multiply the bounding square’s dimensions and bottom-left coordinate by the stepwise resample factor (2 or 3) so as to translate it onto the corresponding pixels in the next (larger) sample. • Add the blob and go to the next unmatched pixel… • Downsamples guarantee convergence of optimal bounding square for next blob in O (log2n) time. • Upsamples help by achieving sub-pixel precision on the original bitmap.

  32. An Evaluation Problem (5, 50, 100, 1000 blobs)

  33. Their Solution (1000 blobs)

  34. 08: Iš Vilniaus Implementation Language: C# • Heuristics and Algorithms • Images larger than 100x100 pixels area were resized down to fit an area of 100x100. • The image is split into single colour components using a floodfill algorithm. • Components are considered largest first. • A modified maximum number of blobs is used in generation. • For performance, a simpler similarity function to the original bitmap was employed that only considered the number of absolute correct pixels. • Method 1: To generate timer-safe blobs, the center of the rectangular area defined by each component is used as the centre of a blob. • Method 2: A Genetic algorithm • Chromosomes were configurations of a random number of components up to the limit, and the number of splits for each component. • Each split of the components became a blob and the Center of Gravity calculated from the area of the split sub-component. The radius was calculated from the width and height of the split and also from the difference to the center of gravity. • The similarity function is used as a fitness function. • Termination upon timeout or convergance. • The best solution is used

  35. Evaluation Problem (5, 20, 100 blobs)

  36. Their Solution (5 blobs)

  37. 09: Papa Charlie • Language of choice: Java • The algorithm: • Divide the bitmap into components, each holding a colour area; • The similarity calculator was used to find the best result on a component; • Until the maximum number of blobs is reached, repeat the following procedure: • Remove components with a small number of correct pixels • Subdivide components with the most wrong pixels • Heuristics to cater for special cases were also created.

  38. Another Evaluation Problem (5, 50, 100 blobs)

  39. Their solution with 100 blobs

  40. 10: WM\{James} • A suite of 3 algorithms, run separately, and the best result chosen. • Colour Based Overapproximation: For each colour (starting with the most frequent one), the centroid of the extremal points of the points where the colour appears is calculated, and a blob drawn covering the whole area. Very fast but crude algorithm. • Cluster Based Overapproximation: Contiguous clusters of colours are ordered by size, largest first. The algorithm then assigns blobs of colour to each cluster until the blobs are exhausted, using the centroid approach. Increased cost to generate the clusters. • Cluster Based Overapproximation and Merging: Build upon previous by also merging clusters which have not been assigned a blob to an existing blob. • Result is also checked with a different blob ordering (biggest first). • There are also minor trigonometric optimisations.

  41. Another Evaluation Problem (with 5, 20, 100 blobs)

  42. Their solution with 20 blobs

  43. 11: daemon • “We used java to implement our blobify.  The approach we took was a genetic algorithm.” • All the problems timed out 

  44. 12: AJaZ Tackled the problem in C# on .NET Framework 3.5 • If there are enough blobs, put a blob of radius 0.0001 (we liked the way such a blob can cover a pixel!) at the centre of each pixel. • If not, use a Genetic Algorithm • Large images (>2000 pixels), are scaled down to around 1000 pixels. • Places blobs randomly (with no colour) • Colour is calculated to be the most common color of the visible part of the circle • Mutation: • Blobs with low amount of correct pixels are removed • Blobs that overlap on a blob of the same color are removed • Other Mutations: Change position, Change Radius, Change Blob order • chances for each one are variable and change depending on which produces the most "better" mutations. (Sort of tailor making the mutation for every different bitmap) • The last 5 seconds spend mutating the best solution • Stopping condition: 57 seconds (excluding input and output)

  45. Guess the original image (1000 blobs)

  46. Che Guevara (5, 50, 100, 1000 blobs)

  47. 13: C^D • The blobmap is divided into monochrome blobmaps. • With each monochrome blobmap: • Extract a rectangle from this map such that all colored pixels are within it. • Create a blob with radius half the smallest side of the rectangle. • If half the pixels in the rectangle are coloured then use the blob • Otherwise, split the monochrome map and repeat. Different splitting mechanisms were used. • The blobs used are the biggest ones generated.

  48. The smiley faces ended up as…

  49. this using 100 blobs

  50. 14: y.a.t.n. • Implementation language: C • Randomly place blobs in areas of the image which use the same colour. • The best blob (in terms of correct pixels and paint used) is kept. • The remaining blobs are randomly refined by adding blobs in positions which enclose the same colour of the blob being put and again assessing the similarity. • Repeat until the algorithm converges or until 50 seconds have elapsed.

More Related