1 / 11

CoMMA Update

CoMMA Update. By Juveria & Muhammad. Past Work-1. Started with the basic concept of mining images and text and combining them together to get naïve rules. Dataset was of 780 images all from different domains (bad for a start). Features Used: R,G,B,Y,Edge Orientations,Intensities

juro
Download Presentation

CoMMA Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CoMMA Update By Juveria & Muhammad

  2. Past Work-1 • Started with the basic concept of mining images and text and combining them together to get naïve rules. • Dataset was of 780 images all from different domains (bad for a start). • Features Used: R,G,B,Y,Edge Orientations,Intensities • Basically used a C++ implementation of Apriori Association Rule Mining to get rules from text and from images and then combine them simply by matching image id, features and annotations. • Result: Not that good. Got lots of keywords for each image and not refined at all!

  3. Past Work-2 • Used Decentralised Apriori • Dataset: 680 images only belonging to flowers, landscapes and nature domains (Better to restrict) • Features Used: R,G,B,Y,Edge Orientations,Intensities • Get rules from the two domains separately and then use these to get combined rules by Using Decentralised Apriori made in C# (Is a fun language with lambdas). • Results: Definitely better than the previous but some noise here n there and efficiency issues.

  4. Present Work • The Big Picture: Application • Dataset • Features • Programming • Algorithm

  5. The Big Picture : CoMMA Allow a user to add images with/without annotations (initially with) Hard Disk Auto Annotation System Images Images FRONT END Annotations generated Feature Extractor Image Path, Annotations, Features, Rules Images Database Image retrieval using a query image Image Mining These images annotated & Similar Annotated Images Image retrieval using keyword search Java Keywords Text Mining Oracle 9i Annotated Images with related keywords Matlab PHP & HTML

  6. Dataset • Presently we have 670 images. Target is of 3000. Images are available on the web. With your help in annotating these images, we should be ready with a bolder dataset. • http://reddwarf.cs.rit.edu/~dmrg/CoMMA/annotate.php • For the multirelational aspect we’re trying to get more than one annotation for each image. This will give us a 1:n multirelational data.

  7. Image Features- Common Approaches in CBIR • Subsections of images Each Image is divided into 5 sections left-upper corner, right-upper corner, left-down corner, right-down corner • HSV color space: Hue Saturation Value color space excludes errors of the form Y=0 means R=1,B=1,G=1. We are taking 10 bin histograms for H,S,V. (on exploring matlab for such hist reveals that one gets 3X10X100 blocks of arrays..there has to be a sensible way of storing on the relevant portions- have to find out about that)

  8. Multi-relational Association Rule Mining • Modifying existing Association Rule Mining Algorithms for multi relations --Partial proving already done. More thinking needed as to how to use the present data structures for the purpose. • Designing an algorithm starting from scratch– Lot of thinking to come up with something totally new: not a modified version on existing algorithm. A Formal proof will be needed to show that such an algorithm works.

  9. ARMS • Apriori: We’ve already looked at decentralized Apriori and WARMR. Basic drawback of Apriori is that it uses a horizontal partition method, which is inefficient as compared to vertical method • FP Tree: We have to explore this algorithm. • Eclat: Uses horizontal partition method. Not good either. • COFI: We have to explore this

  10. Conclusion • That’s all for now. • We’re exploring further and posting our discoveries on our webpage • Thank you!

  11. References • Dristributed Multimedia Databases: Techniques & Applications by Timothy K. Shih. • Mutimedia Databases and Mining by B. Thirusingham • Multi-relations Data Mining

More Related