1 / 17

[GISCUP2013] Mailing List Q&A + Project Discussion

[GISCUP2013] Mailing List Q&A + Project Discussion. Ashok Dahal. Overview. Discussion of questions asked by registered members Responses of GISCUP2013 team Discussion of GISCUP updates Project Discussion. Q&A - Deadline. Q: When is the submission deadline? (01/17)

billie
Download Presentation

[GISCUP2013] Mailing List Q&A + Project Discussion

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. [GISCUP2013] Mailing List Q&A + Project Discussion Ashok Dahal

  2. Overview • Discussion of questions asked by registered members • Responses of GISCUP2013 team • Discussion of GISCUP updates • Project Discussion

  3. Q&A - Deadline • Q: When is the submission deadline? (01/17) • A: The deadline is August 1st, 2013(01/21)

  4. GISCUP2013 Update • Dataset changed(02/27) • Summary of changes: • Some Polygon/Point pairs were incorrectly not reported in the result set. • In the new version, we provide the polygons (I.e., stored in poly10.txt. And poly15.txt) sorted by the sequence number in ascending order. From now on, you can assume that all polygons data are given to you sorted that way.

  5. Q&A – Data Size Limit • How many points and polygons? • Each object(point/polygon) will have number of instances(point/polygon with timestamp) • According to problem statement: the maximum number of points and polygons will be no more than 1M and 500 respectively. • The question is: that many objects or instances?

  6. Q&A – Data Size Limit contd. • Sample data provided has point files with 500 and 1000 points and polygon files with 10 and 15 polygon. • Point500.txt = 39,289 lines(instances) • Point1000.txt = 69,619 lines(instances) • Poly10.txt = 30 lines (instances) • Poly15.txt = 40 lines (instances)

  7. Q&A – Data Size Limit [Response] • The size limit applies to the number of instances of the points and polygons. • That is, the total number of points in the points file will be less than 1M and the total number of polygons in the polygon file will be less than 500. • To be more specific, the number of lines in the points file will be less than 1M and the number of lines in the polygon file will be less than 500.

  8. Q&A – Data Size Limit arguments • Argument: Determining whether points are in a single polygon which gets redefined 500 times is a much easier problem than having 500 distinct polygons defined at the same time the whole time.  In real world it may not happen. • Response: we have to restrict certain dimensions of the problem to make it practical for a contest.

  9. Q&A – Data Size Limit arguments contd. • Argument: Do we care how many lines are in the polygon input file or we care how many polygons can be defined at a given time? • Response: • The maximum number of polygons that can be defined at a given time is 500. In this case, none of the polygons would move, only the points will move. • The minimum number of polygons that can be defied at a given time is 1. In this case, the polygon can move 499 times.

  10. Q&A – Defining Polygons • Question: In the sample files, all of the polygons are defined at once with the first several timestamps, before any points are defined. Will all of the polygons initially defined at the start? Or is it possible new polygons will appear later on? • Response: All the initial polygons will be defined before any points are defined as we did in the sample files

  11. Q&A – Defining Polygons[Arguments] • Argument: Actually, the sample files do not agree with that statement.  The sample file poly15.txt has a Polygon with ID 0 which does not get defined until timestamp 124106, and sample file poly10.txt has a Polygon with ID 0 which does not get defined until 403047. • Response: This is a data error. We will fix it and redo the test files. There should not be any polygons with an ID less than 1.

  12. Q&A – Evaluation Machine • Question: can we assume a Java Runtime Environment installation to be present on the evaluation machine? • Response: Yes, you can assume a JDK 1.6 version.

  13. Your Own Questions • Do you also want to ask some specific questions? https://wwws.cs.umn.edu/mm-cs/listinfo/GISCup2013 • Get to this link and register so that you can ask questions to them. You will also get emails when somebody asks question and when GISCUP team responds.

  14. Project Discussion • The data set we are going to use for evaluation will be way bigger than the sample files provided in the CUP website. • Example – no of lines in points500.txt file can go up to 1M from current 39,289 lines. Similarly, no of lines in poly10.txt can go up to 500 lines from current 30 lines. • You need to work on speeding up your program since large dataset can take a lot of time to get processed and validate your output(Why?)

  15. Project Discussion contd. • My Experience: • I am using two methods to check a point INSIDE a polygon(Ray-casting and Winding number method). The algorithms are exhaustive which means no speeding up is done yet. • Programming language : PERL. • Speed and accuracy wise, both method seems similar. • It is taking around 800s for using points500.txt(39,289 instances)with poly10.txt(30 instances). • How long will it take for points500.txt(1M instances) and poly10.txt(500 instances)?

  16. Project Discussion contd. • So Speeding Up is a MAJOR factor. • Accuracy: • All 10366 pairs matching. • Initially, I had more than 20,000 pairs in my output which means, I had more than 9600 extra pairs. • If there are extra pairs, your score will go down because each extra pair will decrement the score by 1. • So accuracy is another MAJOR factor. • HINTS for accuracy: • Remove all the extra pairs based on the problem definition. i.e. check time stamp of point vs. polygon and check if the polygon is already expired.

  17. Project Discussion contd. • Remember that you also have to do , WITHIN not only INSIDE the polygon. • Things you need to consider: • Start early!!! • Work on the speed. • Apply filtering as discussed in the class. • If you can utilize multi core CPU, that is awesome. • Work on accuracy.

More Related