1 / 46

Enhancing Data Handling with DataContainer: Best Practices

Learn how to optimize data management with DataContainer, improve code readability, avoid errors, and automate metadata tracking for robust programming. Explore new syntaxes and dynamic operators for efficient data manipulation.

abel-black
Download Presentation

Enhancing Data Handling with DataContainer: Best Practices

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Spot & DataContainer Daniel (me) Henryk& Thomas & Siavash

  2. Agenda • The way we work with data and metadata NOW and how we want to improve it • Make codes shorter and easier to read • Basic Clean-up using DataContainer • Two new operator constructor syntaxes • Avoid/detect programming mistake -> more robust code • Some Things to Keep in Mind • Children operator dimension • Operation History • Dynamic Operator

  3. What is Metadata? --------------> Data Image Source: USGS Unit on the axis How far apart are the sample The label on the axis <etc> --------------> Metadata

  4. How We Treat Metadata Why keeping track of metadata ourselves is error prone?  Data and metadata are separated.  We have to change the metadata ourselves. What if we load different data and the unit is different

  5. How We Treat Metadata Why keeping track of metadata ourselves is error prone? How do we know what actually happened in the function?

  6. How We Treat Metadata Why keeping track of metadata ourselves is error prone? Data is structured differently than we thought: What if the data we get actually has space-domain on 1st dimension and time domain on 2nd dimension?

  7. How We Treat Metadata Why keeping track of metadata ourselves is error prone? Keep track of metadata is a total pain  for higher dimension data. Remember there are other metadata too, not just unit.

  8. What if We Want to Keep Track of Metadata Automatically Originally Operator(s) Data We MetaData Operator (not us) should modify BOTH the data and metadata at the same time.

  9. What if We Want to Keep Track of Metadata Automatically What we want Operator(s) Data MetaData DataContainer  So what does a DataContainer look like?

  10. Store Data and MetaData in one Object

  11. Same Vector Operation as MATLAB Array DataContainer MATLAB array C = iCon(randn(3,5)); >> C1 = vec(C); >> C2 = C’; >> C3 = norm(C); >> C4 = 2*C + 3*C – 4*C >> % and more x = randn(3,5); >> x1 = vec(x); >> x2 = x’; >> x3 = norm(x); >> x4 = 2*x + 3*x– 4*x >> % and more DataContainer : Works like MATLAB array

  12. Like RSF at File Level Meta Data (variable: model) Data (variable: m) Saving to File

  13. Spot Operator Apply Changes to DataContainer’s Metadata

  14. What DataContainer Enables to Do What if we load a different set of data and the unit is different?

  15. What DataContainer Enables to Do Keep track of metadata is a total pain  

  16. What DataContainer Enables to Do Originally, We don’t know if the data is structured the way we expected. We don’t know if the function is doing things properly. We can put something to check the metadata. Here we check unit as an example.

  17. Agenda • The way we work with data and metadata NOW and how we want to improve it • Make codes shorter and easier to read • Basic Clean-up using DataContainer • Two new operator constructor syntaxes • Avoid/detect programming mistake -> more robust code • Some Things to Keep in Mind • Children operator dimension • Operation History • Dynamic Operator

  18. How to use DataContainer in Your Code Example code – L1 Recovery

  19. Clean-up Stage 0: Put data and metadata into DataContainer Example code metadata The data Let’s Clean up the Code!

  20. Clean-up Stage 0: Put data and metadata into DataContainer

  21. Clean-up Stage 0: Put data and metadata into DataContainer

  22. Clean-up Stage 1: Using new Syntax to Clean Up and Give more info to the spot operator New Operator Constructor Syntaxes A = opDFT(model.n(1)) Version 0 A = opDFT(size(C,1)) Version 1 A = opDFT({C,1}) Version 2 A = opDFT

  23. Clean-up Stage 1: Using new Syntax to Clean Up and Give more info to the spot operator

  24. Clean-up Stage 2: Using Dynamic Operator New Operator Constructor Syntaxes A = opDFT(model.n(1)) Version 0 A = opDFT(size(C,1)) Version 1 A = opDFT({C,1}) Version 2 A = opDFT Dynamic Operator

  25. Clean-up Stage 2: Using Dynamic Operator Simple Dynamic Operator Example : opDFT

  26. Clean-up Stage 2: Using Dynamic Operator Simple Dynamic Operator Example : opDFT DataContainer Dimension: 3 X 1 opDFT( ) opDFT( ) * 3

  27. Clean-up Stage 2: Using Dynamic Operator

  28. Agenda • The way we work with data and metadata NOW and how we want to improve it • Make codes shorter and easier to read • Basic Clean-up using DataContainer • Two new operator constructor syntaxes • Avoid/detect programming mistake -> more robust code • Some Things to Keep in Mind • Children operator dimension • Operation History • Dynamic Operator

  29. Example - Catching Error Having spot operate on MATLAB array Correct

  30. Example - Catching Error Having spot operate on MATLAB array Wrong!!

  31. Example - Catching Error Having spot operate on DataContainer

  32. Agenda • The way we work with data and metadata NOW and how we want to improve it • Make codes shorter and easier to read • Basic Clean-up using DataContainer • Two new operator constructor syntaxes • Avoid/detect programming mistake -> more robust code • Some Things to Keep in Mind • Children operator dimension • Operation History • Dynamic Operator

  33. Keep in Mind - Operator opKron( , ) Size : dim1 X dim2 X dim3 For operating on MATLAB Array: (DataContainer) (n x 1) opKron (m x n) For operating on DataContainer: X (MATLAB Array)

  34. Keep in Mind - Operator To Properly Define the Operator: opTaper = opKron(opDirac,opDirac, opLinMute({C,1},0.1,0.2)); opTaper = opKron(opDirac(size(C,3)),opDirac(size(C,2)),opLinMute(size(C,1),0.1*size(C,1),0.2*size(C,1))); opTaper = opKron(opDirac({C,2:3}),opLinMute({C,1},0.1,0.2)); ({<Container>, <selected dimensions>}) Dynamic operator

  35. Keep in Mind - DataContainer In our example, we must initialize the container as 3-dimensional THEN vectorize it.

  36. Keep in Mind – DataContainerKeeps Track of Metadata History res2 should have the same metadata as container C.

  37. How operating on DataContainer helps recover information that we usually can’t recover History Stack res1 = A * C A.ID Mode = 1 Delta, Origin, Unit,…, etc A.ID Mode = 2 res2 = A’ * A * C A.ID Mode = 1 Delta, Origin, Unit,…, etc

  38. Keep in Mind - DataContainer • When doing addition or subtraction between 2 DataContainers, we keep the history of the one on the left. Container 1 C1_history Container 2 C2_history Result Container C1_history + =

  39. Keep in Mind – Dynamic Operator • A dynamic operator is activated in the scope it is used.

  40. Keep in Mind – Dynamic Operator • A dynamic operator is activated in the scope it is used. Advantage? Disadvantage?

  41. Keep in Mind – Dynamic Operator • Some dynamic operators cannot activate in adjoint mode. • They require that they had been used previously in the forward mode.

  42. A’ * A * C Example A = opCurvelet(?) A = opCurvelet(23,29) • Some dynamic operators require that they had been used previously in the forward mode.

  43. History Stack C = iCon(randn(23,29)) A * C A.ID Mode = 1 Delta, Origin, Unit, info for A,…, etc A.ID Mode = 2 A’ * A * C A.ID Mode = 1 Delta, Origin, Unit, info for A,…, etc

  44. Keep in Mind – Dynamic Operator • The operator will have different ID when created at different time. A.ID Mode = 1 …

  45. Conclusion Why consider using DataContainer? • Works just like MATLAB array. • Not too difficult to change • Added feature • Organize your data and metadata • Can be saved to file – no more editing your script • Apply changes to your metadata • Shorten your code => easier to read • Dynamic operator • Automatically initialize operator for you • Use the same script for different size data => easily scale to larger data sets • Catch error that MATLAB array can’t normally catch • The new spot operators are backward compatible

  46. That’s it! Questions

More Related