00:00

1. **Enhancing Geotechnical Data Integrity with Python** 2. **Geological engineering student focusing on data integr

sarrado
Download Presentation

1. **Enhancing Geotechnical Data Integrity with Python** 2. **Geological engineering student focusing on data integr

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ENHANCING GEOTECHNICAL DATA INTEGRITY WITH PYTHON Ezekiel Camacho Ezekiel Camacho

  2. 2 OVERVIEW o Presenter’s Background o QAQC in Geotechnical Engineering o Code Development & Automation o GUI Integration

  3. 3 ABOUT ME Interests Geological Engineering Student University of British Columbia, Vancouver o Data science applications in geotechnical engineering o Rock mechanics o Geothermal exploration o Current geotechnical engineering intern at Teck Resources Ltd. o Prior core logging experience in mineral exploration o Prior experience as geochemistry laboratory technician o Involved in coding and research projects with UBC design teams

  4. QAQC IN GEOTECHNICAL ENGINEERING Core logging, field tests, laboratory tests, and research data

  5. 5 COMMON SOURCES OF ERROR IN DATA COLLECTION DATA ENTRY INSTRUMENTATION SAMPLE DEGRADATION o Mechanical breaks from transport o Weathering o Mislabeling information o Transcription errors o Incorrect units o Logging inconsistency o Equipment calibration o Sensor drift o Data logger malfunction

  6. 6 COMMON SOURCES OF ERROR IN DATA COLLECTION DATA ENTRY INSTRUMENTATION SAMPLE DEGRADATION o Mechanical breaks from transport o Weathering o Mislabeling information o Transcription errors o Incorrect units o Logging inconsistency o Equipment calibration o Sensor drift o Data logger malfunction M o d e l s a r e o n l y a s g o o d a s o u r d a t a – R o b u s t Q A Q C w o r k f l o w s a r e p a r a m o u n t t o s u c c e s s f u l g e o t e c h n i c a l a n a l y s e s

  7. 7 TRADITIONAL QAQC PROCESS oData entry and transcription typically verified through manual review, cross-checking against original records oStatistical data validation conducted using individual datasets at a time oTracking anomalous spikes in time series data with increasing data quantity oManual cross-validation of reports and logs against benchmarking data

  8. 8 QAQC WITH PYTHON DATA COLLECTION o Forms can be modified to only contain selections for recognized codes from the logging manual and can flag incorrect entries o Fast retrieval and processing of multiple databases and digital archives VALIDATION o Detection of outliers and inconsistencies can be programmed and used in conjunction with data processing scripts o Automated cross-validation with benchmarking and research data

  9. 9 QAQC WITH PYTHON VISUALIZATION o Plots can be generated for each dataset to explore the relationships in the data REPORT GENERATION o Summary of analyses and plots can be exported as a report

  10. CODE DEVELOPMENT & AUTOMATION Tools and methods in Python code development

  11. 11 ADVANTAGES OF AUTOMATION o Significant reduction in processing time o Minimal calculation errors o Combination with more complex scripts and modelling o Cross-functional use of packaged scripts o Replicability of plots for annual data

  12. 12 PYTHON LIBRARIES Pandas/NumPy SciPy Matplotlib/Seaborn/Plotly o Data cleaning o Data manipulation o Scientific computing o Model development o Visualization Keras/Sci-Kit Learn Groundhog PandasTable o Machine learning – E.g. validation of RQD and TCR from core photos o Soil dynamics o Pile calculations and site investigations o Shallow foundations o Viewing dataframes in Python GUIs

  13. 13 DATA VISUALIZATION Sources: https://www.researchgate.net/figure/Correlation-matrix-for-all-the-geotechnical-parameters_fig2_374231477, https://www.geeksforgeeks.org/python-seaborn-pairplot-method/

  14. 14 NON-LINEAR MODELS Source: http://geologyandpython.com/hoek-brown.html

  15. 15 AUTOMATION REQUIRES ONGOING DEVELOPMENT o Ensure the workflows apply to our current needs o Workflows must be sufficiently documented o Scripts must be regularly reviewed and updated o Reduce the risk of inefficiency and bugs in the code

  16. GUI INTEGRATION Converting Python scripts into independent apps for distribution and use

  17. 17 GUI INTEGRATIONS WITH PYTHON o Intuitive interfaces bridge the gap between the programming and geotechnical world o Shifts our focus from user training and toward adoption of efficient workflows powered by robust and tested scripts o Data analysis results and insights from visualizations are easily accessible and presentable to stakeholders

  18. 18 GUI LIBRARIES TKinter/CustomTKinter o Modern interface to run scripts o Applicable for non- programmers Django/Flask o Web development o Real-time monitoring o Dashboards

  19. 19 COMPLEX MULTI-USE GUI Image Source: https://codereview.stackexchange.com/questions/240378/tkinter-gui-for-running-hplc-pumps-real-time-data-visualization

  20. 20 DATA PRE-PROCESSING Template Generation Data Transformations o Variables in data collection o Generated from core logging/lab testing manuals o Serves as control for which types of data are processed o Run initial diagnostics in the data which can be cross-validated with output from analysis o Standardization o Text normalization o Validation of categorical variables with accepted codes in data entry (e.g. Lithology, ISRM Strength)

  21. 21 PYTHON-INTEGRATED QAQC WORKFLOW

  22. THANK YOU Any questions?

More Related