SPARK Search Engine - PowerPoint PPT Presentation

donald
slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
SPARK Search Engine PowerPoint Presentation
Download Presentation
SPARK Search Engine

play fullscreen
1 / 26
Download Presentation
SPARK Search Engine
173 Views
Download Presentation

SPARK Search Engine

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. SPARK Search Engine

  2. Whoam I? Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK http://spark.furore.com/fhir/patient?... The workafter the question mark.

  3. The place of Search REST Service Storage Spark MongoDB Index & Search

  4. Search Paradigm FHIR client should be easy. FHIR server needs to solve the complex issues. Search has some…

  5. Search First there was Storage Then there was Search

  6. Connectathon To test a client – you must have a tested server To test a server – you must have a tested client “One fool can ask more questions than seven wise men can answer”

  7. Connectathon “But what if you are wrong?”

  8. History • Version 1. • A Generics based implementation • On top of the FHIR data model. • Programmed per search parameter programming. • No meta data available yet. • No indexing. • Slow.

  9. History • Version 2. • Data Model independent, • Meta data not available - manually added • Lucene.NET as indexer (Index in Lucene, Database in Mongo) • Fast • Standardised all parameter specifics into standard “modifiers”. • All Code based on search parameter types. • Joins are client side

  10. History • Version 3. • Modified to store the Lucene index in Mongo • Index storage unreliable. • Never saw light of day

  11. History • Version 4. CURRENT • Index storage to a dedicated Mongo collection • Build expression tree from parameters • Chained parameters have full functionality (modifiers, operators) • Joins are client side

  12. Indexing Why indexing?

  13. Whyindexing http://spark.furore.com/fhir/patient?provider.name:partial=Health

  14. Whyindexing http://spark.furore.com/fhir/patient?provider.name:partial=Health

  15. Indexing. HOW-TO Harvest the Resource Determine data type Groomyour data Store data in Index You DO want A de-serialized data to an object with all values strongly typed. You DON’T want to spend time analyzing and interpreting JSON and/or XML.

  16. Indexing. 1. Harvesting Resource: Patient Search parameter: family Searches for the family name and prefix of every HumanName that is registered with a Patient. Usage: http://spark.furore.com/fhir/patient?family=White

  17. Indexing. 1. Harvesting Using the Visitor pattern Pathfrom Meta data: "patient.Name.Prefix" "patient.Name.Family" Resource: Patient Search parameter: family Patient Given List<Name> Prefix Name (HumanName) Family Name (HumanName) Suffix Name (HumanName)

  18. Indexing. 2. Determine data type • > patient (Patient) • > Name (HumanName) • > LastName (string) • Data type: string • Search parameter type: string • Selected indexing method: • Single value – as string • More values – as string array

  19. Indexing. 2. Determine data type • > patient (Patient) • > Gender (Coding) • > Coding (List<Coding>) • > Code (CodeableConcept) • Data type: Code • Search parameter type: Token • Selected Indexing method: • Store in an array each codeable concept • System (uri) • Code (string) • Display (string)

  20. Indexing. 3. Groomyour data • Remove dashes, dots, slashes from dates etc. • If you implement a like search from the left side, you might want to split names at the dash in to multiple hits.

  21. Indexing. 4. Store in the index * Level The patient is not a contained resource (level 0) * Family In Mongoyoucan store an array thatcanbesearchedlike a normal string.

  22. Future • Version 5. NEXT • All parameters based on FHIR data types? • Joins using Mongo Map-Reduce?

  23. Complexity So what is the issue?

  24. Complexity • Include & Chained parameters • Joining over references return multiple resource types • Client side (not in Mongo database) joins

  25. Complexity • Transactions • FHIR has bulk POST • Split between Indexing and storage

  26. Complexity • Multiple types • Some properties do not have a fixed type. • Example: observation.value • Can be a: • CodeableConcept • String • Quantity (number + unit)