1 / 36

Crawling, P arsing and Semantic Matching of Vacancies and CV’s Semantic Recruitment Technology Jakub Zavrel , Textke

Crawling, P arsing and Semantic Matching of Vacancies and CV’s Semantic Recruitment Technology Jakub Zavrel , Textkernel InGRID Workshop 11-2-2014. Textkernel : Spinoff from R&D in machine learning and language technology

tanaya
Download Presentation

Crawling, P arsing and Semantic Matching of Vacancies and CV’s Semantic Recruitment Technology Jakub Zavrel , Textke

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Crawling, ParsingandSemantic Matching ofVacanciesandCV’sSemanticRecruitmentTechnologyJakubZavrel, TextkernelInGRID Workshop 11-2-2014

  2. Textkernel: • Spinoff from R&D in machine learning and language technology • Founded 2001, offices in Amsterdam (HQ), Frankfurt, Paris, 45 employees; strong R&D focus • Deloitte Fast 50 2007, 2010, 30% YoY growth • Core technology: Understanding unstructured text data. Multi-lingual Market: • Job boards, Recruitment Software, Staffing and recruitment, Mobility, LargeEmployers • Products: • Multi-lingual tools (15 languages) to extract CVs and jobs • Jobfeed: largest real time DB for job marketanalysis • Search! & Match! to connect people and jobs • Customers: UWV,PoleEmploi, Adecco, Randstad, USG, Monster, Stepstone, XING, SAP, Unisys, Bosch, Axa, Philips, etc. (350 direct, 2000+ indirect), • Large partner network (HR & recruitment software)

  3. Language gap I likeprogramming, butI’minterested do takeon more project management responsibility Is there a job in ourorganisationthatbetter fits mydegree? We are looking to hire: Anexperiencedtech team team lead I’dlike to workonour mobile strategy. I’vehelped a frienddevelop a mobile app. • The idealcandidate has: • min. 5yr of experience • Certfiedscrummaster • Exp. w/iOS, Android I’dlike to do more withmyorganisational talent. Completedacademic studies Computer Scienceorrelated 30% travel forcustomerpresentations

  4. The Job ad searches directly in a database and identifies relevant candidates (or vice versa) …

  5. Extract! CV/Job Parsing Automatically convert each document into a complete record

  6. Extract!

  7. Extract!

  8. Extract!

  9. Extract!

  10. Extract! – Zero data entry job application

  11. Extract!

  12. Extract! • Time savingscodingCVsand Jobs • Ifyouacceptnoise, 100% time savings • Structured dataallowsbettersearch: SemanticSearching and Matching • Codingenablesreporting and statistics

  13. Occupation coding! • Coding followsExtraction • Customerspecificorstandardtaxonomies • Stringsimilaritybasednormalization • Lot of synonyms per language • Distance = confidences • Problem cases: ambiguity, context, long tail • More complex modelscan help(classifiers, multi-variate models) • Semantic matching better (occupation coding errors are counterbalancedbyother variables)

  14. Search! • Semanticsearch: „Letsyou find whatyoumeannotwhatyoutype“ Impression...

  15. Match! • Match!

  16. Semantic Matching Technology: • Natural Language Processing • Machine Learning • Semantic Analysis • Probabilistic Language Model • Search Engine • Multi-lingual taxonomies • Recruitment knowledge-bases

  17. Demo

  18. Jobfeed Search andanalyse real-time online jobadsaswellashistoricaldata

  19. Jobfeed

  20. Jobfeed! Knowledge of all demandforlabour in European job market • Salesleadsforrecruitment and staffingcompanies • Real time labourmarketanalyticstools • Largest database of jobs for matching unemployed • Perfect data sourcefortextmining

  21. Jobfeed! • Real time collection of online job adsfromany (unstructured) source • Available in NL, DE, FR, IT • Gradually rolling out in rest of Europe • Richlysemanticallystructured data

  22. Jobfeed!

  23. Jobfeed: Multilingual Occupation Taxonomy • Occupations >4000 codes • 4 languages • 3 layer hierarchy • >50K synonyms • Link to other concepts: • - Skills • - Education level • - Sector • - O*NET • UWV (Dutch Employment Agency) • ROME Example: NL: administratiefmedewerker, EN: administrative assistant, FR: employéadministratif, DE:Verwaltungsassistent(m/w). Group: administrative personnel Class: Administration and Customer Service Synonyms: administrative employee, assistant clerk, office support Skills: ms office, excel, english language, etc O*NET: 43-9199.00: Office and Administrative Support Workers, All Other UWV: 1000402563: Administratiefmedewerkersecretariaat Basedonmillions of jobs, years of customer feedback and experience!

  24. Demo

  25. Jobfeed as material for Research

  26. Je op is voor te ervaring aan als and software Frequent words for "Java developer" en van de een je met in het Java of om team zijn kennis bij Ervaring die the naar a jaar jij bent Developer HBO hebt to werken werk

  27. voor te is of zijn aan bent naar bij om Frequent words for all professions en van de een in het je met op Je als ervaring die Het hebt deze werken zoek De wij functie onze ben tot over werk opleiding uit and werkzaamheden dat binnen u Als Voor zelfstandig kennis ook s verantwoordelijk

  28. Solution: contrast frequencies Observed frequency of w: O(w) = A Expected frequency of w: E(w) = C * B / D Pick words with highest score: score(w) = (O - E)2 / E

  29. Top words for "Java developer" java developer software spring scrum agile hibernate ontwikkelaar u j2ee Building rich skills profiles for thousands of occupations from millions of real time jobs…… new trends and occupations… wij xml jee o javascript you kennis ontwikkelen oracle ontwikkeling development maven applicaties ervaring web de frameworks jboss mbo senior architectuur webservices informatica werkzaamheden technologie developers eclipse bezit het team wo rijbewijs technieken tomcat the vca zelfstandig architect werklocatie html

  30. Supply & Demand • Have: lots of data, technology, ideas • Want: labormarket expertise, students, research

  31. SemanticRecruitmentTechnology Thanks!

More Related