1 / 49

Not just another Reaction Database

Not just another Reaction Database. Aileen Day 1 , Valery Tkachenko 1 , Alexey Pshenichnov 1 , Leah McEwen 2 , Simon Coles 3 , Richard Whitby 3. 1 Data Science, Royal Society of Chemistry 2 Physical Sciences Library, Cornell University

asorenson
Download Presentation

Not just another Reaction Database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Not just another Reaction Database Aileen Day1, Valery Tkachenko1, Alexey Pshenichnov1, Leah McEwen2, Simon Coles3, Richard Whitby3 1Data Science, Royal Society of Chemistry 2Physical Sciences Library, Cornell University 3Department of Chemistry, University of Southampton

  2. RSC Archive – 480,000+ articles

  3. Digitally Enabling RSC Archive

  4. Article X-ray Compounds Reaction Analytical Data Text and References

  5. RSC data repository Properties Reactions Data sources Compounds The RSC data repository is under development, and is intended to contain chemical data which supports its publications. A first version has been written which captures compounds, data sources and properties domains. Reactions are next…

  6. RSC data repository - reactions There are a lot of reactions databases already – many established with many reactions This reactions database aims to capture reactions: in sufficient detail for someone else to reproduce in analogous ways to those captured in Electronic Lab Notebook with fully recorded processes, parameters and equipment in S88 process recipe [1] style raw characterization data linked to products which gave low yields or unintended products multistep reactions to fully record all reaction products (not just the target product) Guided by the aims of Dial-a-Molecule Detail Scope

  7. Dial-a-Molecule aim

  8. Dial-a-Molecule Roadmap To provide this In such a way that others can do this kind of analysis For these to be a potential source

  9. RSC data repository – reactions domain Reactions Substances Procedures Equipment Steps Reaction runs Compounds Parameters Solutions Mixtures Samples

  10. Reaction examples Reaction 1: Example of reaction text-mined from RSC archive by NextMove with S88-style procedure Reaction 2: Example From Will Dichtel’s research group via Leah McEwen (ELN-style reaction)

  11. Reaction 1: NextMovereaction text-mined from RSC archive – original article

  12. Reaction 1: NextMovereaction text-mined from RSC archive – cml output <?xml version="1.0" encoding="UTF-8"?> <reactionListxmlns="http://www.xml-cml.org/schema" xmlns:cmlDict="http://www.xml-cml.org/dictionary/cml/" xmlns:nameDict="http://www.xml-cml.org/dictionary/cml/name/" xmlns:unit="http://www.xml-cml.org/unit/" xmlns:cml="http://www.xml-cml.org/schema" xmlns:dl="http://bitbucket.org/dan2097"> <reaction> <dl:source> <dl:documentId>c3ra45871g</dl:documentId> <dl:paragraphText>Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid. [α]D20 −24.2 (c 1.1, CHCl3); 1H NMR (CDCl3, 300 MHz) δ 0.04 (s, 3H), 0.07 (s, 3H), 0.85 (s, 9H), 1.34 (s, 3H), 1.44 (s, 3H), 2.16 (br, 1H), 3.68–3.81 (m, 3H), 4.16 (t, J = 13.8 Hz, J = 13.8 Hz, 1H), 4.59 (t, J = 6.6 Hz, J = 6.6 Hz, 1H), 5.22 (d, J = 10.7 Hz, 1H), 5.34 (d, J = 17.1 Hz, 1H), 5.90 (ddd, J = 7.2 Hz, J = 10.2 Hz, J = 17.2 Hz, 1H); 13C NMR (CDCl3, 75 MHz) δ 134.1, 118.4, 108.5, 79.5, 78.8, 70.8, 65.0, 27.8, 25.9, 25.4, 18.1, −3.7, −4.4. HRMS (ESI) calcd for [M + Na]+ (C15H30O4SiNa) 325.1811, found 325.1807.</dl:paragraphText> </dl:source> <dl:reactionSmiles>[H-].C([Al+]CC(C)C)C(C)C.C([O:17][CH2:18][C@@H:19]([O:29][Si:30]([C:33]([CH3:36])([CH3:35])[CH3:34])([CH3:32])[CH3:31])[C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21]1)(=O)C(C)(C)C&gt;ClCCl&gt;[C:33]([Si:30]([CH3:32])([CH3:31])[O:29][C@@H:19]([C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21]1)[CH2:18][OH:17])([CH3:36])([CH3:35])[CH3:34] |f:0.1|</dl:reactionSmiles> <productList> <product role="product"> <molecule id="m0"> <name dictRef="nameDict:unknown">10</name> <dl:nameResolved>(R)-2-((tert-Butyldimethylsilyl)oxy)-2-((4S,5S)-2,2-dimethyl-5-vinyl-1,3-dioxolan-4-yl)ethanol</dl:nameResolved> </molecule> <amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00102">1.02 mmol</amount> <amount dl:propertyType="MASS" dl:normalizedValue="0.308">308 mg</amount> <amount dl:propertyType="PERCENTYIELD" dl:normalizedValue="79">79%</amount> <amount dl:propertyType="CALCULATEDPERCENTYIELD" dl:normalizedValue="79.1" units="unit:percentYield">79.1</amount> <identifier dictRef="cml:smiles" value="C(C)(C)(C)[Si](O[C@H](CO)[C@H]1OC(O[C@H]1C=C)(C)C)(C)C"/> <identifier dictRef="cml:inchi" value="InChI=1S/C15H30O4Si/c1-9-11-13(18-15(5,6)17-11)12(10-16)19-20(7,8)14(2,3)4/h9,11-13,16H,1,10H2,2-8H3/t11-,12+,13-/m0/s1"/> <dl:entityType>definiteReference</dl:entityType> <dl:appearance>colourless</dl:appearance> <dl:state>liquid</dl:state> </product> </productList> <reactantList> <reactant role="reactant"> <molecule id="m1"> <name dictRef="nameDict:unknown">Diisobutylaluminium hydride</name> </molecule> <amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00323">3.23 mmol</amount> <amount dl:propertyType="MOLARITY" dl:normalizedValue="1.1">1.1 M</amount> <amount dl:propertyType="VOLUME" dl:normalizedValue="0.00293">2.93 mL</amount> <identifier dictRef="cml:smiles" value="[H-].C(C(C)C)[Al+]CC(C)C"/> <identifier dictRef="cml:inchi" value="InChI=1S/2C4H9.Al.H/c2*1-4(2)3;;/h2*4H,1H2,2-3H3;;/q;;+1;-1"/> <dl:entityType>exact</dl:entityType> </reactant> <reactant role="reactant" count="1"> <molecule id="m2"> <name dictRef="nameDict:unknown">9</name> <dl:nameResolved>(R)-2-((tert-Butyldimethylsilyl)oxy)-2-((4S,5S)-2,2-dimethyl-5-vinyl-1,3-dioxolan-4-yl)ethyl pivalate</dl:nameResolved> </molecule> <amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00129">1.29 mmol</amount> <amount dl:propertyType="MASS" dl:normalizedValue="0.500">500 mg</amount> <identifier dictRef="cml:smiles" value="C(C(C)(C)C)(=O)OC[C@H]([C@H]1OC(O[C@H]1C=C)(C)C)O[Si](C)(C)C(C)(C)C"/> <identifier dictRef="cml:inchi" value="InChI=1S/C20H38O5Si/c1-12-14-16(24-20(8,9)23-14)15(13-22-17(21)18(2,3)4)25-26(10,11)19(5,6)7/h12,14-16H,1,13H2,2-11H3/t14-,15+,16-/m0/s1"/> <dl:entityType>definiteReference</dl:entityType> </reactant> </reactantList> <spectatorList> <spectator role="solvent"> <molecule id="m3"> <name dictRef="nameDict:unknown">dichloromethane</name> </molecule> <amount dl:propertyType="VOLUME" dl:normalizedValue="0.020">20 mL</amount> <identifier dictRef="cml:smiles" value="ClCCl"/> <identifier dictRef="cml:inchi" value="InChI=1S/CH2Cl2/c2-1-3/h1H2"/> <dl:entityType>exact</dl:entityType> </spectator> </spectatorList> <dl:reactionActionList> <dl:reactionAction action="Add"> <dl:phraseText>Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C</dl:phraseText> <dl:chemical ref="m1"/> <dl:chemical ref="m2"/> <dl:chemical ref="m3"/> <dl:parameterpropertyType="Temperature" normalizedValue="-78">-78 °C.</dl:parameter> </dl:reactionAction> <dl:reactionAction action="Stir"> <dl:phraseText>The reaction mixture was stirred at −78 °C for another 2 h</dl:phraseText> <dl:parameterpropertyType="Time" normalizedValue="7200">2 h</dl:parameter> <dl:parameterpropertyType="Temperature" normalizedValue="-78">-78 °C</dl:parameter> </dl:reactionAction> <dl:reactionAction action="Heat"> <dl:phraseText>warmed up to rt</dl:phraseText> <dl:parameterpropertyType="Temperature" normalizedValue="room temperature">rt</dl:parameter> </dl:reactionAction> <dl:reactionAction action="Quench"> <dl:phraseText>quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL)</dl:phraseText> <chemical> <molecule id="m4"> <name dictRef="nameDict:unknown">methanol</name> </molecule> <amount dl:propertyType="VOLUME" dl:normalizedValue="0.003">3 mL</amount> <identifier dictRef="cml:smiles" value="CO"/> <identifier dictRef="cml:inchi" value="InChI=1S/CH4O/c1-2/h2H,1H3"/> <dl:entityType>exact</dl:entityType> </chemical> <chemical> <molecule id="m5"> <name dictRef="nameDict:unknown">citric acid</name> </molecule> <amount dl:propertyType="VOLUME" dl:normalizedValue="0.005">5 mL</amount> <identifier dictRef="cml:smiles" value="C(CC(O)(C(=O)O)CC(=O)O)(=O)O"/> <identifier dictRef="cml:inchi" value="InChI=1S/C6H8O7/c7-3(8)1-6(13,5(11)12)2-4(9)10/h13H,1-2H2,(H,7,8)(H,9,10)(H,11,12)"/> <dl:entityType>exact</dl:entityType> </chemical> </dl:reactionAction> <dl:reactionAction action="Concentrate"> <dl:phraseText>concentrated</dl:phraseText> </dl:reactionAction> <dl:reactionAction action="Add"> <dl:phraseText>The residue was added with water (10 mL)</dl:phraseText> <chemical> <molecule id="m6"> <name dictRef="nameDict:unknown">water</name> </molecule> <amount dl:propertyType="VOLUME" dl:normalizedValue="0.010">10 mL</amount> <identifier dictRef="cml:smiles" value="O"/> <identifier dictRef="cml:inchi" value="InChI=1S/H2O/h1H2"/> <dl:entityType>exact</dl:entityType> </chemical> </dl:reactionAction> <dl:reactionAction action="Extract"> <dl:phraseText>extracted with dichloromethane (12 mL × 3)</dl:phraseText> <chemical> <molecule id="m7"> <name dictRef="nameDict:unknown">dichloromethane</name> </molecule> <amount dl:propertyType="VOLUME" dl:normalizedValue="0.012">12 mL</amount> <identifier dictRef="cml:smiles" value="ClCCl"/> <identifier dictRef="cml:inchi" value="InChI=1S/CH2Cl2/c2-1-3/h1H2"/> <dl:entityType>exact</dl:entityType> </chemical> </dl:reactionAction> <dl:reactionAction action="Dry"> <dl:phraseText>dried over Na2SO4</dl:phraseText> <chemical> <molecule id="m8"> <name dictRef="nameDict:unknown">Na2SO4</name> </molecule> <identifier dictRef="cml:smiles" value="[Na+].[Na+].[O-]S(=O)(=O)[O-]"/> <identifier dictRef="cml:inchi" value="InChI=1S/2Na.H2O4S/c;;1-5(2,3)4/h;;(H2,1,2,3,4)/q2*+1;/p-2"/> <dl:entityType>exact</dl:entityType> </chemical> </dl:reactionAction> <dl:reactionAction action="Filter"> <dl:phraseText>filtered</dl:phraseText> </dl:reactionAction> <dl:reactionAction action="Concentrate"> <dl:phraseText>concentrated</dl:phraseText> </dl:reactionAction> <dl:reactionAction action="Purify"> <dl:phraseText>The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33)</dl:phraseText> <chemical> <molecule id="m9"> <name dictRef="nameDict:unknown">crude product</name> </molecule> <dl:entityType>definiteReference</dl:entityType> </chemical> <chemical> <molecule id="m10"> <name dictRef="nameDict:unknown">SiO2</name> </molecule> <dl:entityType>falsePositive</dl:entityType> </chemical> <chemical> <molecule id="m11"> <name dictRef="nameDict:unknown">EtOAc-hexanes</name> </molecule> <dl:entityType>exact</dl:entityType> </chemical> </dl:reactionAction> <dl:reactionAction action="Yield"> <dl:phraseText>to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid</dl:phraseText> <dl:chemical ref="m0"/> </dl:reactionAction> </dl:reactionActionList> </reaction> </reactionList>

  13. Reactions properties Reactions Substances Procedures Equipment Steps Reaction runs Compounds • Reaction is defined by: • Reaction Smiles from textmining output • NextMove’sNameRXN program categorises reaction by: • Named Reaction ontology ID and name [1] • Reaction Class and name [2] Parameters Solutions Mixtures Samples [1] https://github.com/rsc-ontologies/rxno [2] Carey, Laffan, Thomson and Williams hierarchy: DOI: 10.1039/B602413K

  14. Reaction 1: Reaction Reactions • Reaction SMILES: [H-].C([Al+]CC(C)C)C(C)C.C([O:17][CH2:18][C@@H:19]([O:29][Si:30]([C:33]([CH3:36])([CH3:35])[CH3:34])([CH3:32])[CH3:31])[C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21]1)(=O)C(C)(C)C&gt;ClCCl&gt;[C:33]([Si:30]([CH3:32])([CH3:31])[O:29][C@@H:19]([C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21]1)[CH2:18][OH:17])([CH3:36])([CH3:35])[CH3:34] |f:0.1| • ReactionClass: “9.7 Other functional group interconversion” • Other Named Reaction: “9.7.61 Ester hydrolysis” From Nextmove’snamerxn reaction output (software source should be linked from Properties database) As well as reaction SMILES we can store Reaction RXN, RD and ChemDraw files.

  15. Reaction 1: Reaction reference Reference • URL: http://dx.doi.org/10.1039/c3ra45871g • Title: "Diastereoselectivevinylalumination for the synthesis of pericosine A, B and C" • Description: Reaction text-mined by NextMove from RSC article with DOI: 10.1039/c3ra45871g • Authors: Long-Shiang Li; Duen-Ren Hou • Publication Date: 31/10/2013 • DOI: 10.1039/c3ra45871g • Journal: RSC Advances • Publication Type: Journal Article Reference Details • External Identifier: c3ra45871g: product 10 • Paragraph Text: Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise…

  16. RSC data repository – reaction components Reactions Substances Procedures Equipment Steps Reaction runs Compounds • Reaction components define each reaction and each component is: • Defined as a substance/compound/solution/mixture • Assigned a reaction role is stored which can take values Reactant/ Product/ Solvent/ Catalyst/ Intermediate/ ChiralAuxiliary Parameters Solutions Mixtures Samples Text-mining identifies all compounds and solutions (indicated by molarity) that play a role in each reaction and returns smiles, InChI, reaction role, and amounts of each.

  17. Reaction 1: compounds and solutions Reaction components: reactant , solvent, product Other compound/substance used in procedure Solutions • Diisobutylaluminium hydride Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane(20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid (aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid. Compounds • 9 • dichloromethane • methanol • citric acid • water • Dichloromethane • Na2SO4 • 10 Ignored for now (only the name was extracted in this pass) – in time “Substances” • SiO2 • EtOAc–hexanes

  18. Reaction 1: Reaction Components

  19. Reaction 1: Reaction rendering Reaction When you click on it • Solution: Diisobutylaluminiumhydride • Components: • Solution Role: Solute; Molarity: 1.1M; Compound: Diisobutylaluminium(1+) hydride: • Solution Role: Solvent; Compound: cyclohexane

  20. RSC data repository – reaction runs Reactions Substances Procedures Equipment Steps Reaction runs Compounds • While the reaction information defines the overall reaction, the details about each specific instance of performing the reaction are stored in reaction runs: • stoichiometry table of each component • labels of components • amounts of components • links to specific samples and sources • results and yields of products. Parameters Solutions Mixtures Samples

  21. Reaction 1: Reaction Run Reaction Run • Label: Preparation of lithium acetylide (phenylethynyl)lithium; Experiment Stage: Executed • Stoichiometry Table Rows

  22. RSC data repository – procedure Reactions Substances Procedures Equipment Steps Reaction runs Compounds For reactions to be fully reproducible and queryable they are captured in a way analagous to S88 process recipes [1]: Break process down into a series of steps (actions) Define parameters at any level (for whole experiment or for particular action) Define equipment at any level (for whole experiment or for particular action) [1] https://en.wikipedia.org/wiki/ISA-88 Parameters Solutions Mixtures Samples

  23. S88-style procedures Type of actions which can be assigned to procedure steps

  24. S88-style procedures Parameters that can be assigned to actions or experiments Substance Parameters Other Parameters Can be time dependent sample ID temperature pressure quantity time particle size weight speed pH volume rate

  25. Reaction 1: procedure steps Text mining breaks down procedure summary into steps: • <dl:reactionActionList/dl:reactionActions> dl:phraseTexts • action="Add“: Diisobutylaluminiumhydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C • action=" Stir“: The reaction mixture was stirred at −78 °C for another 2 h • action="Heat“: warmed up to rt • action="Quench“: quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL) • action="Concentrate“: concentrated • action="Add“: The residue was added with water (10 mL) • action="Extract“: extracted with dichloromethane (12 mL × 3) • action="Dry“: dried over Na2SO4 • action="Filter“: filtered • action="Concentrate“: concentrated • action="Purify“: The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) • action="Yield“: to give 10 (308 mg, 1.02 mmol, 79%) as a colourlessliquid Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid (aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid.

  26. <dl:reactionAction action="Add"> <dl:phraseText>Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C</dl:phraseText> <dl:chemical ref="m1"/> <dl:chemical ref="m2"/><dl:chemical ref="m3"/> <dl:parameterpropertyType="Temperature" normalizedValue="-78">-78 °C.</dl:parameter> </dl:reactionAction> Reaction 1: Example Reaction Step 1 Procedure Step • Ordinal:1; Title: Add; Experiment Stage: Executed • Description: Diisobutylaluminiumhydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C • Type: “Add” • Parameters: • Substance: Stoichiometry Table Row for Diisobutylaluminium hydride • Substance: Stoichiometry Table Row for 9 • Substance: Stoichiometry Table Rowfor dichloromethane • Temperature: • Value: -78C Underlined values are retrieved from elsewhere in the repository (so that if e.g. amounts are updated, changes can be made in one place and be picked up

  27. <dl:reactionAction action="Stir"> <dl:phraseText>The reaction mixture was stirred at −78 °C for another 2 h</dl:phraseText> <dl:parameterpropertyType="Time" normalizedValue="7200">2 h</dl:parameter> <dl:parameterpropertyType="Temperature" normalizedValue="-78">-78 °C</dl:parameter> </dl:reactionAction> Reaction 1: Example Reaction Step 2 Procedure Step • Ordinal:2; Title: Stir; Experiment Stage: Executed • Description: The reaction mixture was stirred at −78 °C for another 2 h • Type: “Stir” • Parameters: • Temperature: • Value: -78C • Time: 2 hours

  28. <dl:reactionAction action="Quench"> <dl:phraseText>quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL)</dl:phraseText> <chemical><molecule id="m4"> <name dictRef="nameDict:unknown">methanol</name></molecule> <amount dl:propertyType="VOLUME" dl:normalizedValue="0.003">3 mL</amount> <identifier dictRef="cml:smiles" value="CO"/> <identifier dictRef="cml:inchi" value="InChI=1S/CH4O/c1-2/h2H,1H3"/> <dl:entityType>exact</dl:entityType> </chemical> <chemical><molecule id="m5"> <name dictRef="nameDict:unknown">citric acid</name></molecule> …. </chemical> </dl:reactionAction> Reaction 1: Example Reaction Step 3 Procedure Step • Ordinal:3; Title: Quench; Experiment Stage: Executed • Description: quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL) • Type: “Quench” • Parameters: • Substance: • Label: methanol • Compound: • Volume: 0.003 L • Substance: • Label: citric acid • Compound: • Volume: 0.005 L

  29. Reaction 2: ELN-style reaction Example reaction from Cornell (Will Dichtel’s research group, via Leah McEwen): • Multiple “runs” of a reaction are performed, with different amounts, and under different conditions • Results, observations and product characterisations are stored for each • This allows the run which gives rise to the best yield to be identified • Currently the experient files are stored in a number of files (see below), but this information is suitable to be stored in an Electronic Lab Notebook: • SJH-01-227_Enotebook.docx (“notebook” which shows the details of a particular run of a reaction – stoichiometry table (embedded Excel spreadsheet which does calculations), actual quantities, notes of conditions and results and TLC images embedded • WeeklyReport_5_01_2014.docx (logs all runs of all reactions done during a particular week – grouped by reaction, with reaction schema and observations noted) • spectra files

  30. SJH-01-227_Enotebook.docx SJH-01-227_Enotebook.docx SJH-01-223 and Cu(OTf)2 was transferred to a 5mL RBF with a reflux condenser with a schlenk adaptor and put under a N2 environment. The benzaldehyde was dissolved in dichloroethane and this solution was added via syringe to the RBF reaction flask. The flask was then placed in a 100°C oil bath and TFA was added via Hamilton microsyringe. The reaction stirred or 30 min. Reaction database SJH-01-227 11/4/2014 Procedure, parameters, substance parameters, equipment Procedure- results When complete, the reaction was washed with sat. NaHCO3(aq)­ and extracted three times with DCM. The organic fractions were collected and dried with MgSO4, filtered and solvent was removed under vacuum. The product was purified on SiO2 column chromatography (3:7 DCM:Hexanes). The product was isolated as a light yellow solid 0.0963 g (68% Yield). Reaction run - stoichiometry table Procedure - results

  31. WeeklyReport_5_01_2014.docx Experiment observations mostly – stored in Procedure results

  32. Other spectra files Spectra database ultimately (but Procedure Results Files for now) • Files that would probably go into spectra bucket of data repository: • SJH-01-227.jdx or SJH-01-227_jcamp.jdx (IR spectrum files - same content) • SJH-01-227_22-145C.jdx (1H NMR spectrum) • SJH-01-227-RT-2D.jdx (2D 1H NMR spectrum) • Other files which might be processed (to extract e.g. store peak assignment values into the data repository so that they can be exported): • SJH-01-227_DCM_rsw.rsw or SJH-01-227_DCM_rtf.rtf (UV-VIS-NIR peaks in text file – nearly the same as each other) • Other files (we think duplicates of the above): • SJH-01-227.spa (binary file) • SJH-01-227_csv.csv (text, but with no headers) • SJH-01-227_grams.spc (binary file) • SJH-01-227_mattson.ras (binary file) • SJH-01-227_nicolet.nic (binary file) • SJH-01-227_pcir.ird (binary file) • SJH-01-227_spa.spa (binary file) • SJH-01-227_spectacle.irs (binary file) • SJH-01-227_tiff.tiff and SJH-01-227_wmf.wmf (image files of the same spectrum) • SJH-01-227_DCM_baseline.csw (UV-VIS-NIR, binary file) • SJH-01-227_DCM_bsw.bsw (UV-VIS-NIR spectrum, binary file) • SJH-01-227_DCM_csv.csv (might be able to do something with this – UV?) • SJH-01-227_DCM_dsw.dsw (UV-VIS-NIR spectrum, binary file) • SJH-01-227_DCM_grams.spc (UV-VIS-NIR spectrum, binary file) • SJH-01-227_DCM_gsw.gsw (UV-VIS-NIR spectrum, binary file) • SJH-01-227_DCM_msw.msw (UV-VIS-NIR spectrum, binary file) Use this as an interim example Procedure Results files

  33. ESI docx example – synthetic procedure Reaction runs database - stoichiometry table, reaction results and procedure – S88 Synthesis of 17: 16 (0.101 g, 0.132 mmol) and Cu(OTf)2 (0.006 g, 0.01 mmol) were added to a round-bottom flask under a N2 atmosphere. In a separate vial, 2 (0.155 g, 0.753 mmol) was dissolved in C2H4Cl2 (1.3 mL) and transferred to the reaction flask. CF3CO2H (0.030 mL, 3 equiv) was added to the reaction mixture, which was refluxed at 100 °C for 1 h. The reaction mixture was washed with saturated NaHCO3 (15 mL) and extracted with C2H4Cl2 (3 x 5 mL). The organic fractions were collected, dried (MgSO4), and filtered to give a dark red solution. The solvent was removed, and the product was purified by column chromatography (SiO2, 30:70 CH2Cl2 : hexane) to yield 17 as a pale yellow powder (0.096 g, 68% yield). 17:1H NMR (500 MHz, CDCl3): δ 8.15 (d, 2H), 8.13 (s, 1H,), 7.98 (s, 1H), 7.95 (s, 2H), 7.92 (d,2H), 7.88 (d, 1H), 7.87 (d, 1H), 7.84 (d, 1H), 7.80 (s, 1H), 7.69 (t, 2H), 7.64 (d, 2H), 7.57 (t, 2H), 7.56 (s, 2H), 7.54 (s, 2H), 7.54 (d, 2H), 7.45 (t, 1H), 7.44 (t, 2H), 7.40 (t, 2H), 7.39 (t, 1H), 7.38 (t, 1H), 7.34 (t, 1H), 6.88 (t, 4H), 6.88 (t, 2H), 6.80 (s, 2H), 6.77 (d, 4H), 6.70 (d, 1H), 6.50 (t, 1H), 6.39 (d, 2H), 6.24 (t, 2H), 6.22 (s, 1H), 6.11 (s, 2H), 6.04 (s, 1H). 13C NMR (125 MHz, CDCl3) δ 141.47, 141.10, 140.85, 140.42, 140.32, 140.20, 140.10, 139.60, 139.45, 139.37, 139.16, 139.03, 138.72, 138.28, 138.07, 133.28, 133.04, 132.96, 132.90, 132.64, 132.37, 131.60, 131.41, 131.19, 131.17, 130.72, 130.48, 130.28, 129.87, 129.85, 129.57, 129.30, 129.16, 129.11, 128.35, 128.21, 128.08, 128.04, 127.86, 127.72, 127.47, 126.85, 126.65, 126.50, 126.32, 126.25, 126.17, 126.08, 125.98, 125.84. IR (solid, ATR) 3051, 2925, 2131, 1947, 1590, 1488, 1444, 1415, 1318, 1274, 1180, 1133, 1074, 1018, 950, 882, 870, 809, 771, 743, 720, 697 cm-1. HRMS (DART) calcd for [C84H56+] 1064.4376, found 1064.4348.

  34. Reaction 2: Reaction, ReactionFile and Reference Reaction • Components: • ReactionFile: SJH-01-227.cdx • FileType: ReactionFileType.CDX • ReactionSMILES: C1(C#CC2=C(C3=CC=CC=C3)C=C(C=CC=C4)C4=C2)=CC(C#CC5=C(C6=CC=CC=C6)C=C(C=CC=C7)C7=C5)=CC(C#CC8=C(C9=CC=CC=C9)C=C(C=CC=C%10)C%10=C8)=C1.O=CC1=CC=CC=C1C#CC2=CC=CC=C2>[O-]S(=O)(C(F)(F)F)=O.[O-]S(=O)(C(F)(F)F)=O.[Cu+2].OC(C(F)(F)F)=O.ClCCCl>C%11(C%12=C(C=C(C=CC=C%13)C%13=C%12)C%14=C(C%15=CC=CC=C%15)C=C(C=CC=C%16)C%16=C%14)=CC(C%17=C(C=C(C=CC=C%18)C%18=C%17)C%19=C(C%20=CC=CC=C%20)C=C(C=CC=C%21)C%21=C%19)=CC(C%22=C(C=C(C=CC=C%23)C%23=C%22)C%24=C(C%25=CC=CC=C%25)C=C(C=CC=C%26)C%26=C%24)=C%11|f:3.4.5|

  35. Reaction 2: Reference Reference • ELN: Reaction SJH-01-227 • Authors: Sam Hein; William R. Dichtel; Leah McEwen • URL: http://www.eln.com/cornell/dichtel/SJH-01-227 • Publication date: 12th February 2014 • PublicationType: PublicationType.ELN • Reference Details: Reaction SJH-01-227

  36. Reaction Reaction 2: Planned reaction run • Reaction Run: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Planned • Stoichiometry Table:

  37. S88 process standard approach S88 allows procedure steps (process actions) to be grouped into “process operations”: Process ProcessOperation ProcessActions Preparation / Reaction / Work up / Isolation Experiment Synthesis stage Heat / Cool / Dose / Stir etc. We allow “Procedure Steps” to be nested and have seeded the following procedure step types to assign to procedure steps for these parent operations: ProcessStage ProcessStage ProcessStage

  38. Procedure Reaction 2: Planned procedure • Title: Reaction SJH-01-227 dated 2/12/2014; Failed Reaction: false; Experiment Stage: Planned; Link to ReactionRun • Procedure Steps:

  39. Reaction/Procedure Planned and Executed Experiment Stage If there are differences between the planned and executed reaction or procedure then both versions of the following can be stored and flagged as having an ExperimentStage field as Planned or Executed: Reaction run All corresponding stoichiometry table rows Procedure and for each All corresponding Procedure Steps and ParameterValues and ParameterTimes Results and requested user inputs can be recorded and linked the relevant procedure or step of the Executed Procedure

  40. Reaction 2: Reaction run (Executed and Planned) Reaction • Reaction Run: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Executed; Link to Planned reaction run • Stoichiometry Table By default, the executed version is shown, but the planned version can be accessed via clicking on a link Links to actual amounts of reactants/reagents used Reaction • Reaction Run: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Planned • Stoichiometry Table Links to planned amounts of reactants/reagents used

  41. Reaction 2: Stroichiometry table (Executed and Planned) Click to see added sample information (see next slide)

  42. Reaction 2: Sample information of product (for executed version) Sample • Compound: • Characterisations: • Appearance: “light yellow solid” at DateMeasured: TimeStamp: 17:00:00 02/12/2014 • Label: SJH_01_227 • OriginalDateAcquired: 17:00:00 02/12/2014 • SubstanceState = Solid • SampleAmounts: • Mass: 0.0963 g at TimeStamp: 17:00:00 02/12/2014 • SubstanceSource: • Reaction: • Reaction Run: Reaction SJH-01-227 dated 2/12/2014 • Stoichiometry Table Row Product : SJH_01_227

  43. Reaction 2: Procedure (planned and executed values) Procedure • Title: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Executed; Link to Planned Procedure; Link to ReactionRun • Procedure Steps Links to executed ReactionRun and Procedure Steps Procedure • Title: Reaction SJH-01-227 dated 2/12/2014; FailedReaction: false; Experiment Stage: Planned; Link to ReactionRun • Procedure Steps Links to planned ReactionRun and Procedure Steps

  44. Reaction 2: Procedure Steps (Executed version) All values that are retrieved from stoichiometry table rows are automatically updated with Executed rather than Planned values

  45. Conclusions We have shown how this reactions database captures reactions: in sufficient detail for someone else to reproduce in analogous ways to those captured in Electronic Lab Notebook with fully recorded processes, parameters and equipment in S88 process recipe [1] style raw characterization data linked to products which gave low yields or unintended products multistep reactions to fully record all reaction products (not just the target product)

  46. Because of all this being captured and linked… Reactions Substances Procedures Equipment Steps Reaction runs Compounds Parameters Solutions Mixtures Samples

  47. Future work We have shown 2 examples: Reaction 1: Example of reaction text-mined from RSC archive by NextMove with S88-style procedure there are 31,000 more of these to be validated and imported Reaction 2: Example From Will Dichtel’s research group via Leah McEwen (ELN-style reaction) Consider pipeline for population direct from ELNs Develop reactions user interface, API, and import/validation platform

  48. Thank you Email: tkachenkov@rsc.org Slides: http://www.slideshare.net/valerytkachenko16

More Related