mgperson2002

Organic Chemistry web based application and synthesis search engine

Recommended Posts

Hi all,

I'm in the process of developing a web based application with the goal of it functioning as a tool for organic chemistry students as well as a synthesis pathway search engine. I would greatly appreciate any feedback.

www.organicchemmaster.com

Of note it has recently been capable of detecting a pathway from benzene to aspirin (2-acetoxybenzoic acid): https://www.organicchemmaster.com/Molgen/Reaction/benzene/2-acetoxybenzoic%20acid?options=Calc

 

Thanks

Share this post


Link to post
Share on other sites

It doesn’t seem to be well optimised for mobile devices would be my first comment. My second one is that I don’t really understand what the website is trying to achieve exactly? As a synthesis pathway engine it is not particularly useful as-is. From what I could see you only seem to be able to optimise from a very limited selection of molecules using biochemical pathways? Perhaps this is just because I am viewing it on my phone. The discover function again is very limited, and contains some typos and other errors. 

Share this post


Link to post
Share on other sites

For a test, I made ethane molecule in reactant(s) area, then made chloroethane in product area. Result:

Quote
 
Get Search Link: 

 

Search Result Status: Failure. Unfortunately I could not find a pathway.
Start Molecule: ethane
End Molecule: 1-chloroethane
Search Result Steps: 1
Search Time: 0 min: 0 sec: 4 msec
Search Tools Used: Pathway Calculations, MolGen Reactions
Optimized For: FewestSteps

Then I changed ethane to ethene molecule in reactant(s) area by clicking on bonding to change to CH2=CH2

and result is:

Quote
Get Search Link: 

 

Search Result Status: Great success!!
Start Molecule: eth-1-ene
End Molecule: 1-chloroethane
Search Result Steps: 1
Search Time: 0 min: 0 sec: 41 msec
Search Tools Used: Pathway Calculations, MolGen Reactions
Optimized For: FewestSteps

So, it's not calculating reaction on-the-fly but searching in database.. ?

That' pretty basic example.

C2H6+Cl2->C2H5Cl+HCl

 

Edited by Sensei

Share this post


Link to post
Share on other sites
1 hour ago, hypervalent_iodine said:

It doesn’t seem to be well optimised for mobile devices would be my first comment. My second one is that I don’t really understand what the website is trying to achieve exactly? As a synthesis pathway engine it is not particularly useful as-is. From what I could see you only seem to be able to optimise from a very limited selection of molecules using biochemical pathways? Perhaps this is just because I am viewing it on my phone. The discover function again is very limited, and contains some typos and other errors. 

Thanks @hypervalent_iodine . Yes, it is INDEED not yet optimized for mobile devices, though of course that is a future plan. The discover function there is basically intended as a "teaser" or a very limited functional version of the reaction application feature. The more full fledged version can be found at : https://www.organicchemmaster.com/MolGen/Index . If you're able to describe the typos and other errors I'm more than happy to address them.

Thanks @Sensei for testing. I think the answer to your question "So, it's not calculating reaction on-the-fly but searching in database.. ?" is that it actually IS calculating the reaction from ethene to chloroethane, as soon as your click the reaction button, as opposed to doing any sort of database search that would be something along the lines of "Find a reaction that begins with ethene and produces chloroethane". It does not actually run the calculation as soon as you change the first molecule from ethane to ethene though.

In the particular case you mentioned of searching for a path from ethene to chloroethane, the reaction it uses can be viewed here: https://www.organicchemmaster.com/Reaction/Rule/019 . This reaction is actually defined as a rule: convert a carbon carbon double bond to a carbon carbon single bond with a chlorine attached. That is, opposed to a more specific reaction like "Ethene becomes chloroethene". As such, this rule can also be applied to convert propene to 2-chloropropane.

Share this post


Link to post
Share on other sites
3 minutes ago, Sensei said:

If there is rule of addition of HCl, then there should be rules for Chlorination, Fluorination, Bromination, Iondination etc.

https://en.wikipedia.org/wiki/Halogenation#Chlorination

 

Agreed. This is a work in progress. Bromination actually is currently supported. https://www.organicchemmaster.com/Reaction/Rule/005 You can view a comprehensive list of the rules currently supported here: https://www.organicchemmaster.com/Reaction/# . Fluorination and Iodination are on the future list as well.

Share this post


Link to post
Share on other sites
12 hours ago, hypervalent_iodine said:

So what does it offer over other websites like organic portal, SciFinder, or Reaxys?

The main features/goals of the site:

  1. A full fledged drawing/design tool to allow the user to build organic molecules following all proper bonding, hybridization, and steric spacing rules. The tool also allows the user to zoom in and zoom out to examine specific areas of more complex molecules by viewing the hybridization, formal charge, electrons free to bond and parent skeleton of each atom of the molecule. Furthermore, the tool provides an instantaneously generated (on the fly) IUPAC name of the molecule the user is creating and convenient links to Google and PubChem searches for that molecule. The user can also type in the IUPAC name to view the structure of the molecule. A future goals is to allow the user to attach custom designed/saved side chains (such as an acetoxy radical) for ease of design.
  2. pathways page to allow the user to explore common and custom organic pathways including common metabolism pathways such as the Calvin Cycle. Specific pathways the user created /discovered will also be stored here under the "My Pathways" sub navigation. Ultimately there is also a tool for a more "admin" level user to approve or reject proposed pathways.
  3. reactions page to allow the user to better understand specific organic reactions (e.g. Friedel-Crafts Acylation and Catalyic Reduction with Hydrogen and Palladium). The user is able to view all associated rules for a given reaction as well as predict what applying a reaction to a particular molecule will yield. The two categories of reactions are 1) reactions by IUPAC name, which involves a reaction specific to a named reactant and product (e.g. Isocitrate dyhydrogenase in the Calvin cycle) and 2) reactions by rule, which involves a rule as described above (convert a carbon carbon double bond to a carbon carbon single bond with a chlorine attached.)
  4. reaction solver feature to allow the user to find a synthesis pathway between both a beginning and target organic molecule(s). This feature can be especially valuable for a student trying to tackle a difficult synthesis problem in which they can't readily find a solution with a web search and ultimately will be useful as a synthesis pathway search engine. An example solution that is calculated is the synthesis pathway from benzene to aspirin (2-acetoxybenzoic acid). This feature allows the user to optimize the synthesis pathway engine search for lowest pathway cost, fewest number of synthesis steps, or shortest reaction time. It also allows the user to specific which reaction sources to search through: the reaction rules, the reactions by IUPAC name, and external sources. Finally, it has an optional "smart search" feature designed to take advantage of a heuristic to predict the most optimal pathway.
  5. home page allowing the user to explore a simplified set of common organic molecules and reaction pathways. This serves as a "lite" version of the site with an introduction to its features. 

I believe this differentiates from the organic portal OSIRIS property explorer in that it : 1) provides an on the fly IUPAC name generation for the current molecule the user has drawn/designed. This is done locally on the client so does not require a server interaction. 2) Provides a web based application as opposed to a Java app to allow benefits of cloud storage and easier accessibility (eventually when optimized for mobile) 3) only allows user to design molecules that obey proper bonding, hybridization, and steric spacing rules. 4) Allows more detailed inspection of each atom in the molecule for use as a learning tool (including hybridization, free bonding electrons, formal charge) 5) (hopefully) provides a cleaner, easier to use tool including custom attachments the user can define (such as a carboxylic acid or acetoxy radical).

From my understanding of both the other features of organic portal and SciFinder (please correct me if I'm wrong)  they serve as compendiums of reaction knowledge, research, information, and modeling. The goal of organicchemmaster is to utilize this information to create models of reactions that can be applied to an on the fly, user defined, controlled and initiated synthesis pathway search. 

Per Reaxys, the differentiation of the search engine is that the user can define a starting molecule and a goal molecule for the search, as opposed to requesting known synthesis pathways for a specific goal molecule. The hope is that this approach will ultimately allow for better discovery of synthesis pathways as opposed to viewing existing ones, but of course that is a very tall task and requires much further work and testing. The question goes from "What are some promising synthesis pathways of this molecule of interest" to "What is the most optimal way (cost, number of steps, or total reaction time) to produce this molecule of interest given that I have this (or these) materials available." In the meantime, the goal of the synthesis pathway search engine is also to provide the chemistry student with a tool to solve problems and learn about reactions (without needing to spend as much money as a tool like Reaxys.) 

Again, this is a work in progress, so I greatly appreciate all feedback and clarification on any areas I may be mistaken about the other tools/sites you mentioned. I am also interested in potentially collaborating. Finally, I am maintaining a blog of the project at molecularpathwaygenerator.blogspot.com

Share this post


Link to post
Share on other sites

Okay, I have looked at it briefly on my lap top. It is a good idea, but I think you need to invest a lot more time into it before it becomes useful.

Before I get too much into it, I have to say that I don't like the draw tool you use. It's very limited and not super intuitive. I tried to step through the tutorial to draw ethanol, but could not get it to add any heteroatoms? Could you not integrate something like JSME or Chem Doodle? These are more complete in terms of a draw tool, and they use the more common bond-line / zig zag style of drawing. I think JSME is free to use as well. 

You may be interested in a recent project launched by IBM. It essentially performs the task of your reaction solver but uses a data driven AI approach, which I think is very elegant. Is this what your tool does as well, or do you use a different method? Out of curiosity, how do you calculate cost and reaction time? Does it take in to account purification and work up steps, or just the reagent cost (and if so, where do you source this and what country are you basing it from?). Reaction time is quite variable doesn't always translate across different molecules very well. I think most chemists would be more interested in limited number of steps, as this invariably leads to a reduction in cost and time. I don't think that trying to sort by some arbitrary cost or 'time' value would be that helpful. 

Another question I have with the solver feature is how it accounts for possible by-products and incompatible functional groups. Does it incorporate protecting group chemistry? Does it allow for the user to refine the types of reactions used?

As for typos. Firstly, I would recommend sub-scripting in your chemical formulae where appropriate. You also use a lower-case k for K2Cr2O7. The wording you use is also very confusing in some places. I don't have time to go through it all, but I recommend hiring someone to proof read it. 

For example:

Quote

Add Chloride one Carbon away from carbon-carbon double bond
Replace carbon-carbon double bond with Chloride

This is the action for Cl2 addition to a double bond. I honestly do not understand what it is saying. Is it saying you only add one chloride to the double bond? That seems to be the interpretation, but it is incorrect if this is the case. I also don't like the use of the word replace. You use it when describing oxidation of alcohols to aldehydes, but this isn't really a good way of describing it IMO. 

Another thing I noticed is that your LiAlH4 reduction rules do not seem to account for esters. 

Will you have an encyclopaedia of name reactions? 

Share this post


Link to post
Share on other sites
On 1/31/2019 at 11:02 PM, hypervalent_iodine said:

Okay, I have looked at it briefly on my lap top. It is a good idea, but I think you need to invest a lot more time into it before it becomes useful.

Thanks again for the feedback, I appreciate thinking it's a good idea. I absolutely agree there is MUCH more work to be done to make it as useful as it can be. In fact I am looking to get as many resources/people working on the project as possible. 

In general, a few of the items you mentioned are not supported yet, though of course the goal is to implement support for ALL types of organic molecules. Specifically, esters are not yet supported other than the ester linkage found in an acetoxy group (for example the aspirin (2-acetoxybenzoic acid) molecule ). As they are not, there is no defined reaction rule yet for the LiAlH4 reduction of esters. But yes, that will of course be eventually modeled. As will heteroatoms, which are not yet supported, either. I keep a blog to update when new molecules are supported.

On 1/31/2019 at 11:02 PM, hypervalent_iodine said:

As for typos. Firstly, I would recommend sub-scripting in your chemical formulae where appropriate. You also use a lower-case k for K2Cr2O7

These typos are actually VERY easy to fix, fortunately. I went ahead and formatted subscripts where appropriate. 

On 1/31/2019 at 11:02 PM, hypervalent_iodine said:

Another question I have with the solver feature is how it accounts for possible by-products and incompatible functional groups. Does it incorporate protecting group chemistry? Does it allow for the user to refine the types of reactions used?

The search engine/solver feature DOES allow for protecting group chemistry, eventually. No protecting group reactions (such as Tosylate reactions) are currently modeled, but the search engine WILL allow for those type of reactions. Without getting too much into the specific search algorithms, there is a rule in place such that the search will never consider applying the same reaction to the exact same molecule (under the same conditions) twice, as obviously this could lead to infinite loops (applying and unapplying Tosylation to the same molcule without otherwise altering it).

The goal is also to allow the user to control which reactions are considered. There will be logical groupings for reactions such as: toxins, carcinogens, etc, that the user can choose to exclude from consideration. The user will also eventually be able to create custom lists. This functionality can also be used in an educational context, say if an instructor only wants a student to work with a limited set of reactions.

Incompatible functional groups will be included in the rule that determines if a reaction can be applied to a molecule. I haven't yet spec'ed out by-products, but I think those can be tracked in the search.

Also, of note, the user will eventually be able to define their OWN rule reactions from scratch. 

On 1/31/2019 at 11:02 PM, hypervalent_iodine said:

Could you not integrate something like JSME or Chem Doodle?

I absolutely appreciate the feedback. Yes, per the interface I have indeed considered going with a drawing style more similar to those tools. The drag and drop approach is something I envisioned from the start as a unique approach that also lends well to the immediate, on the fly IUPAC molecule name generation and the hybridization and free bonding electrons of atoms immediate updates. That said, so far feedback has been preferring the JSME style of selecting an atom/bond-line/etc and then clicking the part of the molecule to add to. Moving to this approach will definitely be highly considered. I do think as an educational tool it is important to maintain the inspector feature of the interface so a user can view the properties of any atom in the molecule. This tool can be hidden as well.

I did look into the IBM project. From my understanding it is more of a predictive reaction tool. The question it answers as far as I can tell (please let me know if I'm mistaken, I haven't used it much) is "If I combine these chemicals or molecules, what will the reaction and the products be". The question that is asked by the pathway synthesis generator with this project is "What is the most optimal way (cost, number of steps, or total reaction time) to produce this molecule of interest given that I have this (or these) materials available." So it's more of a molecule A to molecule B search than a what happens if we combine molcules A and B.

 

On 1/31/2019 at 11:02 PM, hypervalent_iodine said:

Out of curiosity, how do you calculate cost and reaction time? Does it take in to account purification and work up steps, or just the reagent cost (and if so, where do you source this and what country are you basing it from?). Reaction time is quite variable doesn't always translate across different molecules very well. I think most chemists would be more interested in limited number of steps, as this invariably leads to a reduction in cost and time. I don't think that trying to sort by some arbitrary cost or 'time' value would be that helpful. 

I appreciate your thoughts on number of steps being of most interest. Currently, cost IS calculated from reagent cost. The reagant costs are sourced generally from MolPort, Sigma Aldrich, ABBLIS and TCI chemicals in US dollars. If you click on the reagants in a pathway you can get a link to the source. Right now, these values are static. The goal is to make these sources updated dynamically at regular intervals. Also the goal is to of course take other steps of the reaction into consideration. 

There will absolutely be an encyclopedia of name reactions. In fact, all reactions in consideration are currently available to view here: http://organicchemmaster.com/Reaction/.

 

On 1/31/2019 at 11:02 PM, hypervalent_iodine said:

This is the action for Cl2 addition to a double bond. I honestly do not understand what it is saying. Is it saying you only add one chloride to the double bond?

What that rule is saying (in plain English) is: 1) Locate a molecule with a Carbon-Carbon double bond.  2) at the location of the Carbon-Carbon double bond, remove the double bond from the molecule and add two Chloride functional groups. One at the first carbon in the double bond and one at the second. 

The text you see is actually a computer generated translation of what is happening in the coded model of the reaction. So getting that to be readable AND chemically appropriate, as you described, is a work in progress :) To clarify though, as the reaction is modeled, TWO chlorides are added. One from the "Add" instruction and one from the "Replace". So but-2-ene will be chlorinated to form 2,3-dichlorobutane as predicted/solved for.

Thank you again for the feedback, I am absolutely happy to continue discussion.

On 1/31/2019 at 3:32 AM, Sensei said:

If there is rule of addition of HCl, then there should be rules for Chlorination, Fluorination, Bromination, Iodination etc.

https://en.wikipedia.org/wiki/Halogenation#Chlorination

@Sensei Support for Iodination and Fluorination reactions has been added.

Share this post


Link to post
Share on other sites
2 hours ago, mgperson2002 said:

I appreciate your thoughts on number of steps being of most interest. Currently, cost IS calculated from reagent cost. The reagant costs are sourced generally from MolPort, Sigma Aldrich, ABBLIS and TCI chemicals in US dollars. If you click on the reagants in a pathway you can get a link to the source. Right now, these values are static. The goal is to make these sources updated dynamically at regular intervals. Also the goal is to of course take other steps of the reaction into consideration. 

 

So, with Sigma you can get wildly different prices depending on what quality / grade you are okay with. For synthesis,  you generally do not need analytical grade chemicals. I bring this up because when I went back to have a look at your reaction solver, I had a look at the suggested pathways for the example you linked to. This is what I get:

Screen Shot 2019-02-06 at 6.47.44 pm.png

 

The 2-hydroxybenzoic acid in the second example is aka salicylic acid. When I saw the price listed I immediately went and checked Sigma because there was no way something as common as salicylic acid costs $1940/g. When I check Sigma, I can buy 3 kg of the stuff for $US200. The only thing I could find close to the price you list was an analytical reference sample (50 mg for $US123), but no one is using an analytical standard to perform synthesis unless they have an absolutely wild disregard for budgets. 

One of the reasons I asked about where you were basing costs is because if you live somewhere like I do (Australia), you have to factor in a lot of other costs for delivery and tax. This is not really a problem if the intended users are students and not researchers of course. 

2 hours ago, mgperson2002 said:

I absolutely appreciate the feedback. Yes, per the interface I have indeed considered going with a drawing style more similar to those tools. The drag and drop approach is something I envisioned from the start as a unique approach that also lends well to the immediate, on the fly IUPAC molecule name generation and the hybridization and free bonding electrons of atoms immediate updates. That said, so far feedback has been preferring the JSME style of selecting an atom/bond-line/etc and then clicking the part of the molecule to add to. Moving to this approach will definitely be highly considered. I do think as an educational tool it is important to maintain the inspector feature of the interface so a user can view the properties of any atom in the molecule. This tool can be hidden as well.

Firstly, I don't understand the bold statement.

Unfortunately, the tool you have now is not very easy or intuitive to use (IMO), which I think should be a key focus over simply making something that is different. I think that the tool that you have with the on the fly IUPAC name generation would be good as a separate tool. As in, don't use it as the primary draw tool for your reaction solver or whatever, just use it as a specific tool for students to learn about IUPAC nomenclature. I think having something that creates a name as you build a molecule would be very good for that, but I can't think of a single time when I have been using SciFinder or similar to look up a reaction and have needed a IUPAC name to be generated as I construct a molecule.

Another comment I would make about it is that you should integrate a way for a user to freely move around atoms. I was using the tutorial just before to make aspirin, and when I added an ethane to an oxygen atom, it added it in such a way that it overlapped with the phenyl ring. Upon trying to add the carbonyl oxygen, I instead got a carbon that appeared triple bonded to my original oxygen, and there was no way for me to change it without deleting it. I would also recommend you have a function to convert other identifiers into structures, like CAS numbers, SMILES, or IUPAC names. Finally, more atoms! Boron chemistry is pretty common for example, as are molecules with phosphorus, but you have neither as an option. 

 

2 hours ago, mgperson2002 said:

I did look into the IBM project. From my understanding it is more of a predictive reaction tool. The question it answers as far as I can tell (please let me know if I'm mistaken, I haven't used it much) is "If I combine these chemicals or molecules, what will the reaction and the products be". The question that is asked by the pathway synthesis generator with this project is "What is the most optimal way (cost, number of steps, or total reaction time) to produce this molecule of interest given that I have this (or these) materials available." So it's more of a molecule A to molecule B search than a what happens if we combine molcules A and B.

 

Yes, you are correct. I half misremembered what it was and didn't look it up when I posted. Still an interesting project! 

From what I can tell, you are essentially automating a retrosynthetic analysis on a compound back to some pre-defined starting material, and then displaying that as the equivalent forward reaction steps. These forward reactions are shown as A --> B with the main reactants that one might use. Before I comment further, could you tell me if this is this correct? What level of complexity do you anticipate you will be able to handle? As in, how many retrosynthetic steps could it do, and how complex can the molecule be?

 

2 hours ago, mgperson2002 said:

I appreciate your thoughts on number of steps being of most interest. Currently, cost IS calculated from reagent cost. The reagant costs are sourced generally from MolPort, Sigma Aldrich, ABBLIS and TCI chemicals in US dollars. If you click on the reagants in a pathway you can get a link to the source. Right now, these values are static. The goal is to make these sources updated dynamically at regular intervals. Also the goal is to of course take other steps of the reaction into consideration. 

 

I happen to work as an organic chemist in a group that does a lot of drug discovery and natural product total synthesis. In total synthesis, the things that we really aim for are (as I mentioned) shorter steps, higher yields, and fewer chromatography steps. The big thing is shorter routes and higher yields. Cost of the reagent is obviously critical also, particularly in industry, but having few steps, high yields, and easy purification are often the things you look at first. 

 

 

Share this post


Link to post
Share on other sites
21 hours ago, hypervalent_iodine said:

So, with Sigma you can get wildly different prices depending on what quality / grade you are okay with. For synthesis,  you generally do not need analytical grade chemicals

Right, so I will say, the price points that are CURRENTLY listed were actually just sourced as a reasonable starting point to get the lowest cost pathway search function working. Of course the goal again is to make this a dynamically updated, robust system of determining costs of each step in a synthesis. I do appreciate the delivery and tax costs, which I think can actually be incorporated into the search itself with a location parameter. In that particular case you screenshot-ed, the price is actually for the acetic anhydride used to acetylate the salicylic acid rather than the salicylic acid itself (you can see the source by clicking on the reagent name). That said, the price point listed per gram for acetic anhydride is definitely a lot lower, so I have updated the current price point. This price was sourced a while ago, it very well indeed might have been an analytical standard. The pricing and cost generating feature is DEFINITELY one area I want to work closely with an organic chemist or a team of chemists.

hybridization and free bonding electrons of atoms immediate updates

This refers to the feature of being able to inspect an individual atom in a molecule and view both the hybridization of that atom and its electrons that are free for bonding (currently bonded to a Hydrogen). If you hover over the Carbon in 2-acetoxybenzoic acid that the acetoxy group is attached to, you will see the hybridization as sp2 and free bonding electrons as 0.

image.png.8ea54d5ad6a56317ab4dff4348279670.png

 

Thoughts are appreciated about the other areas of the interface.

21 hours ago, hypervalent_iodine said:

I would also recommend you have a function to convert other identifiers into structures, like CAS numbers, SMILES, or IUPAC names. Finally, more atoms! Boron chemistry is pretty common for example, as are molecules with phosphorus, but you have neither as an option. 

Sounds good! So IUPAC names ARE currently supported and generated on the fly. SMILES can definitely be generated on the fly as a feature to add. As CAS numbers are assigned as opposed to systematically generated from the molecule structure alone (correct me if I'm wrong) this feature will be slightly different as it will require a lookup (either at a cache level or a server pull). In the meantime, the user can click on either the google search or pubchem search icon to pull data about the molecule including the CAS number.

Currently Boron chemistry is NOT supported, but absolutely that is another area to add. My M.O. for adding new types of chemistry (heteroatoms, Boron, ester linkages) is I want the IUPAC name generator tool to support ALL reasonable chemicals the user might design first, then introduce that chemistry as a design option. The reasoning is to make the app as robust as possible at each level before adding more. Phosphorous as an atom is not supported, but the Phosphate polyatomic ion is, for such applications as biochemical pathways e.g. the Calvin cycle.

 

21 hours ago, hypervalent_iodine said:

Yes, you are correct. I half misremembered what it was and didn't look it up when I posted. Still an interesting project! 

From what I can tell, you are essentially automating a retrosynthetic analysis on a compound back to some pre-defined starting material, and then displaying that as the equivalent forward reaction steps. These forward reactions are shown as A --> B with the main reactants that one might use. Before I comment further, could you tell me if this is this correct? What level of complexity do you anticipate you will be able to handle? As in, how many retrosynthetic steps could it do, and how complex can the molecule be?

Absolutely, agreed still a very interesting project they are doing! The goal of this app is actually to provide a more general approach than a retrosynthetic analysis. The app searches more for a forward synthesis pathway from a user defined starting molecule to a user defined goal molecule. So the starting material is pre-defined, but it's actually not necessarily taken from a set of known starting materials. Nor are any specifically looked for given the goal molecule. The analogy I have come up for is a Google Maps search to find a path from location A to location B, as opposed to finding a lot of known routes to location B from common starting locations.

One particular application, and please bear with me as this is purely hypothetical, is a group producing acetaminophen/paracetamol that is used to beginning production with a stock of phenol. For some reason that stock might have suddenly become more expensive, or they might have run out, but they find themselves readily with a large stock of toluene. As they're curious if instead they can produce the acetaminophen from the toluene stock, they enter toluene as the starting molecule and acetaminophen as the goal molecule. This is beneficial if the retrosynthesis might not have identified toluene otherwise as a starting molecule. Again, as this is hypothetical I am indeed making assumptions on availability of materials among other things, but it's meant to illustrate the approach. A further benefit of looking both forwards AND backwards is a different way to discover a new pathway or even a related molecule. 

So I think the best answer is it's more general in approach, more of a bidirectional search, yet of course can incorporate strategies and algorithms from a retrosynthetic analysis approach. The steps it can theoretically handle in a synthesis are not bounded, nor is the complexity of either of the molecules, but of course the challenge of resources and time becomes higher as these two climb. The search engine of the app is definitely another area that is built to evolve. Heuristics have been implemented to guide the search, as well. To comment on your point of fewer steps preferred, the default synthesis search IS to optimize for the fewest number of steps.

 

21 hours ago, hypervalent_iodine said:

I happen to work as an organic chemist in a group that does a lot of drug discovery and natural product total synthesis. In total synthesis, the things that we really aim for are (as I mentioned) shorter steps, higher yields, and fewer chromatography steps.

Very useful to know. The absolute home run goal of this entire project is ultimately to help discover a new potentially useful drug or even a more efficient synthesis pathway of an already existing useful drug. I absolutely am looking to work with a group of organic chemists.

One final note, I am not aware of the etiquette of this forum, but I am certainly interested in communicating over private messaging as well as this thread.

 

 

 

image.png

Edited by mgperson2002
Removing larger image

Share this post


Link to post
Share on other sites
19 minutes ago, mgperson2002 said:

hybridization and free bonding electrons of atoms immediate updates

This refers to the feature of being able to inspect an individual atom in a molecule and view both the hybridization of that atom and its electrons that are free for bonding (currently bonded to a Hydrogen). If you hover over the Carbon in 2-acetoxybenzoic acid that the acetoxy group is attached to, you will see the hybridization as sp2 and free bonding electrons as 0.

Right. I think this supports my suggestion of using your draw tool as a separate tool for student learning rather than the main one you use for all of your drawing. 

20 minutes ago, mgperson2002 said:

Sounds good! So IUPAC names ARE currently supported and generated on the fly. SMILES can definitely be generated on the fly as a feature to add. As CAS numbers are assigned as opposed to systematically generated from the molecule structure alone (correct me if I'm wrong) this feature will be slightly different as it will require a lookup (either at a cache level or a server pull). In the meantime, the user can click on either the google search or pubchem search icon to pull data about the molecule including the CAS number.

I think you might have misunderstood what I meant, or I wasn't very clear. I wasn't talking about generating those parameters on-the-fly, I meant that for your reaction solver or discover function it might be useful to be able to generate a structure from a substance identifier rather than having to draw it in every time. 

 

24 minutes ago, mgperson2002 said:

Absolutely, agreed still a very interesting project they are doing! The goal of this app is actually to provide a more general approach than a retrosynthetic analysis. The app searches more for a forward synthesis pathway from a user defined starting molecule to a user defined goal molecule. So the starting material is pre-defined, but it's actually not necessarily taken from a set of known starting materials. Nor are any specifically looked for given the goal molecule. The analogy I have come up for is a Google Maps search to find a path from location A to location B, as opposed to finding a lot of known routes to location B from common starting locations.

One particular application, and please bear with me as this is purely hypothetical, is a group producing acetaminophen/paracetamol that is used to beginning production with a stock of phenol. For some reason that stock might have suddenly become more expensive, or they might have run out, but they find themselves readily with a large stock of toluene. As they're curious if instead they can produce the acetaminophen from the toluene stock, they enter toluene as the starting molecule and acetaminophen as the goal molecule. This is beneficial if the retrosynthesis might not have identified toluene otherwise as a starting molecule. Again, as this is hypothetical I am indeed making assumptions on availability of materials among other things, but it's meant to illustrate the approach. A further benefit of looking both forwards AND backwards is a different way to discover a new pathway or even a related molecule. 

So I think the best answer is it's more general in approach, more of a bidirectional search, yet of course can incorporate strategies and algorithms from a retrosynthetic analysis approach. The steps it can theoretically handle in a synthesis are not bounded, nor is the complexity of either of the molecules, but of course the challenge of resources and time becomes higher as these two climb. The search engine of the app is definitely another area that is built to evolve. Heuristics have been implemented to guide the search, as well. To comment on your point of fewer steps preferred, the default synthesis search IS to optimize for the fewest number of steps.

 

I question how robust this will really be. What you are describing is automated forward reaction planning, which has to rely in some part on automated retrosynthesis and / or the ability for your program deduce a way to build the right complexity to meet a defined end goal. Keep in mind that automated retrosynthesis is something that has only seen real-world success in the last year or two, and relies very much on AI / neural networks / deep learning / other buzzwords. What I am a little confused about is the ultimate goal of this website. The above quote seems to suggest you want it to be used for both researchers as well as students. If that is the case, I think you might be biting off more than you can chew with this particular tool. A research chemist seeking out synthetic protocols would have any number of concerns that I think you might struggle to address. For instance, what if I wanted to make something enantioselectively (this is a very common requirement)? Can it handle only linear synthesis, or could it develop a convergent route (the latter being  much more preferable )? I can definitely see how handy it would be to have what you are talking about for a researcher if it were fully fledged and capable of handling a lot of complexity and a lot of reactions, but at the end of the day I would really question whether or not this is a goal you can reasonably achieve within the frame of a free-to-use web-based service with a small team behind it (I assume this is the case, apologies if not). Moreover, I wonder if by pitching this for students and researchers, you are spreading yourself a touch too thin and venturing into the realms of doing too much at once. Everything else in your website to me seems like it is geared largely towards education, so the reaction solver / design function seems a bit...out of place I guess. That's just my 2 cents. 

 

56 minutes ago, mgperson2002 said:

The absolute home run goal of this entire project is ultimately to help discover a new potentially useful drug

I haven't really seen anything in your website that would lend itself to this goal. Could you elaborate? 

Share this post


Link to post
Share on other sites

Appreciated re: your comment of a different design tool for the student than the main one. Will fully take that into consideration.

1 hour ago, hypervalent_iodine said:

I think you might have misunderstood what I meant, or I wasn't very clear. I wasn't talking about generating those parameters on-the-fly, I meant that for your reaction solver or discover function it might be useful to be able to generate a structure from a substance identifier rather than having to draw it in every time. 

You actually can currently do this. If you click on the molecule name under molecule properties, then enter a different IUPAC name, the molecule structure will change to that name. The basic molecule tutorial describes this: https://www.organicchemmaster.com/MolGen/Tutorial/BasicMolecule . Apologies if I wasn't clear in this case! As long as the IUPAC name is obtainable the structure can be drawn, so this feature can eventually be used for SMILES and CAS numbers.

 

1 hour ago, hypervalent_iodine said:

What I am a little confused about is the ultimate goal of this website. The above quote seems to suggest you want it to be used for both researchers as well as students. If that is the case, I think you might be biting off more than you can chew with this particular tool. A research chemist seeking out synthetic protocols would have any number of concerns that I think you might struggle to address. For instance, what if I wanted to make something enantioselectively (this is a very common requirement)? Can it handle only linear synthesis, or could it develop a convergent route (the latter being  much more preferable )

Full enantioselectivity of reactions is indeed something that is planned to be modeled. Currently full stereochemistry support is implemented in the molecule modeling/naming. Some of the named reactions are stereospecific in modeling already, such as this glycolysis step. The thoughts are to include stereospecificity in the rules of a rule based reaction. I had not yet considered convergent syntheses, but I see how those would be valuable. I have some rough ideas for how this might be accomplished using the search engine.

1 hour ago, hypervalent_iodine said:

Moreover, I wonder if by pitching this for students and researchers, you are spreading yourself a touch too thin and venturing into the realms of doing too much at once. Everything else in your website to me seems like it is geared largely towards education

That is indeed fair. The process I most reasonably see this project/application taking is FIRST being useful as an educational tool. Then utilizing the knowledge gained and the improvement of the reaction modeling framework (the code backend) to translate that framework to be useful for researcher. You are absolutely right that the second goal will require a much larger team and many more resources and it is indeed more than I can chew. This is the main reason I am actually hoping to build a much larger team, eventually. I KNOW how hard this will be. But I want to see it happen. The first step is indeed to pitch primarily to students.

My current thoughts to aid in discovery of a new drug are currently both using the reaction predictor (clicking on the reaction edit pencil will allow you to predict applying that reaction to a given molecule, might change this to a double click anywhere in the row) and including a feature to specially mark a molecule that is either previously unknown, or has other interesting values that is included in a synthesis pathway result. I am absolutely open to more thoughts.

2 hours ago, hypervalent_iodine said:

I would really question whether or not this is a goal you can reasonably achieve within the frame of a free-to-use web-based service with a small team behind it (I assume this is the case, apologies if not).

I'll take the "I'm a small team" part as a compliment ;) Now it's a free to use web-based service as again, I am looking to gather interest and find others interested in working on the project. Absolutely a tool that is useful for researchers specifically would not follow this model.

Share this post


Link to post
Share on other sites
Posted (edited)

Hi all. 

An update for those who are following. I have gone ahead and done a re-haul of the interface and specialized/optimized many parts of it for a mobile or small-screen device. Eventually the goal will also be to create an app for ios and android, but for now the site www.organicchemmaster.com will render as it has before on a desktop/large screen device (specifically with screen width of greater than 992px) and will now render in a special mobile optimized mode for a device with a smaller screen. @hypervalent_iodine I did take to mind your suggestion that the draw tool be more intuitive/similar to JSME or ChemDoodle for the mobile version. As such I switched away from a drag and drop approach for adding carbon chains/atoms and towards a click to select and click to add approach. It is my hope to add the bond-line/ zig zag style as an option as well in the near future.

@Sensei I did add support for free-radical chlorination. Your search for a pathway from ethane to chloroethane will now be successful. The reaction specifics can be viewed at https://www.organicchemmaster.com/Reaction/Rule/027.

There is of course still much more work to be done. The latest updates can be viewed on the project blog at : http://molecularpathwaygenerator.blogspot.com/

Thanks to all who have responded.

 

Edited by mgperson2002

Share this post


Link to post
Share on other sites

Hi again all.

A further update after the addition of some more features. I have added full support for esters in the interface. As such, I was able to go ahead and expand the LiAlH4 reduction rules to INCLUDE reduction of an ester as was previously mentioned missing by @hypervalent_iodine. An example of such a reduction reaction can be viewed here: https://www.organicchemmaster.com/Molgen/Reaction/methyl ethanoate/ethan-1-ol?options=Calc,Reac. Also mentioned was the lack of support for heteroatoms. I have gone ahead and introduced support for nitrogen heteroatoms in particular for this update, with the motivating goal being to support all 5 DNA/RNA bases as well as the medicine allopurinol. A tutorial for how heteroatoms can be added to a molecule can be found here: https://www.organicchemmaster.com/MolGen/Tutorial/HeteroatomMolecule. I am particularly interested in feedback on this approach for adding heteroatoms. Future updates will include other heteroatoms as well. Also of note, the support of allopurinol required, in turn, the support of heterocyclic molecules.

Once again, there is still absolutely MUCH more work to be done, and I do sincerely appreciate all feedback. The project blog again can be viewed at http://molecularpathwaygenerator.blogspot.com/

Thanks all

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now