Using linear regression to calculate a quantitative structure-activity relationship (QSAR) between the anesthetic activity of ethers and valence molecular connectivity index.OBJECTIVES
Students will learn how-to:
1. calculate simple molecular connectivity indices, including the valence molecular connectivity index. 2. use linear regression to determine a quantitative relationship between two sets of numbers. In this case the two sets of numbers will be the molecular connectivity indices for a family of molecules and the biological activities of these molecules. 3. use molecular modeling software to visualize molecules.
MOLECULAR CONNECTIVITY INDICES
An important goal in applying chemistry to fight disease and pollution is to be able to predict biological activities of molecules from molecular structures. A variety of numbers can be associated with molecular structures. Some numbers arise from calculations of physicochemical properties, while others are so-called molecular connectivity (i.e., topological) indices that take into account factors such as branching, valence, and other information. Some connectivity indices are highly correlated with physiochemical properties, biological activities, or both. Quantitative structure-activity relationships between connectivity indices and biological activities have resulted in rapid, low-cost, molecular design and screening procedures that help identify molecules that may be medically or environmentally useful.
VALENCE-MOLECULAR CONNECTIVITY ALGORITHM
Valence was introduced into molecular connectivity (i.e., topological) indices by Lemont B. Kier in 1975 (Wallace J. Murray, Lowell H. Hall, and Lemont B. Kier, Molecular Connectivity III: Relationship to Partition Coefficients, Journal of Pharmaceutical Sciences, Dec. 1975, 1978-1981). Like other connectivity indices, the valence molecular connectivity (VMC) is a sum of terms. Each term is a fraction. The numerator of the fraction is always one. The denominator of the fraction is the square root of a product of two terms. The two terms of the product are the valences associated with the two molecular groups connected together by a chemical bond. A table of valences follows.Carbon Groups Nitrogen Groups Oxygen GroupsCH3 1 NH2 3 OH 5CH2 2 NH 4 O 6CH 3 N 5C 4
Using MIMS Notation
While several inline methods of encoding molecular structures have recently become popular, especially SMILES, it is viewed that even SMILES is too difficult to learn in a short time for non-science majors who have not yet taken a college chemistry course (e.g., those enrolled in Physical Science). Therefore a simple inline molecular structure encoding method, named MIMS (Minimal Inline Molecular Structures) was developed for use in this project. MIMS uses only two symbols, "-" and ">", in addition to standard symbols for chemical elements and the groups they form (e.g., CH3). A "dash" represents a bond between chemical groups along the main skeleton of the molecule. A"greater than sign" (>) represents a branch off the main skeleton.For example, CH3 / CH3 - CH2 - CH \ CH3is conveniently written inline in MIMS notation as: CH3-CH2-CH>CH3>CH3As in SMILES, there can be more than one MIMS notation for some molecules. For example, the above molecule can also be encoded as: CH3-CH2-CH>CH3-CH3If a branch consists of more than one group, these branch chains can be reduced to several short MIMS statements. To arrive at the total VMC for the molecule, one merely sums the VMC contributions from each MIMS statement.For example, CH3 CH3 \ / CH CH3 / / CH3 - CH2 - CH-CH2-CH2 \ \ CH CH3 / \ CH3 CH3can be written in MIMS notation as: CH3-CH2-CH>CH>CH-CH2-CH2>CH3>CH3 VMC = 4.104CH>CH3>CH3 VMC = 1.155CH>CH3>CH3 VMC = 1.155 ==========TOTAL VMC = 6.414Note that the two CH groups attached to the third carbon of the main skeleton each appear twice in the MIMS encoding: first in the in the main skeleton and then, for a second time, in the two branch statements. The number of times a group appears is irrelevant in the valence molecular connectivity index. The important consideration is that each bond between two groups is counted only once, which is the case here. Test CaseAn example is the following molecule: CH3 /CH3-NH-CH-C-O-CH3 || OUsing MIMS, this molecule can be written inline as follows:CH3-NH-CH>CH3-C>0-0-CH3The VMC for this molecule is:VMC = (1*4)-1/2 + (4*3)-1/2 + (3*1)-1/2 + (3*4)-1/2 + (4*6)-1/2 + (4*6)-1/2 + (6*1)-1/2 = 2.47
BIOLOGICAL ACTIVITY DATA
ANESTHETIC ACTIVITY OF ETHERS IN MICE log(1/C)1. dimethyl ether 1.85 CH3-O-CH3 2. methyl ethyl ether 2.22 CH3-O-CH2-CH33. methyl isopropyl ether 2.70 CH3-O-CH>CH3-CH3 4. diethyl ether 2.75 CH3-CH2-O-CH2-CH35. methyl propyl ether 2.90 CH3-O-CH2-CH2-CH36. ethyl isopropyl ether 3.00 CH3-CH2-0-CH>CH3-CH37. ethyl propyl ether 3.10 CH3-CH2-O-CH2-CH2-CH38. methyl butyl ether 3.15 CH3-O-CH2-CH2-CH2-CH39. propyl isopropyl ether 3.26 CH3-CH2-CH2-O-CH>CH3-CH310. methyl amyl ether 3.40 CH3-O-CH2-CH2-CH>CH2-CH311. dipropyl ether 3.40 CH3-CH2-CH2-O-CH2-CH2-CH312. ethyl butyl ether 3.30 CH3-CH2-O-CH2-CH2-CH2-CH3REFERENCEMarsh, D.F.; Leake, C.D. Anesthesiology 1950, 11, 455.
PROCEDURE
1. Enter biological activity data into an Excel worksheet as 1/(log C). 2. Calculate valence molecular connectivity indices using the valence molecular connectivity program. 3. Enter the vmc results into the Excel worksheet you created in step 1. 4. Use linear regression to find the correlation between valence molecular connectivity and 1/(log C) for anesthetic potency.
This material is based upon work supported by the National Science
Foundation under Grant No. 9653672. Any opinions, findings, and conclusions or recommendations expressed in
this material are those of the author(s) and do not necessarily reflect the
views of the National Science Foundation.
|