Topological Index Calculator III [1] Computation of the Hosoya Index using SMILES representation Steven D. Granz, Departments of Mathematics & Computer Science, Gordon College, Wenham, MA 01984, sgranz@peace.gordon.edu Irvin J. Levy, Departments of Chemistry & Computer Science, Gordon College, ijl@gordon.edu Hosoya Index Regression: Abstract: Hosoya Index Algorithm Using SMILES Representation: Z(CC(C)) Number of parent carbons: 2 N = 2 / 2 = 1 N = 1 + 1 = 2 C | C(C) Z(CC(C)) = Z(C) * Z(C(C)) + PI Z(fragments) PI Z(fragments) = = Z(CC(C)) = Z(C) * Z(C(C)) + Z((C)) = 1 * 2 + 1 = 3 Z(C(C)CC) Number of parent carbons: 3 N = 3 / 2 = 1.5 N = 1 N = 1 + 1 = 2 C(C) | CC Z(C(C)CC) = Z(C(C)) * Z(CC) + PI Z(fragments) PI Z(fragments) = = = Z(C(C)) * Z(CC) + Z((C)) * Z(C) = 2 * 2 + 1 * 1 = 5 Z(G) = Z(CC(C)) * Z(C(C)CC) + Z((C)) * Z((C)) * Z(C) * Z(CC) = 3 * 5 + 1 * 1 * 1 * 2 = 17 Previously we introduced Topological Index Calculator as a freely available JavaScript application for the computation of QSPR descriptor indices for alkane molecules. The previous program supported the computation of the following indices: Balaban Index, Weiner Index, Randic Index, Odd-Even Index, Polarity Index, Vertex Degree Distance Index and Harary Index. We now report the addition of the Hosoya Index for acyclic alkanes. Further, we discuss the algorithm we have developed to compute the Hosoya Index directly from the SMILES representation of the molecule and how to compute the Hosoya Index for cyclic alkanes. Computing the value of the Hosoya Index by hand can be tedious. As the number of bonds in the molecule increases, the probability of human error becomes greater. Without a computational tool such as Topological Index Calculator, human error is very likely to occur. In the process of extending Topological Index Calculator, we developed an algorithm that uses SMILES representation to compute the Hosoya Index. SMILES Representation Summary: SMILES [3] (Simplified Molecular Input Line Entry Specification) is a simple yet comprehensive chemical nomenclature. SMILES is used for expressing the molecular graph of "normal" organic molecules. Two Basic Rules of SMILES: Carbons are represented by the atomic symbol: C Branching is indicated by parentheses Hosoya Index Algorithm For Acyclic Alkanes: Given G is the SMILES representation of an acyclic alkane, Z(G) is computed as follows: Find the number of carbons in G If the number of carbons = 1 Z(G) = 1 Else if the number of carbons = 2 Z(G) = 2 Else if the number of carbons = 3 Z(G) = 3 Else Find the number of parent carbons in G Let N = (the number of parent carbon) / 2 If the number of parent carbon is odd truncate N to an integer Add one to N Find the Nth carbon in G Let the atom, B, be the Nth parent carbon atom in G’s SMILES representation Z(G) = Z(subgraph, left of B) * Z(subgraph, from B to end) + PI Z(fragments) where PI Z(fragments) is equal to the product of Z for each fragment created by removing B and the parent carbon to the left of B Hosoya Index Algorithm for Cyclic Alkanes: Given G is the graph representation of a cyclic alkane, Z(G) can be computed by finding the Hosoya Index of two other alkanes for which the sum of their individual Z values will equal Z(G). Preferably, we wish to find two alkanes that are acyclic, but if that is not the case then we can again find the Hosoya Index of two other alkanes, recursively continuing until we have a sum of the Hosoya Index of all acyclic alkanes, yielding the Hosoya Index for the original cyclic structure. An algorithm to do this is the following: 1. Let H be a subgraph defined by removal of one random edge, E, from a ring in G 2. Let I be a subgraph defined by removal of E from G as well as by the removal of all edges adjacent to E in G 3. If H or I is a disconnected graph then calculate Z(H) and/or Z(I) as the product of Z for each of the connected components of the disconnected graph(s) 4. Z(G) = Z(H) + Z(I) http://www.math-cs.gordon.edu/courses/organic/topo/ Hosoya Index: The Hosoya Index [2], Z = Z(G), was introduced by Hosoya in 1971 as the Z index. This index is defined below: where p(G;i) is the number of selections of i mutually non-adjacent edges in G , a chemical graph with N vertices. By definition, p(G;0) = 1, and p(G;1) is the number of edges in G. Example: 2,3-dimethylpentane p(G;0) = 1 p(G;1) = 6 p(G;2) = 8 p(G;3) = 2 The Hosoya Index of 2,3-dimethylpentane Z(G) = p(G;0) + p(G;1) + p(G;2) + p(G;3) = 1 + 6 + 8 + 2 = 17 Acyclic Alkane Example: 2,3-dimethylpentane SMILES Representation: G = CC(C)C(C)CC Z(G) Number of parent carbons: 5 N = 5 / 2 = 2.5 N = 2 N = 2 +1 = 3 CC(C) | C(C)CC Z(G) = Z(CC(C)) * Z(C(C)CC) + PI Z(fragments) PI Z(fragments) = = Z(G) = Z(CC(C)) * Z(C(C)CC) + Z((C)) * Z((C)) * Z(C) * Z(CC) Future Directions: • Students use the tool to verify values found in the literature since at least two journal articles [4] [5] have been found to have errors. • Incorporate existing indices into Topological Index Calculator • Develop new indices with better predictions of the boiling point Cyclic Alkane Example: 1,2-cyclopropane Z(G) = Z( ) + Z( ) Z(G) = Z(CC(C)CC) + Z(CC) * Z(C) * Z(C) * Z(C) Z(CC(C)CC) = Z(CC(C)) * Z(CC) + PI Z(fragments) = Z(CC(C)) * Z(CC) + Z(C) * Z((C)) * Z(C) = 3 * 2 + 1 * 1 * 1 = 7 = 7 + 2 * 1 * 1 * 1 = 9 References: [1] Topological Index Calculator II: Applications for the classroom and research, Steven D. Granz and I. J. Levy, 228th ACS national meeting, Philadelphia, PA 2004. [2] Mihalic, Z. "A Graph-Theoretical Approach to Structure-Property Relationships.” J. Chem. Educ. 1992, 69, 701-712. [3] Weininger, D. “SMILES, a Chemical Language and Information System.” J. Chem. Inf. Comput. Sci. 1988, 28, 31-26. [4] Cao, C. "Topological Indices Based on Vertex, Distance and Ring: On Boiling Points of Paraffins and Cycloalkanes." J. Chem. Inf. and Comp. Sci., 2001, 41, 867-877. [5] Mihalic, Z., et al. “The Detour Matrix And The Detour Index.” Topological Indices and Related Descriptors in QSAR and QSPR. Ed. J. Devillers and A. T. Balaban. Gordon and Breach Science Publishers, 1999, pp.297- 299.