The kinetics of thrombin inhibition was determined from the hydrolysis

A manual database/literature search starting from those proteins to find the path between PMA and muscle contraction is clearly a daunting task. In this study, we performed a large scale integration of a diverse set of bio-entities and their relationship information from both databases and literature and built a network based system, integrated bio-entity network, for biological knowledge discovery. We aim to address the three challenges faced by the current knowledge discovery studies, namely, data integration, relationship annotation and hypothesis ranking. Although there is still a lot of room for further improvement in all three areas, the framework we set up in this study presents a clear path toward effective automatic biological knowledge discovery. With the network data structure, graph theoretic algorithms can be designed to search for high probability indirect relationships in IBN. Those automatically generated hypotheses based on the current knowledge base can help researchers to better understand their experimental results and design future experiments. A goal of future research would be to implement a publicly accessible knowledge discovery system. Finally, we point out several directions that the current system can be further improved. SP600125 JNK inhibitor Firstly, some relationship information is still poorly documented in the current databases such as protein-disease relationships and protein-pathway relationships. These relationships can be extracted automatically from literature and added to IBN. Secondly, relationship information needs to be specific to the particular relationship type and direction needs to be given where it is relevant. Such information can be obtained for interactions extracted automatically from literature. We recently developed a method similar to protein interaction Mdm2 inhibitor extraction to predict the directionality of interactions and obtained very good accuracy. This method can be used to add directionality information to the edges in IBN. Thirdly, the probabilities associated with the relationships in IBN have been very helpful in estimating the probabilities of indirectly related bio-entities to rank the generated hypotheses. Estimation of the probabilities of automatically generated hypotheses can be further improved by building more sophisticated models using information of individual relationships. Finally, we want to point out that the protein naming system still needs to be improved.

Leave a Reply Cancel reply