This tool is able to locate gene/protein names in biomedical literature. The core of the system is a dictionary generated by semi-supervised learning from a large amount of unlabeled biomedical texts . Two appoaches are provided: (a) maximum match based on the dictionary. (b) The combination of the dictionary and a conditional random field (CRF) model. You can test it with your own sentences on the demo page.
Click here to download the dictionary used in the system.
 Yanpeng Li, Hongfei Lin and ZhihaoYang. Incorporating Rich Background Knowledge for Gene Named Entity Classification and Recognition, BMC Bioinformatics, 2009, 10:223.
This page is maintained by Yanpeng Li.
Department of Computer Science and Engineering, Dalian University of Technology, Dalian, China.