PPIExtractor: A Protein-Protein Interaction Extractor for Biomedical Literature



Protein-protein interactions play a key role in various aspects of the structural and functional organization of the cell. Knowledge about them unveils the molecular mechanisms of biological processes. However, the amount of biomedical literature regarding protein interactions is increasing rapidly and it is difficult for interaction database curators to detect and curate protein interaction information manually. We present a system,PPIExtrator, to automatically extract protein-protein interactions from biomedical literature and construct the PPI network. The system applies Feature Coupling Generalization to tag protein names, uses the extended semantic similarity to normalize protein mentions, combines the feature-based kernel, tree kernel and graph kernel to extract PPI, and finally visualizes the PPI network. Experimental evaluations show that our system can achieve state-of-the-art performance with respect to comparable evaluations.

PPIExtrator is free for academic research purposes only. It can be downloaded here.

Averagely the system can process more than 100 abstracts in an hour. For a large input file,the processing procedure may take several days.An example of Lung Squamous Cell Carcinoma PPI network is provided(the input file is .\File\LungSquamousCellCarcinoma.txt) and you can see the result by directly opening .\results\LungSquamousCellCarcinoma\LungSquamousCellCarcinoma.txt in PPIExtractor. For more details, click here to download the help documnet.


Zhihao Yang, Zhehuan Zhao, Yanpeng Li, Yuncui Hu, Hongfei Lin,PPIExtractor: A Protein Interaction Extraction and Visualization System for Biomedical Literature,IEEE Transactions on NanoBioscience, 2013,12(3):173-181.

Configuration demand

    1. Platform: Windows

    2. RAM:2G

    3. Other:JRE 1.6 has been installed.


    Unzip the *.rar file in a directory and double-click the *.bat in the directory .

This page is maintained by Zhihao Yang.