Protein lysine β-hydroxybutyrylation (Kbhb) is a newly discovered post-translational modification associated with a wide range of biological processes. Identifying Kbhb sites is critical for a better understanding of its mechanism of action. However, biochemical experimental methods for probing Kbhb sites are costly and have a long cycle. Therefore, a feature embedding learning method based on the Transformer encoder was proposed to predict Kbhb sites. In this method, amino acid residues were mapped into numerical vectors according to their amino acid class and position in a learnable feature embedding method. Then the Transformer encoder was used to extract discriminating features, and the bidirectional long short-term memory network (BiLSTM) was used to capture the correlation between different features. In this paper, a benchmark dataset was constructed, and a Kbhb site predictor, AutoTF-Kbhb, was implemented based on the proposed method. Experimental results showed that the proposed feature embedding learning method could extract effective features. AutoTF-Kbhb achieved an area under curve (AUC) of 0.87 and a Matthews correlation coefficient (MCC) of 0.37 on the independent test set, significantly outperforming other methods in comparison. Therefore, AutoTF-Kbhb can be used as an auxiliary means to identify Kbhb sites.
Citation: WEI Zhisen, WANG Zhiwei, YU Jinyao, DENG Cheng, YU Dongjun. Prediction of protein Kbhb sites based on learnable feature embedding. Journal of Biomedical Engineering, 2025, 42(5): 1029-1035. doi: 10.7507/1001-5515.202401005 Copy
Copyright © the editorial department of Journal of Biomedical Engineering of West China Medical Publisher. All rights reserved
 
        

 
                