Because of there are currently no complete non-histone crotonylation datasets, we used the experimentally verified human Kcrot peptides data derived from two proteomic assays [1, 2], and collected dataset from papaya[3], rice[4] and tabacum[5]. To prepare the benchmark data sets with high confidence for training and testing, we procedure established : (1) The Kcrot sites with high confidence were collected as positive sites, and the remaining lysine residues on the Kcrot-containing proteins were considered negative sites. (2) In order to avoid over-estimation caused by similar protein sequences, the Kcrot-containing proteins with sequence identities > 30% were clustered using the CD-HIT tools. (3) For each site in the dataset, we extracted 7-residue peptides with the lysine site in the center. If the Kcrot-containing peptides (i.e., positive peptides) were identical to those with negative sites (i.e., negative peptides), both peptides were removed. (4) The representative proteins in the dataset was randomly divided into two groups: 4/5 (1651) for cross-validation and the rest 1/5 (413) for an independent test. The Kcrot sites on histone proteins are derived from the Qiu's research [6]. Both of the datasets could be downloaded here.

[1] Xu, W., et al., Global profiling of crotonylation on non-histone proteins. Cell Res, 2017. 27(7): p. 946-949.
[2] Huang, H., D.L. Wang, and Y. Zhao, Quantitative Crotonylome Analysis Expands the Roles of p300 in the Regulation of Lysine Crotonylation Pathway. Proteomics, 2018: p. e1700230.
[3] Liu, K., et al., A qualitative proteome-wide lysine crotonylation profiling of papaya (Carica papaya L.). Sci Rep, 2018. 8(1): p. 8230.
[4] Liu, S., et al., Global involvement of lysine crotonylation in protein modification and transcription regulation in rice. Mol Cell Proteomics, 2018
[5] Sun, H., et al., First comprehensive proteome analysis of lysine crotonylation in seedling leaves of Nicotiana tabacum. Sci Rep, 2017. 7(1): p. 3013.
[6] Qiu, W.R., et al., iKcr-PseEns: Identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics, 2018. 110(5): p. 239-246.

Who are using?