Supplementary web site for
Kevin Y. Yip and Mark Gerstein,
Training Set Expansion: An Approach to Improving the Reconstruction of Biological Networks from Limited and Uneven Reliable Interactions
|
|
Jump to:
Datasets
Supplementary Tables
Supplementary Figures
|
|
Dataset S1: BioGRID-10 |
This dataset contains all BioGRID interactions of Saccharomyces cerevisiae (version 2.0.44) that satisfy the following criteria:
Having one of the following physical interaction types:
FRET
Protein-peptide
Co-crystal Structure
Co-fractionation
Co-purification
Reconstituted Complex
Biochemical Activity
Affinity Capture-Western
Two-hybrid
Affinity Capture-MS
From one of the small-scale studies, defined as studies that report less than 10 physical interactions to BioGRID
The involving proteins/genes have valid values from all the features for learning
The dataset contains 5,126 interactions that involve 2,328 yeast proteins.
Download
|
|
|
Dataset S2: BioGRID-200 |
This dataset is similar to dataset S1, except that small-scale studies are defined as studies that report less than 200 physical interactions to BioGRID. Notice that since the four high-throughput datasets used as data features all have more than 200 interactions, they are not included in this dataset.
The dataset contains 12,155 interactions that involve 3,222 yeast proteins.
Download
|
|
|
Dataset S3: DIP_MIPS_iPfam |
This dataset contains the union of all interactions from DIP (7 Oct 2007 version), MIPS (18 May 2006 version) and iPfam (version 21 of Pfam) that satisfy the following criteria:
For interactions in DIP, only those identified in small-scale experiments or multiple experiments are considered
For interactions in MIPS, only the physical, non-Yeast two hybrid and non-TAP-MS ones are considered
The involving proteins/genes have valid values from all the features for learning
The dataset contains 3,201 interactions that involve 1,681 yeast proteins.
Download
|
|
|
| |
|
Supplementary table S1: complete set of prediction accuracies on BioGRID-10 (in percentage of AUC)
Color code: red - rank 1, green - rank 2, blue - rank3 of each mode
|
| |
phy |
loc |
exp-gasch |
exp-spellman |
y2h-ito |
y2h-uetz |
tap-gavin |
tap-krogan |
int |
Mode 1 |
|
|
|
|
|
|
|
|
|
| direct |
58.04 |
66.55 |
64.61 |
57.41 |
51.52 |
52.13 |
59.37 |
61.62 |
70.91 |
| kCCA |
65.80 |
63.86 |
68.98 |
65.10 |
50.89 |
50.48 |
57.56 |
51.85 |
80.98 |
| kML |
63.87 |
68.10 |
69.67 |
68.99 |
52.76 |
53.85 |
60.86 |
57.69 |
73.47 |
| em |
71.22 |
75.14 |
67.53 |
64.96 |
55.90 |
53.13 |
63.74 |
68.20 |
81.65 |
| local SVM |
71.53 |
71.17 |
70.35 |
68.98 |
67.26 |
67.25 |
64.59 |
67.48 |
74.77 |
| local+pp SVM |
72.07 |
69.64 |
76.02 |
73.54 |
71.50 |
71.46 |
74.41 |
71.09 |
82.94 |
| local+ki SVM |
71.72 |
71.15 |
75.84 |
71.00 |
69.32 |
69.03 |
70.66 |
71.89 |
81.75 |
| local+pp+ki SVM |
71.78 |
70.40 |
76.73 |
71.37 |
70.42 |
70.43 |
73.49 |
72.47 |
83.19 |
| local SVR |
71.67 |
71.41 |
72.66 |
70.63 |
67.27 |
67.27 |
64.60 |
67.48 |
75.65 |
| local+pp SVR |
73.89 |
75.25 |
77.43 |
75.35 |
71.60 |
71.51 |
74.62 |
71.39 |
83.63 |
| local+ki SVR |
71.68 |
71.42 |
75.89 |
70.96 |
69.40 |
69.05 |
70.53 |
72.03 |
81.74 |
| local+pp+ki SVR |
72.40 |
75.19 |
77.41 |
73.81 |
70.44 |
70.57 |
73.59 |
72.64 |
83.59 |
Mode 2 |
|
|
|
|
|
|
|
|
|
| direct |
59.99 |
67.81 |
66.18 |
59.22 |
54.02 |
54.64 |
62.28 |
63.69 |
72.34 |
| Pkernel |
72.98 |
69.84 |
78.61 |
77.30 |
57.01 |
54.65 |
71.16 |
70.36 |
87.34 |
| local SVM |
76.17 |
78.68 |
76.07 |
73.46 |
72.26 |
72.23 |
68.39 |
72.48 |
81.29 |
| local+pp SVM |
75.85 |
73.66 |
79.71 |
75.61 |
74.05 |
73.80 |
75.89 |
75.10 |
87.80 |
| local+ki SVM |
76.06 |
78.70 |
79.02 |
73.32 |
72.68 |
72.03 |
71.22 |
75.55 |
85.53 |
| local+pp+ki SVM |
76.32 |
73.73 |
79.99 |
75.48 |
73.58 |
73.35 |
74.98 |
75.87 |
87.62 |
| local SVR |
76.89 |
78.73 |
79.72 |
77.32 |
72.93 |
72.89 |
68.81 |
73.15 |
82.82 |
| local+pp SVR |
77.71 |
80.71 |
82.56 |
80.62 |
74.74 |
74.41 |
76.36 |
75.12 |
88.78 |
| local+ki SVR |
76.76 |
78.73 |
80.62 |
76.44 |
73.39 |
72.76 |
72.42 |
76.22 |
86.12 |
| local+pp+ki SVR |
77.45 |
80.57 |
81.93 |
78.92 |
74.14 |
74.01 |
75.59 |
76.59 |
88.56 |
Mode 3 (mode 2 with self-interactions removed) |
|
|
|
|
|
|
|
|
|
| direct |
57.72 |
66.69 |
64.23 |
56.86 |
51.36 |
52.01 |
60.10 |
61.60 |
70.75 |
| Pkernel |
72.01 |
68.89 |
77.89 |
76.37 |
56.24 |
53.97 |
71.48 |
69.67 |
87.13 |
| local SVM |
76.47 |
78.56 |
76.27 |
73.88 |
72.57 |
72.54 |
68.64 |
72.81 |
81.39 |
| local+pp SVM |
75.84 |
73.41 |
79.93 |
76.16 |
74.48 |
74.21 |
76.38 |
75.63 |
87.79 |
| local+ki SVM |
76.40 |
78.57 |
79.56 |
73.90 |
72.92 |
72.35 |
71.63 |
75.94 |
85.43 |
| local+pp+ki SVM |
76.51 |
73.43 |
80.32 |
75.66 |
73.70 |
73.60 |
75.62 |
76.24 |
87.62 |
| local SVR |
77.17 |
78.71 |
79.87 |
77.56 |
73.21 |
73.18 |
69.05 |
73.44 |
82.97 |
| local+pp SVR |
78.18 |
80.44 |
82.57 |
80.41 |
75.05 |
74.83 |
76.76 |
75.70 |
88.87 |
| local+ki SVR |
77.10 |
78.71 |
80.74 |
76.41 |
73.51 |
72.97 |
72.72 |
76.53 |
85.96 |
| local+pp+ki SVR |
77.52 |
80.51 |
81.73 |
78.51 |
74.27 |
74.09 |
76.10 |
76.85 |
88.55 |
|
|
|
Supplementary table S2: complete set of prediction accuracies on BioGRID-200 (in percentage of AUC)
Color code: red - rank 1, green - rank 2, blue - rank3 of each mode
|
| |
phy |
loc |
exp-gasch |
exp-spellman |
y2h-ito |
y2h-uetz |
tap-gavin |
tap-krogan |
int |
Mode 1 |
|
|
|
|
|
|
|
|
|
| direct |
58.89 |
66.32 |
65.44 |
59.68 |
51.87 |
51.28 |
63.98 |
64.56 |
71.59 |
| kCCA |
69.14 |
66.36 |
72.30 |
62.74 |
53.52 |
50.85 |
63.23 |
58.49 |
85.73 |
| kML |
65.86 |
68.57 |
73.79 |
73.41 |
55.00 |
56.12 |
64.41 |
62.67 |
68.82 |
| em |
73.60 |
75.78 |
68.66 |
67.55 |
56.10 |
53.47 |
68.76 |
70.48 |
80.89 |
| local SVM |
76.67 |
76.78 |
78.92 |
77.49 |
75.08 |
75.07 |
71.24 |
75.34 |
82.56 |
| local+pp SVM |
76.35 |
75.85 |
80.02 |
78.29 |
75.86 |
76.48 |
77.63 |
76.51 |
85.36 |
| local+ki SVM |
75.88 |
76.71 |
80.42 |
78.04 |
75.55 |
75.27 |
75.15 |
76.91 |
85.46 |
| local+pp+ki SVM |
76.51 |
75.73 |
80.68 |
78.00 |
75.91 |
75.83 |
76.83 |
77.10 |
85.57 |
| local SVR |
77.18 |
76.48 |
80.23 |
79.02 |
75.08 |
75.07 |
71.91 |
75.34 |
83.09 |
| local+pp SVR |
77.60 |
78.92 |
81.98 |
80.59 |
76.10 |
76.48 |
76.67 |
76.54 |
85.98 |
| local+ki SVR |
75.79 |
76.50 |
80.87 |
78.59 |
75.59 |
75.33 |
75.03 |
76.96 |
85.42 |
| local+pp+ki SVR |
76.06 |
78.94 |
81.71 |
79.58 |
75.98 |
75.94 |
76.73 |
77.15 |
85.83 |
Mode 2 |
|
|
|
|
|
|
|
|
|
| direct |
60.52 |
66.81 |
66.97 |
61.41 |
54.01 |
53.70 |
65.19 |
65.81 |
72.50 |
| local SVM |
83.37 |
83.96 |
84.94 |
83.22 |
81.74 |
81.75 |
75.47 |
81.95 |
88.76 |
| local+pp SVM |
83.26 |
83.14 |
86.15 |
84.23 |
81.68 |
81.89 |
81.34 |
82.85 |
91.37 |
| local+ki SVM |
81.84 |
84.00 |
86.02 |
82.77 |
81.30 |
80.98 |
78.31 |
82.63 |
89.99 |
| local+pp+ki SVM |
82.20 |
83.06 |
86.17 |
82.38 |
81.54 |
81.54 |
80.91 |
82.76 |
91.16 |
| local SVR |
83.88 |
83.30 |
86.79 |
85.54 |
82.68 |
82.71 |
76.36 |
82.89 |
89.92 |
| local+pp SVR |
84.37 |
85.62 |
88.12 |
87.00 |
82.43 |
82.87 |
80.61 |
83.65 |
91.82 |
| local+ki SVR |
82.31 |
83.29 |
86.93 |
84.16 |
82.29 |
81.99 |
79.02 |
83.65 |
90.17 |
| local+pp+ki SVR |
82.63 |
85.55 |
87.02 |
85.03 |
82.44 |
82.54 |
81.09 |
83.80 |
91.51 |
Mode 3 (mode 2 with self-interactions removed) |
|
|
|
|
|
|
|
|
|
| direct |
58.91 |
65.99 |
65.61 |
59.81 |
52.10 |
51.79 |
63.74 |
64.40 |
71.37 |
| local SVM |
83.63 |
84.16 |
85.15 |
83.55 |
82.04 |
82.06 |
75.72 |
82.25 |
88.90 |
| local+pp SVM |
83.32 |
83.70 |
86.71 |
84.60 |
82.33 |
82.45 |
81.79 |
83.59 |
91.46 |
| local+ki SVM |
82.07 |
84.20 |
86.54 |
83.22 |
81.78 |
81.49 |
78.77 |
83.11 |
90.05 |
| local+pp+ki SVM |
82.50 |
83.55 |
86.66 |
82.75 |
82.10 |
82.02 |
81.64 |
83.45 |
91.28 |
| local SVR |
84.10 |
83.51 |
86.99 |
85.78 |
82.99 |
83.01 |
76.61 |
83.20 |
90.09 |
| local+pp SVR |
84.74 |
85.71 |
88.21 |
87.00 |
82.82 |
83.37 |
81.29 |
84.31 |
91.88 |
| local+ki SVR |
82.49 |
83.51 |
87.35 |
84.27 |
82.65 |
82.39 |
79.41 |
84.03 |
90.18 |
| local+pp+ki SVR |
82.67 |
85.74 |
87.43 |
85.11 |
82.79 |
82.87 |
81.67 |
84.25 |
91.55 |
|
|
|
Supplementary table S3: complete set of prediction accuracies on DIP_MIPS_iPfam (in percentage of AUC)
Color code: red - rank 1, green - rank 2, blue - rank3 of each mode
|
| |
phy |
loc |
exp-gasch |
exp-spellman |
y2h-ito |
y2h-uetz |
tap-gavin |
tap-krogan |
int |
Mode 1 |
|
|
|
|
|
|
|
|
|
| direct |
63.09 |
64.23 |
68.60 |
62.24 |
53.40 |
57.34 |
63.46 |
64.58 |
73.68 |
| kCCA |
68.78 |
62.24 |
70.93 |
66.85 |
55.25 |
56.70 |
62.88 |
62.59 |
74.45 |
| kML |
65.04 |
67.58 |
70.09 |
69.80 |
58.12 |
59.90 |
63.72 |
61.19 |
77.58 |
| em |
63.22 |
67.90 |
65.15 |
61.74 |
56.23 |
58.31 |
68.02 |
62.92 |
78.46 |
| local SVM |
72.45 |
69.90 |
71.45 |
69.02 |
66.56 |
66.53 |
64.95 |
66.92 |
74.28 |
| local+pp SVM |
73.00 |
70.38 |
75.69 |
74.12 |
72.10 |
72.10 |
75.84 |
71.83 |
83.26 |
| local+ki SVM |
73.67 |
69.89 |
76.89 |
72.01 |
69.80 |
69.25 |
72.75 |
72.41 |
82.44 |
| local+pp+ki SVM |
72.93 |
70.76 |
77.46 |
72.17 |
70.86 |
70.78 |
74.47 |
72.81 |
83.20 |
| local SVR |
72.85 |
70.50 |
72.89 |
70.60 |
66.58 |
66.56 |
64.97 |
66.93 |
74.76 |
| local+pp SVR |
74.48 |
74.99 |
78.09 |
75.89 |
72.02 |
72.09 |
75.88 |
71.56 |
83.72 |
| local+ki SVR |
74.03 |
70.47 |
76.87 |
72.87 |
69.88 |
69.39 |
72.80 |
72.43 |
82.41 |
| local+pp+ki SVR |
73.62 |
74.92 |
78.35 |
75.08 |
70.93 |
70.97 |
74.48 |
73.01 |
83.39 |
Mode 2 |
|
|
|
|
|
|
|
|
|
| direct |
67.57 |
66.48 |
71.54 |
66.24 |
57.74 |
61.52 |
67.46 |
68.86 |
76.53 |
| Pkernel |
73.51 |
68.24 |
78.91 |
77.08 |
58.10 |
58.51 |
72.65 |
69.98 |
85.04 |
| local SVM |
77.78 |
77.79 |
76.67 |
73.99 |
72.93 |
72.98 |
68.68 |
73.23 |
81.10 |
| local+pp SVM |
77.42 |
75.15 |
79.94 |
77.10 |
76.21 |
76.20 |
78.45 |
76.28 |
87.10 |
| local+ki SVM |
78.31 |
77.80 |
80.86 |
75.24 |
75.52 |
73.99 |
74.65 |
77.51 |
85.95 |
| local+pp+ki SVM |
77.71 |
74.93 |
81.38 |
75.46 |
75.61 |
75.83 |
77.37 |
78.03 |
86.75 |
| local SVR |
78.78 |
77.80 |
79.84 |
77.38 |
73.46 |
73.49 |
69.01 |
73.72 |
82.12 |
| local+pp SVR |
79.25 |
81.65 |
83.01 |
81.67 |
76.76 |
76.88 |
79.75 |
76.99 |
88.26 |
| local+ki SVR |
78.88 |
77.80 |
81.55 |
77.83 |
76.11 |
74.62 |
75.56 |
78.07 |
86.51 |
| local+pp+ki SVR |
78.78 |
81.68 |
82.60 |
79.90 |
76.08 |
76.20 |
77.79 |
78.72 |
87.68 |
Mode 3 (mode 2 with self-interactions removed) |
|
|
|
|
|
|
|
|
|
| direct |
63.77 |
64.30 |
68.19 |
62.24 |
52.71 |
56.94 |
63.61 |
65.17 |
73.78 |
| Pkernel |
72.66 |
66.76 |
78.42 |
75.88 |
57.31 |
56.90 |
73.18 |
69.76 |
85.63 |
| local SVM |
77.91 |
77.88 |
76.83 |
74.23 |
73.34 |
73.40 |
68.58 |
73.68 |
81.18 |
| local+pp SVM |
77.81 |
75.17 |
79.91 |
76.99 |
75.93 |
75.76 |
78.28 |
77.45 |
86.80 |
| local+ki SVM |
78.49 |
77.88 |
80.82 |
75.64 |
74.94 |
73.45 |
73.96 |
77.05 |
85.30 |
| local+pp+ki SVM |
78.17 |
75.32 |
81.02 |
75.28 |
75.25 |
75.04 |
76.87 |
77.53 |
86.58 |
| local SVR |
78.95 |
77.89 |
80.09 |
77.66 |
73.95 |
74.00 |
69.02 |
74.28 |
82.40 |
| local+pp SVR |
79.05 |
80.91 |
82.42 |
80.88 |
76.33 |
76.30 |
79.10 |
77.45 |
87.54 |
| local+ki SVR |
78.86 |
77.90 |
81.36 |
77.39 |
75.34 |
73.87 |
74.60 |
77.45 |
85.74 |
| local+pp+ki SVR |
78.34 |
80.85 |
82.15 |
79.01 |
75.39 |
75.26 |
77.22 |
77.92 |
87.32 |
|
|
|
| |
|
Supplementary Figure S1: Prediction accuracy of local modeling with and without training set expansion trained by different sub-samples of the BioGRID-10 dataset. |
| Download |
|
Supplementary Figure S2: For each interaction in the BioGRID-10 dataset, we computed the difference between the rank of it among all predictions given by training set expansion (local+pp or local+ki), and the best rank among those given by the four methods em, kCCA, kML and local. A positive rank difference indicates that training set expansion was able to rank the correct interaction higher than any of the four methods in comparison. This figure shows the correlations between the positive rank differences and 1) minimum node degree, 2) average node degree, and 3) similarity (inner product according to the integrated kernel) of the two interacting proteins. Correlations and the corresponding p-values are computed using both Pearson and Spearman correlation functions. |
| Download |
|
|
|