Table I. The dataset of fragments of proteins: Fernandez-Escamilla,A.M. et al. (2004) Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nat. Biotechnol., 22, 1302-1306. | ||
Name | Sequence | Experimental amyloidogenicity |
τ-protein | ||
K19 | PGGGKVQIVYKPV | + |
K19d | PGGGKVYKPV | - |
Mut1 | PGGGKNAEVYKPV | - |
Mut2 | PGGGKVQIVEKPV | - |
K19Chym | QTAPVPMPDLKNVKSKIGSTENLKHQPGGGKVQIVY | - |
K19Chym1 | KPVDLSKVTSKCGSLGNIHHKPGGGQVEVKSEKLDF | - |
K19Chym2 | KDRVQSKIGSLDNITHVPGGGN | - |
K19Gluc4 | QTAPVPMPDLKNVKSKIGSTE | - |
K19Gluc41 | NLKHQPGGGKVQIVYKPVDLSKVTSKCGSLGNIHHKPGGGQVE | + |
K19Gluc42 | VKSE | - |
K19Gluc43 | KLDFKDRVQSKIGSLDNITHVPGGGN | - |
K19Gluc78 | QTAPVPMPD | - |
K19Gluc781 | LKNVKSKIGSTE | - |
K19Gluc782 | NLKHQPGGGKVQIVYKEVD | + |
K19Gluc783 | LSKVTSKCGSLGNIHHKPGGGQVE | - |
K19Gluc784 | VKSEKLDFKDRVQSKIGSLDNITHVPGGGN | - |
PHF8 | GKVQIVYK | + |
PHF6 | VQIVYK | + |
V313-K321 | VDLSKVTSK | - |
V318-G335 | VTSKCGSLGNIHHKPGGG | - |
V335-E342 | GQVEVSKE | - |
Amyloid beta Aβ peptide (1-40) | ||
Whole | DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGGVV | + |
HABP1 | VPHQKLVFFAEDVGS | + |
HABP2 | VHPQKLVFFAEDVGS | + |
HABP3 | VHHPKLVFFAEDVGS | + |
HABP4 | VHHQPLVFFAEDVGS | + |
HABP5 | KKPVFFAED | - |
HABP6 | KKLPFFAED | - |
HABP7 | KKLVPFAED | - |
HABP8 | VHHQKLVPFAEDVGS | - |
HABP9 | KKLVFPAED | - |
HABP10 | KKLVFFPED | + |
HABP11 | VHHQEKLVFFAPDVGS | - |
HABP12 | VHHQEKLVFFAEPVGS | + |
HABP13 | VHHQEKLVFFAEDPGS | + |
HABP14 | VHHQEKLVFFAEDVPS | + |
HABP15 | KKLVFFAED | + |
HABP16 | VHHQKLVFFAEDVGS | + |
AB1 | KLVFF | - |
AB2 | QKLVFFA | - |
AB3 | HQKLVFFAE | - |
AB4 | HHQKLVFFAED | + |
AB5 | VHHQKLVFFAEDV | + |
AB6 | EVHHQKLVFFAEDVG | + |
AB7 | YEVHHQKLVFFAEDVGS | + |
AB8 | GYEVHHQKLVFFAEDVGSN | + |
AB9 | SGYEVHHQKLVFFAEDVGSNK | + |
AB10 | DSGYEVHHQKLVFFAEDVGSNKG | + |
AB11 | HDSGYEVHHQKLVFFAEDVGSNKGA | + |
Alpha synuclein | ||
NAC1-18 | EQVTNVGGAVVTGVTAVA | + |
NAC1-18s | TVNGVGEVTATAVQGVAV | + |
NAC3-18 | VTNVGGAVVTGVTAVA | + |
NAC1-13 | EQVTNVGGAVVTG | + |
NAC6-14 | VGGAVVTGV | + |
Acyl phosphatase | ||
1-17 | STAQSLKSVDYEVFGRV | - |
18-33 | QGVSFRMYTEDEARKI | - |
34-53 | GVVGWVKNTSKGTVTGQVQG | + |
54-68 | PEDKVNSMKSWLSKV | - |
69-85 | GSPSSRIDRTNFSNEKT | - |
86-98 | ISKLEYSNFSVRY | + |
β2-microglobulin | ||
A | IQRTPKIQVYSRHPAE | - |
B | NGKSNFLNCYVSG | - |
C | FHPSDIEVDLLK | - |
D | NGERIEKVEHSDLSFSKD | - |
E | DWSFYLLYYTEFT | + |
E1 | DWSFYLLYYTEFTPTGKDEYA | + |
F | PTGKDEYACRVNHVT | - |
G | LSQPKIVKWDRDM | - |
434 Cro repressor | ||
1Cro | MQTLSERLKKRRIALKY | - |
2Cro | YKMTQTELATKAGVK | - |
3Cro | YKQQSIQLIEAGVTKR | - |
4Cro | TKRPRFLYEIAMALNSD | + |
5Cro | AMALNCDPVWLQYGTKRGKA | - |
Sperm whale myoglobin | ||
A-Helix | VLSEGEWQLVLHVWAKVEA | + |
AB-Domain | EGEWQLVLHVWAKVEADVAGHGQDILIRLFK | + |
B-Helix | DVAGHGQDILIRLFKS | + |
BC-Turn | KSHPET | - |
CCD-Domai | HPETLEKFDRFKHLK | - |
D-Helix | TEAEMKA | - |
E-Helix | SEDLKKHGVTVLTALGAILK | - |
EF-Turn | KKGHHEAE | - |
F-helix | ELKPLAQSHA | - |
FG-Turn | ATKHKIP | - |
Myohemerithrin | ||
N-terminal | GWEIPEPYVWDESFRVFY | - |
C-terminal | GTDFKYKGKL | - |
A_helix | YEQLDEEHKKIFKGIFDCIRD | - |
A_helix | YEQLDEEHKKIFKGIFDCIRD | + |
AB_loop | RDNSA | - |
B_helix | SAPNLATLVKVTTNHFTHEEAMMD | + |
BC_loop | DAAKYSEV | - |
C_Helix | EVVPHKKMHKDFLEKIGGL | + |
CD_loop | GLSAPVD | - |
D_helix | AKNVDYCKEWLVNHIK | - |
D_helix | AKNVDYCKEWLVNHIK | - |
French bean plastocyanin | ||
Pc-1 | LEVLLGSG | - |
Pc-2 | LEVLLGSGDGSLVFV | + |
Pc-2a | SGDGSL | - |
Pc-3 | SLVFVPSEFS | - |
Pc-4 | SEFSV | - |
Pc-5 | SEFSVPSGEK | - |
Pc-6 | KIVFKNNA | - |
Pc-6a | GEKIVFKNNAGFPHNVVFDE | + |
Pc-7 | KIVFKNNAGFPH | - |
Pc-8 | KNNAGFPHNV | - |
Pc-9 | PHNVVFDEDDEIP | - |
Pc-10 | IPAGVDAVKISM | + |
Pc-10a | EIPAGV | - |
Pc-10b | DAVKIS | - |
Pc-11 | MPEEELL | - |
Pc-12 | MPEEELLNAPGETYVVTL | + |
Pc-13 | ELLNAPGETY | - |
Pc-13a | NAPGETY | - |
Pc-13b | APGET | - |
Pc-14 | GETYVVTL | + |
Pc-14a | ETYVVT | - |
Pc-15 | VTLDTKGTY | - |
Pc-16 | GTYSFYT | + |
Pc-16a | TYSFYC | - |
Pc-17 | YTSPHQGAGMV | - |
Pc-18 | MVGKVTVN | - |
Pc-19 | GTVSFVTSPHQGAGMVGKVTVN | + |
Bovine pancreatic trypsin inhibitor (BPTI) | ||
P1-15 | RPDFSLEPPYTGPSK | - |
P29-44 | LSQTFVYGGSRAKRNN | + |
P13-21 | PSKARIIRY | - |
P41-51 | KRNNFKSAEDS | - |
P16-28 | ARIIRYFYNAKAG | - |
P45-58 | FKSAEDSMRTSGGA | - |
P24-32 | NAKAGLSQT | - |
N-terminal domain of ribosomal protein L9 | ||
Beta_1 | MKVIFLKDVKG | + |
Beta_2 | KGKKGEIKNVAD | - |
Alpha_1 | GYANNFLFKQG | + |
Beta_3 | LAIEATPA | - |
Alpha_2 | TPANLKALEAQKQKEQR | - |
Glutathione S transeferase P domain II (Glutex) | ||
Alpha_4 | DQKEAALVDMVNDGVEDLRCKYATLIYT | - |
Alpha_5 | YEAGKEKYVKELPEHLKPFETLLSQ | - |
Alpha_6 | QISFADYNLLDLLRIHQVLN | + |
Alpha_7 | PLLSAYVARLSA | - |
Alpha_8 | PKIKAFLA | - |
Spectrin SH3 | ||
M_2 | AYVKKLDSGTGKELVLAL | - |
M_4 | YDYQEKSPREVTMKKGD | - |
M_8 | DILTLLNSTNKDWWKVEVND | + |
M_C | GGKDWWKVGG | - |
M_6 | DWWKVEVNDRQGFVPA | + |
M_68 | DILTLLNSTNKDWWKVEVNDRQGFVPA | + |
M_681 | DILTLLNSTNKDWWKVEVNDRQGFVPA | - |
Ada-2h | ||
H1_Wt | VPSNEEQIKNLLQLEAQEHLQY | - |
H1_Mt | VPSNEEQIKKLLELEAKKHLQY | - |
H2_WT | FVNVQAVKVFLESQGIAY | + |
H2_Mt | FVNVEAVKAFLEAHGIAY | + |
Ara | ||
Ara1 | AVGKSNLLSRYARNEFSA | - |
Ara2 | RFRAVTSAYYRGAVG | - |
Ara3 | TRRTTFESVGRWLDELKIHSD | - |
Ara4 | AVSVEEGKALAEEEGLF | - |
Ara5 | STNVKTAFEMVILDIYNNV | + |
Com-A | ||
ComA1 | DHPAVMEGTKTILETDSNLS | - |
ComA2 | EPSEQFIKQHDFSSY | - |
ComA3 | VNGMELSKQILQENPH | - |
ComA4 | EVEDYFEEAIRAGLH | - |
ComA5 | TESKEKITQYIYHVLNGEIL | + |
CheY | ||
Che_Y1 | DFSTMRRIVRNLLKELGYN | - |
Che_Y2 | EDGVDALNKLQAGGY | - |
Che_Y3 | MDGLELLKTIRADSAY | - |
Che_Y4 | AKKENIIAAAQAGASGY | + |
Che_Y5 | PFTAATLEEKLNKIFEKLGMY | + |
Flavodoxin | ||
FXN1 | GTGNTEKMAELIAKGIIESGKDY | - |
FXN3 | EESEFEPFIEEISTKISY | - |
FXN4 | GDGKWMRDFEQRMNGYGSV | - |
FXN5 | EPDEAEQDSIEFGKKIANIY | - |
P21-ras | ||
P21A | GVGKSALTIQLIQNHFVY | + |
P21B | EYSAMRDQYMRTGEG | - |
P21C | INNTKSFEDIHQYREQIKRVKDS | - |
P21D | ARTVESRQAQDLARSYGIP | - |
P21E | RQGVEDAFYTLVREIRQHK | + |
PL B1 protein | ||
PL_B1_95-114_pH_4.1 | VTIKANLIFANGFTQTAEFKG | + |
PL_B1_114-138_pH_2.4 | KGTFEKATSEAYAYADTLKKDNGEY | + |
PL_B1_136-155D_pH_6.1 | GEYTVDVADKGYTLNIKFAGD | + |
Protein G | ||
ProteinG2-19 | TYKLINGKTLKGETTTEA | - |
ProteinG21-40 | GDAATAEKVFKQYANDNGVD | - |
ProteinG41-56 | GEWTYDDATKTFTVTE | + |
human prion protein | ||
1-10 | QGGGTHSQWN | - |
6-15 | HSQWNKPSKP | - |
11-20 | KPSKPKTNMK | - |
16-25 | KTNMKHMAGA | - |
21-30 | HMAGAAAAGA | - |
26-35 | AAAGAVVGGL | - |
31-40 | VVGGLGGYML | - |
36-45 | GGYMLGSAMS | - |
41-50 | GSAMSRPIIH | - |
51-60 | FGSDYEDRYY | - |
56-65 | EDRYYRENMH | - |
61-70 | RENMHRYPNQ | - |
66-75 | RYPNQVYYRP | - |
71-80 | VYYRPMDEYS | - |
76-85 | MDEYSNQNNF | - |
81-90 | NQNNFVHDCV | - |
86-95 | VHDCVNITIK | + |
91-100 | NITIKQHTVT | - |
96-105 | QHTVTTTTKG | - |
101-110 | TTTKGENFTE | - |
106-115 | ENFTETDVKM | - |
111-120 | TDVKMMERVV | - |
116-125 | MERVVEQMCI | - |
121-130 | EQMCITQYER | - |
126-135 | TQYERESQAY | - |
131-140 | ESQAYYQRGS | - |
136-145 | YQRGSSMVLF | - |
141-150 | SMVLFSSPPV | + |
146-155 | SSPPVILLIS | + |
151-160 | ILLISFLIFL | + |
156-163 | FLIFLIVG | + |
human lysozyme | ||
5-14 | RCELARTLKR | + |
10-19 | RTLKRLGMDG | - |
15-24 | LGMDGYRGIS | - |
20-29 | YRGISLANWM | - |
25-34 | LANWMCLAKW | + |
30-39 | CLAKWESGYN | - |
35-44 | ESGYNTRATN | - |
40-49 | TRATNYNAGD | - |
45-54 | YNAGDRSTDY | - |
50-59 | RSTDYGIFQI | - |
55-64 | GIFQINSRYW | - |
60-69 | NSRYWCNDGK | - |
65-74 | CNDGKTPGAV | - |
70-79 | TPGAVNACHL | - |
75-84 | NACHLSCSAL | - |
85-94 | LQDNIADAVA | - |
90-99 | ADAVACAKRV | - |
95-104 | CAKRVVRDPQ | - |
100-109 | VRDPQGIRAW | - |
105-114 | GIRAWVAWRN | - |
110-119 | VAWRNRCQNR | - |
115-124 | RCQNRDVRQY | - |
120-129 | DVRQYVQGCG | - |
β2-microglobulin | ||
4-13 | RTPKIQVYSR | - |
9-18 | QVYSRHPAEN | - |
14-23 | HPAENGKSNF | - |
24-33 | LNCYVSGFHP | - |
29-38 | SGFHPSDIEV | - |
34-43 | SDIEVDLLKN | - |
39-48 | DLLKNGERIE | - |
44-53 | GERIEKVEHS | - |
49-58 | KVEHSDLSFS | - |
54-63 | DLSFSKDWSF | + |
59-68 | KDWSFYLLYY | + |
64-73 | YLLYYTEFTP | + |
69-78 | TEFTPTEKDE | + |
74-83 | TEKDEYACRV | - |
79-88 | YACRVNHVTL | - |
84-93 | NHVTLSQPKI | - |
Table II. The dataset of fibril-forming and nonfibril-forming peptides (AmylHex): Thompson,M.J. et al. (2006) The 3D profile method for identifying fibril-forming segments of proteins. Proc. Natl. Acad. Sci. USA, 103, 4074-4078. | |
Sequence | Experimental amyloidogenicity |
YTVIIE | + |
WTVIIE | + |
VTVIIE | + |
TTVIIE | + |
SYVIIE | + |
SVVIIE | + |
STVYIE | + |
STVWIE | + |
STVTIE | + |
STVNIE | + |
STVLIE | + |
STVIYE | + |
STVIIY | + |
STVIIW | + |
STVIIV | + |
STVIIT | + |
STVIIS | + |
STVIIQ | + |
STVIIN | + |
STVIIM | + |
STVIIL | + |
STVIII | + |
STVIIF | + |
STVIIE | + |
STVIID | + |
STVIIA | + |
STVIFE | + |
STVFIE | + |
STVEIE | + |
STSIIE | + |
STQIIE | + |
STNIIE | + |
STLIIE | + |
STFIIE | + |
STEIIE | + |
SSVIIE | + |
SQVIIE | + |
SNVIIE | + |
SMVIIE | + |
SLVIIE | + |
SIVIIE | + |
SGVIIE | + |
SFVIIE | + |
SEVIIE | + |
SDVIIE | + |
SAVIIE | + |
QTVIIE | + |
NTVIIE | + |
MTVIIE | + |
LTVIIE | + |
ITVIIE | + |
GTVIIE | + |
FTVIIE | + |
ETVIIE | + |
DTVIIE | + |
ATVIIE | + |
NHVTLS | + |
LLYYTE | + |
KIVKWD | + |
KDWSFY | + |
FYLLYY | + |
VEALYL | + |
LYQLEN | + |
LVEALY | + |
NFGAIL | + |
FLVHSS | + |
VQIVYK | + |
SWVIIE | - |
STYIIE | - |
STVVIE | - |
STVSIE | - |
STVQIE | - |
STVPIE | - |
STVMIE | - |
STVIWE | - |
STVIVE | - |
STVITE | - |
STVISE | - |
STVIQE | - |
STVIPE | - |
STVINE | - |
STVIME | - |
STVILE | - |
STVIIP | - |
STVIGE | - |
STVIEE | - |
STVIDE | - |
STVIAE | - |
STVGIE | - |
STVDIE | - |
STVAIE | - |
STTIIE | - |
STPIIE | - |
STMIIE | - |
STIIIE | - |
STGIIE | - |
STDIIE | - |
STAIIE | - |
SPVIIE | - |
PTVIIE | - |
KTVLIE | - |
KTVIYE | - |
KTVIVE | - |
KTVIIT | - |
KTVIIE | - |
YYTEFT | - |
YVSGFH | - |
WSFYLL | - |
VYSRHP | - |
VTLSQP | - |
VKWDRD | - |
TEFTPT | - |
SRHPAE | - |
SGFHPS | - |
SDLSFS | - |
RVNHVT | - |
RTPKIQ | - |
QPKIVK | - |
PTEKDE | - |
PSDIEV | - |
PKIQVY | - |
NGKSNF | - |
NGERIE | - |
LSQPKI | - |
LSFSKD | - |
LKNGER | - |
KWDRDM | - |
KVEHSD | - |
KSNFLN | - |
IQVYSR | - |
IQRTPK | - |
IEKVEH | - |
HPAENG | - |
FTPTEK | - |
FSKDWS | - |
FHPSDI | - |
EVDLLK | - |
ERIEKV | - |
EKDEYA | - |
EHSDLS | - |
DLLKNG | - |
DIEVDL | - |
AENGKS | - |
YQLENY | - |
SLYQLE | - |
SHLVEA | - |
RGFFYT | - |
HLVEAL | - |
GSHLVE | - |
GFFYTP | - |
GERGFF | - |
FYTPKT | - |
FVNQHL | - |
FFYTPK | - |
ERGFFY | - |
EALYLV | - |
NLGPVL | - |
LIAGFN | - |