Table I. The dataset of fragments of proteins:
Fernandez-Escamilla,A.M. et al. (2004)
Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins.
Nat. Biotechnol., 22, 1302-1306.
Name Sequence Experimental amyloidogenicity
τ-protein
K19 PGGGKVQIVYKPV +
K19d PGGGKVYKPV -
Mut1 PGGGKNAEVYKPV -
Mut2 PGGGKVQIVEKPV -
K19Chym QTAPVPMPDLKNVKSKIGSTENLKHQPGGGKVQIVY -
K19Chym1 KPVDLSKVTSKCGSLGNIHHKPGGGQVEVKSEKLDF -
K19Chym2 KDRVQSKIGSLDNITHVPGGGN -
K19Gluc4 QTAPVPMPDLKNVKSKIGSTE -
K19Gluc41 NLKHQPGGGKVQIVYKPVDLSKVTSKCGSLGNIHHKPGGGQVE +
K19Gluc42 VKSE -
K19Gluc43 KLDFKDRVQSKIGSLDNITHVPGGGN -
K19Gluc78 QTAPVPMPD -
K19Gluc781 LKNVKSKIGSTE -
K19Gluc782 NLKHQPGGGKVQIVYKEVD +
K19Gluc783 LSKVTSKCGSLGNIHHKPGGGQVE -
K19Gluc784 VKSEKLDFKDRVQSKIGSLDNITHVPGGGN -
PHF8 GKVQIVYK +
PHF6 VQIVYK +
V313-K321 VDLSKVTSK -
V318-G335 VTSKCGSLGNIHHKPGGG -
V335-E342 GQVEVSKE -
Amyloid beta Aβ peptide (1-40)
Whole DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGGVV +
HABP1 VPHQKLVFFAEDVGS +
HABP2 VHPQKLVFFAEDVGS +
HABP3 VHHPKLVFFAEDVGS +
HABP4 VHHQPLVFFAEDVGS +
HABP5 KKPVFFAED -
HABP6 KKLPFFAED -
HABP7 KKLVPFAED -
HABP8 VHHQKLVPFAEDVGS -
HABP9 KKLVFPAED -
HABP10 KKLVFFPED +
HABP11 VHHQEKLVFFAPDVGS -
HABP12 VHHQEKLVFFAEPVGS +
HABP13 VHHQEKLVFFAEDPGS +
HABP14 VHHQEKLVFFAEDVPS +
HABP15 KKLVFFAED +
HABP16 VHHQKLVFFAEDVGS +
AB1 KLVFF -
AB2 QKLVFFA -
AB3 HQKLVFFAE -
AB4 HHQKLVFFAED +
AB5 VHHQKLVFFAEDV +
AB6 EVHHQKLVFFAEDVG +
AB7 YEVHHQKLVFFAEDVGS +
AB8 GYEVHHQKLVFFAEDVGSN +
AB9 SGYEVHHQKLVFFAEDVGSNK +
AB10 DSGYEVHHQKLVFFAEDVGSNKG +
AB11 HDSGYEVHHQKLVFFAEDVGSNKGA +
Alpha synuclein
NAC1-18 EQVTNVGGAVVTGVTAVA +
NAC1-18s TVNGVGEVTATAVQGVAV +
NAC3-18 VTNVGGAVVTGVTAVA +
NAC1-13 EQVTNVGGAVVTG +
NAC6-14 VGGAVVTGV +
Acyl phosphatase
1-17 STAQSLKSVDYEVFGRV -
18-33 QGVSFRMYTEDEARKI -
34-53 GVVGWVKNTSKGTVTGQVQG +
54-68 PEDKVNSMKSWLSKV -
69-85 GSPSSRIDRTNFSNEKT -
86-98 ISKLEYSNFSVRY +
β2-microglobulin
A IQRTPKIQVYSRHPAE -
B NGKSNFLNCYVSG -
C FHPSDIEVDLLK -
D NGERIEKVEHSDLSFSKD -
E DWSFYLLYYTEFT +
E1 DWSFYLLYYTEFTPTGKDEYA +
F PTGKDEYACRVNHVT -
G LSQPKIVKWDRDM -
434 Cro repressor
1Cro MQTLSERLKKRRIALKY -
2Cro YKMTQTELATKAGVK -
3Cro YKQQSIQLIEAGVTKR -
4Cro TKRPRFLYEIAMALNSD +
5Cro AMALNCDPVWLQYGTKRGKA -
Sperm whale myoglobin
A-Helix VLSEGEWQLVLHVWAKVEA +
AB-Domain EGEWQLVLHVWAKVEADVAGHGQDILIRLFK +
B-Helix DVAGHGQDILIRLFKS +
BC-Turn KSHPET -
CCD-Domai HPETLEKFDRFKHLK -
D-Helix TEAEMKA -
E-Helix SEDLKKHGVTVLTALGAILK -
EF-Turn KKGHHEAE -
F-helix ELKPLAQSHA -
FG-Turn ATKHKIP -
Myohemerithrin
N-terminal GWEIPEPYVWDESFRVFY -
C-terminal GTDFKYKGKL -
A_helix YEQLDEEHKKIFKGIFDCIRD -
A_helix YEQLDEEHKKIFKGIFDCIRD +
AB_loop RDNSA -
B_helix SAPNLATLVKVTTNHFTHEEAMMD +
BC_loop DAAKYSEV -
C_Helix EVVPHKKMHKDFLEKIGGL +
CD_loop GLSAPVD -
D_helix AKNVDYCKEWLVNHIK -
D_helix AKNVDYCKEWLVNHIK -
French bean plastocyanin
Pc-1 LEVLLGSG -
Pc-2 LEVLLGSGDGSLVFV +
Pc-2a SGDGSL -
Pc-3 SLVFVPSEFS -
Pc-4 SEFSV -
Pc-5 SEFSVPSGEK -
Pc-6 KIVFKNNA -
Pc-6a GEKIVFKNNAGFPHNVVFDE +
Pc-7 KIVFKNNAGFPH -
Pc-8 KNNAGFPHNV -
Pc-9 PHNVVFDEDDEIP -
Pc-10 IPAGVDAVKISM +
Pc-10a EIPAGV -
Pc-10b DAVKIS -
Pc-11 MPEEELL -
Pc-12 MPEEELLNAPGETYVVTL +
Pc-13 ELLNAPGETY -
Pc-13a NAPGETY -
Pc-13b APGET -
Pc-14 GETYVVTL +
Pc-14a ETYVVT -
Pc-15 VTLDTKGTY -
Pc-16 GTYSFYT +
Pc-16a TYSFYC -
Pc-17 YTSPHQGAGMV -
Pc-18 MVGKVTVN -
Pc-19 GTVSFVTSPHQGAGMVGKVTVN +
Bovine pancreatic trypsin inhibitor (BPTI)
P1-15 RPDFSLEPPYTGPSK -
P29-44 LSQTFVYGGSRAKRNN +
P13-21 PSKARIIRY -
P41-51 KRNNFKSAEDS -
P16-28 ARIIRYFYNAKAG -
P45-58 FKSAEDSMRTSGGA -
P24-32 NAKAGLSQT -
N-terminal domain of ribosomal protein L9
Beta_1 MKVIFLKDVKG +
Beta_2 KGKKGEIKNVAD -
Alpha_1 GYANNFLFKQG +
Beta_3 LAIEATPA -
Alpha_2 TPANLKALEAQKQKEQR -
Glutathione S transeferase P domain II (Glutex)
Alpha_4 DQKEAALVDMVNDGVEDLRCKYATLIYT -
Alpha_5 YEAGKEKYVKELPEHLKPFETLLSQ -
Alpha_6 QISFADYNLLDLLRIHQVLN +
Alpha_7 PLLSAYVARLSA -
Alpha_8 PKIKAFLA -
Spectrin SH3
M_2 AYVKKLDSGTGKELVLAL -
M_4 YDYQEKSPREVTMKKGD -
M_8 DILTLLNSTNKDWWKVEVND +
M_C GGKDWWKVGG -
M_6 DWWKVEVNDRQGFVPA +
M_68 DILTLLNSTNKDWWKVEVNDRQGFVPA +
M_681 DILTLLNSTNKDWWKVEVNDRQGFVPA -
Ada-2h
H1_Wt VPSNEEQIKNLLQLEAQEHLQY -
H1_Mt VPSNEEQIKKLLELEAKKHLQY -
H2_WT FVNVQAVKVFLESQGIAY +
H2_Mt FVNVEAVKAFLEAHGIAY +
Ara
Ara1 AVGKSNLLSRYARNEFSA -
Ara2 RFRAVTSAYYRGAVG -
Ara3 TRRTTFESVGRWLDELKIHSD -
Ara4 AVSVEEGKALAEEEGLF -
Ara5 STNVKTAFEMVILDIYNNV +
Com-A
ComA1 DHPAVMEGTKTILETDSNLS -
ComA2 EPSEQFIKQHDFSSY -
ComA3 VNGMELSKQILQENPH -
ComA4 EVEDYFEEAIRAGLH -
ComA5 TESKEKITQYIYHVLNGEIL +
CheY
Che_Y1 DFSTMRRIVRNLLKELGYN -
Che_Y2 EDGVDALNKLQAGGY -
Che_Y3 MDGLELLKTIRADSAY -
Che_Y4 AKKENIIAAAQAGASGY +
Che_Y5 PFTAATLEEKLNKIFEKLGMY +
Flavodoxin
FXN1 GTGNTEKMAELIAKGIIESGKDY -
FXN3 EESEFEPFIEEISTKISY -
FXN4 GDGKWMRDFEQRMNGYGSV -
FXN5 EPDEAEQDSIEFGKKIANIY -
P21-ras
P21A GVGKSALTIQLIQNHFVY +
P21B EYSAMRDQYMRTGEG -
P21C INNTKSFEDIHQYREQIKRVKDS -
P21D ARTVESRQAQDLARSYGIP -
P21E RQGVEDAFYTLVREIRQHK +
PL B1 protein
PL_B1_95-114_pH_4.1 VTIKANLIFANGFTQTAEFKG +
PL_B1_114-138_pH_2.4 KGTFEKATSEAYAYADTLKKDNGEY +
PL_B1_136-155D_pH_6.1 GEYTVDVADKGYTLNIKFAGD +
Protein G
ProteinG2-19 TYKLINGKTLKGETTTEA -
ProteinG21-40 GDAATAEKVFKQYANDNGVD -
ProteinG41-56 GEWTYDDATKTFTVTE +
human prion protein
1-10 QGGGTHSQWN -
6-15 HSQWNKPSKP -
11-20 KPSKPKTNMK -
16-25 KTNMKHMAGA -
21-30 HMAGAAAAGA -
26-35 AAAGAVVGGL -
31-40 VVGGLGGYML -
36-45 GGYMLGSAMS -
41-50 GSAMSRPIIH -
51-60 FGSDYEDRYY -
56-65 EDRYYRENMH -
61-70 RENMHRYPNQ -
66-75 RYPNQVYYRP -
71-80 VYYRPMDEYS -
76-85 MDEYSNQNNF -
81-90 NQNNFVHDCV -
86-95 VHDCVNITIK +
91-100 NITIKQHTVT -
96-105 QHTVTTTTKG -
101-110 TTTKGENFTE -
106-115 ENFTETDVKM -
111-120 TDVKMMERVV -
116-125 MERVVEQMCI -
121-130 EQMCITQYER -
126-135 TQYERESQAY -
131-140 ESQAYYQRGS -
136-145 YQRGSSMVLF -
141-150 SMVLFSSPPV +
146-155 SSPPVILLIS +
151-160 ILLISFLIFL +
156-163 FLIFLIVG +
human lysozyme
5-14 RCELARTLKR +
10-19 RTLKRLGMDG -
15-24 LGMDGYRGIS -
20-29 YRGISLANWM -
25-34 LANWMCLAKW +
30-39 CLAKWESGYN -
35-44 ESGYNTRATN -
40-49 TRATNYNAGD -
45-54 YNAGDRSTDY -
50-59 RSTDYGIFQI -
55-64 GIFQINSRYW -
60-69 NSRYWCNDGK -
65-74 CNDGKTPGAV -
70-79 TPGAVNACHL -
75-84 NACHLSCSAL -
85-94 LQDNIADAVA -
90-99 ADAVACAKRV -
95-104 CAKRVVRDPQ -
100-109 VRDPQGIRAW -
105-114 GIRAWVAWRN -
110-119 VAWRNRCQNR -
115-124 RCQNRDVRQY -
120-129 DVRQYVQGCG -
β2-microglobulin
4-13 RTPKIQVYSR -
9-18 QVYSRHPAEN -
14-23 HPAENGKSNF -
24-33 LNCYVSGFHP -
29-38 SGFHPSDIEV -
34-43 SDIEVDLLKN -
39-48 DLLKNGERIE -
44-53 GERIEKVEHS -
49-58 KVEHSDLSFS -
54-63 DLSFSKDWSF +
59-68 KDWSFYLLYY +
64-73 YLLYYTEFTP +
69-78 TEFTPTEKDE +
74-83 TEKDEYACRV -
79-88 YACRVNHVTL -
84-93 NHVTLSQPKI -


Table II. The dataset of fibril-forming and nonfibril-forming peptides (AmylHex):
Thompson,M.J. et al. (2006)
The 3D profile method for identifying fibril-forming segments of proteins.
Proc. Natl. Acad. Sci. USA, 103, 4074-4078.
Sequence Experimental amyloidogenicity
YTVIIE +
WTVIIE +
VTVIIE +
TTVIIE +
SYVIIE +
SVVIIE +
STVYIE +
STVWIE +
STVTIE +
STVNIE +
STVLIE +
STVIYE +
STVIIY +
STVIIW +
STVIIV +
STVIIT +
STVIIS +
STVIIQ +
STVIIN +
STVIIM +
STVIIL +
STVIII +
STVIIF +
STVIIE +
STVIID +
STVIIA +
STVIFE +
STVFIE +
STVEIE +
STSIIE +
STQIIE +
STNIIE +
STLIIE +
STFIIE +
STEIIE +
SSVIIE +
SQVIIE +
SNVIIE +
SMVIIE +
SLVIIE +
SIVIIE +
SGVIIE +
SFVIIE +
SEVIIE +
SDVIIE +
SAVIIE +
QTVIIE +
NTVIIE +
MTVIIE +
LTVIIE +
ITVIIE +
GTVIIE +
FTVIIE +
ETVIIE +
DTVIIE +
ATVIIE +
NHVTLS +
LLYYTE +
KIVKWD +
KDWSFY +
FYLLYY +
VEALYL +
LYQLEN +
LVEALY +
NFGAIL +
FLVHSS +
VQIVYK +
 
SWVIIE -
STYIIE -
STVVIE -
STVSIE -
STVQIE -
STVPIE -
STVMIE -
STVIWE -
STVIVE -
STVITE -
STVISE -
STVIQE -
STVIPE -
STVINE -
STVIME -
STVILE -
STVIIP -
STVIGE -
STVIEE -
STVIDE -
STVIAE -
STVGIE -
STVDIE -
STVAIE -
STTIIE -
STPIIE -
STMIIE -
STIIIE -
STGIIE -
STDIIE -
STAIIE -
SPVIIE -
PTVIIE -
KTVLIE -
KTVIYE -
KTVIVE -
KTVIIT -
KTVIIE -
YYTEFT -
YVSGFH -
WSFYLL -
VYSRHP -
VTLSQP -
VKWDRD -
TEFTPT -
SRHPAE -
SGFHPS -
SDLSFS -
RVNHVT -
RTPKIQ -
QPKIVK -
PTEKDE -
PSDIEV -
PKIQVY -
NGKSNF -
NGERIE -
LSQPKI -
LSFSKD -
LKNGER -
KWDRDM -
KVEHSD -
KSNFLN -
IQVYSR -
IQRTPK -
IEKVEH -
HPAENG -
FTPTEK -
FSKDWS -
FHPSDI -
EVDLLK -
ERIEKV -
EKDEYA -
EHSDLS -
DLLKNG -
DIEVDL -
AENGKS -
YQLENY -
SLYQLE -
SHLVEA -
RGFFYT -
HLVEAL -
GSHLVE -
GFFYTP -
GERGFF -
FYTPKT -
FVNQHL -
FFYTPK -
ERGFFY -
EALYLV -
NLGPVL -
LIAGFN -