| Table I. The dataset of fragments of proteins: Fernandez-Escamilla,A.M. et al. (2004) Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nat. Biotechnol., 22, 1302-1306. | ||
| Name | Sequence | Experimental amyloidogenicity |
| τ-protein | ||
| K19 | PGGGKVQIVYKPV | + |
| K19d | PGGGKVYKPV | - |
| Mut1 | PGGGKNAEVYKPV | - |
| Mut2 | PGGGKVQIVEKPV | - |
| K19Chym | QTAPVPMPDLKNVKSKIGSTENLKHQPGGGKVQIVY | - |
| K19Chym1 | KPVDLSKVTSKCGSLGNIHHKPGGGQVEVKSEKLDF | - |
| K19Chym2 | KDRVQSKIGSLDNITHVPGGGN | - |
| K19Gluc4 | QTAPVPMPDLKNVKSKIGSTE | - |
| K19Gluc41 | NLKHQPGGGKVQIVYKPVDLSKVTSKCGSLGNIHHKPGGGQVE | + |
| K19Gluc42 | VKSE | - |
| K19Gluc43 | KLDFKDRVQSKIGSLDNITHVPGGGN | - |
| K19Gluc78 | QTAPVPMPD | - |
| K19Gluc781 | LKNVKSKIGSTE | - |
| K19Gluc782 | NLKHQPGGGKVQIVYKEVD | + |
| K19Gluc783 | LSKVTSKCGSLGNIHHKPGGGQVE | - |
| K19Gluc784 | VKSEKLDFKDRVQSKIGSLDNITHVPGGGN | - |
| PHF8 | GKVQIVYK | + |
| PHF6 | VQIVYK | + |
| V313-K321 | VDLSKVTSK | - |
| V318-G335 | VTSKCGSLGNIHHKPGGG | - |
| V335-E342 | GQVEVSKE | - |
| Amyloid beta Aβ peptide (1-40) | ||
| Whole | DAEFRHDSGYEVHHQKLVFFAEDVGSNKGAIIGLMVGGVV | + |
| HABP1 | VPHQKLVFFAEDVGS | + |
| HABP2 | VHPQKLVFFAEDVGS | + |
| HABP3 | VHHPKLVFFAEDVGS | + |
| HABP4 | VHHQPLVFFAEDVGS | + |
| HABP5 | KKPVFFAED | - |
| HABP6 | KKLPFFAED | - |
| HABP7 | KKLVPFAED | - |
| HABP8 | VHHQKLVPFAEDVGS | - |
| HABP9 | KKLVFPAED | - |
| HABP10 | KKLVFFPED | + |
| HABP11 | VHHQEKLVFFAPDVGS | - |
| HABP12 | VHHQEKLVFFAEPVGS | + |
| HABP13 | VHHQEKLVFFAEDPGS | + |
| HABP14 | VHHQEKLVFFAEDVPS | + |
| HABP15 | KKLVFFAED | + |
| HABP16 | VHHQKLVFFAEDVGS | + |
| AB1 | KLVFF | - |
| AB2 | QKLVFFA | - |
| AB3 | HQKLVFFAE | - |
| AB4 | HHQKLVFFAED | + |
| AB5 | VHHQKLVFFAEDV | + |
| AB6 | EVHHQKLVFFAEDVG | + |
| AB7 | YEVHHQKLVFFAEDVGS | + |
| AB8 | GYEVHHQKLVFFAEDVGSN | + |
| AB9 | SGYEVHHQKLVFFAEDVGSNK | + |
| AB10 | DSGYEVHHQKLVFFAEDVGSNKG | + |
| AB11 | HDSGYEVHHQKLVFFAEDVGSNKGA | + |
| Alpha synuclein | ||
| NAC1-18 | EQVTNVGGAVVTGVTAVA | + |
| NAC1-18s | TVNGVGEVTATAVQGVAV | + |
| NAC3-18 | VTNVGGAVVTGVTAVA | + |
| NAC1-13 | EQVTNVGGAVVTG | + |
| NAC6-14 | VGGAVVTGV | + |
| Acyl phosphatase | ||
| 1-17 | STAQSLKSVDYEVFGRV | - |
| 18-33 | QGVSFRMYTEDEARKI | - |
| 34-53 | GVVGWVKNTSKGTVTGQVQG | + |
| 54-68 | PEDKVNSMKSWLSKV | - |
| 69-85 | GSPSSRIDRTNFSNEKT | - |
| 86-98 | ISKLEYSNFSVRY | + |
| β2-microglobulin | ||
| A | IQRTPKIQVYSRHPAE | - |
| B | NGKSNFLNCYVSG | - |
| C | FHPSDIEVDLLK | - |
| D | NGERIEKVEHSDLSFSKD | - |
| E | DWSFYLLYYTEFT | + |
| E1 | DWSFYLLYYTEFTPTGKDEYA | + |
| F | PTGKDEYACRVNHVT | - |
| G | LSQPKIVKWDRDM | - |
| 434 Cro repressor | ||
| 1Cro | MQTLSERLKKRRIALKY | - |
| 2Cro | YKMTQTELATKAGVK | - |
| 3Cro | YKQQSIQLIEAGVTKR | - |
| 4Cro | TKRPRFLYEIAMALNSD | + |
| 5Cro | AMALNCDPVWLQYGTKRGKA | - |
| Sperm whale myoglobin | ||
| A-Helix | VLSEGEWQLVLHVWAKVEA | + |
| AB-Domain | EGEWQLVLHVWAKVEADVAGHGQDILIRLFK | + |
| B-Helix | DVAGHGQDILIRLFKS | + |
| BC-Turn | KSHPET | - |
| CCD-Domai | HPETLEKFDRFKHLK | - |
| D-Helix | TEAEMKA | - |
| E-Helix | SEDLKKHGVTVLTALGAILK | - |
| EF-Turn | KKGHHEAE | - |
| F-helix | ELKPLAQSHA | - |
| FG-Turn | ATKHKIP | - |
| Myohemerithrin | ||
| N-terminal | GWEIPEPYVWDESFRVFY | - |
| C-terminal | GTDFKYKGKL | - |
| A_helix | YEQLDEEHKKIFKGIFDCIRD | - |
| A_helix | YEQLDEEHKKIFKGIFDCIRD | + |
| AB_loop | RDNSA | - |
| B_helix | SAPNLATLVKVTTNHFTHEEAMMD | + |
| BC_loop | DAAKYSEV | - |
| C_Helix | EVVPHKKMHKDFLEKIGGL | + |
| CD_loop | GLSAPVD | - |
| D_helix | AKNVDYCKEWLVNHIK | - |
| D_helix | AKNVDYCKEWLVNHIK | - |
| French bean plastocyanin | ||
| Pc-1 | LEVLLGSG | - |
| Pc-2 | LEVLLGSGDGSLVFV | + |
| Pc-2a | SGDGSL | - |
| Pc-3 | SLVFVPSEFS | - |
| Pc-4 | SEFSV | - |
| Pc-5 | SEFSVPSGEK | - |
| Pc-6 | KIVFKNNA | - |
| Pc-6a | GEKIVFKNNAGFPHNVVFDE | + |
| Pc-7 | KIVFKNNAGFPH | - |
| Pc-8 | KNNAGFPHNV | - |
| Pc-9 | PHNVVFDEDDEIP | - |
| Pc-10 | IPAGVDAVKISM | + |
| Pc-10a | EIPAGV | - |
| Pc-10b | DAVKIS | - |
| Pc-11 | MPEEELL | - |
| Pc-12 | MPEEELLNAPGETYVVTL | + |
| Pc-13 | ELLNAPGETY | - |
| Pc-13a | NAPGETY | - |
| Pc-13b | APGET | - |
| Pc-14 | GETYVVTL | + |
| Pc-14a | ETYVVT | - |
| Pc-15 | VTLDTKGTY | - |
| Pc-16 | GTYSFYT | + |
| Pc-16a | TYSFYC | - |
| Pc-17 | YTSPHQGAGMV | - |
| Pc-18 | MVGKVTVN | - |
| Pc-19 | GTVSFVTSPHQGAGMVGKVTVN | + |
| Bovine pancreatic trypsin inhibitor (BPTI) | ||
| P1-15 | RPDFSLEPPYTGPSK | - |
| P29-44 | LSQTFVYGGSRAKRNN | + |
| P13-21 | PSKARIIRY | - |
| P41-51 | KRNNFKSAEDS | - |
| P16-28 | ARIIRYFYNAKAG | - |
| P45-58 | FKSAEDSMRTSGGA | - |
| P24-32 | NAKAGLSQT | - |
| N-terminal domain of ribosomal protein L9 | ||
| Beta_1 | MKVIFLKDVKG | + |
| Beta_2 | KGKKGEIKNVAD | - |
| Alpha_1 | GYANNFLFKQG | + |
| Beta_3 | LAIEATPA | - |
| Alpha_2 | TPANLKALEAQKQKEQR | - |
| Glutathione S transeferase P domain II (Glutex) | ||
| Alpha_4 | DQKEAALVDMVNDGVEDLRCKYATLIYT | - |
| Alpha_5 | YEAGKEKYVKELPEHLKPFETLLSQ | - |
| Alpha_6 | QISFADYNLLDLLRIHQVLN | + |
| Alpha_7 | PLLSAYVARLSA | - |
| Alpha_8 | PKIKAFLA | - |
| Spectrin SH3 | ||
| M_2 | AYVKKLDSGTGKELVLAL | - |
| M_4 | YDYQEKSPREVTMKKGD | - |
| M_8 | DILTLLNSTNKDWWKVEVND | + |
| M_C | GGKDWWKVGG | - |
| M_6 | DWWKVEVNDRQGFVPA | + |
| M_68 | DILTLLNSTNKDWWKVEVNDRQGFVPA | + |
| M_681 | DILTLLNSTNKDWWKVEVNDRQGFVPA | - |
| Ada-2h | ||
| H1_Wt | VPSNEEQIKNLLQLEAQEHLQY | - |
| H1_Mt | VPSNEEQIKKLLELEAKKHLQY | - |
| H2_WT | FVNVQAVKVFLESQGIAY | + |
| H2_Mt | FVNVEAVKAFLEAHGIAY | + |
| Ara | ||
| Ara1 | AVGKSNLLSRYARNEFSA | - |
| Ara2 | RFRAVTSAYYRGAVG | - |
| Ara3 | TRRTTFESVGRWLDELKIHSD | - |
| Ara4 | AVSVEEGKALAEEEGLF | - |
| Ara5 | STNVKTAFEMVILDIYNNV | + |
| Com-A | ||
| ComA1 | DHPAVMEGTKTILETDSNLS | - |
| ComA2 | EPSEQFIKQHDFSSY | - |
| ComA3 | VNGMELSKQILQENPH | - |
| ComA4 | EVEDYFEEAIRAGLH | - |
| ComA5 | TESKEKITQYIYHVLNGEIL | + |
| CheY | ||
| Che_Y1 | DFSTMRRIVRNLLKELGYN | - |
| Che_Y2 | EDGVDALNKLQAGGY | - |
| Che_Y3 | MDGLELLKTIRADSAY | - |
| Che_Y4 | AKKENIIAAAQAGASGY | + |
| Che_Y5 | PFTAATLEEKLNKIFEKLGMY | + |
| Flavodoxin | ||
| FXN1 | GTGNTEKMAELIAKGIIESGKDY | - |
| FXN3 | EESEFEPFIEEISTKISY | - |
| FXN4 | GDGKWMRDFEQRMNGYGSV | - |
| FXN5 | EPDEAEQDSIEFGKKIANIY | - |
| P21-ras | ||
| P21A | GVGKSALTIQLIQNHFVY | + |
| P21B | EYSAMRDQYMRTGEG | - |
| P21C | INNTKSFEDIHQYREQIKRVKDS | - |
| P21D | ARTVESRQAQDLARSYGIP | - |
| P21E | RQGVEDAFYTLVREIRQHK | + |
| PL B1 protein | ||
| PL_B1_95-114_pH_4.1 | VTIKANLIFANGFTQTAEFKG | + |
| PL_B1_114-138_pH_2.4 | KGTFEKATSEAYAYADTLKKDNGEY | + |
| PL_B1_136-155D_pH_6.1 | GEYTVDVADKGYTLNIKFAGD | + |
| Protein G | ||
| ProteinG2-19 | TYKLINGKTLKGETTTEA | - |
| ProteinG21-40 | GDAATAEKVFKQYANDNGVD | - |
| ProteinG41-56 | GEWTYDDATKTFTVTE | + |
| human prion protein | ||
| 1-10 | QGGGTHSQWN | - |
| 6-15 | HSQWNKPSKP | - |
| 11-20 | KPSKPKTNMK | - |
| 16-25 | KTNMKHMAGA | - |
| 21-30 | HMAGAAAAGA | - |
| 26-35 | AAAGAVVGGL | - |
| 31-40 | VVGGLGGYML | - |
| 36-45 | GGYMLGSAMS | - |
| 41-50 | GSAMSRPIIH | - |
| 51-60 | FGSDYEDRYY | - |
| 56-65 | EDRYYRENMH | - |
| 61-70 | RENMHRYPNQ | - |
| 66-75 | RYPNQVYYRP | - |
| 71-80 | VYYRPMDEYS | - |
| 76-85 | MDEYSNQNNF | - |
| 81-90 | NQNNFVHDCV | - |
| 86-95 | VHDCVNITIK | + |
| 91-100 | NITIKQHTVT | - |
| 96-105 | QHTVTTTTKG | - |
| 101-110 | TTTKGENFTE | - |
| 106-115 | ENFTETDVKM | - |
| 111-120 | TDVKMMERVV | - |
| 116-125 | MERVVEQMCI | - |
| 121-130 | EQMCITQYER | - |
| 126-135 | TQYERESQAY | - |
| 131-140 | ESQAYYQRGS | - |
| 136-145 | YQRGSSMVLF | - |
| 141-150 | SMVLFSSPPV | + |
| 146-155 | SSPPVILLIS | + |
| 151-160 | ILLISFLIFL | + |
| 156-163 | FLIFLIVG | + |
| human lysozyme | ||
| 5-14 | RCELARTLKR | + |
| 10-19 | RTLKRLGMDG | - |
| 15-24 | LGMDGYRGIS | - |
| 20-29 | YRGISLANWM | - |
| 25-34 | LANWMCLAKW | + |
| 30-39 | CLAKWESGYN | - |
| 35-44 | ESGYNTRATN | - |
| 40-49 | TRATNYNAGD | - |
| 45-54 | YNAGDRSTDY | - |
| 50-59 | RSTDYGIFQI | - |
| 55-64 | GIFQINSRYW | - |
| 60-69 | NSRYWCNDGK | - |
| 65-74 | CNDGKTPGAV | - |
| 70-79 | TPGAVNACHL | - |
| 75-84 | NACHLSCSAL | - |
| 85-94 | LQDNIADAVA | - |
| 90-99 | ADAVACAKRV | - |
| 95-104 | CAKRVVRDPQ | - |
| 100-109 | VRDPQGIRAW | - |
| 105-114 | GIRAWVAWRN | - |
| 110-119 | VAWRNRCQNR | - |
| 115-124 | RCQNRDVRQY | - |
| 120-129 | DVRQYVQGCG | - |
| β2-microglobulin | ||
| 4-13 | RTPKIQVYSR | - |
| 9-18 | QVYSRHPAEN | - |
| 14-23 | HPAENGKSNF | - |
| 24-33 | LNCYVSGFHP | - |
| 29-38 | SGFHPSDIEV | - |
| 34-43 | SDIEVDLLKN | - |
| 39-48 | DLLKNGERIE | - |
| 44-53 | GERIEKVEHS | - |
| 49-58 | KVEHSDLSFS | - |
| 54-63 | DLSFSKDWSF | + |
| 59-68 | KDWSFYLLYY | + |
| 64-73 | YLLYYTEFTP | + |
| 69-78 | TEFTPTEKDE | + |
| 74-83 | TEKDEYACRV | - |
| 79-88 | YACRVNHVTL | - |
| 84-93 | NHVTLSQPKI | - |
| Table II. The dataset of fibril-forming and nonfibril-forming peptides (AmylHex): Thompson,M.J. et al. (2006) The 3D profile method for identifying fibril-forming segments of proteins. Proc. Natl. Acad. Sci. USA, 103, 4074-4078. | |
| Sequence | Experimental amyloidogenicity |
| YTVIIE | + |
| WTVIIE | + |
| VTVIIE | + |
| TTVIIE | + |
| SYVIIE | + |
| SVVIIE | + |
| STVYIE | + |
| STVWIE | + |
| STVTIE | + |
| STVNIE | + |
| STVLIE | + |
| STVIYE | + |
| STVIIY | + |
| STVIIW | + |
| STVIIV | + |
| STVIIT | + |
| STVIIS | + |
| STVIIQ | + |
| STVIIN | + |
| STVIIM | + |
| STVIIL | + |
| STVIII | + |
| STVIIF | + |
| STVIIE | + |
| STVIID | + |
| STVIIA | + |
| STVIFE | + |
| STVFIE | + |
| STVEIE | + |
| STSIIE | + |
| STQIIE | + |
| STNIIE | + |
| STLIIE | + |
| STFIIE | + |
| STEIIE | + |
| SSVIIE | + |
| SQVIIE | + |
| SNVIIE | + |
| SMVIIE | + |
| SLVIIE | + |
| SIVIIE | + |
| SGVIIE | + |
| SFVIIE | + |
| SEVIIE | + |
| SDVIIE | + |
| SAVIIE | + |
| QTVIIE | + |
| NTVIIE | + |
| MTVIIE | + |
| LTVIIE | + |
| ITVIIE | + |
| GTVIIE | + |
| FTVIIE | + |
| ETVIIE | + |
| DTVIIE | + |
| ATVIIE | + |
| NHVTLS | + |
| LLYYTE | + |
| KIVKWD | + |
| KDWSFY | + |
| FYLLYY | + |
| VEALYL | + |
| LYQLEN | + |
| LVEALY | + |
| NFGAIL | + |
| FLVHSS | + |
| VQIVYK | + |
| SWVIIE | - |
| STYIIE | - |
| STVVIE | - |
| STVSIE | - |
| STVQIE | - |
| STVPIE | - |
| STVMIE | - |
| STVIWE | - |
| STVIVE | - |
| STVITE | - |
| STVISE | - |
| STVIQE | - |
| STVIPE | - |
| STVINE | - |
| STVIME | - |
| STVILE | - |
| STVIIP | - |
| STVIGE | - |
| STVIEE | - |
| STVIDE | - |
| STVIAE | - |
| STVGIE | - |
| STVDIE | - |
| STVAIE | - |
| STTIIE | - |
| STPIIE | - |
| STMIIE | - |
| STIIIE | - |
| STGIIE | - |
| STDIIE | - |
| STAIIE | - |
| SPVIIE | - |
| PTVIIE | - |
| KTVLIE | - |
| KTVIYE | - |
| KTVIVE | - |
| KTVIIT | - |
| KTVIIE | - |
| YYTEFT | - |
| YVSGFH | - |
| WSFYLL | - |
| VYSRHP | - |
| VTLSQP | - |
| VKWDRD | - |
| TEFTPT | - |
| SRHPAE | - |
| SGFHPS | - |
| SDLSFS | - |
| RVNHVT | - |
| RTPKIQ | - |
| QPKIVK | - |
| PTEKDE | - |
| PSDIEV | - |
| PKIQVY | - |
| NGKSNF | - |
| NGERIE | - |
| LSQPKI | - |
| LSFSKD | - |
| LKNGER | - |
| KWDRDM | - |
| KVEHSD | - |
| KSNFLN | - |
| IQVYSR | - |
| IQRTPK | - |
| IEKVEH | - |
| HPAENG | - |
| FTPTEK | - |
| FSKDWS | - |
| FHPSDI | - |
| EVDLLK | - |
| ERIEKV | - |
| EKDEYA | - |
| EHSDLS | - |
| DLLKNG | - |
| DIEVDL | - |
| AENGKS | - |
| YQLENY | - |
| SLYQLE | - |
| SHLVEA | - |
| RGFFYT | - |
| HLVEAL | - |
| GSHLVE | - |
| GFFYTP | - |
| GERGFF | - |
| FYTPKT | - |
| FVNQHL | - |
| FFYTPK | - |
| ERGFFY | - |
| EALYLV | - |
| NLGPVL | - |
| LIAGFN | - |