(sorted by number of parameters)
Name | Params | Paper | Code | Notes |
---|---|---|---|---|
xTrimoPGLM | 100B | bioRxiv | Not available | |
ESM2 | 8M - 15B | bioRxiv | Code | |
ProGen2 | 151M - 6.4B | arXiv | Code | |
ProtTrans | 420M - 3B | Paper | Code | BFD+UniRef50 |
ProteinLM | 200M, 3B | arXiv | Code | |
RITA | 85M - 1.2B | arXiv | Code | |
ProGen1 | 1.2M | bioRxiv | Code | |
Ankh | 450M, 1.15B | arXiv | Code | |
ProtGPT2 | 738M | Paper | Code | |
Tranception | 700M | Paper | Code | |
ESM1 | 43M - 670M | Paper | Code | |
PoET | 57M - 604M | arXiv | Not available | Only available through OpenProtein.AI web app |
DistilProtBert | 230M | bioRxiv | Code | |
DARK | 128M | bioRxiv | ||
PRoBERTa | 44M | Paper | Code | |
TAPE | 38M | arXiv | Code | |
ProteinBERT | 16M | Paper | Code, PyTorch | ~106M proteins from UniRef90; 28 days over ~670M records (i.e. ~6.4 iterations) |
AminoBERT | bioRxiv | Code |
Name | Params | Paper | Code | Notes |
---|---|---|---|---|
PeTriBERT | 40M | bioRxiv | N/A | Optimized for protein design |
Name | Params | Paper | Code | Notes |
---|---|---|---|---|
CARP | 600K - 640M | bioRxiv | Code | CNN |
SeqVec | 93M | Paper | Code | bidirectional LSTM; UniRef50 |
UniRep | 90M | Paper | Code | mLSTM |
ProSE | 24M | Paper | Code | LSTM |
Name | Params | Paper | Code | Notes |
---|---|---|---|---|
TCR-BERT | 100M | bioRxiv | Code | |
AntiBERTa | 86M | Paper | Code | |
AntiBERTy | 26M | arXiv | Code | |
IgLM | 1.5M, 13M | bioRxiv | Code | |
Sapiens | 0.6M | Paper | Code | |
AbLang | Paper | Code |
Name | Params | Paper | Code | Notes |
---|---|---|---|---|
GenSLM | 25M - 25B | bioRxiv | Code | |
Nucleotide Transformer | 500M - 2.5B | bioRxiv | Code | |
GENA-LM | 110M - 336M | bioRxiv | Code | Inputs up to 36,000 base pairs |