A

longformer-base-4096

Otherby Ai2·Model page

Ai2's Longformer base model that extends transformer attention to handle documents up to 4,096 tokens long for NLP tasks.

Share:

Model Card

Longformer is a transformer model for long documents.

longformer-base-4096 is a BERT-like model started from the RoBERTa checkpoint and pretrained for MLM on long documents. It supports sequences of length up to 4,096.

Longformer uses a combination of a sliding window (local) attention and global attention. Global attention is user-configured based on the task to allow the model to learn task-specific representations. Please refer to the examples in modeling_longformer.py and the paper for more details on how to set global attention.

Citing

If you use Longformer in your research, please cite Longformer: The Long-Document Transformer.

@article{Beltagy2020Longformer,
  title={Longformer: The Long-Document Transformer},
  author={Iz Beltagy and Matthew E. Peters and Arman Cohan},
  journal={arXiv:2004.05150},
  year={2020},
}

Longformer is an open-source project developed by the Allen Institute for Artificial Intelligence (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering.

Author
A
Ai2
Organization · ✓
allenai
Details
Downloads1.3M
Likes228
AccessOpen Source
Licenseapache-2.0
Librarytransformers
CreatedMar 2, 2022
UpdatedApr 5, 2023
View on Hugging Face
Languages
en
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

longformer-base-4096 — AI Model Details | Applied