¿Quién creó layoutlmv3-base?

layoutlmv3-base fue publicado por Microsoft en Hugging Face.

layoutlmv3-base

Name: layoutlmv3-base
Author: Microsoft

layoutlmv3-base es el modelo multimodal de 125 millones de parámetros de Microsoft para comprensión de documentos combinando señales de texto, maquetación e imagen.

Descripción del Modelo

Microsoft Document AI | GitHub

Model description

LayoutLMv3 is a pre-trained multimodal Transformer for Document AI with unified text and image masking. The simple unified architecture and training objectives make LayoutLMv3 a general-purpose pre-trained model. For example, LayoutLMv3 can be fine-tuned for both text-centric tasks, including form understanding, receipt understanding, and document visual question answering, and image-centric tasks such as document image classification and document layout analysis.

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei, ACM Multimedia 2022.

Citation

If you find LayoutLM useful in your research, please cite the following paper:

@inproceedings{huang2022layoutlmv3,
  author={Yupan Huang and Tengchao Lv and Lei Cui and Yutong Lu and Furu Wei},
  title={LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  year={2022}
}

License

The content of this project itself is licensed under the Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0). Portions of the source code are based on the transformers project. Microsoft Open Source Code of Conduct

Autor

Microsoft

Organización · ✓

microsoft

Detalles

Descargas1.7M

Me gusta500

AccesoCódigo Abierto

Parámetros125M

Licenciacc-by-nc-sa-4.0

Libreríatransformers

Creado18 abr 2022

Actualizado10 abr 2024

Ver en Hugging Face

Idiomas

Entiende todo el contexto.

Regístrate para leer casos de estudio completos, acceder a métricas detalladas y recibir todos los reportes.