¿Quieres leer esta página en español? → Recursos sobre anotación de datos
Here are some resources I always go to for data annotation and how to have the best-quality possible for your annotated datasets:
📚 Books:
- Ide, N., & Pustejovsky, J. (2017). Handbook of Linguistic Annotation. Springer. https://www.google.es/books/edition/Handbook_of_Linguistic_Annotation/F3koDwAAQBAJ?hl=en&gbpv=1&dq=handbook+of+linguistic+annotation&printsec=frontcover
- Pustejovsky, J., & Stubbs, A. (2012). Natural Language Annotation for Machine Learning: A guide to corpus-building for applications. O’Reilly Media, Inc. https://www.google.es/books/edition/Natural_Language_Annotation_for_Machine_/QtzmqamXxx4C?hl=en&gbpv=1
📄 Papers & white-papers:
- Aldama, N. et al. Anotación de corpus lingüísticos: metodología utilizada en el Instituto de Ingeniería del Conocimiento (IIC) https://www.iic.uam.es/pdf/anotacion-corpus-linguisticos.pdf
- Klie, J.-C., Eckart de Castilho, R., & Gurevych, I. (2024). Analyzing Dataset Annotation Quality Management in the Wild. Computational Linguistics, 50(3), 817-866. https://doi.org/10.1162/coli_a_00516
- Tseng, T. et al. (2020) Best Practices for Managing Annotation Projects. Bloomberg https://assets.bbhub.io/company/sites/40/2020/09/Annotation-Best-Practices-091020-FINAL.pdf