The Challenges of Forensic Linguistics in the Age of AI

Authors

DOI:

https://doi.org/10.15381/lengsoc.v23i2.29462

Keywords:

Forensic Linguistics, Artificial Intelligence, authorship attribution, deepfakes, hybrid texts

Abstract

In this article, the challenges faced by forensic linguistics in the era of Artificial Intelligence are analyzed. It focuses on describing the automatic generation of text and voice using AI and examining how this impacts authorship attribution and the authentication of evidence in judicial contexts. Additionally, it reviews current detection tools and their limitations. The findings suggest that traditional methodologies may not suffice to detect AI-generated texts, emphasizing the need to develop new tools and interdisciplinary approaches. The research underscores the importance of developing ethical and legal frameworks to regulate the use of AI in judicial processes.

Author Biography

  • Sheila Queralt, Laboratorio SQ-Lingüistas Forenses, Barcelona, España

    Es doctora en Traducción y Ciencias del Lenguaje, máster en Lingüística Forense, en Policía Científica e Inteligencia Criminal, en Criminalística, en Ciencias del Grafismo, en Lingüística Teórica y Aplicada, y en Estadística Aplicada a la Investigación. Es licenciada en Lingüística y en Traducción e Interpretación. Es directora del Laboratorio SQ-Lingüistas Forenses, docente en distintas universidades y colabora como experta lingüista con diferentes cuerpos policiales nacionales e internacionales en casos de corrupción, ciberseguridad, narcotráfico, homicidios, terrorismo, entre otros. Es autora de varios libros entre los que destacan Fundamentos de la Lingüística Forense (2019), Atrapados por la lengua (2020, 3.ª ed. 2024), Estafas amorosas (2022) y Lingüistas de Hoy (2023).

References

Alhafni, B., Kulkarni, V., Kumar, D., y Raheja, V. (2024). Personalized Text Generation with Fine-Grained Linguistic Control. arXiv. https://doi.org/10.48550/arXiv.2402.04914

ASIS International. (2024). Putting Generative AI to Use for Crime: Fraud, Disinformation, Exploitation, and More. ASIS. https://www.asisonline.org

Brennan, M., Afroz, S., y Greenstadt, R. (2012). Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity. ACM Transactions on Information and System Security (TISSEC), 15(3), 1-22. https://doi.org/10.1145/2382448.238245

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, Z., Winter, C., y Amodei, D. (2020). Language models are few-shot learners. arXiv. https://doi.org/10.48550/arXiv.2005.14165

Chakraborty, S., Bedi, A. S., Zhu, S., An, B., Manocha, D., y Huang, F. (2023). On the possibilities of ai-generated text detection. arXiv. https://doi.org/10.48550/arXiv.2304.04736

Coulthard, M., y Johnson, A. (2007). An Introduction to Forensic Linguistics: Language in Evidence. Cambridge University Press.

Dinis-Oliveira, R. J., y Azevedo, R. M. (2023). ChatGPT in forensic sciences: a new Pandora’s box with advantages and challenges to pay attention. Forensic Sciences Research, 8(4), 275-279. https://doi.org/10.1093/fsr/owad039

Departamento de Cyber Threat Intelligence de NTT DATA Cybersecurity. (2024). Informe de tendencias y ciberamenazas del 1er semestre de 2024. NTT DATA. https://es.nttdata.com/newsfolder/los-ciberataques-supondran-un-coste-de-10000-millones-en-2024

Departamento de Justicia de EE.UU. (2024, 9 de julio). Justice Department Leads Efforts Among Federal, International, and Private Sector Partners. United States Attorney’s Office. https://www.justice.gov/usao-ndil/pr/justice-department-leads-efforts-among-federal-international-and-private-sector

Eder, M., Rybicki, J., y Kestemont, M. (2016). Stylometry with R: A package for computational text analysis. R Journal, 8(1), 107-121. https://journal.r-project.org/archive/2016/RJ-2016-007/index.html

Europol. (2024). New report finds that criminals leverage AI for malicious use – and it’s not just deep fakes. https://www.europol.europa.eu/media-press/newsroom/news/new-report-finds-criminals-leverage-ai-for-malicious-use-%E2%80%93-and-it%E2%80%99s-not-just-deep-fakes

FBI. (2024, 8 de mayo). FBI Warns of Increasing Threat of Cyber Criminals Utilizing Artificial Intelligence. https://www.fbi.gov/contact-us/field-offices/sanfrancisco/news/fbi-warns-of-increasing-threat-of-cyber-criminals-utilizing-artificial-intelligence

Floridi, L., Cowls, J., King, T. C., y Taddeo, M. (2021). How to design AI for social good: Seven essential factors. Ethics, Governance, and Policies in Artificial Intelligence, 26, 125-151. https://link.springer.com/article/10.1007/s11948-020-00213-5

Fobbe, E. (2020). Text-linguistic analysis in forensic authorship attribution. Journal of Language and Law (JLL), 9, 111–123. https://www.languageandlaw.eu/jll/article/view/78

Franganillo, J. (2023). La inteligencia artificial generativa y su impacto en la creación de contenidos mediáticos. Methaodos, revista de ciencias sociales, 11(2), 1-17. https://doi.org/10.17502/mrcs.v11i2.710

Federal Trade Comission. (2024, 25 de septiembre). FTC Announces Crackdown on Deceptive AI Claims and Schemes Federal Trade Commission. Recuperado de https://www.ftc.gov/news-events/news/press-releases/2024/09/ftc-announces-crackdown-deceptive-ai-claims-schemes

Gehrmann, S., Strobelt, H., y Rush, A. M. (2019). GLTR: Statistical Detection and Visualization of Generated Text. arXiv. https://doi.org/10.48550/arXiv.1906.04043

Gillespie, J. H., y McKee, J. (1998). The Text Analysis Program: Developing students' analytical skills. ReCALL, 10(2), 44-53. https://www.cambridge.org/core/journals/recall/article/abs/text-analysis-program-developing-students-analytical-skills/047569D72795ADEBA82B9D2D97BEC457

Giulianelli, M., Baan, J., Aziz, W., Fernández, R., y Plank, B. (2023). What comes next? evaluating uncertainty in neural text generators against human production variability. arXiv. https://doi.org/10.48550/arXiv.2305.11707

Gómez, M. J. (2023). “Mamá, ayúdame”: Inteligencia Artificial imita voz de adolescente para fingir secuestro y exigir rescate. Recuperado de https://www.latercera.com/tendencias/noticia/mama-ayudame-inteligencia-artificial-imita-voz-de-adolescente-para-fingir-secuestro-y-exigir-rescate/XTQEUXTJZJH5BJJXHRJX7QID4E/

Grant, T. (2022). The idea of progress in forensic authorship analysis. Cambridge University Press.

Hadnagy, C., y Fincher, M. (2015). Phishing Dark Waters: The Offensive and Defensive Sides of Malicious Emails. John Wiley & Sons.

Ji, J., Li, R., Li, S., Guo, J., Qiu, W., Huang, Z., Chen, C., Xinru, J., y Lu, X. (2024). Detecting machine-generated texts: Not just" ai vs humans" and explainability is complicated. arXiv. https://doi.org/10.48550/arXiv.2406.18259

Juola, P. (2015). The Rowling case: a proposed standard analytic protocol for authorship questions. Digital Scholarship in the Humanities, 30, 100-113. https://doi.org/10.1093/llc/fqv040

Lau, H. T., y Zubiaga, A. (2024). Understanding the Effects of Human-written Paraphrases in LLM-generated Text Detection. arXiv.

Maras, M. H. y Alexandrou, A. (2018). Determining authenticity of video evidence in the age of artificial intelligence and in the wake of Deepfake videos. The International Journal of Evidence & Proof, 23, 255-262. https://doi.org/10.1177/1365712718807226

Mindner, L., Schlippe, T., y Schaaff, K. (2023). Classification of human-and ai-generated texts: Investigating features for chatgpt. En International Conference on Artificial Intelligence in Education Technology (pp. 152-170). Springer Nature Singapore. https://doi.org/10.48550/arXiv.2308.05341

Muñoz-Basols, J., y Fuertes Gutiérrez, (2024). Oportunidades de la Inteligencia Artificial (IA) en la enseñanza y el aprendizaje de lenguas. En J. Muñoz-Basols, M. Fuertes Gutiérrez, y L. Cerezo (Eds.), La enseñanza del español mediada por tecnología: De la justicia social a la Inteligencia Artificial (IA) (pp. 343-360). Routledge.

Queralt, S. (2023). Los aportes de la lingüística forense contra el cibercrimen. Del Español. Revista de Lengua, 1, 259-271. https://doi.org/10.33776/dlesp.v1.7923

Perkins, R. (2021). The Application of Forensic Linguistics in Cybercrime Investigations. Policing: A Journal of Policy and Practice, 15, 168–78. https://doi.org/10.1093/police/pay097

Sousa-Silva, R. (2022). Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection. International Journal for the Semiotics of Law - Revue internationale de Sémiotique juridique, 35, 2409–2433. https://doi.org/10.1007/s11196-022-09901-w

Sophos. (2019, 5 de septiembre). Scammers deepfake CEO’s voice to talk underling into $243000 transfer. Sophos News. https://news.sophos.com/en-us/2019/09/05/scammers-deepfake-ceos-voice-to-talk-underling-into-243000-transfer/?utm_source=chatgpt.com

Unesco. (2024). Directrices de la UNESCO para el uso de sistemas de inteligencia artificial en juzgados y tribunales. Unesdoc. Recuperado de https://unesdoc.unesco.org/ark:/48223/pf0000390781_spa

Wallace, E., Feng, S., Kandpal, N., Gardner, M., y Singh, S. (2019). Universal adversarial triggers for attacking and analyzing NLP. arXiv. https://doi.org/10.48550/arXiv.1908.07125

Weichert, J., y Dimobi, C. (2024). DUPE: Detection Undermining via Prompt Engineering for Deepfake Text. arXiv. https://doi.org/10.48550/arXiv.2404.11408

Ypma, R. J. F., Ramos, D., y Meuwly, D. (2023). AI-based Forensic Evaluation in Court: The Desirability of Explanation and the Necessity of Validation. En Z. Geradts y K. Franke (Eds.), Artificial Intelligence (AI) in Forensic Sciences. Wiley.

Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., Roesner, F., y Choi, Y. (2019). Defending against neural fake news. arXiv. https://doi.org/10.48550/arXiv.1905.12616

Zeng, Z., Liu, S., Sha, L., Li, Z., Yang, K., Liu, S., Gašević, D. y Chen, G. (2024). Detecting AI-Generated Sentences in Human-AI Collaborative Hybrid Texts: Challenges, Strategies, and Insights. arXiv. https://doi.org/10.48550/arXiv.2403.03506

Downloads

Published

2024-12-30

Issue

Section

Dossier sobre inteligencia artificial, lenguaje y discurso digital

How to Cite

Queralt, S. (2024). The Challenges of Forensic Linguistics in the Age of AI. Lengua Y Sociedad, 23(2), 1099-1116. https://doi.org/10.15381/lengsoc.v23i2.29462