The Challenges of Forensic Linguistics in the Age of AI
DOI:
https://doi.org/10.15381/lengsoc.v23i2.29462Keywords:
Forensic Linguistics, Artificial Intelligence, authorship attribution, deepfakes, hybrid textsAbstract
In this article, the challenges faced by forensic linguistics in the era of Artificial Intelligence are analyzed. It focuses on describing the automatic generation of text and voice using AI and examining how this impacts authorship attribution and the authentication of evidence in judicial contexts. Additionally, it reviews current detection tools and their limitations. The findings suggest that traditional methodologies may not suffice to detect AI-generated texts, emphasizing the need to develop new tools and interdisciplinary approaches. The research underscores the importance of developing ethical and legal frameworks to regulate the use of AI in judicial processes.
References
Alhafni, B., Kulkarni, V., Kumar, D., y Raheja, V. (2024). Personalized Text Generation with Fine-Grained Linguistic Control. arXiv. https://doi.org/10.48550/arXiv.2402.04914
ASIS International. (2024). Putting Generative AI to Use for Crime: Fraud, Disinformation, Exploitation, and More. ASIS. https://www.asisonline.org
Brennan, M., Afroz, S., y Greenstadt, R. (2012). Adversarial stylometry: Circumventing authorship recognition to preserve privacy and anonymity. ACM Transactions on Information and System Security (TISSEC), 15(3), 1-22. https://doi.org/10.1145/2382448.238245
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, Z., Winter, C., y Amodei, D. (2020). Language models are few-shot learners. arXiv. https://doi.org/10.48550/arXiv.2005.14165
Chakraborty, S., Bedi, A. S., Zhu, S., An, B., Manocha, D., y Huang, F. (2023). On the possibilities of ai-generated text detection. arXiv. https://doi.org/10.48550/arXiv.2304.04736
Coulthard, M., y Johnson, A. (2007). An Introduction to Forensic Linguistics: Language in Evidence. Cambridge University Press.
Dinis-Oliveira, R. J., y Azevedo, R. M. (2023). ChatGPT in forensic sciences: a new Pandora’s box with advantages and challenges to pay attention. Forensic Sciences Research, 8(4), 275-279. https://doi.org/10.1093/fsr/owad039
Departamento de Cyber Threat Intelligence de NTT DATA Cybersecurity. (2024). Informe de tendencias y ciberamenazas del 1er semestre de 2024. NTT DATA. https://es.nttdata.com/newsfolder/los-ciberataques-supondran-un-coste-de-10000-millones-en-2024
Departamento de Justicia de EE.UU. (2024, 9 de julio). Justice Department Leads Efforts Among Federal, International, and Private Sector Partners. United States Attorney’s Office. https://www.justice.gov/usao-ndil/pr/justice-department-leads-efforts-among-federal-international-and-private-sector
Eder, M., Rybicki, J., y Kestemont, M. (2016). Stylometry with R: A package for computational text analysis. R Journal, 8(1), 107-121. https://journal.r-project.org/archive/2016/RJ-2016-007/index.html
Europol. (2024). New report finds that criminals leverage AI for malicious use – and it’s not just deep fakes. https://www.europol.europa.eu/media-press/newsroom/news/new-report-finds-criminals-leverage-ai-for-malicious-use-%E2%80%93-and-it%E2%80%99s-not-just-deep-fakes
FBI. (2024, 8 de mayo). FBI Warns of Increasing Threat of Cyber Criminals Utilizing Artificial Intelligence. https://www.fbi.gov/contact-us/field-offices/sanfrancisco/news/fbi-warns-of-increasing-threat-of-cyber-criminals-utilizing-artificial-intelligence
Floridi, L., Cowls, J., King, T. C., y Taddeo, M. (2021). How to design AI for social good: Seven essential factors. Ethics, Governance, and Policies in Artificial Intelligence, 26, 125-151. https://link.springer.com/article/10.1007/s11948-020-00213-5
Fobbe, E. (2020). Text-linguistic analysis in forensic authorship attribution. Journal of Language and Law (JLL), 9, 111–123. https://www.languageandlaw.eu/jll/article/view/78
Franganillo, J. (2023). La inteligencia artificial generativa y su impacto en la creación de contenidos mediáticos. Methaodos, revista de ciencias sociales, 11(2), 1-17. https://doi.org/10.17502/mrcs.v11i2.710
Federal Trade Comission. (2024, 25 de septiembre). FTC Announces Crackdown on Deceptive AI Claims and Schemes Federal Trade Commission. Recuperado de https://www.ftc.gov/news-events/news/press-releases/2024/09/ftc-announces-crackdown-deceptive-ai-claims-schemes
Gehrmann, S., Strobelt, H., y Rush, A. M. (2019). GLTR: Statistical Detection and Visualization of Generated Text. arXiv. https://doi.org/10.48550/arXiv.1906.04043
Gillespie, J. H., y McKee, J. (1998). The Text Analysis Program: Developing students' analytical skills. ReCALL, 10(2), 44-53. https://www.cambridge.org/core/journals/recall/article/abs/text-analysis-program-developing-students-analytical-skills/047569D72795ADEBA82B9D2D97BEC457
Giulianelli, M., Baan, J., Aziz, W., Fernández, R., y Plank, B. (2023). What comes next? evaluating uncertainty in neural text generators against human production variability. arXiv. https://doi.org/10.48550/arXiv.2305.11707
Gómez, M. J. (2023). “Mamá, ayúdame”: Inteligencia Artificial imita voz de adolescente para fingir secuestro y exigir rescate. Recuperado de https://www.latercera.com/tendencias/noticia/mama-ayudame-inteligencia-artificial-imita-voz-de-adolescente-para-fingir-secuestro-y-exigir-rescate/XTQEUXTJZJH5BJJXHRJX7QID4E/
Grant, T. (2022). The idea of progress in forensic authorship analysis. Cambridge University Press.
Hadnagy, C., y Fincher, M. (2015). Phishing Dark Waters: The Offensive and Defensive Sides of Malicious Emails. John Wiley & Sons.
Ji, J., Li, R., Li, S., Guo, J., Qiu, W., Huang, Z., Chen, C., Xinru, J., y Lu, X. (2024). Detecting machine-generated texts: Not just" ai vs humans" and explainability is complicated. arXiv. https://doi.org/10.48550/arXiv.2406.18259
Juola, P. (2015). The Rowling case: a proposed standard analytic protocol for authorship questions. Digital Scholarship in the Humanities, 30, 100-113. https://doi.org/10.1093/llc/fqv040
Lau, H. T., y Zubiaga, A. (2024). Understanding the Effects of Human-written Paraphrases in LLM-generated Text Detection. arXiv.
Maras, M. H. y Alexandrou, A. (2018). Determining authenticity of video evidence in the age of artificial intelligence and in the wake of Deepfake videos. The International Journal of Evidence & Proof, 23, 255-262. https://doi.org/10.1177/1365712718807226
Mindner, L., Schlippe, T., y Schaaff, K. (2023). Classification of human-and ai-generated texts: Investigating features for chatgpt. En International Conference on Artificial Intelligence in Education Technology (pp. 152-170). Springer Nature Singapore. https://doi.org/10.48550/arXiv.2308.05341
Muñoz-Basols, J., y Fuertes Gutiérrez, (2024). Oportunidades de la Inteligencia Artificial (IA) en la enseñanza y el aprendizaje de lenguas. En J. Muñoz-Basols, M. Fuertes Gutiérrez, y L. Cerezo (Eds.), La enseñanza del español mediada por tecnología: De la justicia social a la Inteligencia Artificial (IA) (pp. 343-360). Routledge.
Queralt, S. (2023). Los aportes de la lingüística forense contra el cibercrimen. Del Español. Revista de Lengua, 1, 259-271. https://doi.org/10.33776/dlesp.v1.7923
Perkins, R. (2021). The Application of Forensic Linguistics in Cybercrime Investigations. Policing: A Journal of Policy and Practice, 15, 168–78. https://doi.org/10.1093/police/pay097
Sousa-Silva, R. (2022). Fighting the Fake: A Forensic Linguistic Analysis to Fake News Detection. International Journal for the Semiotics of Law - Revue internationale de Sémiotique juridique, 35, 2409–2433. https://doi.org/10.1007/s11196-022-09901-w
Sophos. (2019, 5 de septiembre). Scammers deepfake CEO’s voice to talk underling into $243000 transfer. Sophos News. https://news.sophos.com/en-us/2019/09/05/scammers-deepfake-ceos-voice-to-talk-underling-into-243000-transfer/?utm_source=chatgpt.com
Unesco. (2024). Directrices de la UNESCO para el uso de sistemas de inteligencia artificial en juzgados y tribunales. Unesdoc. Recuperado de https://unesdoc.unesco.org/ark:/48223/pf0000390781_spa
Wallace, E., Feng, S., Kandpal, N., Gardner, M., y Singh, S. (2019). Universal adversarial triggers for attacking and analyzing NLP. arXiv. https://doi.org/10.48550/arXiv.1908.07125
Weichert, J., y Dimobi, C. (2024). DUPE: Detection Undermining via Prompt Engineering for Deepfake Text. arXiv. https://doi.org/10.48550/arXiv.2404.11408
Ypma, R. J. F., Ramos, D., y Meuwly, D. (2023). AI-based Forensic Evaluation in Court: The Desirability of Explanation and the Necessity of Validation. En Z. Geradts y K. Franke (Eds.), Artificial Intelligence (AI) in Forensic Sciences. Wiley.
Zellers, R., Holtzman, A., Rashkin, H., Bisk, Y., Farhadi, A., Roesner, F., y Choi, Y. (2019). Defending against neural fake news. arXiv. https://doi.org/10.48550/arXiv.1905.12616
Zeng, Z., Liu, S., Sha, L., Li, Z., Yang, K., Liu, S., Gašević, D. y Chen, G. (2024). Detecting AI-Generated Sentences in Human-AI Collaborative Hybrid Texts: Challenges, Strategies, and Insights. arXiv. https://doi.org/10.48550/arXiv.2403.03506
Downloads
Published
Issue
Section
License
Copyright (c) 2024 Sheila Queralt

This work is licensed under a Creative Commons Attribution 4.0 International License.
AUTHORS RETAIN THEIR RIGHTS
a. Authors retain their trade mark rights and patent, and also on any process or procedure described in the article.
b. Authors can submit to the journal Lengua y Sociedad, papers disseminated as pre-print in repositories. This should be made known in the cover letter.
c. Authors retain their right to share, copy, distribute, perform and publicly communicate their article (eg, to place their article in an institutional repository or publish it in a book), with an acknowledgment of its initial publication in the journal Lengua y Sociedad.
d. Authors retain theirs right to make a subsequent publication of their work, to use the article or any part thereof (eg a compilation of his papers, lecture notes, thesis, or a book), always indicating its initial publication in the journal Lengua y Sociedad (the originator of the work, journal, volume, number and date).