Tecnologías de la Información y Telecomunicaciones

Development of a ‘smart’ system for fake news detection

A research team from the University of Jaén and the University of Alicante has tested a model based on artificial intelligence that determines the truthfulness of information in digital media. Using this tool, both journalists and end users will be able to define the credibility of a text.

Jaén |

05/11/2021

A research team from the universities of Jaén and Alicante has created an application that automatically analyses news stories and determines their truthfulness with a high degree of accuracy. Although the model is still in the testing phase, it is proposed as a useful tool for filtering the vast amount of information that reaches journalists and private readers every day.

To identify fake news, scientists have developed machine learning-based false news and misinformation credibility inference models. These artificial intelligence techniques allow the system to analyse the news on two levels to detect whether there are inconsistencies in the content and whether the structure matches what any publication with journalistic rigour should possess.

The system analyses the structure of the published news item taking into account traditional journalistic standards: the 5W+H rule and the inverted pyramid.

In the journal Expert Systems with Applications, the researchers have published an article entitled ‘Exploiting discourse structure of traditional digital media to enhance automatic fake news detection’ in which they present the prototype of a ‘fake news’ detector for websites. This tool aims to offer greater confidence to the reader and to provide journalists with new tools that allow them to distinguish between different pieces of information.

The system analyses the structure of the published news item taking into account traditional journalistic standards: the 5W+H rule and the inverted pyramid. These references are based on the fact that a rigorous news story should contain information that answers the six basic questions (what, when, where, who, why and how) and should be presented in descending priority from what is most important to the finer details. «The structure of a publication gives clues as to whether it has a journalistic basis or whether it is instead simulating a real news story,» Fundación Descubre was told by Miguel Ángel García, researcher at the University of Jaén and one of the authors of the article.

Miguel Ángel García, researcher at the University of Jaén and one of the authors of the article.

From the analysis of natural language, experts develop an algorithm that detects information that does not match this structure. These calculations are based on techniques of what is known as machine learning, whereby the system ‘learns’ as it accumulates more and more data.

In addition, the machine is capable of processing thousands of simultaneous data in seconds, something that a person could not do. «Thus, journalists can compare and contrast sources, detect incorrect structures, viral contents or inconsistencies between the headline and the body of the news immediately and automatically. The end user will also have clues as to whether the news they read meets certain standards or not,» adds Estela Saquete, researcher at the University of Alicante and another author of the article.

Estela Saquete, researcher at the University of Alicante and another author of the article.

Analyse, detect and highlight to alert

The scientific team of ‘Intelligent Information Access Systems’ (SINAI), of the University of Jaén, and the ‘Natural Language Processing and Information Systems Group’ (GPLSI) of the University of Alicante, carried out the tests with a dataset in Spanish consisting of more than 200 articles focused on health issues, of special relevance nowadays due to the numerous fake news circulating about COVID-19.

The system is based on deep learning, which creates computational models composed of multiple layers of data processing. In this particular work, the experts define two layers. On the one hand, the structure of the news item, and on the other, the storyline. In this way, the machine predicts not only the credibility of the form, but also of the content.

In addition, the researchers have applied a new scheme for data processing, known as fine-grained annotation, which consists of establishing labels for news items. These tag sets apply to all possibilities, even if the differences are small. This provides a detailed description of each text at both levels.

Each label carries a set of attributes that provide information beyond linguistic aspects to include the verification of facts, semantic relationships between elements or contextual features. It even refers to aspects related to the emotional charge that a piece of writing may contain and that distance it from the objectivity that should characterise a real news story.

The experts’ objective is to obtain an application that automatically marks the text of a news item while it is being read and that flashes up those fragments that may be false, pointing out the references with other similar texts that allow their truthfulness to be verified.

This research has been developed through the project ‘LIVING-LANG: Modelling the behaviour of digital entities through Human Language Technologies’ of the Spanish Ministry of Science and Innovation, and the project ‘SIIA: Human Language Technologies for an inclusive, egalitarian and accessible society’ of the Comunidad Valenciana.

Spanish’s version: Desarrollan un sistema ‘inteligente’ que detecta noticias falsas

References

Alba Bonet Jovera, Alejandro Piad Morffis, Estela Saquete, Patricio Martínez Barco and Miguel Ángel García Cumbreras.‘Exploiting discourse structure of traditional digital media to enhance automatic fake news detection’. Expert Systems with Applications. 2021

Más información:

#CienciaDirecta, agencia de noticias de ciencia andaluza, financiada por la Consejería de Transformación Económica, Industria, Conocimiento y Universidades de la Junta de Andalucía.

Teléfono: 954 232 349

E-mail: comunicacion@fundaciondescubre.es

Additional documentation

Miguel Ángel García, author of the article

Estela Saquete, author of the article

System kake news

Fake news pc

Últimas publicaciones

#CienciaDirecta

Ciencias del Mar

Revelan el impacto de las cremas solares sobre las praderas marinas de la Bahía de Cádiz

Cádiz | 02 de agosto de 2025

Un equipo de investigación de la Universidad de Cádiz ha demostrado que los ingredientes de los fotoprotectores afectan a la salud de una planta marina propia del ecosistema costero gaditano. El experimento ha evidenciado cómo estos productos alteran tanto su capacidad para absorber carbono como el equilibrio de las bacterias que la rodean, lo que podría comprometer su papel en la protección del litoral y en la lucha contra el cambio climático.

Sigue leyendo

Educación

Demuestran como la ‘gamificación’ motiva a los adolescentes al ejercicio físico

Málaga | 31 de julio de 2025

Un trabajo basado en una gamificación de la serie de ficción de ‘Los Vengadores’ realizado por expertos de la Universidad de Málaga ha concluido que a través de pequeñas franjas de quince minutos durante las clases de Educación Física se ha logrado mejorar los hábitos saludables de los adolescentes.

Sigue leyendo

Agroalimentación

Prueban la efectividad de la participación para mejorar la trazabilidad en el comercio ecológico

Córdoba | 28 de julio de 2025

El Grupo Operativo SPG en el que ha participado la Universidad de Córdoba fomenta el uso de iniciativas participativas para tener una producción y consumo ecológicos y de proximidad más fuertes y articulados.

Sigue leyendo

#CienciaDirecta

Tu fuente de noticias sobre ciencia andaluza

Más información Suscríbete

¿ERES CIENTÍFICO/A Y QUIERES DIFUNDIR TUS RESULTADOS? CONTÁCTANOS

¿QUIERES CONTACTAR CON UN CIENTÍFICO/A? CONSULTA LA GUÍA EXPERTA

Política de cookies

Este sitio web utiliza cookies para mejorar su experiencia mientras navega por el sitio web. De estas, las cookies que se clasifican como necesarias se almacenan en su navegador, ya que son esenciales para el funcionamiento de las funcionalidades básicas del sitio web. También utilizamos cookies de terceros que nos ayudan a analizar y comprender cómo utiliza este sitio web. Estas cookies se almacenarán en su navegador solo con su consentimiento. También tiene la opción de optar por no recibir estas cookies. Pero la exclusión voluntaria de algunas de estas cookies puede afectar su experiencia de navegación.

Necesarias

Siempre activado

Las cookies necesarias son absolutamente esenciales para que el sitio web funcione correctamente. Esta categoría solo incluye cookies que garantizan funcionalidades básicas y características de seguridad del sitio web. Estas cookies no almacenan ninguna información personal.

Cookie	Duración	Descripción
CONSENT	16 años 7 meses	Esta cookie está configurada por el complemento de consentimiento de cookies de GDPR. La cookie se utiliza para almacenar el consentimiento del usuario para las cookies.
cookielawinfo-checkbox-advertisement	1 año	La cookie se establece mediante el consentimiento de cookies de GDPR para registrar el consentimiento del usuario para las cookies en la categoría "Publicidad".
cookielawinfo-checkbox-analytics	1 año	Estas cookies están configuradas por el complemento de WordPress de consentimiento de cookies de GDPR. La cookie se utiliza para recordar el consentimiento del usuario para las cookies en la categoría "Análisis".
cookielawinfo-checkbox-necessary	1 año	Esta cookie está configurada por el complemento de consentimiento de cookies de GDPR. Las cookies se utilizan para almacenar el consentimiento del usuario para las cookies en la categoría "Necesarias".
cookielawinfo-checkbox-performance	1 año	Esta cookie está configurada por el complemento de consentimiento de cookies de GDPR. La cookie se utiliza para almacenar el consentimiento del usuario para las cookies en la categoría "Rendimiento".
JCS_INENREF	1 hora	La cookie es parte de las medidas de seguridad del sitio web y se utiliza con fines antispam.
JCS_INENTIM	1 hora	La cookie es parte de las medidas de seguridad del sitio web y se utiliza con fines antispam.
PHPSESSID	sessión	Esta cookie es nativa de las aplicaciones PHP. La cookie se utiliza para almacenar e identificar la identificación de sesión única de un usuario con el fin de administrar la sesión del usuario en el sitio web. La cookie es una cookie de sesión y se elimina cuando se cierran todas las ventanas del navegador.

Rendimiento

Las cookies de rendimiento se utilizan para comprender y analizar los índices de rendimiento clave del sitio web, lo que ayuda a brindar una mejor experiencia de usuario a los visitantes.

Cookie	Duración	Descripción
_gat	1 minuto	Google Universal Analytics instala estas cookies para acelerar la tasa de solicitud y limitar la recopilación de datos en sitios de alto tráfico.
YSC	sessión	Estas cookies son establecidas por Youtube y se utilizan para rastrear las vistas de videos incrustados.

Analítica

Las cookies analíticas se utilizan para comprender cómo los visitantes interactúan con el sitio web. Estas cookies ayudan a proporcionar información sobre métricas, el número de visitantes, la tasa de rebote, la fuente de tráfico, etc.

Cookie	Duración	Descripción
_ga	2 años	Esta cookie es instalada por Google Analytics. La cookie se utiliza para calcular los datos de visitantes, sesiones y campañas y realizar un seguimiento del uso del sitio para el informe de análisis del sitio. Las cookies almacenan información de forma anónima y asignan un número generado aleatoriamente para identificar visitantes únicos.
_gid	1 día	Esta cookie es instalada por Google Analytics. La cookie se utiliza para almacenar información sobre cómo los visitantes usan un sitio web y ayuda a crear un informe analítico de cómo está funcionando el sitio web. Los datos recopilados, incluido el número de visitantes, la fuente de donde provienen y las páginas visitadas de forma anónima.

Las cookies publicitarias se utilizan para proporcionar a los visitantes anuncios y campañas de marketing relevantes. Estas cookies rastrean a los visitantes en los sitios web y recopilan información para proporcionar anuncios personalizados.

Cookie	Duración	Descripción
IDE	1 año 24 días	Utilizado por Google DoubleClick y almacena información sobre cómo el usuario utiliza el sitio web y cualquier otro anuncio antes de visitar el sitio web. Se utiliza para presentar a los usuarios anuncios que son relevantes para ellos de acuerdo con el perfil del usuario.
ms-uid	1 año	Estas cookies tienen una finalidad publicitaria. Contienen un valor único generado aleatoriamente que permite a la Plataforma distinguir navegadores y dispositivos. Esta información se utiliza para medir el rendimiento de los anuncios y proporcionar recomendaciones de productos basadas en datos.
NID	6 meses	Esta cookie se utiliza para crear un perfil en función del interés del usuario y mostrar anuncios personalizados a los usuarios.
test_cookie	15 minutos	Esta cookie la establece doubleclick.net. El propósito de la cookie es determinar si el navegador del usuario admite cookies.
VISITOR_INFO1_LIVE	5 meses 27 días	Youtube establece esta cookie. Se utiliza para rastrear la información de los videos de YouTube incrustados en un sitio web.