My articles and publications --(full text, click here. You may be asked to sign up --it is free) --Mis publicaciones (texto completo: http://ipn.academia.edu/AdolfoGuzman Quizá le pida suscribirse --es gratis) Mi página Web -- (click here) -- My Web page (http://alum.mit.edu/www/aguzman). ALGUNOS VIDEOS SOBRE LO QUE HAGO. Conferencia 'Ciudad inteligente, con conectividad y tecnología' (oct. 2010), parte 1 (15min), parte 2 (8min), parte 3 (9min), parte 4 (2min). Entrevista por redCudiMéxico, 2012: aquí (11 min). Avances en Inteligencia Artificial, entrevista en la Univ. IBERO, Puebla, 2013. Pulse aquí (53min). Video in the series "Personalities in the history of ESIME" (for the 100 years anniversary of ESIME-IPN, in Spanish) about Adolfo Guzman": 2014, click here. (1h)
Entrevista "La visión de los egresados del IPN, a 80 años de la creación del IPN y 100 años de la creación de la ESIME, 2014: ver en youtube (1h). Seminario sobre "Big Data" (la Ciencia de Datos). 2014. Pulse aquí (56min). Seminar on "Big Data", in English, 2014. Click here (56min). Algunos trabajos sobre Minería de Datos y sus Aplicaciones (CIC-IPN, 2016): pulse aquí (5min). El auge y el ocaso de las máquinas de Lisp (Plática en la Reunión Anual 2016 de la Academia Mexicana de Computación): pulse aquí (56min). Entrevista sobre la funcionalidad y competitividad de Hotware 10: 2016, aquí (6 min). Adolfo Guzmán Arenas, Ingeniero Electrónico e investigador del Centro de Investigación en Computación del IPN, conversó sobre su trayectoria y la importancia de las ciencias aplicadas para el desarrollo del país. 2017, Canal 11, Noticias TV (30min). Cómo se construyó la primera computadora en el mundo de procesamiento paralelo con Lisp. Marzo 2018. https://www.youtube.com/watch?v=dzyZGDhxwrU (12 min). Charla "Historias de éxito en la computación mexicana", ciclo Códice IA. Entrevista a A. Guzmán, "Entre la vida y la academia": https://bit.ly/3sIOQBc (45 min). El CIC cumple 25 años. Pulse aquí (51min. Habla Adolfo: "Pasado y futuro del CIC": minutos 13.57 a 22.70 ).
Perfil en ResearchGate -- Adolfo Guzman-Arenas My URL in Google Scholar: http://scholar.google.com/citations?user=Nw5lSdEAAAAJ My ORCID number 0000-0002-8236-0469. Scopus Author ID 6602302516.

Follow me on Academia.edu

Antecumen. Prototipo de herramienta para el análisis con cubos en memoria principal

184. Gilberto Martinez-Luna, Adolfo Guzmán-Arenas. (2008) Antecumen. Prototipo de herramienta para el análisis con cubos en memoria principal. Click here. Este artículo técnico fue publicado en la Conferencia Mundial sobre Tecnologías de la Informacion y Comunicaciones 2008.
Se describe una herramienta llamada Antecumem que se utiliza para desarrollar análisis en bases de datos almacenadas en memoria principal. La descripción abarca una lista de preguntas de negocio y el almacén de la base de datos. El almacén es una estructura de datos con arreglos y que están ligados entre sí, llamada Arblis, con lo cual no se buscan los datos en disco, lo que reduce el tiempo en la búsqueda de datos. Arblis almacena la base de datos, que es modelada como una base multi-dimensional (cubos de datos). Este modelo, permite definir operaciones con los cubos, operaciones con un interés sobre sucesos a través del tiempo, pero que también pueden ser en cualquier otra dimensión. Una operación con los datos de interés a analizar, puede ser el porcentaje de incremento de un período a otro. Arblis permite responder a la lista de preguntas de negocios aquí planteada.
(Technical paper). It describes a tool called Antecumem which is used for analysis in databases stored in main memory. The description includes a list of questions from business and store the database. The warehouse is a data structure and arrangements that are linked to each other, call Arblis, which does not seek data on disk, which reduces the time in the search for data. Arblis stores the database, which is modeled as a multi-dimensional (data cube). This model lets you define operations in the data cubes, oprations with an interest in events over time, but may also be in any other dimension. An operation with the data of interest to analyze, may be the percentage increase from one period to another. Arblis responding to the list of business questions raised here.

Ph. D. Thesis. A complete description of ANTECUMEN and Arblis is found in the Ph. D. Thesis of Gilberto Martinez, click here. Una descripción detallada de ANTECUMEN y Arblis, y programas accesorios, así como ejemplos y análisis teórico, se encuentra en la tesis de doctorado de Gilberto Martínez Luna. Abstract: This work deals with the design of a data structure called Arblis on main memory as a persistent warehouse only for reading, where to make searches and operations for data analysis. The design of the data structure helps to reduce the time to read the data of the main memory from 10 to 50 times, instead of reading them from disc, desirable reduction in information systems that make data analysis and decision support. The data stored is already validated and there is no modification process.
The data structure consists of two arrays linked between them. The data store is used by a software tool called Antecumem. The constructed tool full the data structure Arblis, captures the desired tasks, makes the corresponding data analyses and obtains the corresponding results.
The tasks of data analysis are on businesses questions that work with data range in different variables that have a special interest. In order to obtain the answer of the businesses questions it is generally required to merge thousands and sometimes millions of records. Besides, the processes are expensive by the access to the data, and also, they are expensive in the number of operations between the records.
The data ranges in the variables define to the tool to have like work unit the multi-dimensional model or the datacube. This unit allows to define operations on the results of analyzing the datacubes. The operations are union, intersection and difference. In general these are of interest in events through time, but it may be in any dimension. In addition, it allows to define operations on the facts, like the percentage of increase from a period to another one.
With the identification of the key elements (parameters and datacube) of the businesses questions when making a classification of them, obtained a work model in order to facilitate the creation of the corresponding algorithms to solve the questions. This model allows to see the base as a multi-dimensional database. The flexibility of the model allows to answer more business questions which were not consider of the start.
Based in the model, the tool uses an input screen to receive the parameters that define the type of businesses question, to accept the ranks of data to define the datacubes where the questions are answered and the corresponding screen to return the results for an interpretation of the results.
As a further proof of the usefulness of the data structure, Antecumem is used to model to the nodes of a called structure lattice. Lattice stores the views that form or complement to the datacube. This structure allows according to the ranks of the question, to select the detail of records or to make the decision to directly go to read the already accumulated records within the nodes of latice. It is decision helps to reduce plus the response time, in other types of questions, in addition to the modeled ones initially.
Palabras claves. Bases de Datos Relacionales, Cubo en Memoria, Datos en Memoria, Minería de Datos, Minería Incremental, Ajuste de Curvas, Bases de Datos Multidimensionales, OLAP, Cubos de Datos y Lattices.

Automatic interchange of knowledge between business ontologies

185. Alma-Delia Cuevas, Adolfo Guzman-Arenas (2009) Automatic interchange of knowledge between business ontologies. This technical paper will appear in Proceedings of Intelligent Decision Support Technologies 2009, Japan. Click here. This is a shorter version, suited for a Congress, of paper #183.
A person adds new knowledge to his/her mind, taking into account new information, additional details, better precision, synonyms, homonyms, redundancies, apparent contradictions, and inconsistencies between what he/she knows and new knowledge that he/she acquires. This way, he/she incrementally acquires information keeping at all times it consistent. This information can be perfectly represented by Ontologies. In contrast to human approach, algorithms of Ontologies fusion lack these features, merely being computer-aided editors where a person solves the details and inconsistencies. This article presents a method of Ontology Merging (OM), its algorithm and implementation to fuse or join two Ontologies (obtained from Web documents) in an automatic fashion (without human intervention), producing a third ontology, and taking into account the inconsistencies, contradictions, and redundancies between both Ontologies, thus delivering a result close to reality. OM produces better results, when they are compared against fusions manually carried out. The repeated use of OM allows acquisition of much more information about the same topic.

An extensive explanation of OM can be found in Alma-Delia Cuevas Ph. D. Thesis, click here. Ejemplos, explicaciones extensas, y consideraciones teóricas sobre OM y su desempeño se halla en la tesis de doctorado de Alma Delia Cuevas Rasgado. ABSTRACT: A person’s knowledge increases as more information is obtained from his environment; information sources play an important role in this process. One does not learn from zero, even an animal is born with innate knowledge. Learning happens by adding new concepts or linking them to already existing ones. Although information from outside sources can contradict or confound a person, he has the tools to solve somehow this problem. The knowledge accumulates in what we can call his ontology.
Ontologies can also be structured and defined in computers. This work focuses on ontology fusion; during the fusion the same cases arises as those occurring to a person. The difference is that machines have no common sense, so the challenges are to automate the fusion, to perform it in spite of problems (redundancies, descriptions at different detail levels), and that the result be as close as possible to the result obtained by a person.
Previous works [11, 13, 28 y 40] perform ontology fusion in a semiautomatic, computer-assisted manner. Others [25 y 34] fuse ontologies expressed in a formal notation, but are incapable of fusing mutually-inconsistent ontologies, as most of the real-life ontologies are.
This work presents a process for ontology merging which is automatic and robust. Automatic since the computer detects and solves the problems arising during the fusion and robust because merging occurs in spite of ontologies being mutually inconsistent and present information from different viewpoints. The efficiency of our algorithm is shown by converting by hand several documents in Internet to ontologies in our notation, and then automatically fusing them. Results show a slight error margin in comparison with manual fusion performed by an expert.
RESUMEN. El conocimiento de un ser humano se va acumulando conforme a lo que sucede en su entorno, las fuentes de información tienen un papel importante en este proceso; no se aprende de cero, inclusive un animal nace con conocimiento previo. El aprendizaje sucede agregando nuevos conceptos o asociándolos a los ya existentes. Aunque existe información del exterior que puede contradecir o confundir a un ser humano, éste cuenta con las herramientas que le permite resolverlo de alguna manera. A éste cúmulo de información se le puede llamar su ontología.
Las ontologías también se pueden estructurar y definir en las computadoras. Este trabajo se centra en la unión de ontologías entre computadoras, durante ésta unión pueden suceder los mismos casos que en una persona; la diferencia es que las máquinas carecen de sentido común y los desafíos son hacer la fusión de manera automática, que no se detenga ante los problemas (redundancias, distinto nivel de descripción…) que se presenten y que el resultado sea lo más cercano a la fusión natural de conocimiento del ser humano.
Existen trabajos [11, 13, 28 y 40] que realizan la unión de ontologías pero lo hacen de manera semiautomática, otros [25 y 34] unen ontologías expresadas en un lenguaje formal, pero son incapaces de unir ontologías mutuamente inconsistentes, como lo son la mayoría de las ontologías reales.
Este trabajo presenta un proceso de unión de ontologías de forma automático y robusto. Automático porque la computadora detecta y resuelve los problemas que se presentan durante el proceso de la unión y Robusto porque realiza la unión pese a que las ontologías son mutuamente inconsistentes o representan la información desde distintos ángulos. Se demuestra la eficiencia del algoritmo de fusión a través de varios ejemplos reales con documentos obtenidos de Internet cuyas ontologías se construyeron manualmente y se fusionaron de manera automática. Los resultados tuvieron un ligero margen de error en comparación con la fusión manual de un usuario experto en el tema del documento.

Obtaining the consensus and inconsistency among a set of assertions on a qualitative attribute

186. Adolfo Guzman-Arenas, Adriana Jimenez, Obtaining the consensus and inconsistency among a set of assertions on a qualitative attribute. (Technical paper) Click here.
It is well understood how to compute the average or centroid of a set of numeric values, as well as their variance. In this way we handle inconsistent measurements of the same property. We wish to solve the analogous problem on qualitative data: How to compute the “average” or consensus of a set of affirmations on a non-numeric fact, as reported for instance by different Web sites? What is the most likely truth among a set of inconsistent assertions about the same attribute?
Given a set (a bag, in fact) of statements about a qualitative feature, this paper provides a method, based in the theory of confusion, to assess the most plausible value or “consensus” value. It is the most likely value to be true, given the information available. We also compute the inconsistency of the bag, which measures how far apart the testimonies in the bag are. All observers are equally credible, so differences arise from perception errors, due to the limited accuracy of the individual findings (the limited information extracted by the examination method from the observed reality).
Our approach differs from classical logic, which considers a set of assertions to be either consistent (True, or 1) or inconsistent (False, or 0), and it does not use Fuzzy Logic.

Un conjunto de reglas heurísticas para encontrar información interesante en bases de datos relacionales y un algoritmo para su aplicación

Este trabajo es la tesis de maestría de Mirna López Espíndola. El trabajo completo se encuentra aquí. This is the M. Sc. Thesis of Mirna López. Click here to obtain the full work.
RESUMEN.

-->El almacenamiento de información en grandes bases de datos dificulta la extracción de datos útiles o interesantes para un usuario y objetivo específico. El objetivo del descubrimiento de información en bases de datos (KDD) es la obtención de información interesante mediante el proceso de minería de datos, dentro de este proceso se realiza la depuración de información, en el cual se sitúa el presente estudio, esta depuración contempla varias formas de representar la información y diversos métodos para la depuración, nosotros utilizaremos la regla de asociación para representar la información y proponemos un conjunto de reglas heurísticas para la depuración de reglas de asociación. Las reglas heurísticas fueron obtenidas con base en los elementos que, según diversas definiciones del concepto “interesante”, identifican cuando y en que circunstancias algo es interesante. Proponemos un algoritmo para indicar el orden y circunstancias en que cada regla heurística se aplicará. Finalmente, se realiza un ejemplo de la aplicación de nuestro algoritmo y se comprarán los resultados a los obtenidos con el algoritmo de [Sahar 99].
ABSTRACT.
The storage of information in big databases difficult the extract of useful or interesting data for an specific user and goal. The goal of knowledge discovery databases (KDD) is to obtain interesting information through the data mining process, inside this process, is made the refine of information, in the which this study is placed, this refine contemplates many forms of to represent the information and different methods for the refine, we are going to use the association rule for represent the information and we propose a set of heuristic rules for the refine of association rules. The heuristic rules have been obtain with base in the elements that, according the different definition of the “interesting” concept, identify when and in what circumstances something is interesting. We propose an algorithm for indicate the order and circumstances in than each heuristic rule is going to apply. At last, we did an example of the application of our algorithm and the results are going to compare with ones obtain with the algorithm of [Sahar 99].