Atención

Búsqueda avanzada
Buscar en:   Desde:
 
Extract, transform and load architecture for metadata collection
De Giusti, Marisa Raquel, Lira, Ariel Jorge y Oviedo, Néstor Fabián.
VI Simposio Internacional de Bibliotecas Digitales. Pontificia Universidad Católica de Río Grande Do Sul, Porto Alegre, 2011.
  ARK: https://n2t.net/ark:/13683/ptyc/wtQ
Resumen
Digital repositories acting as resource aggregators typically face different challenges, roughly classified in three main categories: extraction, improvement and storage. The first category comprises issues related to dealing with different resource collection protocols: OAI-PMH, web-crawling, webservices, etc and their representation: XML, HTML, database tuples, unstructured documents, etc. The second category comprises information improvements based on controlled vocabularies, specific date formats, correction of malformed data, etc. Finally, the third category deals with the destination of downloaded resources: unification into a common database, sorting by certain criteria, etc. This paper proposes an ETL architecture for designing a software application that provides a comprehensive solution to challenges posed by a digital repository as resource aggregator. Design and implementation aspects considered during the development of this tool are described, focusing especially on architecture highlights.
Texto completo
Dirección externa:
Creative Commons
Esta obra está bajo una licencia de Creative Commons.
Para ver una copia de esta licencia, visite https://creativecommons.org/licenses/by-nc-nd/4.0/deed.es.