A Look at the DataRefiner

In order to incorporate new capabilities of Data Governance in the Platform, a new module has been included: DataRefiner (also called DataCleaner) accessible from the menu entry «Analytics Tools».

This module allows you to load data in different formats (XSL, CSV, XML, JSON, etc.) from your PC, from the Internet or from the Platform itself (through a SQL query) and work with them to make a cleaning, improvement, restructuring or reconciliation of these before loading them into the Platform as Ontology.

It also allows to work with data stored in platform as Ontologies to process them, clean them, and generate files from them. For this purpose, the tool offers a MS Excel-like interface

This module is built on Open Refine, an open-source Java tool (BSD-3 license), that we have already told you about here in the blog not long ago.

More info about it here.

Module capabilities

This module contemplates:

  • Importing files in differentformats and from different origins.
  • Export processed data to different formats.
  • Importing data from an Ontology: in this section, you will be able to connect to a platform instance, select a query and load this data into the tool:
  • Export already processed data (cleaned, aggregated, etc.) to an Ontology by choosing a Platform instance: working in Platform JSON format, or also exporting it as a JSON file to local :
  • The possibility of applying transformations: to a file manually and then automating the application of these same rules to other files (for example you could work only with data from one month, and then apply the results to a yearly file) through a DataFlow component:
  • User-level security: each user will be able to see only their own projects.

In our Development Portal we have several guides about DataRefiner, so we encourage you to take a look, because we explain its use in detail.

✍🏻 Author(s)

Leave a Reply

Your email address will not be published. Required fields are marked *