OpenRefine

OpenRefine is a Java-based powerful application that allows you to load data, understand it, clean it up, and bundle it with data coming from the web. OpenRefine is a powerful tool for working with messy data. OpenRefine is previously called Google Refine.

OpenRefine is similar to spreadsheet applications operating on data rows and columns. It can handle spreadsheet file formats such as CSV, but it behaves more like a database. OpenRefine projects consist of one table, whose rows can be filtered. For example, showing rows where a given column has certain conditions. The working of OpenRefine is done by running a small server on the computer and we can use the web browser to interact with it.

OpenRefine is an open-source tool that allows you to clean up data, transform it from one format into another, and extend it with web services and external data. This activity is commonly known as data wrangling.

Unlike spreadsheets, most operations in OpenRefine are done on all visible rows. For example, consider the transformation of all cells in all rows under one column. Another feature of OpenRefine is that the actions performed on a dataset can be stored in the project and can be replayed on other datasets. In OpenRefine, formulas are not stored in cells but are used to transform the data, and transformation is done only once. Formula expressions can be written in General Refine Expression Language (GREL), Jython, or in Clojure.

OpenRefine always keeps the data private on the computer until we want to share or collaborate. Private data never leaves the computer unless we want it to.


Also Read:

Share:

Author: Ayush Purawr