Last week, DVOSoftware announced the first release of DataValidator. The company's founder, Val Rayzman, is a former executive at Informatica.DataValidator that seems so obviously useful that it is curious that no one else has done of it before. It compares two data sources and finds differences based on the evaluation criteria you select.
For example, consider an ETL process that loads 5000 records from an operational system into a data warehouse. We want to know that all of the records successfully loaded and that none of the values were unexpectedly truncated or altered. In a few clicks, we can create and execute a test in DataValidator that will show any variances between the two tables. Of course, DataValidator can also handle more complex tests as well, including set tests (i.e. show all values that occur in one set but not the other), multiple tables, incremental values, lookups, etc.
The real value of data validator is in the reduced amount of time it takes to perform data validation. Data validation is a highly manual process in most organizations. Some have developed scripts which are then automated and run following each load cycle. However, many organizations, short on time and resources, never get this far. Instead, they end up in a reactionary mode, correcting data errors after they have been discovered by users. The irony here is that organizations purchase Informatica to speed up and automate load processes, but then are still required to manually write code for validation. DataValidator fills that void.
There are a few features that would make DataValidator a more complete tool, such as scheduling. Val assures that these additional features are in the plan for future releases. The idea is to get the product into the market and learn from its users which features are needed most. That said, I believe that DataValidator offers a lot of value as it exists now. I don't think it is difficult to make an ROI case for something like this. But the challenges are the same as the data quality and profiling face, convincing organizations that data quality has a financial impact on the bottom line.
I am hoping to work with DataValidator on a live project soon and evaluate it more completely. I will be certain to share my experiences here.
Recent Comments