Quality Testing

Quality is delighting customers

i want to know what' big data and BI testing , which tools can we use when we do BI testing???

Views: 248

Reply to This

Replies to This Discussion

BI Testing Involves various stages such as

1. Testing raw Source data  - This can be done using various data analysis tools like Talend , Informatica Data Quality.

2. Testing ETL -  Mainly achieved manually through SQL's but can be partially  automated through various features provided in ETL tools like data comparison and data validation using buisness  rules

3. Tetsing BI Reports - Reports functionality  can be tested through functional automation tools like selenium and QTP.

Data validation can be performed by  utilizing facility of functional automation tools of crreating  jdbc connections to database and firing SQL's

Hi kiklo, 

Hope you are doing great.

Big Data is growing at a rapid pace and with Big Data comes bad data. Many companies are using Business Intelligence to make strategic decisions in the hope of gaining a competitive advantage in a tough business landscape. But bad data will cause them to make decisions that will cost their firms millions of dollars.

Testing big data is the toughest thing as it needs lots of efforts and hard work.

A typical agile work model, supported by a Hadoop ecosystem, involves analyzing ETL jobs (or extract, transform and load jobs), transforming data sets, validating them across different layers of data flow and migrating data across multiple databases. Understanding the flow of data against the well-defined business layers in a whole new enterprise warehouse model is typically challenging to handle. As a result, data scientists and analysts prefer automating this workflow and addressing complexities in data transformation and validation by developing Hadoop (Pig) scripts that validate and process data.

Tools used for testing big data are listed below:


  • CouchDB, DatabasesMongoDB, Cassandra, Redis, ZooKeeper, Hbase


  • Hadoop, Hive, Pig, Cascading, Oozie, Kafka, S4, MapR, Flume


  • S3, HDFS ( Hadoop Distributed File System)


  • Elastic,, Google App Engine, EC2


  • R, Yahoo! Pipes, Mechanical Turk, BigSheets, Datameer

  • Increasing need for Live integration of information: With multiple sources of information from different data, it has become imminent to facilitate live integration of information. This forces enterprises to have constantly clean and reliable data, which can only be ensured through end-to-end testing of the data sources and integrators.
  • Instant Data Collection and Deployment: Power of Predictive analytics and the ability to take Decisive Actions have pushed enterprises to adopt instant data collection solutions. These decisions bring in significant business impact by leveraging the insights from the minute patterns in large data sets. Add that to the CIO’s profile which demands deployment of instant solutions to stay in tune with changing dynamics of business. Unless the applications and data feeds are tested and certified for live deployment, these challenges cannot be met with the assurance that is essential for every critical operation.
  • Real-time scalability challenges: Big Data Applications are built to match the level of scalability and monumental data processing that is involved in a given scenario. Critical errors in the architectural elements governing the design of Big Data Applications can lead to catastrophic situations. Hardcore testing involving smarter data sampling and cataloguing techniques coupled with high end performance testing capabilities are essential to meet the scalability problems that Big Data Applications pose.


TTWT Magazine





© 2022   Created by Quality Testing.   Powered by

Badges  |  Report an Issue  |  Terms of Service