1 / 15

BIG DATA TESTING

Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analysed with traditional computing techniques.

Download Presentation

BIG DATA TESTING

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BIG DATA TESTING By QA InfoTech

  2. Scenario

  3. OMG!! Did he just asked me to catch rats in a place full of snakes

  4. Agenda • What is Big Data • Characteristic of Big Data • Meaning of BIG DATA to “US” • Hadoop 6. Submitting a Map Reduce Job

  5. What is BIG DATA? • ‘Big Data’ is similar to ‘small data’, but bigger in size • Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques. • Walmart handles more than 1 million customer transactions every hour. • Facebook handles 40 billion photos from its user base. • Decoding the human genome originally took 10years to process; now it can be achieved in one week.

  6. Three Characteristics of Big Data V3s

  7. What BIG DATA TESTING mean to Testers? • Take into consideration these 3 perspectives: • Data • Infrastructure • Validation Tools

  8. Now the questions comes what technology is needed for handling BIG DATA ? 1.HADOOP

  9. Hadoop & Its Components • Hadoop is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity hardware. Essentially, it accomplishes two tasks: massive data storage and faster processing. Source: http://www.trieuvan.com/apache/hadoop/common/

  10. How is Hadoop Helping? • HDFS: Java based distributed FS that can run and store all kinds of data • Map Reduce: A software programming model for processing large set of data in parallel • YARN: A resource management framework for scheduling and handling resource requests from distributed applications

  11. This is our Input File : Input Sampleset.txt

  12. Map Reduce Program For Max Temperature :Driver Class Job job = new Job(); job.setJarByClass(MaxTemperatureDriver.class); job.setJobName("Max Temperature"); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setMapperClass(MaxTemperatureMapper.class); job.setReducerClass(MaxTemperatureReducer.class);

  13. Mapper Class @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String year = line.substring(15, 19); intairTemperature; if (line.charAt(87) == '+') { // parseInt doesn't like leading plus // signs airTemperature = Integer.parseInt(line.substring(88, 92)); } else { airTemperature = Integer.parseInt(line.substring(87, 92)); }

  14. Reducer Class @Override public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { intmaxValue = Integer.MIN_VALUE; for (IntWritable value : values) { maxValue = Math.max(maxValue, value.get()); } context.write(key, new IntWritable(maxValue)); } }

  15. Thank You • For more information, please: • Contact us at info@qainfotech.com • Visit us at www.qainfotech.com • Read our blog at www.qainfotech.com/blog • Follow us on Twitter at www.twitter.com/qainfotech USA Office International Headquarters Farmington Hills Michigan, U.S.A. Noida Uttar Pradesh, India

More Related