Hadoop Interview Questions and Answers For Mapreduce In 2023

**Advantages of MapReduce**
Advantage	Description
Flexible	Hadoop MapReduce programming can access and operate on different types of structured and unstructured
Parallel Processing	MapReduce programming divides tasks for execution in parallel
Resilient	Is fault tolerant that quickly recognizes the faults & then apply a quick recovery solution implicitly
Scalable	Hadoop is a highly scalable platform that can store as well as distribute large data sets across plenty of servers
Cost-effective	High scalability of Hadoop also makes it a cost-effective solution for ever-growing data storage needs
Simple	It is based on a simple programming model
Secure	Hadoop MapReduce aligns with HDFS and HBase security for security measures
Speed	It uses the distributed file system for storage that processes even the large sets of unstructured data in minutes

Course Name	Date
Big Data Hadoop Certification Training Course	Class Starts on 11th February,2023 11th February SAT&SUN (Weekend Batch)	View Details
Big Data Hadoop Certification Training Course	Class Starts on 8th April,2023 8th April SAT&SUN (Weekend Batch)	View Details

Karthik says:
Oct 3, 2016 at 5:16 am GMT
What is custom key? and How can i implement custom key?
Reply
- EdurekaSupport says:
  Oct 3, 2016 at 10:15 am GMT
  Hey Karthik, thanks for checking out the blog. Here’s a brief explanation about custom key and its implementation.
  – In Hadoop, data types to be used as key must implement WritableComparable interface and data types to be used as value must implement Writable interface.
  – if your custom key / value are of the same type then you can write one custom datatype for both the key / value which implements WritableComparable, otherwise you need to implement two different data types. One for key which implements WritableComparable and second for value which implements Writable interface.
  //Custom Data-Type
  public class MyCustomKey implements WritableComparable
  {}
  //Create Mapper with Custom Key
  public class MyMapper extends Mapper
  {
  }
  Reply
  - Karthik says:
    Oct 3, 2016 at 1:52 pm GMT
    Thank you..
    Reply
bharadwaj says:
Sep 19, 2016 at 1:07 am GMT
can you explain in detail about custom input format..?…
Reply
- EdurekaSupport says:
  Sep 26, 2016 at 11:49 am GMT
  Hey Bharadwaj, thanks for checking out the blog. With regard to your query, custom input format can be implemented as per specific requirement. Please have a look into some below input formats available in MapReduce.
  The default InputFormat is the TextInputFormat. This treats each line of each input file as a separate record, and performs no parsing. This is useful for unformatted data or line-based records like log files.
  A more interesting input format is the KeyValueInputFormat. This format also treats each line of input as a separate record. While the TextInputFormat treats the entire line as the value, the KeyValueInputFormat breaks the line itself into the key and value by searching for a tab character. This is particularly useful for reading the output of one MapReduce job as the input to another.
  Finally, the SequenceFileInputFormat reads special binary files that are specific to Hadoop. These files include many features designed to allow data to be rapidly read into Hadoop mappers. Sequence files are block-compressed and provide direct serialization and deserialization of several arbitrary data types (not just text). Sequence files can be generated as the output of other MapReduce tasks and are an efficient intermediate representation for data that is passing from one MapReduce job to another.
  Hope this helps. Please get in touch if you have any other queries.
  Reply
AMIT RAJPUT says:
Oct 10, 2015 at 9:17 am GMT
In hadoop framewrok, who decide input split?
Reply
- sulthan syedibrahim says:
  Dec 7, 2015 at 9:46 am GMT
  The input split can be set by three property settings
  i. split.minsize
  ii.split.maximumsize and
  iii. by default as block size
  usually developers define the split size as block size. if you have data and the data should be processed within single mapper at the time you can mention the size of the split much higher than the file size.
  Reply
bala says:
Oct 5, 2015 at 5:32 pm GMT
what generic InputSplit class?
Reply
Sande says:
Jul 11, 2015 at 7:10 am GMT
what data structure used in H
adoop?
Reply
- EdurekaSupport says:
  Jul 16, 2015 at 8:56 am GMT
  Hi Sande, HDFS is the default underlying storage platform of Hadoop. Its like any other file system in the sense that it does not care what structure the files have. It only ensures that files will be saved in a redundant fashion and available for retrieval quickly.
  So it is totally up to you the user, to store files with whatever structure you like inside them.
  A MapReduce program simply gets the file data fed to it as an input. Not necessarily the entire file, but parts of it depending on InputFormats etc. The Map program then can make use of the data in whatever way it wants to.
  Reply
Awanish says:
Sep 14, 2013 at 12:01 pm GMT
very nice post,thanks a lot!!
very helpful.
Reply