Apache PIG Hadoop Developer Practice Questions
Mamun 100% Verified
1. What is PIG in Hadoop?
A. It is a sub-set of the API of Hadoop for data processing
B. It is a part of the apache hadoop project which provides C-like scripting language
interface for doing data analysis
C. It is also a part of the apache hadoop projects. It's a "PL-SQL" like interface to write
programs for Data analysis in hadoop cluster
D. PIG is the third most popular form of meat in the US behind poultry and beef. -
ANSWER B
2. Which of the following best describes the relationship between MapReduce and Pig?
A. Pig provides additional functionality that allows certain types of data manipulation not
possible with MapReduce.
B. Pig provides no additional capabilities to MapReduce. Pig programs are executed as
MapReduce jobs via the Pig interpreter.
C. Pig programs rely on MapReduce but are extensible, allowing developers to do
special-purpose processing not provided by MapReduce.
D. Pig provides the additional capability of allowing you to control the flow of multiple
MapReduce jobs. - ANSWER D
3. T/F: Grunt remembers command history. - ANSWER True
4. Top PIG commands: (5) _______, _______, describe, limit, filter - ANSWER load, dump,
describe, limit, filter
,5. Each pig statement ends with? - ANSWER ;
6. T/F: Describe command in PIG is just like Oracle: - ANSWER True
7. Dump command does what? (in PIG) - ANSWER Ans: it's like cat command in unix
8. Example of limit command: (PIG) - ANSWER Ans: B = LIMIT A 100 ;
(A is an alias, B is new alias)
9: PIG: FOREACH does what? - ANSWER Ans: runs through each row
10. PIG: Example of FOREACH: - ANSWER Ans: C = FOREACH B GENERATE symbol,
date, close:
(symbol, date, close are already defined column labels in B)
11. What would following do? (PIG)
A = load 'file1' using PigStorage(':'); - ANSWER Ans: will create alias A and fields will be
$0, $1 etc using : as field separator
12. Given above how to create an alias called A1 using just the first column in A and call
the column ID. (PIG) - ANSWER Ans: A1 = Foreach A Generate $0 as ID;
13. How to run the pigscript1.pig on the local machine? - ANSWER Ans: pig -x local
pigscript1.pig
14. PIG: You ran : A = load './input.txt'; will it work? - ANSWER Ans: yes:
because
, TAB will be the default delimeter.
15. What does flatten do? (PIG) - ANSWER Ans: Flatten un-nests tuples as well as bags
Consider a relation that has a tuple of the form (a, (b, c)). The expression
GENERATE $0, flatten($1),
will cause that tuple to become (a, b, c).
16. T/F: Flatten can be used to convert a bag into tuples: - ANSWER True
17. Which of the following is a pig command that can be used to read a text file (t1) and
load it into an alias called A and each tuple will be just one string (whole line) called line.
- ANSWER Ans: A = LOAD t1 AS (line:chararray);
18. What does the function TOKENIZE do? - ANSWER Ans: to split a string of words (all
words in a single tuple) into a bag of words (each word in a single tuple).
19. Which characters are considered to be word separators for TOKENIZE: - ANSWER
Ans: space, double quote(\\\"), coma(,) parenthesis(()), star(*).
20. PIG: To create the alias named B with alias A and grouping the records on the basis
of ID: - ANSWER Ans: B = GROUP A BY ID;
21. PIG: Syntax for Sorting in Pig for sorting alias A on the age column and putting the
resulting data in alias B - ANSWER Ans: B = ORDER A BY age;
22. Put the following in order as you would run a pig script to retrieve the records of the
stock symbol IBM: A) Group and average
B) Load the data
C) Filter all records beginning with IBM
Mamun 100% Verified
1. What is PIG in Hadoop?
A. It is a sub-set of the API of Hadoop for data processing
B. It is a part of the apache hadoop project which provides C-like scripting language
interface for doing data analysis
C. It is also a part of the apache hadoop projects. It's a "PL-SQL" like interface to write
programs for Data analysis in hadoop cluster
D. PIG is the third most popular form of meat in the US behind poultry and beef. -
ANSWER B
2. Which of the following best describes the relationship between MapReduce and Pig?
A. Pig provides additional functionality that allows certain types of data manipulation not
possible with MapReduce.
B. Pig provides no additional capabilities to MapReduce. Pig programs are executed as
MapReduce jobs via the Pig interpreter.
C. Pig programs rely on MapReduce but are extensible, allowing developers to do
special-purpose processing not provided by MapReduce.
D. Pig provides the additional capability of allowing you to control the flow of multiple
MapReduce jobs. - ANSWER D
3. T/F: Grunt remembers command history. - ANSWER True
4. Top PIG commands: (5) _______, _______, describe, limit, filter - ANSWER load, dump,
describe, limit, filter
,5. Each pig statement ends with? - ANSWER ;
6. T/F: Describe command in PIG is just like Oracle: - ANSWER True
7. Dump command does what? (in PIG) - ANSWER Ans: it's like cat command in unix
8. Example of limit command: (PIG) - ANSWER Ans: B = LIMIT A 100 ;
(A is an alias, B is new alias)
9: PIG: FOREACH does what? - ANSWER Ans: runs through each row
10. PIG: Example of FOREACH: - ANSWER Ans: C = FOREACH B GENERATE symbol,
date, close:
(symbol, date, close are already defined column labels in B)
11. What would following do? (PIG)
A = load 'file1' using PigStorage(':'); - ANSWER Ans: will create alias A and fields will be
$0, $1 etc using : as field separator
12. Given above how to create an alias called A1 using just the first column in A and call
the column ID. (PIG) - ANSWER Ans: A1 = Foreach A Generate $0 as ID;
13. How to run the pigscript1.pig on the local machine? - ANSWER Ans: pig -x local
pigscript1.pig
14. PIG: You ran : A = load './input.txt'; will it work? - ANSWER Ans: yes:
because
, TAB will be the default delimeter.
15. What does flatten do? (PIG) - ANSWER Ans: Flatten un-nests tuples as well as bags
Consider a relation that has a tuple of the form (a, (b, c)). The expression
GENERATE $0, flatten($1),
will cause that tuple to become (a, b, c).
16. T/F: Flatten can be used to convert a bag into tuples: - ANSWER True
17. Which of the following is a pig command that can be used to read a text file (t1) and
load it into an alias called A and each tuple will be just one string (whole line) called line.
- ANSWER Ans: A = LOAD t1 AS (line:chararray);
18. What does the function TOKENIZE do? - ANSWER Ans: to split a string of words (all
words in a single tuple) into a bag of words (each word in a single tuple).
19. Which characters are considered to be word separators for TOKENIZE: - ANSWER
Ans: space, double quote(\\\"), coma(,) parenthesis(()), star(*).
20. PIG: To create the alias named B with alias A and grouping the records on the basis
of ID: - ANSWER Ans: B = GROUP A BY ID;
21. PIG: Syntax for Sorting in Pig for sorting alias A on the age column and putting the
resulting data in alias B - ANSWER Ans: B = ORDER A BY age;
22. Put the following in order as you would run a pig script to retrieve the records of the
stock symbol IBM: A) Group and average
B) Load the data
C) Filter all records beginning with IBM