Individuals
who achieve Cloudera Certified Specialist in Apache HBase (CCSHB) have
demonstrated their technical knowledge, skill, and ability working with Apache
HBase.
Exam Code: CCB-400
Number of Questions: 45 questions
Time Limit: 90 minutes
Passing Score: 69%
Language: English, Japanese
Price: USD $295
Number of Questions: 45 questions
Time Limit: 90 minutes
Passing Score: 69%
Language: English, Japanese
Price: USD $295
Exam Sections
CCB-400 is
designed to test a candidate’s fluency with the concepts and skills in the
following areas:
Core HBase
Concepts
Recognize the fundamental characteristics of Apache HBase and its role in a big data ecosystem. Identify differences between Apache HBase and a traditional RDBMS. Describe the relationship between Apache HBase and HDFS. Given a scenario, identify application characteristics that make the scenario an appropriate application for Apache HBase.
Recognize the fundamental characteristics of Apache HBase and its role in a big data ecosystem. Identify differences between Apache HBase and a traditional RDBMS. Describe the relationship between Apache HBase and HDFS. Given a scenario, identify application characteristics that make the scenario an appropriate application for Apache HBase.
Data Model
Describe how an Apache HBase table is physically stored on disk. Identify the differences between a Column Family and a Column Qualifier. Given a data loading scenario, identify how Apache HBase will version the rows. Describe how Apache HBase cells store data. Detail what happens to data when it is deleted.
Describe how an Apache HBase table is physically stored on disk. Identify the differences between a Column Family and a Column Qualifier. Given a data loading scenario, identify how Apache HBase will version the rows. Describe how Apache HBase cells store data. Detail what happens to data when it is deleted.
Architecture
Identify the major components of an Apache HBase cluster. Recognize how regions work and their benefits under various scenarios. Describe how a client finds a row in an HBase table. Understand the function and purpose of minor and major compactions. Given a region server crash scenario, describe how Apache HBase fails over to another region server. Describe RegionServer splits.
Identify the major components of an Apache HBase cluster. Recognize how regions work and their benefits under various scenarios. Describe how a client finds a row in an HBase table. Understand the function and purpose of minor and major compactions. Given a region server crash scenario, describe how Apache HBase fails over to another region server. Describe RegionServer splits.
Schema
Design
Describe the factors to be considered with creating Column Families. Given an access pattern, define the row keys for optimal read performance. Given an access pattern, define the row keys for locality.
Describe the factors to be considered with creating Column Families. Given an access pattern, define the row keys for optimal read performance. Given an access pattern, define the row keys for locality.
API
Describe the functions and purpose of the HBaseAdmin class. Given a table and rowkey, use the get() operation to return specific versions of that row. Describe the behavior of the checkAndPut() method.
Describe the functions and purpose of the HBaseAdmin class. Given a table and rowkey, use the get() operation to return specific versions of that row. Describe the behavior of the checkAndPut() method.
Administration
Recognize how to create, describe, and access data in tables from the shell. Describe how to bulk load data into Apache HBase. Recognize the benefits of managed region splits.
Recognize how to create, describe, and access data in tables from the shell. Describe how to bulk load data into Apache HBase. Recognize the benefits of managed region splits.
Sample Questions
Question 1
You want
to store clickstream data in HBase. Your data consists of the following: the
source id, the name of the cluster, the URL of the click, the timestamp for
each click
Which
rowkey would you use if you wanted to retrieve the source ids with a scan and
sorted with the most recent first?
A. <(Long)timestamp>
B. <source_id><Long.MAX_VALUE – (Long)timestamp>
C. <timestamp><Long.MAX_VALUE>
D. <Long.MAX_VALUE><timestamp>
B. <source_id><Long.MAX_VALUE – (Long)timestamp>
C. <timestamp><Long.MAX_VALUE>
D. <Long.MAX_VALUE><timestamp>
Question 2
Your
application needs to retrieve 200 to 300 non-sequential rows from a table with
one billion rows. You know the rowkey of each of the rows you need to retrieve.
Which does your application need to implement?
A. Scan
without range
B. Scan with start and stop row
C. HTable.get(Get get)
D. HTable.get(List<Get> gets)
B. Scan with start and stop row
C. HTable.get(Get get)
D. HTable.get(List<Get> gets)
Question 3
You
perform a check and put operation from within an HBase application using the
following:
table.checkAndPut(Bytes.toBytes("rowkey"),
Bytes.toBytes("colfam"),
Bytes.toBytes("qualifier"),
Bytes.toBytes("barvalue"), newrow));
Bytes.toBytes("colfam"),
Bytes.toBytes("qualifier"),
Bytes.toBytes("barvalue"), newrow));
Which
describes this check and put operation?
A. Check
if rowkey/colfam/qualifier exists and the cell
value "barvalue" is equal to newrow. Then
return“true”.
B. Check if rowkey/colfam/qualifier and the cell value "barvalue" is NOT equal to newrow. Then return“true”.
C. Check if rowkey/colfam/qualifier and has the cell value "barvalue". If so, put the values in newrow and return “false”.
D. Check if rowkey/colfam/qualifier and has the cell value "barvalue". If so, put the values in newrow and return “true”.
B. Check if rowkey/colfam/qualifier and the cell value "barvalue" is NOT equal to newrow. Then return“true”.
C. Check if rowkey/colfam/qualifier and has the cell value "barvalue". If so, put the values in newrow and return “false”.
D. Check if rowkey/colfam/qualifier and has the cell value "barvalue". If so, put the values in newrow and return “true”.
Question 4
What is
the advantage of the using the bulk load API over doing individual Puts for
bulk insert operations?
A. Writes
bypass the HLog/MemStore reducing load on the RegionServer.
B. Users doing bulk Writes may disable writing to the WAL which results in possible data loss.
C. HFiles created by the bulk load API are guaranteed to be co-located with the RegionServer hosting the region.
D. HFiles written out via the bulk load API are more space efficient than those written out of RegionServers.
B. Users doing bulk Writes may disable writing to the WAL which results in possible data loss.
C. HFiles created by the bulk load API are guaranteed to be co-located with the RegionServer hosting the region.
D. HFiles written out via the bulk load API are more space efficient than those written out of RegionServers.
Question 5
You have a
“WebLog” table in HBase. The Row Keys are the IP Addresses. You want to
retrieve all entries that have an IP Address of 75.67.12.146. The shell command
you would use is:
A. get
'WebLog', '75.67.21.146'
B. scan 'WebLog', '75.67.21.146'
C. get 'WebLog', {FILTER => '75.67.21.146'}
D. scan 'WebLog', {COLFAM => 'IP', FILTER => '75.67.12.146'}
B. scan 'WebLog', '75.67.21.146'
C. get 'WebLog', {FILTER => '75.67.21.146'}
D. scan 'WebLog', {COLFAM => 'IP', FILTER => '75.67.12.146'}
Answers
Question
1: B
Question 2: D
Question 3: D
Question 4: A
Question 5: A
Question 2: D
Question 3: D
Question 4: A
Question 5: A
No comments:
Post a Comment