Cloudera CCA-500 인증시험을 패스하여 자격증을 취득한것은 IT업계에 진출성공했다는 이정표와 같습니다. 이 자격증을 취득하는것은 많은 IT인사들의 꿈입니다. 하지만Cloudera CCA-500 시험은 난이도가 높은 시험이라 도전할 엄두가 나지 않는다고 합니다. 도전도 해보지 않고 어려울거라고 포기하면 어떤 일에서나 포기가 빠르다고 생각됩니다. Pass4Test 에서는 Cloudera CCA-500 시험에 대비한 Cloudera CCA-500 덤프공부자료를 제공해드립니다. 이 덤프를 구매하여 공부하시면 그렇게 어려운 Cloudera CCA-500 시험도 쉬워집니다. 제일 적은 투자로 제일 큰 수확을 가져다 드리는Pass4Test 를 찾아주세요.
NO.1 You have installed a cluster HDFS and MapReduce version 2 (MRv2) on YARN. You have no
dfs.hosts entry(ies) in your hdfs-site.xml configuration file. You configure a new worker node by
setting fs.default.name in its configuration files to point to the NameNode on your cluster, and you
start the DataNode daemon on that worker node. What do you have to do on the cluster to allow
the worker node to join, and start sorting HDFS blocks?
A. Without creating a dfs.hosts file or making any entries, run the commands
hadoop.dfsadmin-refreshModes on the NameNode
B. Restart the NameNode
C. Creating a dfs.hosts file on the NameNode, add the worker Node's name to it, then issue the
command hadoop dfsadmin -refresh Nodes = on the Namenode
D. Nothing; the worker node will automatically join the cluster when NameNode daemon is started
Answer: A
NO.2 Your Hadoop cluster is configuring with HDFS and MapReduce version 2 (MRv2) on YARN. Can
you configure a worker node to run a NodeManager daemon but not a DataNode daemon and still
have a functional cluster?
A. Yes. The daemon will receive data from the NameNode to run Map tasks
B. Yes. The daemon will get data from another (non-local) DataNode to run Map tasks
C. Yes. The daemon will receive Map tasks only
D. Yes. The daemon will receive Reducer tasks only
Answer: A
NO.3 Choose three reasons why should you run the HDFS balancer periodically?
A. To ensure that there is capacity in HDFS for additional data
B. To ensure that all blocks in the cluster are 128MB in size
C. To help HDFS deliver consistent performance under heavy loads
D. To ensure that there is consistent disk utilization across the DataNodes
E. To improve data locality MapReduce
Answer: D
Explanation:
NOTE: There is only one correct answer in the options for this question. Please check the following
reference:
http://www.quora.com/Apache-Hadoop/It-is-recommended-that-you-run-the-HDFSbalancer-period
ically-Why-Choose-3
NO.4 For each YARN job, the Hadoop framework generates task log file. Where are Hadoop task log
files stored?
A. Cached by the NodeManager managing the job containers, then written to a log directory on the
NameNode
B. Cached in the YARN container running the task, then copied into HDFS on job completion
C. In HDFS, in the directory of the user who generates the job
D. On the local disk of the slave mode running the task
Answer: D
NO.5 You are running a Hadoop cluster with a NameNode on host mynamenode, a secondary
NameNode on host mysecondarynamenode and several DataNodes.
Which best describes how you determine when the last checkpoint happened?
A. Execute hdfs namenode -report on the command line and look at the Last Checkpoint
information
B. Execute hdfs dfsadmin -saveNamespace on the command line which returns to you the last
checkpoint value in fstime file
C. Connect to the web UI of the Secondary NameNode (http://mysecondary:50090/) and look at the
"Last Checkpoint" information
D. Connect to the web UI of the NameNode (http://mynamenode:50070) and look at the "Last
Checkpoint" information
Answer: B
Reference:https://www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter10/hdfs
NO.6 On a cluster running MapReduce v2 (MRv2) on YARN, a MapReduce job is given a directory of
10 plain text files as its input directory. Each file is made up of 3 HDFS blocks. How many Mappers
will run?
A. We cannot say; the number of Mappers is determined by the ResourceManager
B. We cannot say; the number of Mappers is determined by the developer
C. 30
D. 3
E. 10
F. We cannot say; the number of mappers is determined by the ApplicationMaster
Answer: E
NO.7 You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as the
network fabric.
Which workloads benefit the most from faster network fabric?
A. When your workload generates a large amount of output data, significantly larger than the
amount of intermediate data
B. When your workload consumes a large amount of input data, relative to the entire capacity if
HDFS
C. When your workload consists of processor-intensive tasks
D. When your workload generates a large amount of intermediate data, on the order of the input
data itself
Answer: A
NO.8 You observed that the number of spilled records from Map tasks far exceeds the number of
map output records. Your child heap size is 1GB and your io.sort.mb value is set to 1000MB. How
would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?
A. For a 1GB child heap size an io.sort.mb of 128 MB will always maximize memory to disk I/O
B. Increase the io.sort.mb to 1GB
C. Decrease the io.sort.mb value to 0
D. Tune the io.sort.mb value until you observe that the number of spilled records equals (or is as
close to equals) the number of map output records.
Answer: D