Pass4test는 IT인증자격증을 취득하려는 분께 희망을 가져다 드리는 사이트입니다. IT인증시험을 패스하여 국제승인 자격증을 취득하려는 분은 pass4test의 Cloudera CCA-500 (Cloudera Certified Administrator for Apache Hadoop (CCAH))덤프로 시험을 준비하시면 한번에 시험패스 가능합니다. Cloudera CCA-500 (Cloudera Certified Administrator for Apache Hadoop (CCAH))시험은 인기시험입니다.Cloudera CCA-500 (Cloudera Certified Administrator for Apache Hadoop (CCAH)) 덤프를 공부하여 시험탈락하시면 두말하지 않고 덤프비용 환불처리해드립니다.
NO.1 You observed that the number of spilled records from Map tasks far exceeds the number of
map output records. Your child heap size is 1GB and your io.sort.mb value is set to 1000MB. How
would you tune your io.sort.mb value to achieve maximum memory to disk I/O ratio?
A. For a 1GB child heap size an io.sort.mb of 128 MB will always maximize memory to disk I/O
B. Increase the io.sort.mb to 1GB
C. Decrease the io.sort.mb value to 0
D. Tune the io.sort.mb value until you observe that the number of spilled records equals (or is as
close to equals) the number of map output records.
Answer: D
NO.2 On a cluster running MapReduce v2 (MRv2) on YARN, a MapReduce job is given a directory of
10 plain text files as its input directory. Each file is made up of 3 HDFS blocks. How many Mappers
will run?
A. We cannot say; the number of Mappers is determined by the ResourceManager
B. We cannot say; the number of Mappers is determined by the developer
C. 30
D. 3
E. 10
F. We cannot say; the number of mappers is determined by the ApplicationMaster
Answer: E
NO.3 Choose three reasons why should you run the HDFS balancer periodically?
A. To ensure that there is capacity in HDFS for additional data
B. To ensure that all blocks in the cluster are 128MB in size
C. To help HDFS deliver consistent performance under heavy loads
D. To ensure that there is consistent disk utilization across the DataNodes
E. To improve data locality MapReduce
Answer: D
NO.4 Your Hadoop cluster is configuring with HDFS and MapReduce version 2 (MRv2) on YARN. Can
you configure a worker node to run a NodeManager daemon but not a DataNode daemon and still
have a functional cluster?
A. Yes. The daemon will receive data from the NameNode to run Map tasks
B. Yes. The daemon will get data from another (non-local) DataNode to run Map tasks
C. Yes. The daemon will receive Map tasks only
D. Yes. The daemon will receive Reducer tasks only
Answer: A
NO.5 You have installed a cluster HDFS and MapReduce version 2 (MRv2) on YARN. You have no
dfs.hosts entry(ies) in your hdfs-site.xml configuration file. You configure a new worker node by
setting fs.default.name in its configuration files to point to the NameNode on your cluster, and you
start the DataNode daemon on that worker node. What do you have to do on the cluster to allow
the worker node to join, and start sorting HDFS blocks?
A. Without creating a dfs.hosts file or making any entries, run the commands
hadoop.dfsadmin-refreshModes on the NameNode
B. Restart the NameNode
C. Creating a dfs.hosts file on the NameNode, add the worker Node's name to it, then issue the
command hadoop dfsadmin -refresh Nodes = on the Namenode
D. Nothing; the worker node will automatically join the cluster when NameNode daemon is started
Answer: A
NO.6 For each YARN job, the Hadoop framework generates task log file. Where are Hadoop task log
files stored?
A. Cached by the NodeManager managing the job containers, then written to a log directory on the
NameNode
B. Cached in the YARN container running the task, then copied into HDFS on job completion
C. In HDFS, in the directory of the user who generates the job
D. On the local disk of the slave mode running the task
Answer: D
NO.7 You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as the
network fabric.
Which workloads benefit the most from faster network fabric?
A. When your workload generates a large amount of output data, significantly larger than the
amount of intermediate data
B. When your workload consumes a large amount of input data, relative to the entire capacity if
HDFS
C. When your workload consists of processor-intensive tasks
D. When your workload generates a large amount of intermediate data, on the order of the input
data itself
Answer: A
NO.8 You are running a Hadoop cluster with a NameNode on host mynamenode, a secondary
NameNode on host mysecondarynamenode and several DataNodes.
Which best describes how you determine when the last checkpoint happened?
A. Execute hdfs namenode -report on the command line and look at the Last Checkpoint
information
B. Execute hdfs dfsadmin -saveNamespace on the command line which returns to you the last
checkpoint value in fstime file
C. Connect to the web UI of the Secondary NameNode (http://mysecondary:50090/) and look at the
"Last Checkpoint" information
D. Connect to the web UI of the NameNode (http://mynamenode:50070) and look at the "Last
Checkpoint" information
Answer: B