IBM Big Data Engineer C2090-101 exam will be retired on October 31, 2021. If you are planning to earn IBM Certified Data Engineer-Big Data certification, please take and pass C2090-101 exam before the retired date. We provide the latest IBM certification C2090-101 real exam questions, which are the best material for you to clear the test easily. Big Data Engineers focus on collecting, parsing, managing and analyzing large data sets, in order to provide the right data sets and visual tools for analysis to the data scientists.
IBM Certification C2090-101 Exam Topics
IBM certification C2090-101 exam topics cover the following sections.
Data Loading 34%
Load unstructured data into InfoSphere BigInsights
Import streaming data into Hadoop using InfoSphere Streams
Create a BigSheets workbook
Import data into Hadoop and create Big SQL table definitions
Import data to HBase
Import data to Hive
Use Data Click to load from relational sources into InfoSphere BigInsights with a self-service process
Extract data from a relational source using Sqoop
Load log data into Hadoop using Flume
Insert data via IBM General Parallel File System (GPFS) Posix file system API
Load data with Hadoop command line utility
Data Security 8%
Keep data secure within PCI standards
Uses masking (e.g. Optim, Big SQL), and redaction to protect sensitive data
Architecture and Integration 17%
Implement MapReduce
Evaluate use cases for selecting Hive, Big SQL, or HBase
Create and/or query a Solr index
Evaluate use cases for selecting potential file formats (e.g. JSON, CSV, Parquet, Sequence, etc..)
Utilize Apache Hue for search visualization
Performance and Scalability 15%
Use Resilient Distributed Dataset (RDD) to improve MapReduce performance
Choose file formats to optimize performance of Big SQL, JAQL, etc.
Make specific performance tuning decisions for Hive and HBase
Analyze performance considerations when using Apache Spark
Data Preparation, Transformation, and Export 26%
Use Jaql query methods to transform data in InfoSphere BigInsights
Capture and prep social data for analytics
Integrating SPSS model scoring in InfoSphere Streams
Implement entity resolution within a Big Data platform (e.g. Big Match)
Utilize Pig for data transformation and data manipulation
Use Big SQL to transform data in InfoSphere BigInsights
Export processing results out of Hadoop (e.g. DataClick, DataStage, etc.)
Utilize consistent regions in InfoSphere Streams to ensure at least once processing
Share IBM C2090-101 Real Exam Questions
IBM C2090-101 real exam questions can help you test all the above IBM Big Data Engineer C2090-101 topics. Share some IBM Certification C2090-101 real exam questions and answers below.
1.Which of the following techniques is NOT employed by Big SQL to improve performance?
A. Query Optimization
B. Predicate Push down
C. Compression efficiency
D. Load data into DB2 and return the data
Answer: A
2.When embedding SPSS models within InfoSphere Streams, what SPSS product must be installed on the same machine with InfoSphere Streams?
A. SPSS Modeler
B. SPSS Solution Publisher
C. SPSS Accelerator for InfoSphere Streams
D. None, the SPSS software runs remotely to the Streams machine
Answer: B
3.Which of the following statements regarding Sqoop is TRUE? (Choose two.)
A. All columns in a table must be imported
B. Sqoop bypasses MapReduce for enhanced performance
C. Each row from a source table is represented as a separate record in HDFS
D. When using a password file, the file containing the password must reside in HDFS
E. Multiple options files can be specified when invoking Sqoop from the command line
Answer: CE
4.Use of Bulk Load in HBase for loading large volume of data will result in which of the following?
A. It will use less CPU but will use more network resource
B. It will use less network resource but more CPU
C. It will behave same way as using HBase API for loading large volume of data
D. None of the above
Answer: C
5.Which of the following are capabilities of the Apache Spark project?
A. Large scale machine learning
B. Large scale graph processing
C. Live data stream processing
D. All of the above
Answer: B