Cloud Practice

#1 (Accuracy: 100% / 1 votes)

A finance company has collected stock return data for 5,000 publicly traded companies. A financial analyst has a dataset that contains 2,000 attributes for each company. The financial analyst wants to use Amazon SageMaker to identify the top 15 attributes that are most valuable to predict future stock returns.

Which solution will meet these requirements with the LEAST operational overhead?

A. Use the linear leaner algorithm in SageMaker to train a linear regression model to predict the stock returns. Identify the most predictive features by ranking absolute coefficient values.
B. Use random forest regression in SageMaker to train a model to predict the stock returns. Identify the most predictive features based on Gini importance scores.
C. Use an Amazon SageMaker Data Wrangler quick model visualization to predict the stock returns. Identify the most predictive features based on the quick mode's feature importance scores.
D. Use Amazon SageMaker Autopilot to build a regression model to predict the stock returns. Identify the most predictive features based on an Amazon SageMaker Clarify report.

#2 (Accuracy: 100% / 1 votes)

A data scientist needs to develop a model to detect fraud. The data scientist has less data for fraudulent transactions than for legitimate transactions.

The data scientist needs to check for bias in the model before finalizing the model. The data scientist needs to develop the model quickly.

Which solution will meet these requirements with the LEAST operational overhead?

A. Process and reduce bias by using the synthetic minority oversampling technique (SMOTE) in Amazon EMR. Use Amazon SageMaker Studio Classic to develop the model. Use Amazon Augmented Al (Amazon A2I) to check the model for bias before finalizing the model.
B. Process and reduce bias by using the synthetic minority oversampling technique (SMOTE) in Amazon EMR. Use Amazon SageMaker Clarify to develop the model. Use Amazon Augmented AI (Amazon A2I) to check the model for bias before finalizing the model.
C. Process and reduce bias by using the synthetic minority oversampling technique (SMOTE) in Amazon SageMaker Studio. Use Amazon SageMaker JumpStart to develop the model. Use Amazon SageMaker Clarify to check the model for bias before finalizing the model.
D. Process and reduce bias by using an Amazon SageMaker Studio notebook. Use Amazon SageMaker JumpStart to develop the model. Use Amazon SageMaker Model Monitor to check the model for bias before finalizing the model.

#3 (Accuracy: 100% / 1 votes)

A business to business (B2B) ecommerce company wants to develop a fair and equitable risk mitigation strategy to reject potentially fraudulent transactions. The company wants to reject fraudulent transactions despite the possibility of losing some profitable transactions or customers.

Which solution will meet these requirements with the LEAST operational effort?

A. Use Amazon SageMaker to approve transactions only for products the company has sold in the past.
B. Use Amazon SageMaker to train a custom fraud detection model based on customer data.
C. Use the Amazon Fraud Detector prediction API to approve or deny any activities that Fraud Detector identifies as fraudulent.
D. Use the Amazon Fraud Detector prediction API to identify potentially fraudulent activities so the company can review the activities and reject fraudulent transactions.

#4 (Accuracy: 100% / 1 votes)

A data scientist uses Amazon SageMaker Data Wrangler to analyze and visualize data. The data scientist wants to refine a training dataset by selecting predictor variables that are strongly predictive of the target variable. The target variable correlates with other predictor variables.

The data scientist wants to understand the variance in the data along various directions in the feature space.

Which solution will meet these requirements?

A. Use the SageMaker Data Wrangler multicollinearity measurement features with a variance inflation factor (VIF) score. Use the VIF score as a measurement of how closely the variables are related to each other.
B. Use the SageMaker Data Wrangler Data Quality and Insights Report quick model visualization to estimate the expected quality of a model that is trained on the data.
C. Use the SageMaker Data Wrangler multicollinearity measurement features with the principal component analysis (PCA) algorithm to provide a feature space that includes all of the predictor variables.
D. Use the SageMaker Data Wrangler Data Quality and Insights Report feature to review features by their predictive power.

#5 (Accuracy: 100% / 2 votes)

A machine learning (ML) engineer is preparing a dataset for a classification model. The ML engineer notices that some continuous numeric features have a significantly greater value than most other features. A business expert explains that the features are independently informative and that the dataset is representative of the target distribution.

After training, the model's inferences accuracy is lower than expected.

Which preprocessing technique will result in the GREATEST increase of the model's inference accuracy?

A. Normalize the problematic features.
B. Bootstrap the problematic features.
C. Remove the problematic features.
D. Extrapolate synthetic features.

#6 (Accuracy: 100% / 1 votes)

A manufacturing company produces 100 types of steel rods. The rod types have varying material grades and dimensions. The company has sales data for the steel rods for the past 50 years.

A data scientist needs to build a machine learning (ML) model to predict future sales of the steel rods.

Which solution will meet this requirement in the MOST operationally efficient way?

A. Use the Amazon SageMaker DeepAR forecasting algorithm to build a single model for all the products.
B. Use the Amazon SageMaker DeepAR forecasting algorithm to build separate models for each product.
C. Use Amazon SageMaker Autopilot to build a single model for all the products.
D. Use Amazon SageMaker Autopilot to build separate models for each product.

#7 (Accuracy: 100% / 1 votes)

An ecommerce company has observed that customers who use the company's website rarely view items that the website recommends to customers. The company wants to recommend items to customers that customers are more likely to want to purchase.

Which solution will meet this requirement in the SHORTEST amount of time?

A. Host the company's website on Amazon EC2 Accelerated Computing instances to increase the website response speed.
B. Host the company's website on Amazon EC2 GPU-based instances to increase the speed of the website's search tool.
C. Integrate Amazon Personalize into the company's website to provide customers with personalized recommendations.
D. Use Amazon SageMaker to train a Neural Collaborative Filtering (NCF) model to make product recommendations.

#8 (Accuracy: 100% / 2 votes)

A data scientist is building a new model for an ecommerce company. The model will predict how many minutes it will take to deliver a package.

During model training, the data scientist needs to evaluate model performance.

Which metrics should the data scientist use to meet this requirement? (Choose two.)

A. InferenceLatency
B. Mean squared error (MSE)
C. Root mean squared error (RMSE)
D. Precision
E. Accuracy

#9 (Accuracy: 100% / 2 votes)

An ecommerce company discovers that the search tool for the company's website is not presenting the top search results to customers. The company needs to resolve the issue so the search tool will present results that customers are most likely to want to purchase.

Which solution will meet this requirement with the LEAST operational effort?

A. Use the Amazon SageMaker BlazingText algorithm to add context to search results through query expansion.
B. Use the Amazon SageMaker XGBoost algorithm to improve candidate ranking.
C. Use Amazon CloudSearch and sort results by the search relevance score.
D. Use Amazon CloudSearch and sort results by the geographic location.

#10 (Accuracy: 100% / 2 votes)

A machine learning (ML) engineer is creating a binary classification model. The ML engineer will use the model in a highly sensitive environment.

There is no cost associated with missing a positive label. However, the cost of making a false positive inference is extremely high.

What is the most important metric to optimize the model for in this scenario?

A. Accuracy
B. Precision
C. Recall
D. F1

There are 204 results