Amazon AWS Certified Machine Learning - Specialty
Prev

There are 204 results

Next
#91 (Accuracy: 92% / 7 votes)
A Machine Learning Specialist works for a credit card processing company and needs to predict which transactions may be fraudulent in near-real time.
Specifically, the Specialist must train a model that returns the probability that a given transaction may fraudulent.

How should the Specialist frame this business problem?
  • A. Streaming classification
  • B. Binary classification
  • C. Multi-category classification
  • D. Regression classification
#92 (Accuracy: 100% / 4 votes)
A retail company stores 100 GB of daily transactional data in Amazon S3 at periodic intervals. The company wants to identify the schema of the transactional data. The company also wants to perform transformations on the transactional data that is in Amazon S3.

The company wants to use a machine learning (ML) approach to detect fraud in the transformed data.


Which combination of solutions will meet these requirements with the LEAST operational overhead? (Choose three.)
  • A. Use Amazon Athena to scan the data and identify the schema.
  • B. Use AWS Glue crawlers to scan the data and identify the schema.
  • C. Use Amazon Redshift to store procedures to perform data transformations.
  • D. Use AWS Glue workflows and AWS Glue jobs to perform data transformations.
  • E. Use Amazon Redshift ML to train a model to detect fraud.
  • F. Use Amazon Fraud Detector to train a model to detect fraud.
#93 (Accuracy: 100% / 4 votes)
A machine learning specialist is developing a proof of concept for government users whose primary concern is security. The specialist is using Amazon
SageMaker to train a convolutional neural network (CNN) model for a photo classifier application.
The specialist wants to protect the data so that it cannot be accessed and transferred to a remote host by malicious code accidentally installed on the training container.
Which action will provide the MOST secure protection?
  • A. Remove Amazon S3 access permissions from the SageMaker execution role.
  • B. Encrypt the weights of the CNN model.
  • C. Encrypt the training and validation dataset.
  • D. Enable network isolation for training jobs.
#94 (Accuracy: 100% / 3 votes)
A Machine Learning Specialist is working for an online retailer that wants to run analytics on every customer visit, processed through a machine learning pipeline.
The data needs to be ingested by Amazon Kinesis Data Streams at up to 100 transactions per second, and the JSON data blob is 100 KB in size.

What is the MINIMUM number of shards in Kinesis Data Streams the Specialist should use to successfully ingest this data?
  • A. 1 shards
  • B. 10 shards
  • C. 100 shards
  • D. 1,000 shards
#95 (Accuracy: 93% / 11 votes)
A Machine Learning Specialist uploads a dataset to an Amazon S3 bucket protected with server-side encryption using AWS KMS.
How should the ML Specialist define the Amazon SageMaker notebook instance so it can read the same dataset from Amazon S3?
  • A. Define security group(s) to allow all HTTP inbound/outbound traffic and assign those security group(s) to the Amazon SageMaker notebook instance.
  • B. ׀¡onfigure the Amazon SageMaker notebook instance to have access to the VPC. Grant permission in the KMS key policy to the notebook's KMS role.
  • C. Assign an IAM role to the Amazon SageMaker notebook with S3 read access to the dataset. Grant permission in the KMS key policy to that role.
  • D. Assign the same KMS key used to encrypt data in Amazon S3 to the Amazon SageMaker notebook instance.
#96 (Accuracy: 100% / 4 votes)
A Machine Learning Specialist deployed a model that provides product recommendations on a company's website. Initially, the model was performing very well and resulted in customers buying more products on average. However, within the past few months, the Specialist has noticed that the effect of product recommendations has diminished and customers are starting to return to their original habits of spending less. The Specialist is unsure of what happened, as the model has not changed from its initial deployment over a year ago.
Which method should the Specialist try to improve model performance?
  • A. The model needs to be completely re-engineered because it is unable to handle product inventory changes.
  • B. The model's hyperparameters should be periodically updated to prevent drift.
  • C. The model should be periodically retrained from scratch using the original data while adding a regularization term to handle product inventory changes
  • D. The model should be periodically retrained using the original training data plus new data as product inventory changes.
#97 (Accuracy: 100% / 4 votes)
A Data Science team is designing a dataset repository where it will store a large amount of training data commonly used in its machine learning models. As Data
Scientists may create an arbitrary number of new datasets every day, the solution has to scale automatically and be cost-effective.
Also, it must be possible to explore the data using SQL.
Which storage scheme is MOST adapted to this scenario?
  • A. Store datasets as files in Amazon S3.
  • B. Store datasets as files in an Amazon EBS volume attached to an Amazon EC2 instance.
  • C. Store datasets as tables in a multi-node Amazon Redshift cluster.
  • D. Store datasets as global tables in Amazon DynamoDB.
#98 (Accuracy: 100% / 3 votes)
A data science team is working with a tabular dataset that the team stores in Amazon S3. The team wants to experiment with different feature transformations such as categorical feature encoding. Then the team wants to visualize the resulting distribution of the dataset. After the team finds an appropriate set of feature transformations, the team wants to automate the workflow for feature transformations.

Which solution will meet these requirements with the MOST operational efficiency?
  • A. Use Amazon SageMaker Data Wrangler preconfigured transformations to explore feature transformations. Use SageMaker Data Wrangler templates for visualization. Export the feature processing workflow to a SageMaker pipeline for automation.
  • B. Use an Amazon SageMaker notebook instance to experiment with different feature transformations. Save the transformations to Amazon S3. Use Amazon QuickSight for visualization. Package the feature processing steps into an AWS Lambda function for automation.
  • C. Use AWS Glue Studio with custom code to experiment with different feature transformations. Save the transformations to Amazon S3. Use Amazon QuickSight for visualization. Package the feature processing steps into an AWS Lambda function for automation.
  • D. Use Amazon SageMaker Data Wrangler preconfigured transformations to experiment with different feature transformations. Save the transformations to Amazon S3. Use Amazon QuickSight for visualization. Package each feature transformation step into a separate AWS Lambda function. Use AWS Step Functions for workflow automation.
#99 (Accuracy: 100% / 5 votes)
A social media company wants to develop a machine learning (ML) model to detect inappropriate or offensive content in images. The company has collected a large dataset of labeled images and plans to use the built-in Amazon SageMaker image classification algorithm to train the model. The company also intends to use SageMaker pipe mode to speed up the training.

The company splits the dataset into training, validation, and testing datasets.
The company stores the training and validation images in folders that are named Training and Validation, respectively. The folders contain subfolders that correspond to the names of the dataset classes. The company resizes the images to the same size and generates two input manifest files named training.lst and validation.lst, for the training dataset and the validation dataset, respectively. Finally, the company creates two separate Amazon S3 buckets for uploads of the training dataset and the validation dataset.

Which additional data preparation steps should the company take before uploading the files to Amazon S3?
  • A. Generate two Apache Parquet files, training.parquet and validation.parquet, by reading the images into a Pandas data frame and storing the data frame as a Parquet file. Upload the Parquet files to the training S3 bucket.
  • B. Compress the training and validation directories by using the Snappy compression library. Upload the manifest and compressed files to the training S3 bucket.
  • C. Compress the training and validation directories by using the gzip compression library. Upload the manifest and compressed files to the training S3 bucket.
  • D. Generate two RecordIO files, training.rec and validation.rec, from the manifest files by using the im2rec Apache MXNet utility tool. Upload the RecordIO files to the training S3 bucket.
#100 (Accuracy: 100% / 4 votes)
A company wants to create an artificial intelligence (AШ) yoga instructor that can lead large classes of students. The company needs to create a feature that can accurately count the number of students who are in a class. The company also needs a feature that can differentiate students who are performing a yoga stretch correctly from students who are performing a stretch incorrectly.

Determine whether students are performing a stretch correctly, the solution needs to measure the location and angle of each student’s arms and legs.
A data scientist must use Amazon SageMaker to access video footage of a yoga class by extracting image frames and applying computer vision models.

Which combination of models will meet these requirements with the LEAST effort? (Choose two.)
  • A. Image Classification
  • B. Optical Character Recognition (OCR)
  • C. Object Detection
  • D. Pose estimation
  • E. Image Generative Adversarial Networks (GANs)