MCQ Question of Machine learning

MCQ Question of Machine learning

What is Machine Learning (ML)?

The autonomous acquisition of knowledge through the use of manual programs
The selective acquisition of knowledge through the use of computer programs
The selective acquisition of knowledge through the use of manual programs
The autonomous acquisition of knowledge through the use of computer programs

Correct option is D

Father of Machine Learning (ML)

Geoffrey Chaucer
Geoffrey Hill
Geoffrey Everest Hinton
None of the above

Correct option is C

Which is FALSE regarding regression?

It may be used for interpretation
It is used for prediction
It discovers causal relationships
It relates inputs to outputs

Correct option is C

Choose the correct option regarding machine learning (ML) and artificial intelligence (AI)

ML is a set of techniques that turns a dataset into a software
AI is a software that can emulate the human mind
ML is an alternate way of programming intelligent machines
All of the above

Correct option is D

Which of the factors affect the performance of the learner system does not include?

Good data structures
Representation scheme used
Training scenario
Type of feedback

Correct option is A

In general, to have a well-defined learning problem, we must identity which of the following

The class of tasks
The measure of performance to be improved
The source of experience
All of the above

Correct option is D

Successful applications of ML

Learning to recognize spoken words
Learning to drive an autonomous vehicle
Learning to classify new astronomical structures
Learning to play world-class backgammon
All of the above

Correct option is E

Which of the following does not include different learning methods

Analogy
Introduction
Memorization
Deduction

Correct option is B

In language understanding, the levels of knowledge that does not include?

Empirical
Logical
Phonological
Syntactic

Correct option is A

Designing a machine learning approach involves:-

Choosing the type of training experience
Choosing the target function to be learned
Choosing a representation for the target function
Choosing a function approximation algorithm
All of the above

Correct option is E

Concept learning inferred a valued function from training examples of its input and output.

Decimal
Hexadecimal
Boolean
All of the above

Correct option is C

Which of the following is not a supervised learning?

Naive Bayesian
PCA
Linear Regression
Decision Tree Answer

Correct option is B

What is Machine Learning?

Artificial Intelligence
Deep Learning
Data Statistics

A. Only (i)

B. (i) and (ii)

C. All

D. None

Correct option is B

What kind of learning algorithm for “Facial identities or facial expressions”?

Prediction
Recognition Patterns
Generating Patterns
Recognizing Anomalies Answer

Correct option is B

Which of the following is not type of learning?

Unsupervised Learning
Supervised Learning
Semi-unsupervised Learning
Reinforcement Learning

Correct option is C

Real-Time decisions, Game AI, Learning Tasks, Skill Aquisition, and Robot Navigation are applications of which of the folowing

Supervised Learning: Classification
Reinforcement Learning
Unsupervised Learning: Clustering
Unsupervised Learning: Regression

Correct option is B

Targetted marketing, Recommended Systems, and Customer Segmentation are applications in which of the following

Supervised Learning: Classification
Unsupervised Learning: Clustering
Unsupervised Learning: Regression
Reinforcement Learning

Correct option is B

Fraud Detection, Image Classification, Diagnostic, and Customer Retention are applications in which of the following

Unsupervised Learning: Regression
Supervised Learning: Classification
Unsupervised Learning: Clustering
Reinforcement Learning

Correct option is B

Which of the following is not function of symbolic in the various function representation of Machine Learning?

Rules in propotional Logic
Hidden-Markov Models (HMM)
Rules in first-order predicate logic
Decision Trees

Correct option is B

Which of the following is not numerical functions in the various function representation of Machine Learning?

Neural Network
Support Vector Machines
Case-based
Linear Regression

Correct option is C

FIND-S Algorithm starts from the most specific hypothesis and generalize it by considering only

Negative
Positive
Negative or Positive
None of the above

Correct option is B

FIND-S algorithm ignores

Negative
Positive
Both
None of the above

Correct option is A

The Candidate-Elimination Algorithm represents the .

Solution Space
Version Space
Elimination Space
All of the above

Correct option is B

Inductive learning is based on the knowledge that if something happens a lot it is likely to be generally

True
False Answer

Correct option is A

Inductive learning takes examples and generalizes rather than starting with

Inductive
Existing
Deductive
None of these

Correct option is B

A drawback of the FIND-S is that it assumes the consistency within the training set

True
False

Correct option is A

What strategies can help reduce overfitting in decision trees?

Enforce a maximum depth for the tree
Enforce a minimum number of samples in leaf nodes
Pruning
Make sure each leaf node is one pure class

A. All

B. (i), (ii) and (iii)

C. (i), (iii), (iv)

D. None

Correct option is B

Which of the following is a widely used and effective machine learning algorithm based on the idea of bagging?

Decision Tree
Random Forest
Regression
Classification

Correct option is B

To find the minimum or the maximum of a function, we set the gradient to zero because which of the following

Depends on the type of problem
The value of the gradient at extrema of a function is always zero
Both (A) and (B)
None of these

Correct option is B

Which of the following is a disadvantage of decision trees?

Decision trees are prone to be overfit
Decision trees are robust to outliers
Factor analysis
None of the above

Correct option is A

What is perceptron?

A single layer feed-forward neural network with pre-processing
A neural network that contains feedback
A double layer auto-associative neural network
An auto-associative neural network

Correct option is A

Which of the following is true for neural networks?

The training time depends on the size of the
Neural networks can be simulated on a conventional
Artificial neurons are identical in operation to biological

A. All

B. Only (ii)

C. (i) and (ii)

D. None

Correct option is C

What are the advantages of neural networks over conventional computers?

They have the ability to learn by
They are more fault
They are more suited for real time operation due to their high „computational‟

A. (i) and (ii)

B. (i) and (iii)

C. Only (i)

D. All

E. None

Correct option is D

What is Neuro software?

It is software used by Neurosurgeon
Designed to aid experts in real world
It is powerful and easy neural network
A software used to analyze neurons

Correct option is C

Which is true for neural networks?

Each node computes it‟s weighted input
Node could be in excited state or non-excited state
It has set of nodes and connections
All of the above

Correct option is D

What is the objective of backpropagation algorithm?

To develop learning algorithm for multilayer feedforward neural network, so that network can be trained to capture the mapping implicitly
To develop learning algorithm for multilayer feedforward neural network
To develop learning algorithm for single layer feedforward neural network
All of the above

Correct option is A

Which of the following is true?

Single layer associative neural networks do not have the ability to:-

Perform pattern recognition
Find the parity of a picture
Determine whether two or more shapes in a picture are connected or not

A. (ii) and (iii)

B. Only (ii)

C. All

D. None

Correct option is A

The backpropagation law is also known as generalized delta rule

True
False

Correct option is A

Which of the following is true?

On average, neural networks have higher computational rates than conventional computers.
Neural networks learn by
Neural networks mimic the way the human brain

A. All

B. (ii) and (iii)

C. (i), (ii) and (iii)

D. None

Correct option is A

What is true regarding backpropagation rule?

Error in output is propagated backwards only to determine weight updates
There is no feedback of signal at nay stage
It is also called generalized delta rule
All of the above

Correct option is D

There is feedback in final stage of backpropagation

True
False

Correct option is B

An auto-associative network is

A neural network that has only one loop
A neural network that contains feedback
A single layer feed-forward neural network with pre-processing
A neural network that contains no loops

Correct option is B

A 3-input neuron has weights 1, 4 and 3. The transfer function is linear with the constant of proportionality being equal to 3. The inputs are 4, 8 and 5 respectively. What will be the output?

Correct option is B

What of the following is true regarding backpropagation rule?

Hidden layers output is not all important, they are only meant for supporting input and output layers
Actual output is determined by computing the outputs of units for each hidden layer
It is a feedback neural network
None of the above

Correct option is B

What is back propagation?

It is another name given to the curvy function in the perceptron
It is the transmission of error back through the network to allow weights to be adjusted so that the network can learn
It is another name given to the curvy function in the perceptron
None of the above

Correct option is B

The general limitations of back propagation rule is/are

Scaling
Slow convergence
Local minima problem
All of the above

Correct option is D

What is the meaning of generalized in statement “backpropagation is a generalized delta rule” ?

Because delta is applied to only input and output layers, thus making it more simple and generalized
It has no significance
Because delta rule can be extended to hidden layer units
None of the above

Correct option is C

Neural Networks are complex functions with many parameter

Linear
Non linear
Discreate
Exponential

Correct option is A

The general tasks that are performed with backpropagation algorithm

Pattern mapping
Prediction
Function approximation
All of the above

Correct option is D

Backpropagaion learning is based on the gradient descent along error surface.

True
False

Correct option is A

In backpropagation rule, how to stop the learning process?

No heuristic criteria exist
On basis of average gradient value
There is convergence involved
None of these

Correct option is B

Applications of NN (Neural Network)

Risk management
Data validation
Sales forecasting
All of the above

Correct option is D

The network that involves backward links from output to the input and hidden layers is known as

Recurrent neural network
Self organizing maps
Perceptrons
Single layered perceptron

Correct option is A

Decision Tree is a display of an Algorithm?

True
False

Correct option is A

Which of the following is/are the decision tree nodes?

End Nodes
Decision Nodes
Chance Nodes
All of the above

Correct option is D

End Nodes are represented by which of the following

Solar street light
Triangles
Circles
Squares

Correct option is B

Decision Nodes are represented by which of the following

Solar street light
Triangles
Circles
Squares

Correct option is D

Chance Nodes are represented by which of the following

Solar street light
Triangles
Circles
Squares

Correct option is C

Advantage of Decision Trees

Possible Scenarios can be added
Use a white box model, if given result is provided by a model
Worst, best and expected values can be determined for different scenarios
All of the above

Correct option is D

terms are required for building a bayes model.

Correct option is C

Which of the following is the consequence between a node and its predecessors while creating bayesian network?

Conditionally independent
Functionally dependent
Both Conditionally dependant & Dependant
Dependent

Correct option is A

Why it is needed to make probabilistic systems feasible in the world?

Feasibility
Reliability
Crucial robustness
None of the above

Correct option is C

Bayes rule can be used for:-

Solving queries
Increasing complexity
Answering probabilistic query
Decreasing complexity

Correct option is C

provides way and means of weighing up the desirability of goals and the likelihood of achieving

Utility theory
Decision theory
Bayesian networks
Probability theory

Correct option is A

Which of the following provided by the Bayesian Network?

Complete description of the problem
Partial description of the domain
Complete description of the domain
All of the above

Correct option is C

65. Probability provides a way of summarizing the that comes from our laziness and

Belief
Uncertaintity
Joint probability distributions
Randomness

Correct option is B

The entries in the full joint probability distribution can be calculated as

Using variables
Both Using variables & information
Using information
All of the above

Correct option is C

Causal chain (For example, Smoking cause cancer) gives rise to:-

Conditionally Independence
Conditionally Dependence
Both
None of the above

Correct option is A

The bayesian network can be used to answer any query by using:-

Full distribution
Joint distribution
Partial distribution
All of the above

Correct option is B

Bayesian networks allow compact specification of:-

Joint probability distributions
Belief
Propositional logic statements
All of the above

Correct option is A

The compactness of the bayesian network can be described by

Fully structured
Locally structured
Partially structured
All of the above

Correct option is B

The Expectation-Maximization Algorithm has been used to identify conserved domains in unaligned proteins only. State True or False.

True
False

Correct option is B

Which of the following is correct about the Naive Bayes?

Assumes that all the features in a dataset are independent
Assumes that all the features in a dataset are equally important
Both
All of the above

Correct option is C

Which of the following is false regarding EM Algorithm?

The alignment provides an estimate of the base or amino acid composition of each column in the site
The column-by-column composition of the site already available is used to estimate the probability of finding the site at any position in each of the sequences
The row-by-column composition of the site already available is used to estimate the probability
None of the above

Correct option is C

Naïve Bayes Algorithm is a learning algorithm.

Supervised
Reinforcement
Unsupervised
None of these

Correct option is A

EM algorithm includes two repeated steps, here the step 2 is .

The normalization
The maximization step
The minimization step
None of the above

Correct option is C

Examples of Naïve Bayes Algorithm is/are

Spam filtration
Sentimental analysis
Classifying articles
All of the above

Correct option is D

In the intermediate steps of “EM Algorithm”, the number of each base in each column is determined and then converted to

True
False

Correct option is A

Naïve Bayes algorithm is based on and used for solving classification problems.

Bayes Theorem
Candidate elimination algorithm
EM algorithm
None of the above

Correct option is A

Types of Naïve Bayes Model:

Gaussian
Multinomial
Bernoulli
All of the above

Correct option is D

Disadvantages of Naïve Bayes Classifier:

Naive Bayes assumes that all features are independent or unrelated, so it cannot learn the relationship between
It performs well in Multi-class predictions as compared to the other
Naïve Bayes is one of the fast and easy ML algorithms to predict a class of
It is the most popular choice for text classification problems.

Correct option is A

The benefit of Naïve Bayes:-

Naïve Bayes is one of the fast and easy ML algorithms to predict a class of
It is the most popular choice for text classification problems.
It can be used for Binary as well as Multi-class
All of the above

Correct option is D

In which of the following types of sampling the information is carried out under the opinion of an expert?

Convenience sampling
Judgement sampling
Quota sampling
Purposive sampling

Correct option is B

Full form of MDL?

Minimum Description Length
Maximum Description Length
Minimum Domain Length
None of these

Correct option is A

For the analysis of ML algorithms, we need

Computational learning theory
Statistical learning theory
Both A & B
None of these

Correct option is C

PAC stand for

Probably Approximate Correct
Probably Approx Correct
Probably Approximate Computation
Probably Approx Computation

Correct option is A

86. hypothesis h with respect to target concept c and distribution D , is the probability that h will misclassify an instance drawn at random according to D.

True Error
Type 1 Error
Type 2 Error
None of these

Correct option is A

Statement: True error defined over entire instance space, not just training data

True
False

Correct option is A

What are the area CLT comprised of?

Sample Complexity
Computational Complexity
Mistake Bound
All of these

Correct option is D

What area of CLT tells “How many examples we need to find a good hypothesis ?”?

Sample Complexity
Computational Complexity
Mistake Bound
None of these

Correct option is A

What area of CLT tells “How much computational power we need to find a good hypothesis ?”?

Sample Complexity
Computational Complexity
Mistake Bound
None of these

Correct option is B

What area of CLT tells “How many mistakes we will make before finding a good hypothesis ?”?

Sample Complexity
Computational Complexity
Mistake Bound
None of these

Correct option is C

(For question no. 9 and 10) Can we say that concept described by conjunctions of Boolean literals are PAC learnable?

Correct option is A

How large is the hypothesis space when we have n Boolean attributes?

|H| = 3 ⁿ
|H| = 2 ⁿ
|H| = 1 ⁿ
|H| = 4ⁿ

Correct option is A

The VC dimension of hypothesis space H1 is larger than the VC dimension of hypothesis space H2. Which of the following can be inferred from this?

The number of examples required for learning a hypothesis in H1 is larger than the number of examples required for H2
The number of examples required for learning a hypothesis in H1 is smaller than the number of examples required for
No relation to number of samples required for PAC learning.

Correct option is A

For a particular learning task, if the requirement of error parameter changes from 0.1 to 0.01. How many more samples will be required for PAC learning?

Same
2 times
1000 times
10 times

Correct option is D

Computational complexity of classes of learning problems depends on which of the following?

The size or complexity of the hypothesis space considered by learner
The accuracy to which the target concept must be approximated
The probability that the learner will output a successful hypothesis
All of these

Correct option is D

The instance-based learner is a

Lazy-learner
Eager learner
Can‟t say

Correct option is A

When to consider nearest neighbour algorithms?

Instance map to point in kⁿ
Not more than 20 attributes per instance
Lots of training data
None of these
A, B & C

Correct option is E

What are the advantages of Nearest neighbour alogo?

Training is very fast
Can learn complex target functions
Don‟t lose information
All of these

Correct option is D

What are the difficulties with k-nearest neighbour algo?

Calculate the distance of the test case from all training cases
Curse of dimensionality
Both A & B
None of these

Correct option is C

What if the target function is real valued in kNN algo?

Calculate the mean of the k nearest neighbours
Calculate the SD of the k nearest neighbour
None of these

Correct option is A

What is/are true about Distance-weighted KNN?

The weight of the neighbour is considered
The distance of the neighbour is considered
Both A & B
None of these

Correct option is C

What is/are advantage(s) of Distance-weighted k-NN over k-NN?

Robust to noisy training data
Quite effective when a sufficient large set of training data is provided
Both A & B
None of these

Correct option is C

What is/are advantage(s) of Locally Weighted Regression?

Pointwise approximation of complex target function
Earlier data has no influence on the new ones
Both A & B
None of these

Correct option is C

The quality of the result depends on (LWR)

Choice of the function
Choice of the kernel function K
Choice of the hypothesis space H
All of these

Correct option is D

How many types of layer in radial basis function neural networks?

Correct option is A, Input layer, Hidden layer, and Output layer

The neurons in the hidden layer contains Gaussian transfer function whose output are to the distance from the centre of the neuron.

Directly
Inversely
equal
None of these

Correct option is B

PNN/GRNN networks have one neuron for each point in the training file, While RBF network have a variable number of neurons that is usually

less than the number of training
greater than the number of training points
equal to the number of training points
None of these

Correct option is A

Which network is more accurate when the size of training set between small to medium?

PNN/GRNN
RBF
K-means clustering
None of these

Correct option is A

What is/are true about RBF network?

A kind of supervised learning
Design of NN as curve fitting problem
Use of multidimensional surface to interpolate the test data
All of these

Correct option is D

Application of CBR

Design
Planning
Diagnosis
All of these

Correct option is A

What is/are advantages of CBR?

A local approx. is found for each test case
Knowledge is in a form understandable to human
Fast to train
All of these

Correct option is D

112 In k-NN algorithm, given a set of training examples and the value of k < size of training set (n), the algorithm predicts the class of a test example to be the. What is/are advantages of CBR?

Least frequent class among the classes of k closest training
Most frequent class among the classes of k closest training
Class of the closest
Most frequent class among the classes of the k farthest training examples.

Correct option is B

Which of the following statements is true about PCA?

We must standardize the data before applying
We should select the principal components which explain the highest variance
We should select the principal components which explain the lowest variance
We can use PCA for visualizing the data in lower dimensions

A. (i), (ii) and (iv).

B. (ii) and (iv)

C. (iii) and (iv)

D. (i) and (iii)

Correct option is A

Genetic algorithm is a

Search technique used in computing to find true or approximate solution to optimization and search problem
Sorting technique used in computing to find true or approximate solution to optimization and sort problem
Both A & B
None of these

Correct option is A

GA techniques are inspired by

Evolutionary
Cytology
Anatomy
Ecology

Correct option is A

When would the genetic algorithm terminate?

Maximum number of generations has been produced
Satisfactory fitness level has been reached for the
Both A & B
None of these

Correct option is C

The algorithm operates by iteratively updating a pool of hypotheses, called the

Population
Fitness
None of these

Correct option is A

What is the correct representation of GA?

GA(Fitness, Fitness_threshold, p)
GA(Fitness, Fitness_threshold, p, r )
GA(Fitness, Fitness_threshold, p, r, m)
GA(Fitness, Fitness_threshold)

Correct option is C

Genetic operators includes

Crossover
Mutation
Both A & B
None of these

Correct option is C

Produces two new offspring from two parent string by copying selected bits from each parent is called

Mutation
Inheritance
Crossover
None of these

Correct option is C

Each schema the set of bit strings containing the indicated as

0s, 1s
only 0s
only 1s
0s, 1s, *s

Correct option is D

0*10 represents the set of bit strings that includes exactly (A) 0010, 0110

0010, 0010
0100, 0110
0100, 0010

Correct option is A

Correct ( h ) is the percent of all training examples correctly classified by hypothesis then Fitness function is equal to

Fitness ( h) = (correct ( h)) ²
Fitness ( h) = (correct ( h)) ³
Fitness ( h) = (correct ( h))
Fitness ( h) = (correct ( h)) ⁴

Correct option is A

Statement: Genetic Programming individuals in the evolving population are computer programs rather than bit

True
False

Correct option is A

evolution over many generations was directly influenced by the experiences of individual organisms during their lifetime

Baldwin
Lamarckian
Bayes
None of these

Correct option is B

Search through the hypothesis space cannot be characterized. Why?

Hypotheses are created by crossover and mutation operators that allow radical changes between successive generations
Hypotheses are not created by crossover and mutation
None of these

Correct option is A

ILP stand for

Inductive Logical programming
Inductive Logic Programming
Inductive Logical Program
Inductive Logic Program

Correct option is B

What is/are the requirement for the Learn-One-Rule method?

Input, accepts a set of +ve and -ve training examples.
Output, delivers a single rule that covers many +ve examples and few -ve.
Output rule has a high accuracy but not necessarily a high
A & B
A, B & C

Correct option is E

is any predicate (or its negation) applied to any set of terms.

Literal
Null
Clause
None of these

Correct option is A

Ground literal is a literal that

Contains only variables
does not contains any functions
does not contains any variables
Contains only functions Answer

Correct option is C

emphasizes learning feedback that evaluates the learner’s performance without providing standards of correctness in the form of behavioural

Reinforcement learning
Supervised Learning
None of these

Correct option is A

Features of Reinforcement learning

Set of problem rather than set of techniques
RL is training by reward and
RL is learning from trial and error with the
All of these

Correct option is D

Which type of feedback used by RL?

Purely Instructive feedback
Purely Evaluative feedback
Both A & B
None of these

Correct option is B

What is/are the problem solving methods for RL?

Dynamic programming
Monte Carlo Methods
Temporal-difference learning
All of these

Correct option is D

The FIND-S Algorithm

A. Starts with starts from the most specific hypothesis Answer

B. It considers negative examples

C. It considers both negative and positive

D. None of these Correct

136. The hypothesis space has a general-to-specific ordering of hypotheses, and the search can be efficiently organized by taking advantage of a naturally occurring structure over the hypothesis space

A. TRUE

B. FALSE

Correct option is A

137. The Version space is:

The subset of all hypotheses is called the version space with respect to the hypothesis space H and the training examples D, because it contains all plausible versions of the target
The version space consists of only specific
None of these

Correct option is A

The Candidate-Elimination Algorithm

A. The key idea in the Candidate-Elimination algorithm is to output a description of the set of all hypotheses consistent with the training

B. Candidate-Elimination algorithm computes the description of this set without explicitly enumerating all of its

C. This is accomplished by using the more-general-than partial ordering and maintaining a compact representation of the set of consistent

D. All of these

Correct option is D

Concept learning is basically acquiring the definition of a general category from given sample positive and negative training examples of the

A. TRUE

B. FALSE

Correct option is A

The hypothesis h1 is more-general-than hypothesis h2 ( h1 > h2) if and only if h1≥h2 is true and h2≥h1 is false. We also say h2 is more-specific-than h1

A. The statement is true

B. The statement is false

C. We cannot

D. None of these

Correct option is A

The List-Then-Eliminate Algorithm

A. The List-Then-Eliminate algorithm initializes the version space to contain all hypotheses in H, then eliminates any hypothesis found inconsistent with any training

B. The List-Then-Eliminate algorithm not initializes to the version

C. None of these Answer

Correct option is A

What will take place as the agent observes its interactions with the world?

A. Learning

B. Hearing

C. Perceiving

D. Speech

Correct option is A

Which modifies the performance element so that it makes better decision?Performance element

A. Performance element

B. Changing element

C. Learning element

D. None of the mentioned

Correct option is C

Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved example is called:

A. Inductive Learning Hypothesis

B. Null Hypothesis

C. Actual Hypothesis

D. None of these

Correct option is A

Feature of ANN in which ANN creates its own organization or representation of information it receives during learning time is

A. Adaptive Learning

B. Self Organization

C. What-If Analysis

D. Supervised Learning

Correct option is B

How the decision tree reaches its decision?

A. Single test

B. Two test

C. Sequence of test

D. No test

Correct option is C

Which of the following is a disadvantage of decision trees?

· Factor analysis

· Decision trees are robust to outliers

· Decision trees are prone to be overfit

· None of the above

Correct option is C

Tree/Rule based classification algorithms generate which rule to perform the classification.

A. if-then.

B. then

C. do

D. Answer

Correct option is A

What is Gini Index?

A. It is a type of index structure

B. It is a measure of purity

C. None of the options

Correct option is A

What is not a RNN in machine learning?

A. One output to many inputs

B. Many inputs to a single output

C. RNNs for nonsequential input

D. Many inputs to many outputs

Correct option is A

Which of the following sentences are correct in reference to Information gain?

A. It is biased towards multi-valued attributes

B. ID3 makes use of information gain

C. The approach used by ID3 is greedy

D. All of these

Correct option is D

A Neural Network can answer

A. For Loop questions

B. what-if questions

C. IF-The-Else Analysis Questions

D. None of these Answer

Correct option is B

Artificial neural network used for

A. Pattern Recognition

B. Classification

C. Clustering

D. All Answer

Correct option is D

Which of the following are the advantage/s of Decision Trees?

Possible Scenarios can be added
Use a white box model, If given result is provided by a model
Worst, best and expected values can be determined for different scenarios
All of the mentioned

Correct option is D

What is the mathematical likelihood that something will occur?

A. Classification

B. Probability

C. Naïve Bayes Classifier

D. None of the other

Correct option is C

What does the Bayesian network provides?
Complete description of the domain
Partial description of the domain
Complete description of the problem
None of the mentioned

Correct option is C

Where does the Bayes rule can be used?

A. Solving queries

B. Increasing complexity

C. Decreasing complexity

D. Answering probabilistic query

Correct option is D

How many terms are required for building a Bayes model?

A. 2

B. 3

C. 4

D. 1

Correct option is B

What is needed to make probabilistic systems feasible in the world?

A. Reliability

B. Crucial robustness

C. Feasibility

D. None of the mentioned

Correct option is B

It was shown that the Naive Bayesian method

A. Can be much more accurate than the optimal Bayesian method

B. Is always worse off than the optimal Bayesian method

C. Can be almost optimal only when attributes are independent

D. Can be almost optimal when some attributes are dependent

Correct option is C

What is the consequence between a node and its predecessors while creating Bayesian network?

A. Functionally dependent

B. Dependant

C. Conditionally independent

D. Both Conditionally dependant & Dependant

Correct option is C

How the compactness of the Bayesian network can be described?

A. Locally structured

B. Fully structured

C. Partial structure

D. All of the mentioned

Correct option is A

How the entries in the full joint probability distribution can be calculated?

A. Using variables

B. Using information

C. Both Using variables & information

D. None of the mentioned

Correct option is B

How the Bayesian network can be used to answer any query?

A. Full distribution

B. Joint distribution

C. Partial distribution

D. All of the mentioned

Correct option is B

Sample Complexity is

A. The sample complexity is the number of training-samples that we need to supply to the algorithm, so that the function returned by the algorithm is within an arbitrarily small error of the best possible function, with probability arbitrarily close to 1

B. How many training examples are needed for learner to converge to a successful hypothesis.

C. All of these

Correct option is C

PAC stands for

A. Probability Approximately Correct

B. Probability Applied Correctly

C. Partition Approximately Correct

Correct option is A

Which of the following will be true about k in k-NN in terms of variance

A. When you increase the k the variance will increases

B. When you decrease the k the variance will increases

C. Can‟t say

D. None of these

Correct option is B

Which of the following option is true about k-NN algorithm?

A. It can be used for classification

B. It can be used for regression

C. It can be used in both classification and regression Answer

Correct option is C

In k-NN it is very likely to overfit due to the curse of dimensionality. Which of the following option would you consider to handle such problem? 1). Dimensionality Reduction 2). Feature selection

1
2
1 and 2
None of these

Correct option is C

When you find noise in data which of the following option would you consider in k- NN

A. I will increase the value of k

B. I will decrease the value of k

C. Noise can not be dependent on value of k

D. None of these

Correct option is A

Which of the following will be true about k in k-NN in terms of Bias?

A. When you increase the k the bias will be increases

B. When you decrease the k the bias will be increases

C. Can‟t say

D. None of these

Correct option is A

What is used to mitigate overfitting in a test set?

A. Overfitting set

B. Training set

C. Validation dataset

D. Evaluation set

Correct option is C

A radial basis function is a

A. Activation function

B. Weight

C. Learning rate

D. none

Correct option is A

Mistake Bound is

How many training examples are needed for learner to converge to a successful hypothesis.
How much computational effort is needed for a learner to converge to a successful hypothesis
How many training examples will the learner misclassify before conversing to a successful hypothesis
None of these

Correct option is C

All of the following are suitable problems for genetic algorithms EXCEPT

A. dynamic process control

B. pattern recognition with complex patterns

C. simulation of biological models

D. simple optimization with few variables

Correct option is D

Adding more basis functions in a linear model… (Pick the most probably option)

A. Decreases model bias

B. Decreases estimation bias

C. Decreases variance

D. Doesn‟t affect bias and variance

Correct option is A

Which of these are types of crossover

A. Single point

B. Two point

C. Uniform

D. All of these

Correct option is D

A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Which of the following statement is true in following case?

A. Feature F1 is an example of nominal

B. Feature F1 is an example of ordinal

C. It doesn‟t belong to any of the above category.

Correct option is B

You observe the following while fitting a linear regression to the data: As you increase the amount of training data, the test error decreases and the training error increases. The train error is quite low (almost what you expect it to), while the test error is much higher than the train error. What do you think is the main reason behind this behaviour? Choose the most probable option.

A. High variance

B. High model bias

C. High estimation bias

D. None of the above Answer

Correct option is C

Genetic algorithms are heuristic methods that do not guarantee an optimal solution to a problem

A. TRUE

B. FALSE

Correct option is A

Which of the following statements about regularization is not correct?

A. Using too large a value of lambda can cause your hypothesis to underfit the

B. Using too large a value of lambda can cause your hypothesis to overfit the

C. Using a very large value of lambda cannot hurt the performance of your hypothesis.

D. None of the above

Correct option is A

Consider the following: (a) Evolution (b) Selection (c) Reproduction (d) Mutation Which of the following are found in genetic algorithms?

A. All

B. a, b, c

C. a, b

D. b, d

Correct option is A

Genetic Algorithm are a part of

A. Evolutionary Computing

B. inspired by Darwin’s theory about evolution – “survival of the fittest”

C. are adaptive heuristic search algorithm based on the evolutionary ideas of natural selection and genetics

D. All of the above

Correct option is D

Genetic algorithms belong to the family of methods in the

A. artificial intelligence area

B. optimization

C. complete enumeration family of methods

D. Non-computer based (human) solutions area

Correct option is A

For a two player chess game, the environment encompasses the opponent

A. True

B. False

Correct option is A

Which among the following is not a necessary feature of a reinforcement learning solution to a learning problem?

A. exploration versus exploitation dilemma

B. trial and error approach to learning

C. learning based on rewards

D. representation of the problem as a Markov Decision Process

Correct option is D

Which of the following sentence is FALSE regarding reinforcement learning

A. It relates inputs to

B. It is used for

C. It may be used for

D. It discovers causal relationships.

Correct option is D

The EM algorithm is guaranteed to never decrease the value of its objective function on any iteration

A. TRUE

B. FALSE Answer

Correct option is A

Consider the following modification to the tic-tac-toe game: at the end of game, a coin is tossed and the agent wins if a head appears regardless of whatever has happened in the game.Can reinforcement learning be used to learn an optimal policy of playing Tic-Tac-Toe in this case?

A. Yes

B. No

Correct option is B

190. Out of the two repeated steps in EM algorithm, the step 2 is _

the maximization step
the minimization step
the optimization step
the normalization step

Correct option is A

Suppose the reinforcement learning player was greedy, that is, it always played the move that brought it to the position that it rated the best. Might it learn to play better, or worse, than a non greedy player?

A. Worse

B. Better

Correct option is B

A chess agent trained by using Reinforcement Learning can be trained by playing against a copy of the same

A. True

B. False

Correct option is A

The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E

A. TRUE

B. FALSE

Correct option is A

Expectation–maximization (EM) algorithm is an

A. Iterative

B. Incremental

C. None

Correct option is A

Feature need to be identified by using Well Posed Learning Problem:

A. Class of tasks

B. Performance measure

C. Training experience

D. All of these

Correct option is D

A computer program that learns to play checkers might improve its performance as:

A. Measured by its ability to win at the class of tasks involving playing checkers

B. Experience obtained by playing games against

C. Both a & b

D. None of these

Correct option is C

Learning symbolic representations of concepts known as:

A. Artificial Intelligence

B. Machine Learning

C. Both a & b

D. None of these

Correct option is A

The field of study that gives computers the capability to learn without being explicitly programmed

A. Machine Learning

B. Artificial Intelligence

C. Deep Learning

D. Both a & b

Correct option is A

The autonomous acquisition of knowledge through the use of computer programs is called

A. Artificial Intelligence

B. Machine Learning

C. Deep learning

D. All of these

Correct option is B

Learning that enables massive quantities of data is known as

A. Artificial Intelligence

B. Machine Learning

C. Deep learning

D. All of these

Correct option is B

A different learning method does not include

A. Memorization

B. Analogy

C. Deduction

D. Introduction

Correct option is D

Types of learning used in machine

A. Supervised

B. Unsupervised

C. Reinforcement

D. All of these

Correct option is D

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience

A. Supervised learning problem

B. Un Supervised learning problem

C. Well posed learning problem

D. All of these

Correct option is C

Which of the following is a widely used and effective machine learning algorithm based on the idea of bagging?

A. Decision Tree

B. Regression

C. Classification

D. Random Forest

Correct option is D

How many types are available in machine learning?

A. 1

B. 2

C. 3

D. 4

Correct option is C

A model can learn based on the rewards it received for its previous action is known as:

A. Supervised learning

B. Unsupervised learning

C. Reinforcement learning

D. Concept learning

Correct option is C

A subset of machine learning that involves systems that think and learn like humans using artificial neural networks.

A. Artificial Intelligence

B. Machine Learning

C. Deep Learning

D. All of these

Correct option is C

A learning method in which a training data contains a small amount of labeled data and a large amount of unlabeled data is known as

A. Supervised Learning

B. Semi Supervised Learning

C. Unsupervised Learning

D. Reinforcement Learning

Correct option is C

Methods used for the calibration in Supervised Learning

A. Platt Calibration

B. Isotonic Regression

C. All of these

D. None of above

Correct option is C

The basic design issues for designing a learning

A. Choosing the Training Experience

B. Choosing the Target Function

C. Choosing a Function Approximation Algorithm

D. Estimating Training Values

E. All of these

Correct option is E

In Machine learning the module that must solve the given performance task is known as:

A. Critic

B. Generalizer

C. Performance system

D. All of these

Correct option is C

A learning method that is used to solve a particular computational program, multiple models such as classifiers or experts are strategically generated and combined is called as

A. Supervised Learning

B. Semi Supervised Learning

C. Unsupervised Learning

D. Reinforcement Learning

E. Ensemble learning

Correct option is E

In a learning system the component that takes as takes input the current hypothesis (currently learned function) and outputs a new problem for the Performance System to explore.

A. Critic

B. Generalizer

C. Performance system

D. Experiment generator

E. All of these

Correct option is D

Learning method that is used to improve the classification, prediction, function approximation etc of a model

A. Supervised Learning

B. Semi Supervised Learning

C. Unsupervised Learning

D. Reinforcement Learning

E. Ensemble learning

Correct option is E

In a learning system the component that takes as input the history or trace of the game and produces as output a set of training examples of the target function is known as:

A. Critic

B. Generalizer

C. Performance system

D. All of these

Correct option is A

The most common issue when using ML is

A. Lack of skilled resources

B. Inadequate Infrastructure

C. Poor Data Quality

D. None of these

Correct option is C

How to ensure that your model is not over fitting

A. Cross validation

B. Regularization

C. All of these

D. None of these

Correct option is C

A way to ensemble multiple classifications or regression

A. Stacking

B. Bagging

C. Blending

D. Boosting

Correct option is A

How well a model is going to generalize in new environment is known as

A. Data Quality

B. Transparent

C. Implementation

D. None of these

Correct option is B

Common classes of problems in machine learning is

A. Classification

B. Clustering

C. Regression

D. All of these

Correct option is D

Which of the following is a widely used and effective machine learning algorithm based on the idea of bagging?

A. Decision Tree

B. Regression

C. Classification

D. Random Forest

Correct option is D

Cost complexity pruning algorithm is used in?

A. CART

B. 5

C. ID3

D. All of

Correct option is A

Which one of these is not a tree based learner?

A. CART

B. 5

C. ID3

D. Bayesian Classifier

Correct option is D

Which one of these is a tree based learner?

A. Rule based

B. Bayesian Belief Network

C. Bayesian classifier

D. Random Forest

Correct option is D

What is the approach of basic algorithm for decision tree induction?

A. Greedy

B. Top Down

C. Procedural

D. Step by Step

Correct option is A

Which of the following classifications would best suit the student performance classification systems?

A. If-.then-analysis

B. Market-basket analysis

C. Regression analysis

D. Cluster analysis

Correct option is A

What are two steps of tree pruning work?

A. Pessimistic pruning and Optimistic pruning

B. Post pruning and Pre pruning

C. Cost complexity pruning and time complexity pruning

D. None of these

Correct option is B

How will you counter over-fitting in decision tree?

A. By pruning the longer rules

B. By creating new rules

C. Both By pruning the longer rules‟ and „ By creating new rules‟

D. None of Answer

Correct option is A

Which of the following sentences are true?

A. In pre-pruning a tree is ‘pruned’ by halting its construction early

B. A pruning set of class labeled tuples is used to estimate cost

C. The best pruned tree is the one that minimizes the number of encoding

D. All of these

Correct option is D

Which of the following is a disadvantage of decision trees?

A. Factor analysis

B. Decision trees are robust to outliers

C. Decision trees are prone to be over fit

D. None of the above

Correct option is C

In which of the following scenario a gain ratio is preferred over Information Gain?

A. When a categorical variable has very large number of category

B. When a categorical variable has very small number of category

C. Number of categories is the not the reason

D. None of these

Correct option is A

Major pruning techniques used in decision tree are

A. Minimum error

B. Smallest tree

C. Both a & b

D. None of these

Correct option is B

What does the central limit theorem state?

A. If the sample size increases sampling distribution must approach normal distribution

B. If the sample size decreases then the sample distribution must approach normal distribution.

C. If the sample size increases then the sampling distributions much approach an exponential

D. If the sample size decreases then the sampling distributions much approach an exponential

Correct option is A

The difference between the sample value expected and the estimates value of the parameter is called as?

A. Bias

B. Error

C. Contradiction

D. Difference

Correct option is A

In which of the following types of sampling the information is carried out under the opinion of an expert?

A. Quota sampling

B. Convenience sampling

C. Purposive sampling

D. Judgment sampling

Correct option is D

Which of the following is a subset of population?

A. Distribution

B. Sample

C. Data

D. Set

Correct option is B

The sampling error is defined as?

A. Difference between population and parameter

B. Difference between sample and parameter

C. Difference between population and sample

D. Difference between parameter and sample

Correct option is C

Machine learning is interested in the best hypothesis h from some space H, given observed training data D. Here best hypothesis means

A. Most general hypothesis

B. Most probable hypothesis

C. Most specific hypothesis

D. None of these

Correct option is B

Practical difficulties with Bayesian Learning :

A. Initial knowledge of many probabilities is required

B. No consistent hypothesis

C. Hypotheses make probabilistic predictions

D. None of these

Correct option is A

Bayes’ theorem states that the relationship between the probability of the hypothesis before getting the evidence P(H) and the probability of the hypothesis after getting the evidence P(H∣E) is

[P(E∣H)P(H)] / P(E)
[P(E∣H) P(E) ] / P(H)
[P(E) P(H) ] / P(E∣H)
None of these

Correct option is A

A doctor knows that Cold causes fever 50% of the time. Prior probability of any patient having cold is 1/50,000. Prior probability of any patient having fever is 1/20. If a patient has fever, what is the probability he/she has cold?

P(C/F)= 0.0003
P(C/F)=0.0004
P(C/F)= 0.0002
P(C/F)=0.0045

Correct option is C

Which of the following will be true about k in K-Nearest Neighbor in terms of Bias?

A. When you increase the k the bias will be increases

B. When you decrease the k the bias will be increases

C. Can‟t say

D. None of these

Correct option is A

When you find noise in data which of the following option would you consider in K- Nearest Neighbor?

A. I will increase the value of k

B. I will decrease the value of k

C. Noise cannot be dependent on value of k

D. None of these

Correct option is A

In K-Nearest Neighbor it is very likely to overfit due to the curse of dimensionality. Which of the following option would you consider to handle such problem?

Dimensionality Reduction
Feature selection

A. 1

B. 2

C. 1 and 2

D. None of these

Correct option is C

Radial basis functions is closely related to distance-weighted regression, but it is

A. lazy learning

B. eager learning

C. concept learning

D. none of these

Correct option is B

Radial basis function networks provide a global approximation to the target function, represented by of many local kernel function.

A. a series combination

B. a linear combination

C. a parallel combination

D. a non linear combination

Correct option is B

The most significant phase in a genetic algorithm is

A. Crossover

B. Mutation

C. Selection

D. Fitness function

Correct option is A

The crossover operator produces two new offspring from

A. Two parent strings, by copying selected bits from each parent

B. One parent strings, by copying selected bits from selected parent

C. Two parent strings, by copying selected bits from one parent

D. None of these

Correct option is A

Mathematically characterize the evolution over time of the population within a GA based on the concept of

A. Schema

B. Crossover

C. Don‟t care

D. Fitness function

Correct option is A

In genetic algorithm process of selecting parents which mate and recombine to create off-springs for the next generation is known as:

A. Tournament selection

B. Rank selection

C. Fitness sharing

D. Parent selection

Correct option is D

Crossover operations are performed in genetic programming by replacing

A. Randomly chosen sub tree of one parent program by a sub tree from the other parent program.

B. Randomly chosen root node tree of one parent program by a sub tree from the other parent program

C. Randomly chosen root node tree of one parent program by a root node tree from the other parent program

D. None of these

Correct option is A

Translate

MCQ Question of Machine learning

Post a Comment

0 Comments

Labels

MOST POPULAR POSTS:-

Contact form

Social Plugin

Labels

popular post

top read post

world popular post

Recent post

most read post

+ कौन सा देश कब आज़ाद हुआ हिंदी सामान्य ज्ञान पूरी लिस्ट

+ 👉🏻❤️28 सितम्बर को जन्मे व्यक्ति* 28 September

+ District of Up name and RTO Codes

+ भारत में फसलों के सबसे बड़े उत्पादक ❣💐

+ पचवर्षीय योजनाओं में प्राथमिकता के प्रमुख क्षेत्र

+ संधि शब्द क्या है और उसके भेद !

+ चौथी स्टेज का कैंसर भी हो सकता है गाजर के जूस से ठीक, जानिए कैसे करता है मदद

+ Every Day knowledge* *India GK* *100 ques*

+ history ye question kabhi nahi sune होंगे

+ Basic Principle Of System Voltage Control

Popular Posts

education most popular post

india most popular post

Menu Footer Widget

+ Every Day knowledge* India GK 100 ques