Question Bank For DMDW
Question Bank For DMDW
Subject Code:
Multiple Choice Questions
UNIT-1
1. What is the use of Data Mining?
a. time variant non-volatile collection of data
b. The actual discovery phase of a knowledge
c. The stage of selecting the right data
d. None of these
2. Various operations that are carried on data while processing includes
a. Manipulation
b. Analysis
c. Calculation
d. None of these
3. How Euclidean distance is measured?
a. The process of finding a solution for a problem simply by enumerating all possible solutions
according to some pre-defined order and then testing them
b. The distance between two points as calculated using the Pythagoras theorem
c. A stage of the KDD process in which new data is added to the existing selection
d. None of these
4. Information content is________?
a. Restriction that requires data in one column of a database table to the a subset of
another-column
b. One of the defining aspects of a data warehouse
c. The amount of information with in data as opposed to the amount of redundancy or noise
d. None of these
5. Knowledge Discovery in Databases is referred to___________
a. collection of interesting and useful patterns in a database
b. Set of columns in a database table that can be used to identify each record within this table
uniquely.
c. Non-trivial extraction of implicit previously unknown and potentially useful information from
data
d. None of these
6. Classification and regression are the tasks of_________?
a. Data manipulation
b. Data Analysis
c. Data mining
d. None of these
7. In data preprocessing, Noise is referred as ______________?
a. Random errors in a database table
b. A component of a network
c. One of the defining aspects of a data warehouse
d. None of these
8. Which of the following are the properties of entities?
a. Groups
b. Table
c. Attributes
d. Switchboards
9. In which step of Knowledge Discovery, multiple data sources are combined?
a. Data Cleaning
b. Data Integration
c. Data Selection
d. Data Transformation
10. On which of the following does the critical value for a chi-square statistic rely on?
a. The degrees of freedom
b. The sum of the frequencies
c. The row totals
d. The number of variables
UNIT-2
Multiple Choice Questions
1. OLAP stands for
a) Online analytical processing
b) Online analysis processing
c) Online transaction processing
d) Online aggregate processing
2. Data that can be modeled as dimension attributes and measure attributes are called _______
data.
a) Multidimensional
b) Singledimensional
c) Measured
d) Dimensional
3. The generalization of cross-tab which is represented visually is ____________ which is also
called as data cube.
a) Two dimensional cube
b) Multidimensional cube
c) N-dimensional cube
d) Cuboid
4. The process of viewing the cross-tab (Single dimensional) with a fixed value of one attribute
is
a) Slicing
b) Dicing
c) Pivoting
d) Both Slicing and Dicing
5. The operation of moving from finer-granularity data to a coarser granularity (by means of
aggregation) is called a ________
a) Rollup
b) Drill down
c) Dicing
d) Pivoting
6. In SQL the cross-tabs are created using
a) Slice
b) Dice
c) Pivot
d) All of the mentioned
8. What do data warehouses support?
a) OLAP
b) OLTP
c) OLAP and OLTP
d) Operational databases
9. Business intelligence (BI) is a broad category of application programs which includes
_____________
a) Decision support
b) Data mining
c) OLAP
d) All of the mentioned
T1 I1,I2,I3
T2 I2,I3,I4
T3 I4,I5
T4 I1,I2,I4
T5 I1,I2,I3,I5
T6 I1,I2,I3,I4
UNIT-4
Multiple Choice Questions
1. Which of the following clustering type has characteristic shown in the below figure?
a) Partitional
b) Hierarchical
c) Naive bayes
d) None of the mentioned
2. Point out the correct statement.
a) The choice of an appropriate metric will influence the shape of the clusters
b) Hierarchical clustering is also called HCA
c) In general, the merges and splits are determined in a greedy manner
d) All of the mentioned
3. Which of the following is finally produced by Hierarchical Clustering?
a) final estimate of cluster centroids
b) tree showing how close things are to each other
c) assignment of each point to clusters
d) all of the mentioned
4. Which of the following is required by K-means clustering?
a) defined distance metric
b) number of clusters
c) initial guess as to cluster centroids
d) all of the mentioned
5.Point out the wrong statement.
a) k-means clustering is a method of vector quantization
b) k-means clustering aims to partition n observations into k clusters
c) k-nearest neighbor is same as k-means
d) none of the mentioned
6. Which of the following function is used for k-means clustering?
a) k-means
b) k-mean
c) heatmap
d) none of the mentioned
7.Which of the following clustering requires merging approach?
a) Partitional
b) Hierarchical
c) Naive Bayes
d) None of the mentioned
8.Which of the following combination is incorrect?
a) Continuous – euclidean distance
b) Continuous – correlation similarity
c) Binary – manhattan distance
d) None of the mentioned
Part - A Short Answer Type Questions
Part - A Short Answer Type Questions
1. Define web mining. [CO-4, PO-1]
2. What is a multimedia database? [CO-3, PO-1]
3. Define web content mining. [CO-3, PO-1]
4. Define web structure mining. [CO-3, PO-1]
5. Define web usage mining. [CO-3, PO-1]
6. What is spatial mining? [CO-3, PO-1]
7. What is time series analysis? [CO-3, PO-1]
8. Define sequence mining. [CO-3, PO-1]
9. Define graph mining. [CO-3, PO-1]
10. What are the applications of data mining? [CO-3, PO-1]
11. What are the additional themes in data mining? [CO-3, PO-1]
12. What is page rank? [CO-3, PO-1]
13. Write notes on k-means algorithm. [CO-3, PO-1]
18. List out the density based methods. [CO-3, PO-1]
14. List out the partitioning methods. [CO-3, PO-1]
15. Differentiate Agglomerative and Divisive Hierarchical Clustering? [CO-3, PO-1]