(Study Cards)
Zero-chance - ANS-Add 1 to every case (Laplacian correction)
A consumer bikes twice each week. What type of temporal sample is this? - ANS-Cyclic
accuracy - ANS-sensitivity * (pos/pos+neg) + specificity *(neg/pos+neg)
Application Domains of Data Mining - ANS-Healthcare, Business Intelligence,
Earth/surroundings, medical discovery, industry AI
Apriori Algorithm - ANS-A speedy method of finding common itemsets, which additionally entails
pruning non-common objects and self-becoming a member of of okay-itemsets most effective if
their first (k-1) items are the same.
Association Rules - ANS-Association regulations specify a relation among attributes that
appears extra frequently than expected if the attributes were impartial.
Backpropagation - ANS-category error => weight adjustment
Bayes' Theorem - ANS-The probability of an occasion occurring based upon different event
chances.
Bayesian Belief Networks - ANS-A records mining method this is used to deliver advanced
know-how based structures to remedy real-international problems. Involves the conditional
dependency of variables and normally consists of a conditional chance desk.
Bi-Clustering - ANS-Cluster each items and attributes
demanding situations of anomaly detection - ANS-Normal vs. Bizarre, performance (latency,
scalability), interpretability
Classification - ANS-categorical elegance labels (e.G. Fraud detection)
Classification-based totally Methods for Anomaly Detection - ANS-Supervised Learning,
Challenges: Class imbalance, New Patterns
Clustering-based Methods for Anomaly Detection - ANS-Unsupervised Learning, Generalizable
to one-of-a-kind applications (Clustering Method, Similarity Measure)
, Collective Anomaly - ANS-group of items deviate from the norm), structural courting amongst
objects, awesome object (organization of associated items)
Collective outlier - ANS-When a collection of objects vary from the relaxation
Confidence - ANS-P(Ylessons (the observations) with instructions acquired with the aid of a few
extra correct system, or from a extra correct source (the reference)
Constraint-based totally Clustering - ANS-Benefits: involves targeted mining, area know-how,
and efficiency (e.G. Objects can include sales in precise place/time/category); Distance features
encompass weighted attributes and boundaries
Contextual Anomaly - ANS-Context capabilities (behavior features; e.G. Similar climate
conditions) , Identifying context (frequent patterns), Detecting anomaly within context
Contextual outlier - ANS-When an object differs in a context
Correlation policies - ANS-Measure of dependent/correlated events: carry(A,B) = P(A U B) /
P(A)P(B)
Data Fusion - ANS-multi-modal records
Decision Tree Induction - ANS-Basic set of rules: Attribute choice, characteristic break up
Key houses: pinnacle-down, recursive (divide-and-overcome, grasping)
deep neural network (DNN) - ANS-Refers to a neural network with multiple hidden layer (e.G.
Convolutional neural community)
DENCLUE - ANS-Influence function, universal density
Density-Based Clustering - ANS-Local clusters with high density (e.G. DBSCAN-connected
dense community, DENCLUE - sum of nearby impact features).
Key capabilities: arbitrary cluster space, noise-tolerant, unmarried experiment, adjustable
density parameters
Ensemble - ANS-Combined use of more than one models, Bagging (same weights, majority
vote casting, training set has random sample with substitute), Boosting (weighted votes)
Example of spatial temporal anomaly? - ANS-Remote sensing statistics