## Data Mining Homework Help | Project | Homework Problem Solution

### Data Mining Homework Help (Sample)

Attribute |
Data-Type |
Data-Labels [count value] |

Class |
Nominal | 1-NoAuto [6] ,2-AutoPilot [9] |

Stability |
Nominal | 1-Stab[12], 2-Xstab[1] |

Error |
Nominal | 1-XL[1],2-LX[1],3-MM[7],4-SS[3] |

Sign |
Nominal | 1-pp[6], 2-nn[1] |

Wind |
Nominal | 1-head[3],2-tail[4] |

Magnitude |
Nominal | 1-Low[1],2-Med[3],3-Strong[3],4-OutOfRange[4] |

Visibility |
Nominal | 1-Yes[14], 2-No[1] |

### a. Data Pre-processing

### Data attributes Reordering

Best rules found:

ERROR=4 3 ==> Class=2 3 conf:(1)

MAGNITUDE=2 3 ==> Class=2 3 conf:(1)

STABILITY=1 ERROR=4 3 ==> Class=2 3 conf:(1)

STABILITY=1 MAGNITUDE=2 3 ==> Class=2 3 conf:(1)

ERROR=4 SIGN=1 3 ==> Class=2 3 conf:(1)

ERROR=4 WIND=2 3 ==> Class=2 3 conf:(1)

ERROR=4 VISIBILITY=1 3 ==> Class=2 3 conf:(1)

SIGN=1 MAGNITUDE=2 3 ==> Class=2 3 conf:(1)

MAGNITUDE=2 VISIBILITY=1 3 ==> Class=2 3 conf:(1)

STABILITY=1 ERROR=4 SIGN=1 3 ==> Class=2 3 conf:(1)

The above rules showing that how given attribute values are relating to class attribute value ‘2’-Auto Pilot landing, e.g. if Stability=1(stable) and Magnitude=2(medium) then class landing= autopilot

All rules showing that Magnitude=2(Medium), Stability=1 Stab, Error-4 SS, Sign-1 pp values occurrence leading to predicted landing value ‘Auto-Pilot’.

Best rules found for class=1 Manual-landing

STABILITY=2 1 ==> Class=1 1 conf:(1)

ERROR=1 1 ==> Class=1 1 conf:(1)

ERROR=2 1 ==> Class=1 1 conf:(1)

SIGN=2 1 ==> Class=1 1 conf:(1)

MAGNITUDE=4 1 ==> Class=1 1 conf:(1)

VISIBILITY=2 1 ==> Class=2 1 conf:(1)

STABILITY=1 ERROR=1 1 ==> Class=1 1 conf:(1)

STABILITY=1 ERROR=2 1 ==> Class=1 1 conf:(1)

STABILITY=1 SIGN=2 1 ==> Class=1 1 conf:(1)

STABILITY=1 MAGNITUDE=4 1 ==> Class=1 1 conf:(1)

All rules showing that Magnitude=4(OutOfRange), Stability=1 Stab, Error-2 LX, Sign-2 nn values occurrence leading to predicted landing value ‘Non-Auto, Manual landing’.

Decision Tree

J48-

Using percentage split 66%, J48 is giving highest possible accuracy 66%, by changing training and test set size accuracy data mining project help is being decreasing. Also here unpruned tree is being getting for visualizing all attribute relation with class attribute values as showing in tree-diagram below.

J48 unpruned tree

ERROR = 1: 1 (1.0)

ERROR = 2: 1 (1.0)

ERROR = 3

| MAGNITUDE = 1: 2 (5.0/2.0)

| MAGNITUDE = 2: 2 (2.0)

| MAGNITUDE = 3: 1 (2.0/1.0)

| MAGNITUDE = 4: 1 (1.0)

ERROR = 4: 2 (3.0)

Number of Leaves : 7, Size of the tree : 9

Here this tree clearly showing that ERROR attribute values ‘1’ LX,’2’ LX are associating with class value ‘1’ Non-Auto landing or manual landing whereas values ‘3’ MM, ‘4’ SS with Magnitude values ‘1,2’ low-medium is associating with class ‘Auto-pilot landing’

Yes It is classifying successfully as required for predicting future landing values by viewing given attribute values relation. Also, using association rule mining similar results are getting.

Here main challenge is to achieve more accurate predicted results as applied algorithm only giving 66% accuracy data mining using weka homework help and by changing other data size accuracy is being decreasing not increasing. Thus, different other classifiers need to test for obtaining higher accuracy.

Distance metrics are used to find similar data objects that lead to develop robust algorithms for the data mining functionalities such as classification and clustering. Thus, distance metric plays a very pay for data mining homework important role in order to measure the similarity among different data items. Distance metric is used in ‘Clustering’ where clusters having more similar data mining homework for money items in one group or cluster, the more the similarity among the data in clusters, more the chances of particular item belongs to data mining using weka project help particular cluster or group. In general, K-means is a heuristic algorithm that partitions a data set into K clusters by minimizing the sum of squared distance in each cluster. There are different distance metrices uses as ‘Euclidean distance metric’, ‘Manhattan distance metrice’, ‘Chebyche distance’ and Minkowski distance’. In Data mining Clustering structure uses the minimizing a certain error criterion that data mining questions with answers measures the “distance” of each instance to its data mining problem solution representative value. The most well known method of finding minimum is the ‘SSE-Sum of squared error’, which measures the total squared Euclidian distance of instances to their representative values. Challenge here is to choose such clusters that have lowest SSE.