## Monday, January 7, 2013

### Naive Bayes Classifier in Java

Introduction

The Naive Bayes approach is a generative supervised learning method which is based on a simplistic hypothesis: it assumes that the existence of a specific feature of a class is unrelated to the existence of another feature. This condition of independence between model features is essential to the proper classification.
Mathematically, Bayes' theorem gives the relationship between the probabilities of A and B, P(A) and P(B), and the conditional probabilities of A given B and B given A, P(A|B) and P(B|A).
In its most common form the Naive Bayes Formula is defined for a proposition (or class) A and evidence (or observation) B with $p(A|B)= \frac{p(B|A).p(A)}{p(B)}$
- P(A), the prior, is the initial degree of belief in A.
- P(A|B), the posterior, is the degree of belief having accounted for B<
- P(B|A)/P(B) represents the support B provides for A
The case above can be extended to a network of cause-effect conditional probabilities P(X|Y)

In case of the features of the model are known to be independent. The probability of a observation x =( ...,x,...) to belong to a class C is computed as: $p(C|\vec{x})=\frac{\prod (p(x_{i}|C).p(C)}{p(\vec{x})}$. It is usually more convenient to compute the maximum likelihood of the probability of a new observation to belong to a specific class by converting the formula above. $log\,p(C|\vec{x}) = \sum log\,p(x_{i}|C) + log\,p(C) - log\,p(\vec{x})$

Note: For the sake of readability of the implementation of algorithms, all non-essential code such as error checking, comments, exception, validation of class and method arguments, scoping qualifiers or import is omitted

Software design
The class in the example below, implements a basic version Naive Bayes algorithm. The model and its feature is defined by the nested class NClass. This model class defines the features parameters (mean and variance of prior observations) and the class probability p(C).  The computation of the mean and variances of prior is implemented in the NClass.computeStats method.Some of the methods, setters, getters, comments and conditional test on arguments are omitted for the sake of clarity. The kernel function is to be selected at run-time. This implementation supports any number of features and classes.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 public final class NaiveBayes implements Classifier { public final class NClass { private double[] _params = null; private double[] _paramsVariance = null; private double _classProb = 0.0; public NClass(int numParms) { _params = new double[numParams]; } private void add(double[] data) { int numObservations = 0; _paramsVariance = new double[_params.length]; for(int j = 0; j < data.length; ) { j++; for( int k = 0; k < _params.length; k++, j++) { _params[k] += data[j]; _paramsVariance[k] += data[j]*data[j]; } numObservations++; } _classProb = numObservations; } private void computeStats() { double inv = 1.0/_classProb; double invCube = invClassProb*invClassProb*invClassProb; for( int k = 0; k < _params.length; k++) { _params[k] /= _classProb; _paramsVariance[k] = _paramsVariance[k]*inv - _params[k]*_params[k]*invCube; } _classProb /= _numObservations; } } } 

Kernel functions can be used to improve the classification observations by increasing the distance between prior belonging to a class during the training phase. In the case of 2 classes (Bernoulli classification) C1, C2 the kernel algorithm increases the distance between the mean values m1 and m2 of all the prior observations for each of the two classes, adjusted for the variance.

As Java as does not support local functions or closures we need to create a classes hierarchy to implement the different kernel(discriminant) functions. The example below defines a simple linear and logistic (sigmoid function) kernel functions implemented by nested classes. $y = \theta x \,\,and\,\,y =\frac{1}{1+e^{-x}}$

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 public interface Discriminant { public double estimate(double value); } //Nested class that implement a linear Discriminant public static class DiscriminantKernel implements Discriminant { private double _theta = 1.0; public DiscriminantKernel(double theta) { _theta = theta; } public double estimate(double value) { return value*_theta; } } // Nested class that implements a sigmoid function for kernel public static class SigmoidKernel implements Kernel { public double estimate(double value) { return 1.0/(1.0 + Math.exp(-value) } } 

Ultimately, the NaiveBayes class implements the three key components of the learning algorithm:
• Training: train
• Run time classification: classify
A new observation is classify using the logarithmic version of the Naive Bayes formula, logP
First let's define the NaiveBayes class and its constructors.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 public final class NaiveBayes implements Classifier { public final class NClass { } private CDoubleArray[] _valuesArray = null; private NClass[] _classes = null; private int _numObservations = 0; private int _step = 1; private Kernel _kF = null; public NaiveBayes() { this(0,0) } public NaiveBayes(int numParams, int numClasses) { this(numParams, numClasses, new NLinearDiscriminant()); } public NaiveBayes( int numParams, int numClasses, final Discriminant kf ) { _classes = new NClass[numClasses]; _valuesArray = new CDoubleArray[numClasses]; for( int k = 0; k < numClasses; k++) { _classes[k] = new NClass(numParams); _valuesArray[k] = new CDoubleArray(); } _kF = kf; this.discretize(0,numClasses); } .. } 

Training
Next the training method, train is defined. The method consists merely in computing the statistics on historical data, _valuesArray and assign them to predefined classes _classes

  1 2 3 4 5 6 7 8 9 10 11 12 13 public int train() throws ClassifierException { double[] values = null; for( int j = 0; j < _valuesArray.length; j++) { values = _valuesArray[j].currentValues(); _classes[j].add(values); } for( int j = 0; j < _classes.length; j++) { _classes[j].computeStats(); } return values.length; } 

Classification
The run-time classification method classify uses the prior conditional probability to assign a new observation to an existing class. It generate the class id for a set of values or observations.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 public int classify(double[] values) { // Compute the normalizing denominator value double[] normalizedPriorProb = new double[values.length], prob = 0.0; for( int valueIndex = 0; valueIndex < values.length; valueIndex++) { for(int classid = 0; classid < _classes.length; classid++) { prob = Math.abs(values[valueIndex] - _classes[classid]._parameters[valueIndex]); if( prob > normalizedPriorProb[valueIndex]){ normalizedPriorProb[valueIndex] = prob; } } } return logP(values, normalizedPriorProb); } 

A new observation values is assigned to the appropriate class croding to its likelihood or log of conditional probability, by the method logP.
logP computes the likelihood for each value and use the Naive Bayes formula for logarithm of prior probability and log of class probability

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 private int logP(double[] values, double[] denominator) { double score = 0.0, adjustedValue = 0.0, prior = 0.0, bestScore = -Double.MAX_VALUE; int bestMatchedClass = -1; // Walk through all the classes defined in the model for(int classid = 0; classid < _classes.length; classid++) { double[] classParameters = _classes[classid]._parameters; score = 0.0; for( int k = 0; k < values.length; k++) { adjustedValue = _kF.estimate(values[k]); prior = Math.abs(adjustedValue - classParameters[k])/ denominator[k]; score += Math.log(1.0 - prior); } score += Math.log(_classes[classid]._classProb); if(score > bestScore) { bestScore = score; bestMatchedClass = classid; } } return bestMatchedClass; } 

Some of the ancillary private methods are omitted for the sake of clarification. We will look at the implementation of the same classifier in Scala in a later post.

References
• The Elements of Statistics Learning: Data mining, Inference & Prediction - Hastie, Tibshirani, Friedman - Springer
• Machine Learning for Multimedia Content Analysis - Y. Gong, W, Xu - Springer
• Effective Java - J Bloch - Addison-Wesley
• github.com/prnicolas

1. Thanks,i learn a lot form your blog. There is a lot of very useful knowledge in your post to help me solve problems.I enjoy reading your article and hope to see more.

Labels: Bayesian Network Java Barcode Kernel method Machine learning Naive Bayes

1. Machine Learning Projects for Final Year machine learning projects for final year

Deep Learning Projects assist final year students with improving your applied Deep Learning skills rapidly while allowing you to investigate an intriguing point. Furthermore, you can include Deep Learning projects for final year into your portfolio, making it simpler to get a vocation, discover cool profession openings, and Deep Learning Projects for Final Year even arrange a more significant compensation.

Python Training in Chennai Python Training in Chennai Angular Training Project Centers in Chennai

2. Hi, Great.. Tutorial is just awesome..It is really helpful for a newbie like me.. I am a regular follower of your blog. Really very informative post you shared here. Kindly keep blogging. If anyone wants to become a Java developer learn from Java Training in Chennai. or learn thru Java Online Training India . Nowadays Java has tons of job opportunities on various vertical industry.

3. There will be numerous progressions in AI, mechanical autonomy, nanotechnology, biotechnology and quantum registering. machine learning course

4. The web selenium program has started to become very popular and its popularity will keep grow. What every sphere on the job and industry is now being digitalized, the demands of these training continue to soar. Online Advanced JAVA Training

5. With poor economic news coming in daily, how can management determine whether a dip in sales is due to a sales force that just isn't doing its job or whether there are outside economic factors at play that are nearly impossible to overcome? Answering this question is not always easy, however it is always important when determining how successful your company is now and the future potential of the organization. Below, you will find a few points that ought to help you determine whether a dip in sales could be the...torque gauge

6. Thanks for sharing nice information with us. i like your post and all you share with us is uptodate and quite informative, i would like to bookmark the page so i can come here again to read you, as you have done a wonderful job. https://belltestchamber.com/environmental-chamber/desktop-environmental-climatic-temperature-test-chamber/

7. It is in rapidly making a phase and adequately accepting by the world; scholastics interested in electronic thinking and investigate to get the accomplishment if a machine could pick up from data. machine learning course hyderabad

8. Furthermore, if the ongoing discoveries at the College of Perusing (U.K.) are any sign, we may have just started satisfying said prediction. cyber security course in hyderabad

9. I imagine that much obliged for the valuabe data and bits of knowledge you have so given here.DGBELL small environmental chamber manufacturer

10. Excellent .. Amazing .. I’ll bookmark your blog and take the feeds also…I’m happy to find so many useful info here in the post, we need work out more techniques in this regard, thanks for sharing. best impact wrench for cars

11. To buy tiktok views you just need to visit this site https://soclikes.com/ and click several buttons

12. Cool you write, the information is very good and interesting, I'll give you a link to my site.
Best Institute for Data Science in Hyderabad

13. This post is extremely easy to peruse and acknowledge without forgetting about any subtleties. Incredible work! data scientist training

14. You have an extremely knowledge perspective.It’s incredible how thorough your work is.
360DigiTMG machine learning course malaysia

15. Never surpass the Base Breaking Strength (MBS) of the lash or the Working Burden Cutoff (WLL) of shackles. heavy duty strapping tensioner

16. Electric engines were stuck on the old machines to begin with, however discovered their way into the case turning into a characteristic piece of the machine. K cup sealer

17. Aside from indoor practice machines and recreational (backyard) pitching machines, most coaches and parents elect to purchase a machine that will most closely simulate pitches the player will see in the game.mélybölcsős szállítás Europa-Road Kft.

18. Nice blog. Really useful content and informative blog. Keep sharing more blogs again soon.
Artificial Intelligence Training with placements
Data Science Course with placements

19. It is somewhat fantastic, and yet check out the advice at this treat. ppc management

20. Well we really like to visit this site, many useful information we can get here.
digital marketing courses in hyderabad with placement

21. Nice blog, I feel happy to read this blog. Keep sharing more blogs again soon.

22. Nice blog, Keep sharing more.
Machine Learning Training with Placements

23. I was just examining through the web looking for certain information and ran over your blog.It shows how well you understand this subject. Bookmarked this page, will return for extra. data science course in vadodara

24. Extremely overall quite fascinating post. I was searching for this sort of data and delighted in perusing this one. Continue posting. A debt of gratitude is in order for sharing. data scientist course in delhi

25. Amazing Article! I would like to thank you for the efforts you had made for writing this awesome article. This article inspired me to read more. keep it up.If you are Searching for info click on given link
Data science course in pune

26. I am impressed by the information that you have on this blog!