machine learning - How to use different dataset for scikit and NLTK? -


i trying implement inbuilt naive bayes classifier of scikit , nltk raw data have. data have set tab-separated-rows each having label, paragraph , other attributes. interested in classifying paragraphs. need convert data format suitable inbuilt classifiers of scikit/ nltk. want implement gaussian,bernoulli , multinomial naive bayes paragraphs.

question 1: scikit, example given imports iris data. checked iris data, has precalculated values data set. how can convert data such format , directly call gaussian function? there standard way of doing so?
question 2: nltk, should input naivebayesclassifier.classify function? dict boolean values? how can made multinomial or gaussian?

@ question 2:

nltk.naivebayesclassifier.classify expects called 'featureset'. featureset dictionary feature names keys , feature values values, e.g. {'word1':true, 'word2':true, 'word3':false}. nltks' naive bayes classifier cannot used multinomial approach. however, can install scikit learn , use nltk.classify.scikitlearn wrapper module deploy scikit's multinomial classifier.


Comments

Popular posts from this blog

node.js - Mongoose: Cast to ObjectId failed for value on newly created object after setting the value -

gradle error "Cannot convert the provided notation to a File or URI" -

python - NameError: name 'subprocess' is not defined -