我对使用cross_val_score的原因感到困惑 .
根据我的理解,cross_val_score告诉我的模型是'overfitting'还是'underfitting' . 而且,它不训练我的模型 .
因为我只有1个特征,所以它是tfidf(稀疏矩阵) . 如果它不合适,我不知道该怎么办 .
Q1: Did I use it in wrong order? I've seen both 'cross->fit' and 'fit->cross' examples.
Q2: What did the scores in '#print1' tell me? Does it mean I have to train my model k-times (with the same training set) where k is the k-fold that give the best score?
我的代码现在:
model1=GaussianNB(priors=None)
score=cross_val_score(model1, X_train.toarray(), y_train,cv=3,scoring='accuracy')
# print1
print (score.mean())
model1.fit(X_train.toarray(),y_train)
predictions1 = model1.predict(X_test.toarray()) #held out data
# print2
print (classification_report(predictions1,y_test))