machine learning - Mallet CRF SimpleTagger Performance Tuning -


For a conditional random field (CRF), a question value for anyone using the Simple Tagger class of Java Library Mallet Take that I'm already using the multi-thread option for the maximum CPU (this is the case): Where do I start, and if I need to run it then how should I try?

A related question is whether stochastic is a way of doing something similar to gradient descent, which will speed up the training process?

The type of training that I want to do is simple:

  Input: Feature 1 ... FeatureNow label labels ... Test Data: Feature 1 ... Feature N ... Output: Feature 1 ... Feature N Sequence Label ...   

(Where the processing is in production, I have done the data in my code.)

I have a problem obtaining any CRF classifier besides almost mallet working, but I revisit one of the background again and other implementations, or an innovation.

Yes, the stochastic slope lineage is usually used more quickly than the L-BFGS optimizer. I suggest that you can train by SGD or L-BFGS. You can also give a try on take, but it is more difficult to setup.

Otherwise, I believe this is the most used CRF software. It is based on L-BFGS, however, it can not be fast enough for you.

To start with both CRFSuite and CRF ++ should be easy.

Note that all of these will be slow if you have a large number of labels, at least CRFSuite should be taken into account only in label (n-1) th sequence model - n-gm - Which will generally make training and prediction very fast.

Comments