参考文档,感谢前人探路
Doc 深入浅出 Stanford NLP(深入篇)下载
CoreNLP的源码git 下载chinese模型jar包,解压到src用ant导入build.xml,Eclipse-File-New-Other-Java-Java Project from Existing Ant Buildfile
启动服务,总是没法找到standordCoreNLP-chinese.properties
Configuration设置参数-serverProperties edu/stanford/nlp/pipeline/StanfordCoreNLP-chinese.properties
修改IOUtils.java 472行 getInputStreamFromURLOrClasspathOrFileSystem
try { String base = IOUtils.class.getResource("/").getPath(); if (textFileOrUrl.indexOf("/")!=0) { textFileOrUrl = base +textFileOrUrl;//资源文件根目录+需要寻找到文件位置 } in = findStreamInClasspathOrFileSystem(textFileOrUrl); }debug后发现并没有载入properties中的模型,修改 StanfordCoreNLPServer.java中的78行
这样启动服务的时候就会载入模型
protected static String preloadedAnnotators = "tokenize, ssplit, pos, lemma, ner, depparse, coref, natlog, openie";从edu.stanford.nlp.pipeline 运行 StanfordCoreNLPServer.java,localhost:9000
加入自定义词语,重新生成词典
下载Chinese word segmenter,解压
添加词典
java -cp "*" -mx1g edu.stanford.nlp.wordseg.ChineseDictionary -inputDicts my_dict.txt,dict-chris6.ser.gz -output new_dict.ser.gzCompile the code with this command: cd CoreNLP ; ant
Then run this command to build a jar with the latest version of the code: cd CoreNLP/classes ; jar -cf ../stanford-corenlp.jar edu
java -Xmx8g -cp "*" -Djava.io.tmpdir=/your/path/tmp edu.stanford.nlp.pipeline.StanfordCoreNLPServer -serverProperties StanfordCoreNLP-chinese.properties -port 6668