WordNet词网研究6——之JWI(Java Wordnet Interface)WordNet Java接口


 JWI (the MIT Java Wordnet Interface) is a Java library for interfacing with Wordnet. JWI supports access to Wordnet versions 1.6 through 3.0, among other related Wordnet extensions. Wordnet is a freely and publicly available semantic dictionary of English, developed at Princeton University.

JWI is written for Java 1.5.0 and has the package namespace edu.mit.jwi. The distribution does not include the Wordnet dictionary files; these can be downloaded from the Wordnet download site. This version of software is distributed under a license that makes it free to use for all purposes, as long as proper copyright acknowledgement is made.

The javadoc API is posted online for your convenience. So is the version changelog. If you find JWI useful, have found a bug, or would like to request a new feature, please contact me.

DescriptionLinkBinary Files Onlyedu.mit.jwi_2.2.2.jar (143 kb)User's Manualedu.mit.jwi_2.2.2_manual.pdf (276 kb)Source Onlyedu.mit.jwi_2.2.2_src.zip (143 kb)Javadocsedu.mit.jwi_2.2.2_javadoc.zip (617 kb) | onlineDevelopment Kit (binaries and source)edu.mit.jwi_2.2.2_jdk.jar (273 kb)All-in-One (jdk, javadocs, manual)edu.mit.jwi_2.2.2_all.zip (1,090 kb)


JWI是由MIT麻省理工学院,计算机科学与人工智能实验室, Mark Alan.Finlayson主持的项目。JWI是用于访问WordNet的Java API。






      WNHOME = “E:\Commonly Application\WordNet\2.1”;






import java.io.File;

import java.io.IOException;


import java.net.URL;


import edu.mit.jwi.Dictionary;

import edu.mit.jwi.IDictionary;

import edu.mit.jwi.item.IIndexWord;

import edu.mit.jwi.item.IWord;

import edu.mit.jwi.item.IWordID;

import edu.mit.jwi.item.POS;



public class test {


    public static void main(String[] args) throws IOException{




    public static void testDitctionary() throws IOException{

       // construct the URL to the Wordnet dictionary directory

       String wnhome = System.getenv("WNHOME"); //获取环境变量WNHOME

       String path = wnhome + File.separator+ "dict";

       URL url=new URL("file", null, path);  //创建一个URL对象,指向WordNet的ditc目录


       // construct the dictionary object and open it

       IDictionary dict=new Dictionary(url);

       dict.open(); //打开词典


       // look up first sense of the word "dog "

       IIndexWord idxWord=dict.getIndexWord("dog", POS.NOUN);//获取一个索引词,(dog,名词)

       IWordID wordID=idxWord.getWordIDs().get(0);//获取dog第一个词义ID

       IWord word = dict.getWord(wordID); //获取该词

       System .out . println ("Id = " + wordID );

       System .out . println (" 词元 = " + word . getLemma ());

       System .out . println (" 注解 = " + word . getSynset (). getGloss ());






Id = WID-02064081-N-??-dog

 词元 = dog

 注解 = a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds; "the dog barked all night"





















public classRAMDictionaryTest {

     public static voidmain(String[] args) throwsIOException, Exception{

      String wnhome = System.getenv("WNHOME"); //获取环境变量WNHOM 

       String path = wnhome + File.separator+ "dict"

       File wnDir=newFile(path);



    public static voidtestRAMDictionary(File wnDir)throwsIOException, InterruptedException{

       IRAMDictionary dict=newRAMDictionary(wnDir, ILoadPolicy.NO_LOAD);





       //now load into memor

       System.out.print("\nLoading Wordnet into memory...");



       System.out.printf("装载时间:done(%1d msec)\n", System.currentTimeMillis()-t);








     * this method is Achieved to trek around the WordNet


    public static voidtrek(IDictionary dict){




       System.out.print("Treking across Wordnet");


       for(POS pos:POS.values()){ //遍历所有词性

           for(Iterator<IIndexWord> i=dict.getIndexWordIterator(pos);i.hasNext();){


              for(IWordID wid:i.next().getWordIDs()){





                     tickNext=seen + tickSize;






       System.out.printf("done (%1d msec)\n",System.currentTimeMillis()-t);

       System.out.println("In my trek I saw "+ seen + " words");





Treking across Wordnet...........................done (3765 msec)

In my trek I saw 523260 words 

Loading Wordnet into memory...装载时间:done(10625 msec)


Treking across Wordnet...........................done (328 msec)

In my trek I saw 523260 words


      由结果可见,不装如内存的周游WordNet的时间为3765 ms,而装入内存后的周游时间为328 ms,结果中的10625ms把装入内存所消耗的时间。就周游的时间而言转入内存后的时间更快速。





import java.io.File;

import java.io.IOException;

import java.net.URL;


import edu.mit.jwi.Dictionary;

import edu.mit.jwi.IDictionary;

import edu.mit.jwi.item.IIndexWord;

import edu.mit.jwi.item.ISynset;

import edu.mit.jwi.item.IWord;

import edu.mit.jwi.item.IWordID;

import edu.mit.jwi.item.POS;


public class GetWordSynsetsTest {

    public static void main(String[] args) throws IOException{

       String wnhome = System.getenv("WNHOME"); //获取WordNet根目录环境变量WNHOME

       String path = wnhome + File.separator+ "dict";       

       File wnDir=new File(path);

       URL url=new URL("file", null, path);

       IDictionary dict=new Dictionary(url);


       getSynonyms(dict); //testing



    public static void getSynonyms(IDictionary dict){

       // look up first sense of the word "go"

       IIndexWord idxWord =dict.getIndexWord("go", POS.VERB);

       IWordID wordID = idxWord.getWordIDs().get(0) ; // 1st meaning

       IWord word = dict.getWord(wordID);

       ISynset synset = word.getSynset (); //ISynset是一个词的同义词集的接口


       // iterate over words associated with the synset

       for(IWord w : synset.getWords())












2.可能是你在构造词时,在如getIndexWord("go", POS.VERB)函数中词性参数输入错误,比如上例中输入的词性是POS.ADVERB。由于go没有副词,所以汇报NullPointerException异常。







import java.io.File;

import java.io.IOException;

import java.util.Iterator;

import java.util.List;


import edu.mit.jwi.Dictionary;

import edu.mit.jwi.IDictionary;

import edu.mit.jwi.item.IIndexWord;

import edu.mit.jwi.item.ISynset;

import edu.mit.jwi.item.ISynsetID;

import edu.mit.jwi.item.IWord;

 import edu.mit.jwi.item.IWordID;

import edu.mit.jwi.item.POS;

import edu.mit.jwi.item.Pointer;


public class GetHypernymsTest {

    public static void main(String[] args) throws IOException{

       String wnhome = System.getenv("WNHOME"); //获取WordNet根目录环境变量WNHOME

       String path = wnhome + File.separator+ "dict";       

       File wnDir=new File(path);

       IDictionary dict=new Dictionary(wnDir);




    public static void getHypernyms(IDictionary dict){



       IIndexWord idxWord = dict.getIndexWord("article", POS.NOUN);//获取dog的IndexWord

       IWordID wordID = idxWord.getWordIDs().get(0); //取出第一个词义的词的ID号

       IWord word = dict.getWord(wordID); //获取词

       ISynset synset = word.getSynset(); //获取该词所在的Synset


       // 获取hypernyms

       List<ISynsetID> hypernyms =synset.getRelatedSynsets(Pointer.HYPERNYM );//通过指针类型来获取相关的词集,其中Pointer类型为HYPERNYM

       // print out each hypernyms id and synonyms

       List <IWord > words ;

       for( ISynsetID sid : hypernyms ){

           words = dict.getSynset(sid).getWords(); //从synset中获取一个Word的list

           System.out.print(sid + "{");

           for( Iterator<IWord > i = words.iterator(); i.hasNext();){

              System.out.print(i.next().getLemma ());

              if(i. hasNext ()){

                  System.out.print(", ");



           System .out . println ("}");






SID-06282025-N{nonfiction, nonfictional_prose}








    是因为有时候明明知道在WordNet中又这么两个词,但却无法使用JWI由item X指向Item Y,原因就是混用了JWI的词典指针lexical pointers和语义指针semantic pointers。这里的指针不是C/C++的指针,是指词条之间、词集之间的一个连接关系。


词典指针——指词与词一次之间的关联关系,比如说dog与domestic dog的近义词等关系。包括词同形异义的两个词,派生关系等等;其工作的范围是词与词之间。











      至于如何去使用这两个指针,什么时候使用词典指针,什么时候使用语义指针,什么时候可以两者都是用,很不幸地是,WordNet的相关文档解释很少。不过M.A. Finlayson为我们通过观察和统计WordNet,指出了如图说是的指针统计表。










