Skip to content

Latest commit

 

History

History

self_indexing

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Information-Retrieval

Steps to run the project:

  1. Creating index on local. The below commands dumps partial inverted index files and partial catalog files.

a) Creating stemmed index :
python createIndexing.py --do_stem --outf output/stemmed/

b) Creating unstemmed index:
python createIndexing.py --outf output/unstemmed/

c) Creating compressed stemmed index:
python createIndexing.py --do_stem --compress --outf output/stem_compressed/

  1. Merging : This step merges all the partial indexes to create a final merged inverted index.

a) Uncompressed stemmed:
python merging.py
--dirpath output/stemmed/

b) Compressed stemmed :
python merging.py
--compress
--dirpath output/stem_compressed/

a)Uncompressed stemmed :
python runModels.py
-qf data/queryfile1.txt
-stem
-i output/stemmed/combined_inverted_index_84.txt
-vocab output/stemmed/vocab.pickle
-c output/stemmed/combined_catalog_84.txt
--ptype min
--p_alpha .1
-o output/models/stemmed/

b) Compressed stemmed :
python runModels.py
-qf data/queryfile1.txt
-stem
--compress
-i output/stemmed_compressed/combined_inverted_index_84.txt
-vocab output/stemmed_compressed/vocab.pickle
-c output/stemmed_compressed/combined_catalog_84.txt
--ptype min
--p_alpha .1
-o output/models/stemmed_compressed/

Running trec file for precision evaluation of all models: ./trec.pl -q data/qrel.txt output/models/stemmed/tf-idf.out