| Peer-Reviewed

A Survey of Information Retrieval Techniques

Received: 25 June 2017     Accepted: 10 July 2017     Published: 28 November 2017
Views:       Downloads:
Abstract

The explosive growth of resources stored in various forms and transmitted over the internet has necessitated researches into information retrieval technologies. The major information retrieval mechanisms commonly employed include vector space model, Boolean model, Fuzzy Set model, and probabilistic retrieval model. These models are used to find similarities between the query and the documents to retrieve documents that reflect the query. These approaches are based on key-word, which uses lists of keywords to describe the information content. In this paper, a survey of these models is provided in order to understand their working mechanisms and shortcomings. This understanding is vital as it facilitates the choice of an information retrieval technique, based on the underlying requirements. The results of this survey revealed that the current information retrieval models fall short of the expectations in one way or the other. As such, they are not ideal for high precision information retrieval applications.

Published in Advances in Networks (Volume 5, Issue 2)
DOI 10.11648/j.net.20170502.12
Page(s) 40-46
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2017. Published by Science Publishing Group

Keywords

Information Retrieval, Model, Fuzzy, Boolean, Probabilistic, Query

References
[1] B. Jansen and S. Rieh (2010). The Seventeen Theoretical Constructs of Information Searching and Information Retrieval. Journal of the American Society for Information Sciences and Technology. 61(8), 1517-1534.
[2] I. Sutskever, O. Vinyals and Q. Le (2014). Sequence to Sequence Learning with Neural Networks.
[3] M. Sanderson and W. Bruce (2012). The History of Information Retrieval Research. Proceedings of the IEEE. 100: 1444–1451.
[4] R. Baeza, and B. Ribeiro (2011). Modern Information Retrieval: Second edition. Addison-Wesley, New York, NY, USA.
[5] E. Elabd, E. Alshari, and H. Abdulkader (2014). Semantic Boolean Arabic Information Retrieval. The International Arab Journal of Information Technology.
[6] Q. Shatnawi B. Yassein B. and R. Mahafza (2012). A Framework for Retrieving Arabic Documents Based on Queries Written in Arabic Slang Language. Journal of Information Science, vol. 38, pp. 350-365.
[7] Y. Shen, X. He, J. Gao, L. Deng, and G. Mesnil (2014). A Latent Semantic Model with Convolutional-pooling Structure for Information Retrieval. In Proceedings of CIKM.
[8] R. Harastani (2010). Information Retrieval With Fuzzy Logic. Texmex.
[9] W. Onifade and J. Ibitoye (2016). Fuzzy Latent Semantic Query Expansion Model for Enhancing Information Retrieval. University of Ibadan, Nigeria.
[10] B. Yates and R. Neto (2012). Modern information retrieval. Addison Wesley, 2011.
[11] D. Turney, and P. Pantel (2010). From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research.
[12] N. Singh andK. Dwivedi (2012). Analysis of Vector Space Model in Information Retrieval. National Conference on Communication Technologies & its impact on Next Generation Computing.
[13] R. Kiros, Y. Zhu, R. Salakhutdinov, S. Zemel, A. Torralba, R. Urtasun, and S. Fidler (2015). Skip-thought vectors.
[14] R. Pascanu, C. Culcehre, K. Cho, and Y. Bengio, (2013). How to Construct Deep Neural Networks.
[15] M. Dragoni, Celia da Costa Pereira, G. B Andrea. Tettamanzi, (2012). A Conceptual Representation of Documents and Queries for Information Retrieval System using Light Ontologies. Expert Systems with Applications pp. 10376–10388, Elsevier.
[16] C. Exeler and H. Sack (2015). Linked Data Enabled Generalized Vector Space Model To Improve Document Retrieval. Hasso-Plattner-Institute for IT-Systems Engineering.
[17] R. Usbeck (2015). GERBIL: general entity annotation benchmark framework. In 24th WWW conference.
[18] T. Tietz, J. Waitelonis, J. Jager, and H. Sack (2014). Smart media navigator: Visualizing recommendations based on linked data. In 13th International Semantic Web Conference, Industry Track, pages 48{51}.
[19] I. Santos, B. Sanz C. Laorden and G. Bringas (2012). Enhanced Topic-based Vector Space Model for semantics-aware spam filtering. Expert Systems with Applications 39:437-444.
[20] H. Drucker (2013). Support Vector Machines for Spam Categorization.
[21] M. Kwak and G. Leroy (2013). Development and Evaluation of a Biomedical Search Engine using a Predicate-based Vector Space Model.
[22] S. Clark (2013). Topic Modelling and Latent Dirichlet Allocation. Machine Learning for Language Processing.
[23] D. Blei (2012). Probabilistic topic models. Communications of the ACM, 55(4):7784.
[24] S. Liangcai B. Long, M. Weiyi (2014). A Latent Topic Model for Complete Entity Resolution. 25th IEEE International Conference on Data Engineering.
[25] B. Stefan L. Charles V. Gordon (2014). Information Retrieval: Implementing and Evaluating Search Engines. MIT Press.
[26] D. Manning P. Raghavan S. Hinrich (2013). Introduction to Information Retrieval. Cambridge University Press.
[27] H. Paik, (2013). A novel TF-IDF weighting scheme for effective ranking. Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, Dublin, Ireland.
[28] R. Cummins, H. Jiaul, L. Yuanhua, A. Pólya (2015). Urn Document Language Model for Improved Information Retrieval. ACM Transactions on Information Systems (TOIS), v.33 n.4, p.1-34.
[29] P. Sojka and H. Schütze (2015). Introduction to Information Retrieval. Faculty of Informatics, Masaryk University.
[30] Y. Baeza, R. Ribeiro (2011). Modern Information Retrieval.
[31] Y. Kim, Y. Jernite, D. Sontag, M. Rush (2016). Character-Aware Neural Language Models. School of Engineering and Applied Sciences Harvard University.
[32] P. Wise, M. Henrion (2013). A Framework for Comparing Uncertain Inference Systems to Probability. Cornell University Library.
[33] E. Kyburgand, C. Teng (2015). Uncertain Inference.
[34] S. Zhang, H. Jiang, M. Xu, J. Hou, and L. Dai (2015). The Fixed- Size Ordinally-Forgetting Encoding Method for Neural Network Language Models. In Proceedings of ACL.
[35] T. Mikolov, A. Deoras, S. Kombrink, L. Burget, and J. Cernocky (2011). Empirical Evaluation and Combination of Advanced Language Modeling Techniques. In Proceedings of INTERSPEECH.
[36] M. Sundermeyer, H. Ney, and R. Schluter (2015). From feedforward to recurrent lstm neural networks for language modeling. Audio, Speech, and Language Processing, IEEE/ACM Transactions on 23(3):517–529.
[37] S. Goldwater (2015). Introduction to Computational Linguistics: N-gram language models.
[38] D. Matthew(2012). Adadelta: An adaptive learning rate method.
[39] G. Amati (2015). Divergence from Randomness Models.
[40] S. Hinrich (2011). Introduction to Information Retrieval. Institute for Natural Language Processing, Universit¨at Stuttgart.
Cite This Article
  • APA Style

    Mang’are Fridah Nyamisa, Waweru Mwangi, Wilson Cheruiyot. (2017). A Survey of Information Retrieval Techniques. Advances in Networks, 5(2), 40-46. https://doi.org/10.11648/j.net.20170502.12

    Copy | Download

    ACS Style

    Mang’are Fridah Nyamisa; Waweru Mwangi; Wilson Cheruiyot. A Survey of Information Retrieval Techniques. Adv. Netw. 2017, 5(2), 40-46. doi: 10.11648/j.net.20170502.12

    Copy | Download

    AMA Style

    Mang’are Fridah Nyamisa, Waweru Mwangi, Wilson Cheruiyot. A Survey of Information Retrieval Techniques. Adv Netw. 2017;5(2):40-46. doi: 10.11648/j.net.20170502.12

    Copy | Download

  • @article{10.11648/j.net.20170502.12,
      author = {Mang’are Fridah Nyamisa and Waweru Mwangi and Wilson Cheruiyot},
      title = {A Survey of Information Retrieval Techniques},
      journal = {Advances in Networks},
      volume = {5},
      number = {2},
      pages = {40-46},
      doi = {10.11648/j.net.20170502.12},
      url = {https://doi.org/10.11648/j.net.20170502.12},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.net.20170502.12},
      abstract = {The explosive growth of resources stored in various forms and transmitted over the internet has necessitated researches into information retrieval technologies. The major information retrieval mechanisms commonly employed include vector space model, Boolean model, Fuzzy Set model, and probabilistic retrieval model. These models are used to find similarities between the query and the documents to retrieve documents that reflect the query. These approaches are based on key-word, which uses lists of keywords to describe the information content. In this paper, a survey of these models is provided in order to understand their working mechanisms and shortcomings. This understanding is vital as it facilitates the choice of an information retrieval technique, based on the underlying requirements. The results of this survey revealed that the current information retrieval models fall short of the expectations in one way or the other. As such, they are not ideal for high precision information retrieval applications.},
     year = {2017}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - A Survey of Information Retrieval Techniques
    AU  - Mang’are Fridah Nyamisa
    AU  - Waweru Mwangi
    AU  - Wilson Cheruiyot
    Y1  - 2017/11/28
    PY  - 2017
    N1  - https://doi.org/10.11648/j.net.20170502.12
    DO  - 10.11648/j.net.20170502.12
    T2  - Advances in Networks
    JF  - Advances in Networks
    JO  - Advances in Networks
    SP  - 40
    EP  - 46
    PB  - Science Publishing Group
    SN  - 2326-9782
    UR  - https://doi.org/10.11648/j.net.20170502.12
    AB  - The explosive growth of resources stored in various forms and transmitted over the internet has necessitated researches into information retrieval technologies. The major information retrieval mechanisms commonly employed include vector space model, Boolean model, Fuzzy Set model, and probabilistic retrieval model. These models are used to find similarities between the query and the documents to retrieve documents that reflect the query. These approaches are based on key-word, which uses lists of keywords to describe the information content. In this paper, a survey of these models is provided in order to understand their working mechanisms and shortcomings. This understanding is vital as it facilitates the choice of an information retrieval technique, based on the underlying requirements. The results of this survey revealed that the current information retrieval models fall short of the expectations in one way or the other. As such, they are not ideal for high precision information retrieval applications.
    VL  - 5
    IS  - 2
    ER  - 

    Copy | Download

Author Information
  • Department of Computing, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya

  • Department of Computing, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya

  • Department of Computing, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya

  • Sections