Full text search in mongodb

Full text search is a custom implementation created by the MongoDB developers as a specific index type

It has features such as:

Full text search as an index type when creating new indexes, just like any other.
         1.Indexing of multiple fields, with weighting to give different fields higher priority.

         2.Support for Latin based languages initially, with plans for other character sets later. Initially this will be: Danish, Dutch,

           English, Finnish, French, German, Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish.

         3.Support for advanced queries, similar to the Google search syntax e.g. negation and phrase matching.

         4.Stemming, to deal with plurals.

         5.Stop words (see the list here).

This looks like a good, general purpose full text search engine which goes along well with how MongoDB is developing into a good multi-purpose database. It may well never reach the complexity of “proper” search products like Elastic Search or Solr, but that is probably not the goal.

use test

db.adminCommand( { setParameter : "*", textSearchEnabled : true } );

tc = db.test

tc.save( { _id: 1, title: "Olivia Shakespear",text: "Olivia Shakespear (born Olivia Tucker; 17 March 1863 – 3 October 1938) was a British novelist, playwright, and patron of the arts. She wrote six books that are described as \"marriage problem\" novels. Her works sold poorly, sometimes only a few hundred copies. Her last novel, Uncle Hilary, is considered her best. She wrote two plays in collaboration with Florence Farr." } );
tc.save( { _id: 2, title: "Linn-Kristin Riegelhuth Koren", text: "Linn-Kristin Riegelhuth Koren (born 1 August 1984, in Ski) is a Norwegian handballer playing for Larvik HK and the Norwegian national team. She is commonly known as Linka. Outside handball she is a qualified nurse." } );

Then we can create a new index on the title field:
tc.ensureIndex( { "title": "text" } );

> res = tc.runCommand( "text", { search: "Olivia" } );
{
    "queryDebugString" : "olivia||||||",
    "language" : "english",
    "results" : [
        {
            "score" : 0.75,
            "obj" : {
                "_id" : 1,
                "title" : "Olivia Shakespear",
                "text" : "Olivia Shakespear (born Olivia Tucker; 17 March 1863 – 3 October 1938) was a British novelist, playwright, and patron of the arts. She wrote six books that are described as \"marriage problem\" novels. Her works sold poorly, sometimes only a few hundred copies. Her last novel, Uncle Hilary, is considered her best. She wrote two plays in collaboration with Florence Farr."
            }
        }
    ],
    "stats" : {
        "nscanned" : 1,
        "nscannedObjects" : 0,
        "n" : 1,
        "timeMicros" : 128
    },
    "ok" : 1
}

In mongodb,we have 1 full text index.We can create compound index as 

tc.dropIndexes()

tc.ensureIndex( { "title": "text", "text": "text" } );
res = tc.runCommand( "text", { search: "novelists" } );
{
    "queryDebugString" : "novelist||||||",
    "language" : "english",
    "results" : [
        {
            "score" : 0.5116279069767442,
            "obj" : {
                "_id" : 1,
                "title" : "Olivia Shakespear",
                "text" : "Olivia Shakespear (born Olivia Tucker; 17 March 1863 – 3 October 1938) was a British novelist, playwright, and patron of the arts. She wrote six books that are described as \"marriage problem\" novels. Her works sold poorly, sometimes only a few hundred copies. Her last novel, Uncle Hilary, is considered her best. She wrote two plays in collaboration with Florence Farr."
            }
        }
    ],
    "stats" : {
        "nscanned" : 1,
        "nscannedObjects" : 0,
        "n" : 1,
        "timeMicros" : 90
    },
    "ok" : 1
}

 

We can see the indexes using getIndexes() method

tc.getIndexes()
[
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "ns" : "test.test",
        "name" : "_id_"
    },
    {
        "v" : 0,
        "key" : {
            "_fts" : "text",
            "_ftsx" : 1
        },
        "ns" : "test.test",
        "name" : "title_text_text_text",
        "weights" : {
            "text" : 1,
            "title" : 1
        },
        "default_language" : "english",
        "language_override" : "language"
    }
]

You can specify the weight and default_language options when creating the index e.g.

tc.ensureIndex( { "title": "text", "text": "text" }, {weights: { title: 10 }, default_language: "norwegian" } );

Posted On 21 May 2015 By MicroPyramid


Need any Help in your Project?Let's Talk

Latest Comments
Related Articles
Celery Flower to monitor task queue

Celery is a task queue that is to built an asynchronous message passing system. It can be used as a bucket where programming tasks can ...

Continue Reading...
How do I profile django application using django web profiler

When working with a large scale applications which includes many modules, we need to focus on the performance to give more user statisfaction, sustainability. To ...

Continue Reading...
Multifactor Authentication with Django MFA using Google Authenticator

Use Django Multi-Factor Authentication method to verify user identity with more than one authentication methods. It can be used for user login, any transactional methods ...

Continue Reading...
open source packages

Subscribe To our news letter

Subscribe and Stay Updated about our Webinars, news and articles on Django, Python, Machine Learning, Amazon Web Services, DevOps, Salesforce, ReactJS, AngularJS, React Native.
* We don't provide your email contact details to any third parties