5 minutes integrate the new version of Elasticsearch7.9 Chinese search to your Laravel7 project

5 minutes integrate the new version of Elasticsearch7.9 Chinese search to your Laravel7 project

ar414 - 5 minutes integrate the new version of Elasticsearch7.9 Chinese search to your Laravel7 project

Just five steps:

  1. Start the Elasticsearch7.9 Docker image integrated with the ik Chinese word segmentation plugin
  2. Laravel7 configure Scout
  3. Configure Model
  4. Import data
  5. Search

#Demo URL

ar414 - 5 minutes integrate the new version of Elasticsearch7.9 Chinese search to your Laravel7 project

#https://www.ar414.com/search?query=php%E5%91%A8%E6%9D%B0%E4%BC%A6

Search scope

  • Article content
  • Title
  • Label

Result weight

  1. Number of keywords appearing
  2. Number of occurrences of keywords

Search page

  • Highlight
  • Participle display
  • Results pagination

#Foreword

Mainly Chinese blogs just want to do a search, by the way, organize them into articles

Laravel + Elasticsearch Many predecessors have written tutorials and cases, but as the versions of Elasticsearch and laravel are upgraded, many previous articles are not applicable to the new version. It is recommended that you use any open source project to go through the document and use the current version document as Main, tutorial as supplement

#Reference

#Use Elasticsearch integrated with ik Chinese word segmentation plugin

#Pull docker image

$ docker pull ar414/elasticsearch-7.9-ik-plugin

#Create log and data storage directory

Locally map to the docker container to prevent data loss when docker restarts

$ mkdir -p /data/elasticsearch/data
$ mkdir -p /data/elasticsearch/log
$ chmod -R 777 /data/elasticsearch/data
$ chmod -R 777 /data/elasticsearch/log

#RUN

docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -v /data/elasticsearch/data:/var/lib/elasticsearch -v /data/elasticsearch/log:/var/log/elasticsearch ar414/elasticsearch-7.9-ik-plugin 

#Verification

$ curl http://localhost:9200
{
  "name" : "01ac21393985",
  "cluster_name" : "docker-cluster",
  "cluster_uuid" : "h8L336qcRb2i1aydOv04Og",
  "version" : {
    "number" : "7.9.0",
    "build_flavor" : "default",
    "build_type" : "docker",
    "build_hash" : "a479a2a7fce0389512d6a9361301708b92dff667",
    "build_date" : "2020-08-11T21:36:48.204330Z",
    "build_snapshot" : false,
    "lucene_version" : "8.6.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

#Test Chinese word segmentation

curl -X POST "http://localhost:9200/_analyze?pretty" -H 'Content-Type: application/json' -d'
{
  "analyzer": "ik_max_word",
  "text":     "laravel天下无敌"
}
'

{
  "tokens" : [
    {
      "token" : "laravel",
      "start_offset" : 0,
      "end_offset" : 7,
      "type" : "ENGLISH",
      "position" : 0
    },
    {
      "token" : "天下无敌",
      "start_offset" : 7,
      "end_offset" : 11,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "天下",
      "start_offset" : 7,
      "end_offset" : 9,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "无敌",
      "start_offset" : 9,
      "end_offset" : 11,
      "type" : "CN_WORD",
      "position" : 3
    }
  ]
}


#Using Elasticsearch in Laravel Project

matchish/laravel-scout-elasticsearch

Elasticsearch officially provides SDK, which can be more elegant and faster in Laravel project To access Elasticsearch, Laravel itself provides Scout full-text search solution, we only need to change the default [Algolia](https: //www.algolia.com/) The driver is replaced with the ElasticSearch driver.

#Installation

$ composer require laravel/scout
$ composer require matchish/laravel-scout-elasticsearch

#Configuration

  1. Generate Scout configuration file (config/scout.php)
$ php artisan vendor:publish --provider="Laravel\Scout\ScoutServiceProvider"
Copied File [\vendor\laravel\scout\config\scout.php] To [\config\scout.php]
Publishing complete.
  1. Specify Scout driver
  • The first type: specify in the .env file (recommended)
SCOUT_DRIVER=Matchish\ScoutElasticSearch\Engines\ElasticSearchEngine
  • The second type: directly modify the default driver in config/scout.php
'driver' => env('SCOUT_DRIVER','algolia')
To
'driver' => env('SCOUT_DRIVER','Matchish\ScoutElasticSearch\Engines\ElasticSearchEngine')
  1. Specify the Elasticsearch service IP port

If you use docker deployment, use the IP of docker0, and Linux can view it through ifconfig

Configure in .env

ELASTICSEARCH_HOST=172.17.0.1:9200
  1. Registration Service

config/app.php

'providers' => [
     // Other Service Providers
     \Matchish\ScoutElasticSearch\ElasticSearchServiceProvider::class
],
  1. Clear configuration cache
$ php artisan config:clear

So far laravel has been connected to Elasticsearch

#Use in actual business

#Demand

14 minutes and 14 seconds to integrate Elasticsearch Chinese search into your Laravel project

You can search for articles related to keywords through the search box in the upper right corner of the blog, matching from the following points

  • Article content
  • Article title
  • Article tags

Involving 2 Mysql tables and fields

  • article
    • title
    • tags
  • article_content
    • content

#Configure Elasticsearch indexing for articles

  1. Create an index configuration file (config/elasticsearch.php)
$ touch config/elasticsearch.php
  1. elasticsearch.php configuration field mapping
<?php
return [
     'indices' => [
         'mappings' => [
             'blog-articles' => [
                 "properties"=> [
                     "content"=> [
                         "type"=> "text",
                         "analyzer"=> "ik_max_word",
                         "search_analyzer"=> "ik_smart"
                     ],
                     "tags"=> [
                         "type"=> "text",
                         "analyzer"=> "ik_max_word",
                         "search_analyzer"=> "ik_smart"
                     ],
                     "title"=> [
                         "type"=> "text",
                         "analyzer"=> "ik_max_word",
                         "search_analyzer"=> "ik_smart"
                     ]
                 ]
             ]
         ]
     ],
];
  • analyzer: the tokenizer of the field text
    • search_analyzer: word segmentation for search words
    • According to the specific business scenario selection (small particles occupy more resources, the general scenario analyzer uses ik_max_word, search_analyzer uses ik_smart):
      • ik_max_word: provided by the ik Chinese word segmentation plug-in, the maximum number of words segmentation of the text
      • laravel天下无敌 -> laravel, 天下无敌, 天下, 无敌
      • ik_smart: provided by ik Chinese word segmentation plug-in, the minimum number of words segmentation of the text
      • laravel天下无敌 -> laravel, 天下无敌

#Configure Article Model

It is recommended to read Laravel Scout User Documentation

  1. Introduce Laravel Scout

    namespace App\Models\Blog;
    use Laravel\Scout\Searchable;
    class Article extends BlogBaseModel
    {
        use Searchable;
    }
    
  2. Specify the index (elasticsearch.indices.mappings.blog-articles in the configuration file just now)

    /**
      * Specify index
      * @return string
      */
     public function searchableAs()
     {
         return'blog-articles';
     }
    
  3. Set the data field of the import index

    /**
    * Set the data field of the import index
    * @return array
    */
    public function toSearchableArray()
    {
     return [
         'content' => ArticleContent::query()
             ->where('article_id',$this->id)
             ->value('content'),
         'tags' => implode(',',$this->tags),
         'title' => $this->title
     ];
    }
    
  4. Specify the unique ID stored in the search index

    /**
    * Specify the unique ID stored in the search index
    * @return mixed
    */
    public function getScoutKey()
    {
     return $this->id;
    }
    
    /**
    * Specify the key name of the unique ID stored in the search index
    * @return string
    */
    public function getScoutKeyName()
    {
     return'id';
    }
    

#data import

In fact, the data in the data table is imported to Lucene through Elasticsearch Elasticsearch is a package of Lucene and provides a REST API operation interface

  • One-click automatic import: php artisan scout:import
  • Import the specified model: php artisan scout:import ${model}
$ php artisan scout:import "App\Models\Blog\Article"
Importing [App\Models\Blog\Article]
Switching to the new index
5/5 [⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬⚬] 100%
[OK] All [App\Models\Blog\Article] records have been imported.

Import failed, common reasons:

  • Unresolvable dependency resolving [Parameter #0 [ integer $retries ]] in class Elasticsearch\Transport
    • Solution: After modifying the configuration, the configuration cache is not cleared
  • invalid_index_name_exception
    • Solution: searchableAs configuration error, after creating an alias for the index, specify the alias

#Check if the index is correct

$ curl -XGET http://localhost:9200/blog-articles/_mapping?pretty
{
   "blog-articles_1598362919": {
     "mappings": {
       "properties": {
         "__class_name": {
           "type": "text",
           "fields": {
             "keyword": {
               "type": "keyword",
               "ignore_above": 256
             }
           }
         },
         "content": {
           "type": "text",
           "analyzer": "ik_max_word",
           "search_analyzer": "ik_smart"
         },
         "tags": {
           "type": "text",
           "analyzer": "ik_max_word",
           "search_analyzer": "ik_smart"
         },
         "title": {
           "type": "text",
           "analyzer": "ik_max_word",
           "search_analyzer": "ik_smart"
         }
       }
     }
   }
}

#Test

  1. Create a test command line
$ php artisan make:command ElasticTest
  1. Code
<?php

namespace App\Console\Commands;

use App\Models\Blog\Article;
use App\Models\Blog\ArticleContent;
use Illuminate\Console\Command;
use Illuminate\Support\Carbon;

class ElasticTest extends Command
{
    /**
     * The name and signature of the console command.
     *
     * @var string
     */
    protected $signature ='elasticsearch {query}';

    /**
     * The console command description.
     *
     * @var string
     */
    protected $description ='elasticsearch test';

    /**
     * Create a new command instance.
     *
     * @return void
     */
    public function __construct()
    {
        parent::__construct();
    }

    /**
     * Execute the console command.
     *
     * @return mixed
     */
    public function handle()
    {
        //
        $startTime = Carbon::now()->getPreciseTimestamp(3);
        $articles = Article::search($this->argument('query'))->get()->toArray();
        $userTime = Carbon::now()->getPreciseTimestamp(3)-$startTime;
        echo "Time-consuming (milliseconds): {$userTime} \n";

        //content is in another table for easy observation and testing. Here is the output
        if(!empty($articles)) {
            foreach($articles as &$article) {
                $article = ArticleContent::query()->where('article_id',$article['id'])->value('content');
            }
        }

        var_dump($articles);

    }
}

  1. Test
$ php artisan elasticsearch 周杰伦

  1. Complex queries

For example: custom highlight

//ONGR\ElasticsearchDSL\Highlight\Highlight
ArticleModel::search($query,function($client,$body) {
             $higlight = new Highlight();
             $higlight->addField('content',['type' =>'plain']);
             $higlight->addField('title');
             $higlight->addField('tags');
             $body->addHighlight($higlight);
             $body->setSource(['title','tags']);
             return $client->search(['index' => (new ArticleModel())->searchableAs(),'body' => $body->toArray()]);
 })->raw();

The $client and $body in the complex custom query callback can be operated flexibly according to these two packages

#Support Author

coffee


Did you enjoyed the article ?
Subscribe to the newsletter 👨‍💻

Be the first to know when I write something new! If you don't like emails, you can follow me on GitHub.