Laravel 13 向量搜索实战

摘要

本文将通过实战案例演示如何使用 Laravel 13 构建智能语义搜索系统。内容包括:

  • 向量搜索原理与应用场景
  • 数据库设计与迁移
  • 嵌入生成与存储
  • 相似性搜索实现
  • 搜索结果优化
  • 完整实战案例

本文适合希望构建智能搜索功能的 Laravel 开发者。

1. 向量搜索原理

1.1 核心概念

向量搜索通过将文本转换为高维向量,然后计算向量之间的相似度来实现语义搜索。

1.2 应用场景

  • 文档搜索
  • 产品推荐
  • 相似内容发现
  • 问答系统

2. 数据库设计

2.1 迁移文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<?php

use Illuminate\Database\Migrations\Migration;
use Illuminate\Support\Facades\DB;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
public function up(): void
{
DB::statement('CREATE EXTENSION IF NOT EXISTS vector');

Schema::create('documents', function ($table) {
$table->id();
$table->string('title');
$table->text('content');
$table->vector('embedding', 1536);
$table->timestamps();
});

DB::statement('CREATE INDEX documents_embedding_idx ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100)');
}
};

3. 模型实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
<?php

namespace App\Models;

use Illuminate\Database\Eloquent\Model;
use Laravel\Ai\Embeddings;

class Document extends Model
{
protected $fillable = ['title', 'content', 'embedding'];

protected $casts = [
'embedding' => 'vector',
];

protected static function booted(): void
{
static::creating(function (self $doc) {
if (empty($doc->embedding)) {
$doc->embedding = Embeddings::from($doc->content)->generate()->vector;
}
});
}
}

4. 搜索服务

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<?php

namespace App\Services;

use App\Models\Document;
use Laravel\Ai\Embeddings;

class SemanticSearchService
{
public function search(string $query, int $limit = 10)
{
$embedding = Embeddings::from($query)->generate()->vector;

return Document::query()
->whereVectorSimilarTo('embedding', $embedding)
->withSimilarityScore()
->orderByDesc('similarity')
->limit($limit)
->get();
}
}

5. 控制器

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
<?php

namespace App\Http\Controllers;

use App\Services\SemanticSearchService;

class SearchController extends Controller
{
public function __construct(
protected SemanticSearchService $search
) {}

public function search(Request $request)
{
$validated = $request->validate([
'query' => 'required|string|min:3',
'limit' => 'nullable|integer|min:1|max:50',
]);

$results = $this->search->search(
$validated['query'],
$validated['limit'] ?? 10
);

return response()->json([
'query' => $validated['query'],
'results' => $results,
]);
}
}

6. 完整示例

6.1 文档导入

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<?php

namespace App\Services;

use App\Models\Document;
use Laravel\Ai\Embeddings;

class DocumentImportService
{
public function import(array $documents): void
{
$texts = array_column($documents, 'content');
$embeddings = Embeddings::batch($texts)->generate();

foreach ($documents as $index => $doc) {
Document::create([
'title' => $doc['title'],
'content' => $doc['content'],
'embedding' => $embeddings[$index]->vector,
]);
}
}
}

6.2 高级搜索

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
public function advancedSearch(string $query, array $filters = [])
{
$embedding = Embeddings::from($query)->generate()->vector;

$query = Document::query()
->whereVectorSimilarTo('embedding', $embedding)
->withSimilarityScore();

if (!empty($filters['category'])) {
$query->where('category', $filters['category']);
}

if (!empty($filters['date_from'])) {
$query->where('created_at', '>=', $filters['date_from']);
}

return $query->orderByDesc('similarity')->get();
}

7. 性能优化

7.1 索引优化

1
2
3
4
-- HNSW 索引(更快查询)
CREATE INDEX documents_embedding_hnsw_idx ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

7.2 缓存策略

1
2
3
4
5
6
7
8
public function search(string $query)
{
$key = 'search:' . md5($query);

return Cache::remember($key, 3600, function () use ($query) {
// 执行搜索
});
}

8. 总结

通过本实战案例,您已经掌握了:

  1. 向量搜索原理:语义相似性计算
  2. 数据库设计:pgvector 集成
  3. 嵌入生成:Laravel AI SDK
  4. 搜索实现:相似性查询
  5. 性能优化:索引与缓存

参考资料