Delving Into Completely different Search Strategies
To set the context, let’s say we’ve a group of texts about varied technical matters and want to search for info associated to “Machine Learning.” We are going to now take a look at how Key phrase Search, Similarity Search, and Semantic Search provide totally different ranges of depth and understanding, from easy key phrase matching to recognizing associated ideas and contexts.
Allow us to first take a look at the usual code parts used for this system.
1. Customary Code Elements Used
A. Libraries Imported
import os
import re
from whoosh.index import create_in
from whoosh.fields import Schema, TEXT
from whoosh.qparser import QueryParser
from sklearn.feature_extraction.textual content import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from transformers import pipeline
import numpy as np
The next essential libraries are imported on this block:
os
for file system operations.re
for normal expressions.whoosh
for creating and managing a search index.scikit-learn
for TF-IDF vectorization and similarity computation.transformers
for utilizing a deep studying mannequin for function extraction.numpy
for numerical operations, particularly sorting.
B. Pattern Doc Initialization
# Pattern paperwork used for demonstrating all three search strategies
paperwork = [
"Machine learning is a field of artificial intelligence that uses statistical techniques.",
"Natural language processing (NLP) is a part of artificial intelligence that deals with the interaction between computers and humans using natural language. ",
"Deep learning models are a subset of machine learning algorithms that use neural networks with many layers.",
"AI is transforming the world by automating tasks, providing insights through data analysis, and enabling new technologies like autonomous vehicles and advanced robotics. ",
"Natural language processing can be challenging due to the complexity and variability of human language. ",
"The application of machine learning in healthcare is revolutionizing the way diseases are diagnosed and treated.",
"Autonomous vehicles rely heavily on AI and machine learning to navigate and make decisions.",
"Speech recognition technology has advanced considerably thanks to deep learning models. "
]
Defines a listing of pattern paperwork containing textual content associated to numerous matters in synthetic intelligence, machine studying, and pure language processing.