SlideShare a Scribd company logo
1 of 61
Download to read offline
Luigi Fugaro
Unleashing the Power of
Vector Search in .NET
SPONSOR
Join at slido.com
#1041068
ⓘ
Click Present with Slido or install our Chrome extension to display joining instructions for participants
while presenting.
Agenda
★ The Data Balance
★ Turning Data Into Vectors
★ Enter ML Embeddings
★ Redis as Vector Database
★ Redis OM .NET for Vector and more
★ Demo – Live Coding... dotnet run
The Data Balance
The Data Balance
Growth
IDC Report 2023 - https://www.box.com/resources/unstructured-data-paper
Around 80%
of the data generated by organizations is
Unstructured
The Data Balance
Growth
Unstructured
Quasi-Structured
Semi-Structured
Structured
No inherent structure/many degrees of
freedom ~ Docs, PDFs, images, audio, video
Erratic patterns/formats ~ Clickstreams
There's a discernible pattern ~ Spreadsheets /
XML / JSON
Schema/defined data model ~ Database
Data type
The Data Balance
How to deal with unstructured data?
Common approaches were labeling and tagging
There are labor intensive, subjective, and error-prone
The Data Balance
The Data Balance
The Data Balance
What are the common approaches to
deal with Unstructured Data?
ⓘ Click Present with Slido or install our Chrome extension to activate this poll while presenting.
Turning Data
into Vectors !!
Turning Data into Vectors
What is a Vector?
Numeric representation of something in N-dimensional space
using Floating Numbers
Can represent anything... entire documents, images, video, audio
Quantifies features or characteristics of the item
More importantly... they are comparable
Enter ML Embeddings
Using Machine Learning
for Feature Extraction
Enter ML Embeddings
Machine Learning/Deep Learning has leaped forward in the last decade
ML model outperform humans in many tasks nowadays
CV (Computer Vision) models excel at detection/classification
LLMs (LArge Language Models) have advanced exponentially
Enter ML Embeddings
Feature Engineering
Enter ML Embeddings
Automated Feature Engineering
ML model extract latent features
ML embeddings catch the gray areas between features
The process of generating the embeddings is vectorizing
What are Vector Embeddings?
ⓘ Click Present with Slido or install our Chrome extension to activate this poll while presenting.
Why Vectors?
Creating and Storing them
Vectors
Visually
Semantic Relationship Syntactic Relationship
Visually
https://jalammar.github.io/illustrated-word2vec
Vectors
“King”
[ 0.50451 , 0.68607 , -0.59517 , -0.022801, 0.60046 , -0.13498 , -0.08813 , 0.47377 , -0.61798 , -0.31012 , -0.076666, 1.493
, -0.034189, -0.98173 , 0.68229 , 0.81722 , -0.51874 , -0.31503 , -0.55809 , 0.66421 , 0.1961 , -0.13495 , -0.11476 , -0.30344
, 0.41177 , -2.223 , -1.0756 , -1.0783 , -0.34354 , 0.33505 , 1.9927 , -0.04234 , -0.64319 , 0.71125 , 0.49159 , 0.16754 , 0.34344
, -0.25663 , -0.8523 , 0.1661 , 0.40102 , 1.1685 , -1.0137 , -0.21585 , -0.15155 , 0.78321 , -0.91241 , -1.6106 , -0.64426 ,
-0.51042 ]
Visually
https://jalammar.github.io/illustrated-word2vec
Vectors
Visually
https://jalammar.github.io/illustrated-word2vec
Vectors
_
+
Vectors can be operated upon
https://jalammar.github.io/illustrated-word2vec
Vectors
Vectorizing
Generating Vector Embeddings
for your Data
Vectorizing
1. Choose an Embedding Method
2. Clean and preprocess the data as needed
3. Train/Refine the embedding model
4. Generate Embeddings
Vectorizing
Better Models, better Vectors
Embeddings can capture the semantics of complex data
Option #1: Use a pre-trained model
Option #2: train your models with custom data
Vector similarity is a downline tool to analyze embeddings
Vectorizing
Similarity Metrics
➢ Measure “closeness” between vectors in multi-dimensional space
➢ Enable efficient similarity search in vector databases
➢ Improve relevance and precision of search results
Vectorizing
Similarity/Distance Metrics
Cosine Similarity
Vectorizing
VSS in Redis
Redis as Vector Database
Which are the algs to calculate
similarity/distance metrics?
ⓘ Click Present with Slido or install our Chrome extension to activate this poll while presenting.
VSS in Redis
VSS in Redis
Index and query vector data stored as BLOBs in Redis Hashes/JSON
3 distance metrics: Euclidean, Internal Product and Cosine
2 indexing methods: HNSW and Flat
Pre-filtering queries with GEO, TAG, TEXT or NUMERIC fields
Redis OM
How to implement the
solution
Redis OM
A Redis Framework
It’s more than Object Mapping
Redis OM stands for Redis Object Mapping, a suite of libraries designed to
facilitate object-oriented interaction with Redis. It simplifies working with
Redis by abstracting direct commands into higher-level operations.
https://github.com/redis/redis-om-dotnet
Redis OM
More to come...
Redis OM
Why?
● Redis OM simplifies application
development by abstracting Redis'
complexity, increasing productivity, and
enhancing code readability.
What?
◆ Redis OM enables building real-time
applications, supporting:
● Model Data
● Perform CRUD Operations
● Index and Search Data
Redis OM
Redis OM
dotnet run --project ProductCatalog
Redis OM
Fashion Product Finder
Breakdown
■ A Product domain mapped to Redis JSON
■ [Indexed] decorated field for text and numeric indexing
■ [ImageVectorizer] decorated field to generate embeddings
■ [SentenceVectorizer] decorated field to generate embeddings
■ Pre-filter the search if needed
■ Entity Streams to query for K nearest neighbors
■ Display results
Demo… plan B 󰣹
namespace ProductCatalog.Model;
[Document(StorageType = StorageType.Json)]
public class Product
{
// Other fields...
}
Model → Redis JSON
[Document]
Demo… plan B 󰣹
Make Product Searchable → [Indexed] decoration
namespace ProductCatalog.Model;
[Document(StorageType = StorageType.Json)]
public class Product
{
[RedisIdField] [Indexed] public int Id { get; set; }
// Other fields...
[Indexed] public string Gender { get; set; }
[Indexed] public int? Year { get; set; }
// Other fields...
}
Decorate what
you want to
make searchable
Demo… plan B 󰣹
Automatic Embedding Generation
[IVectorizer] decoration
namespace ProductCatalog.Model;
[Document(StorageType = StorageType.Json)]
public class Product
{
// Other fields...
[Indexed(Algorithm = VectorAlgorithm.HNSW, DistanceMetric = DistanceMetric.COSINE)]
[ImageVectorizer]public Vector<string> ImageUrl { get; set; }
[Indexed(Algorithm = VectorAlgorithm.FLAT, DistanceMetric = DistanceMetric.COSINE)]
[SentenceVectorizer] public Vector<string> ProductDisplayName { get; set; }
Demo… plan B 󰣹
Searching with Fluent API
[IVectorizer] decoration
[HttpGet("byImage")]
public IEnumerable<CatalogResponse > ByImage([FromQuery]string url)
{
var collection = _provider.RedisCollection <Product>();
var response = collection.NearestNeighbors(x => x.ImageUrl, 15, url);
return response.Select(CatalogResponse .Of);
}
[HttpGet("byDescription" )]
public IEnumerable<CatalogResponse > ByDescription([FromQuery] string description)
{
var collection = _provider.RedisCollection <Product>();
var response = collection.NearestNeighbors(x => x.ProductDisplayName, 15, description);
return response.Select(CatalogResponse .Of);
}
Demo… plan B 󰣹
Redis OM
The tools and techniques to unlock
the value in Unstructured Data have
evolved greatly...
Redis OM
Databases like Redis and frameworks
like Redis OM can help!
Redis OM
INTEGRATIONS
FEATURES
Storage: HASH | JSON
Indexing: HNSW (ANN) | Flat (KNN)
Distance: L2 | Cosine | IP
Search Types: KNN/ANN | Hybrid |
Range | Full Text
Management: Realtime CRUD
operations, aliasing, temp indices, and
more
Ecosystem integrations
NEW REDIS ENTERPRISE 7.2
FEATURE
Scalable search and query for
improved performance, up to 16X
compared to previous versions
Redis as Vector Database
Vector Similarity Search
Use Cases
VSS Use cases
Vector Similarity Search Use Cases
Question & Answering
VSS Use cases
Vector Similarity Search Use Cases
Context retrieval for Retrieval Augmented Generation (RAG)
Pairing Redis Enterprise with Large Language
Models (LLM) such as OpenAI's ChatGPT, you can
give the LLM access to external contextual
knowledge.
➔ Enables more accurate answers and
prevents model 'hallucinations'.
➔ An LLM combines text fragments in a (most
often) semantically correct way.
VSS Use cases
Vector Similarity Search Use Cases
LLM Conversion Memory
The idea is to improve the model quality and
personalization through an adaptive memory.
➔ Persist all conversation history (memories)
as embeddings in a vector database.
➔ A conversational agent checks for relevant
memories to aid or personalize the LLM
behaviour.
➔ Allows users to change topics without
misunderstandings seamlessly.
VSS Use cases
Vector Similarity Search Use Cases
Semantic Caching
Because LLM completions are expensive, it helps
to reduce the overall costs of the ML-powered
application.
➔ Use vector database to cache input
prompts.
➔ Cache hits evaluated by semantic similarity.
VSS Use cases
Redis resources
Additional resources for learning about Redis
Central place to find
example apps that are
built on Redis
launchpad.redis.com
Redis Launchpad
Free online courses
taught by Redis experts
university.redis.com
Redis University
Create a database
Code your application
Explore your data
developer.redis.com
Developers Portal
Professional certification
program for developers
university.redis.com/
certification
Redis Certification
Redis resources
Which are the actual programing
languages supported by Redis OM?
ⓘ Click Present with Slido or install our Chrome extension to activate this poll while presenting.
.NET Conference 2024
Grazie
Questions?

More Related Content

Similar to Unleashing the Power of Vector Search in .NET - DotNETConf2024.pdf

Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureMark Kromer
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Karen Thompson
 
[WITH THE VISION 2017] IoT/AI時代を生き抜くためのデータ プラットフォーム (Leveraging Azure Data Se...
[WITH THE VISION 2017] IoT/AI時代を生き抜くためのデータ プラットフォーム (Leveraging Azure Data Se...[WITH THE VISION 2017] IoT/AI時代を生き抜くためのデータ プラットフォーム (Leveraging Azure Data Se...
[WITH THE VISION 2017] IoT/AI時代を生き抜くためのデータ プラットフォーム (Leveraging Azure Data Se...Naoki (Neo) SATO
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionDenodo
 
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...Codit
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Maxime Beugnet
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloudJames Serra
 
20171106_OracleWebcast_ITTrends_EFavuzzi_KPatenge
20171106_OracleWebcast_ITTrends_EFavuzzi_KPatenge20171106_OracleWebcast_ITTrends_EFavuzzi_KPatenge
20171106_OracleWebcast_ITTrends_EFavuzzi_KPatengeKarin Patenge
 
Dev show september 8th 2020 power platform - not just a simple toy
Dev show september 8th 2020   power platform - not just a simple toyDev show september 8th 2020   power platform - not just a simple toy
Dev show september 8th 2020 power platform - not just a simple toyJens Schrøder
 
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Kai Wähner
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsCollective Intelligence Inc.
 
Unleashing the Power of Vector Search in .NET - SharpCoding2024.pdf
Unleashing the Power of Vector Search in .NET - SharpCoding2024.pdfUnleashing the Power of Vector Search in .NET - SharpCoding2024.pdf
Unleashing the Power of Vector Search in .NET - SharpCoding2024.pdfLuigi Fugaro
 
Confluent & MongoDB APAC Lunch & Learn
Confluent & MongoDB APAC Lunch & LearnConfluent & MongoDB APAC Lunch & Learn
Confluent & MongoDB APAC Lunch & Learnconfluent
 
Data APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of EngagementData APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of EngagementVictor Olex
 
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?Denodo
 
Web Services Foundation Technologies
Web Services Foundation TechnologiesWeb Services Foundation Technologies
Web Services Foundation TechnologiesPankaj Saharan
 
17 applied architectures
17 applied architectures17 applied architectures
17 applied architecturesMajong DevJfu
 

Similar to Unleashing the Power of Vector Search in .NET - DotNETConf2024.pdf (20)

Big Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft AzureBig Data Analytics in the Cloud with Microsoft Azure
Big Data Analytics in the Cloud with Microsoft Azure
 
RavenDB overview
RavenDB overviewRavenDB overview
RavenDB overview
 
Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
Cis 555 Week 4 Assignment 2 Automated Teller Machine (Atm)...
 
[WITH THE VISION 2017] IoT/AI時代を生き抜くためのデータ プラットフォーム (Leveraging Azure Data Se...
[WITH THE VISION 2017] IoT/AI時代を生き抜くためのデータ プラットフォーム (Leveraging Azure Data Se...[WITH THE VISION 2017] IoT/AI時代を生き抜くためのデータ プラットフォーム (Leveraging Azure Data Se...
[WITH THE VISION 2017] IoT/AI時代を生き抜くためのデータ プラットフォーム (Leveraging Azure Data Se...
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service Option
 
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
Hoe het Azure ecosysteem een cruciale rol speelt in uw IoT-oplossing (Glenn C...
 
Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...Simplifying & accelerating application development with MongoDB's intelligent...
Simplifying & accelerating application development with MongoDB's intelligent...
 
Benefits of the Azure cloud
Benefits of the Azure cloudBenefits of the Azure cloud
Benefits of the Azure cloud
 
20171106_OracleWebcast_ITTrends_EFavuzzi_KPatenge
20171106_OracleWebcast_ITTrends_EFavuzzi_KPatenge20171106_OracleWebcast_ITTrends_EFavuzzi_KPatenge
20171106_OracleWebcast_ITTrends_EFavuzzi_KPatenge
 
Dev show september 8th 2020 power platform - not just a simple toy
Dev show september 8th 2020   power platform - not just a simple toyDev show september 8th 2020   power platform - not just a simple toy
Dev show september 8th 2020 power platform - not just a simple toy
 
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
Modern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced AnalyticsModern Business Intelligence and Advanced Analytics
Modern Business Intelligence and Advanced Analytics
 
Unleashing the Power of Vector Search in .NET - SharpCoding2024.pdf
Unleashing the Power of Vector Search in .NET - SharpCoding2024.pdfUnleashing the Power of Vector Search in .NET - SharpCoding2024.pdf
Unleashing the Power of Vector Search in .NET - SharpCoding2024.pdf
 
Confluent & MongoDB APAC Lunch & Learn
Confluent & MongoDB APAC Lunch & LearnConfluent & MongoDB APAC Lunch & Learn
Confluent & MongoDB APAC Lunch & Learn
 
Data APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of EngagementData APIs as a Foundation for Systems of Engagement
Data APIs as a Foundation for Systems of Engagement
 
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
SAP Analytics Cloud: Haben Sie schon alle Datenquellen im Live-Zugriff?
 
Web Services Foundation Technologies
Web Services Foundation TechnologiesWeb Services Foundation Technologies
Web Services Foundation Technologies
 
17 applied architectures
17 applied architectures17 applied architectures
17 applied architectures
 

More from Luigi Fugaro

Ottimizzare le performance dell'API Server K8s come utilizzare cache e eventi...
Ottimizzare le performance dell'API Server K8s come utilizzare cache e eventi...Ottimizzare le performance dell'API Server K8s come utilizzare cache e eventi...
Ottimizzare le performance dell'API Server K8s come utilizzare cache e eventi...Luigi Fugaro
 
Sharp Coding 2023 - Luigi Fugaro - ACRE.pdf
Sharp Coding 2023 - Luigi Fugaro - ACRE.pdfSharp Coding 2023 - Luigi Fugaro - ACRE.pdf
Sharp Coding 2023 - Luigi Fugaro - ACRE.pdfLuigi Fugaro
 
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AIRed Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AILuigi Fugaro
 
Caching Patterns for lazy devs for lazy loading - Luigi Fugaro VDTJAN23
Caching Patterns for lazy devs for lazy loading - Luigi Fugaro VDTJAN23Caching Patterns for lazy devs for lazy loading - Luigi Fugaro VDTJAN23
Caching Patterns for lazy devs for lazy loading - Luigi Fugaro VDTJAN23Luigi Fugaro
 
Codemotion Milan '22 - Real Time Data - No CRDTs, no party!
Codemotion Milan '22 - Real Time Data - No CRDTs, no party!Codemotion Milan '22 - Real Time Data - No CRDTs, no party!
Codemotion Milan '22 - Real Time Data - No CRDTs, no party!Luigi Fugaro
 
OpenSlava 2018 - Cloud Native Applications with OpenShift
OpenSlava 2018 - Cloud Native Applications with OpenShiftOpenSlava 2018 - Cloud Native Applications with OpenShift
OpenSlava 2018 - Cloud Native Applications with OpenShiftLuigi Fugaro
 
Redis - Non solo cache
Redis - Non solo cacheRedis - Non solo cache
Redis - Non solo cacheLuigi Fugaro
 
JDV for Codemotion Rome 2017
JDV for Codemotion Rome 2017JDV for Codemotion Rome 2017
JDV for Codemotion Rome 2017Luigi Fugaro
 
2.5tier Javaday (italian)
2.5tier Javaday (italian)2.5tier Javaday (italian)
2.5tier Javaday (italian)Luigi Fugaro
 

More from Luigi Fugaro (9)

Ottimizzare le performance dell'API Server K8s come utilizzare cache e eventi...
Ottimizzare le performance dell'API Server K8s come utilizzare cache e eventi...Ottimizzare le performance dell'API Server K8s come utilizzare cache e eventi...
Ottimizzare le performance dell'API Server K8s come utilizzare cache e eventi...
 
Sharp Coding 2023 - Luigi Fugaro - ACRE.pdf
Sharp Coding 2023 - Luigi Fugaro - ACRE.pdfSharp Coding 2023 - Luigi Fugaro - ACRE.pdf
Sharp Coding 2023 - Luigi Fugaro - ACRE.pdf
 
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AIRed Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
Red Hat Summit Connect 2023 - Redis Enterprise, the engine of Generative AI
 
Caching Patterns for lazy devs for lazy loading - Luigi Fugaro VDTJAN23
Caching Patterns for lazy devs for lazy loading - Luigi Fugaro VDTJAN23Caching Patterns for lazy devs for lazy loading - Luigi Fugaro VDTJAN23
Caching Patterns for lazy devs for lazy loading - Luigi Fugaro VDTJAN23
 
Codemotion Milan '22 - Real Time Data - No CRDTs, no party!
Codemotion Milan '22 - Real Time Data - No CRDTs, no party!Codemotion Milan '22 - Real Time Data - No CRDTs, no party!
Codemotion Milan '22 - Real Time Data - No CRDTs, no party!
 
OpenSlava 2018 - Cloud Native Applications with OpenShift
OpenSlava 2018 - Cloud Native Applications with OpenShiftOpenSlava 2018 - Cloud Native Applications with OpenShift
OpenSlava 2018 - Cloud Native Applications with OpenShift
 
Redis - Non solo cache
Redis - Non solo cacheRedis - Non solo cache
Redis - Non solo cache
 
JDV for Codemotion Rome 2017
JDV for Codemotion Rome 2017JDV for Codemotion Rome 2017
JDV for Codemotion Rome 2017
 
2.5tier Javaday (italian)
2.5tier Javaday (italian)2.5tier Javaday (italian)
2.5tier Javaday (italian)
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 

Unleashing the Power of Vector Search in .NET - DotNETConf2024.pdf

  • 1. Luigi Fugaro Unleashing the Power of Vector Search in .NET
  • 3. Join at slido.com #1041068 ⓘ Click Present with Slido or install our Chrome extension to display joining instructions for participants while presenting.
  • 4. Agenda ★ The Data Balance ★ Turning Data Into Vectors ★ Enter ML Embeddings ★ Redis as Vector Database ★ Redis OM .NET for Vector and more ★ Demo – Live Coding... dotnet run
  • 6. The Data Balance Growth IDC Report 2023 - https://www.box.com/resources/unstructured-data-paper Around 80% of the data generated by organizations is Unstructured
  • 7. The Data Balance Growth Unstructured Quasi-Structured Semi-Structured Structured No inherent structure/many degrees of freedom ~ Docs, PDFs, images, audio, video Erratic patterns/formats ~ Clickstreams There's a discernible pattern ~ Spreadsheets / XML / JSON Schema/defined data model ~ Database Data type
  • 8. The Data Balance How to deal with unstructured data? Common approaches were labeling and tagging There are labor intensive, subjective, and error-prone
  • 12. What are the common approaches to deal with Unstructured Data? ⓘ Click Present with Slido or install our Chrome extension to activate this poll while presenting.
  • 14. Turning Data into Vectors What is a Vector? Numeric representation of something in N-dimensional space using Floating Numbers Can represent anything... entire documents, images, video, audio Quantifies features or characteristics of the item More importantly... they are comparable
  • 15. Enter ML Embeddings Using Machine Learning for Feature Extraction
  • 16. Enter ML Embeddings Machine Learning/Deep Learning has leaped forward in the last decade ML model outperform humans in many tasks nowadays CV (Computer Vision) models excel at detection/classification LLMs (LArge Language Models) have advanced exponentially
  • 18. Enter ML Embeddings Automated Feature Engineering ML model extract latent features ML embeddings catch the gray areas between features The process of generating the embeddings is vectorizing
  • 19. What are Vector Embeddings? ⓘ Click Present with Slido or install our Chrome extension to activate this poll while presenting.
  • 22. Visually https://jalammar.github.io/illustrated-word2vec Vectors “King” [ 0.50451 , 0.68607 , -0.59517 , -0.022801, 0.60046 , -0.13498 , -0.08813 , 0.47377 , -0.61798 , -0.31012 , -0.076666, 1.493 , -0.034189, -0.98173 , 0.68229 , 0.81722 , -0.51874 , -0.31503 , -0.55809 , 0.66421 , 0.1961 , -0.13495 , -0.11476 , -0.30344 , 0.41177 , -2.223 , -1.0756 , -1.0783 , -0.34354 , 0.33505 , 1.9927 , -0.04234 , -0.64319 , 0.71125 , 0.49159 , 0.16754 , 0.34344 , -0.25663 , -0.8523 , 0.1661 , 0.40102 , 1.1685 , -1.0137 , -0.21585 , -0.15155 , 0.78321 , -0.91241 , -1.6106 , -0.64426 , -0.51042 ]
  • 25. Vectors can be operated upon https://jalammar.github.io/illustrated-word2vec Vectors
  • 27. Vectorizing 1. Choose an Embedding Method 2. Clean and preprocess the data as needed 3. Train/Refine the embedding model 4. Generate Embeddings
  • 28. Vectorizing Better Models, better Vectors Embeddings can capture the semantics of complex data Option #1: Use a pre-trained model Option #2: train your models with custom data Vector similarity is a downline tool to analyze embeddings
  • 29. Vectorizing Similarity Metrics ➢ Measure “closeness” between vectors in multi-dimensional space ➢ Enable efficient similarity search in vector databases ➢ Improve relevance and precision of search results
  • 32. VSS in Redis Redis as Vector Database
  • 33. Which are the algs to calculate similarity/distance metrics? ⓘ Click Present with Slido or install our Chrome extension to activate this poll while presenting.
  • 35. VSS in Redis Index and query vector data stored as BLOBs in Redis Hashes/JSON 3 distance metrics: Euclidean, Internal Product and Cosine 2 indexing methods: HNSW and Flat Pre-filtering queries with GEO, TAG, TEXT or NUMERIC fields
  • 36. Redis OM How to implement the solution
  • 38. A Redis Framework It’s more than Object Mapping Redis OM stands for Redis Object Mapping, a suite of libraries designed to facilitate object-oriented interaction with Redis. It simplifies working with Redis by abstracting direct commands into higher-level operations. https://github.com/redis/redis-om-dotnet Redis OM
  • 40. Why? ● Redis OM simplifies application development by abstracting Redis' complexity, increasing productivity, and enhancing code readability. What? ◆ Redis OM enables building real-time applications, supporting: ● Model Data ● Perform CRUD Operations ● Index and Search Data Redis OM
  • 41. Redis OM dotnet run --project ProductCatalog
  • 42. Redis OM Fashion Product Finder Breakdown ■ A Product domain mapped to Redis JSON ■ [Indexed] decorated field for text and numeric indexing ■ [ImageVectorizer] decorated field to generate embeddings ■ [SentenceVectorizer] decorated field to generate embeddings ■ Pre-filter the search if needed ■ Entity Streams to query for K nearest neighbors ■ Display results
  • 43. Demo… plan B 󰣹 namespace ProductCatalog.Model; [Document(StorageType = StorageType.Json)] public class Product { // Other fields... } Model → Redis JSON [Document]
  • 44. Demo… plan B 󰣹 Make Product Searchable → [Indexed] decoration namespace ProductCatalog.Model; [Document(StorageType = StorageType.Json)] public class Product { [RedisIdField] [Indexed] public int Id { get; set; } // Other fields... [Indexed] public string Gender { get; set; } [Indexed] public int? Year { get; set; } // Other fields... } Decorate what you want to make searchable
  • 45. Demo… plan B 󰣹 Automatic Embedding Generation [IVectorizer] decoration namespace ProductCatalog.Model; [Document(StorageType = StorageType.Json)] public class Product { // Other fields... [Indexed(Algorithm = VectorAlgorithm.HNSW, DistanceMetric = DistanceMetric.COSINE)] [ImageVectorizer]public Vector<string> ImageUrl { get; set; } [Indexed(Algorithm = VectorAlgorithm.FLAT, DistanceMetric = DistanceMetric.COSINE)] [SentenceVectorizer] public Vector<string> ProductDisplayName { get; set; }
  • 46. Demo… plan B 󰣹 Searching with Fluent API [IVectorizer] decoration [HttpGet("byImage")] public IEnumerable<CatalogResponse > ByImage([FromQuery]string url) { var collection = _provider.RedisCollection <Product>(); var response = collection.NearestNeighbors(x => x.ImageUrl, 15, url); return response.Select(CatalogResponse .Of); } [HttpGet("byDescription" )] public IEnumerable<CatalogResponse > ByDescription([FromQuery] string description) { var collection = _provider.RedisCollection <Product>(); var response = collection.NearestNeighbors(x => x.ProductDisplayName, 15, description); return response.Select(CatalogResponse .Of); }
  • 48. Redis OM The tools and techniques to unlock the value in Unstructured Data have evolved greatly...
  • 49. Redis OM Databases like Redis and frameworks like Redis OM can help!
  • 50. Redis OM INTEGRATIONS FEATURES Storage: HASH | JSON Indexing: HNSW (ANN) | Flat (KNN) Distance: L2 | Cosine | IP Search Types: KNN/ANN | Hybrid | Range | Full Text Management: Realtime CRUD operations, aliasing, temp indices, and more Ecosystem integrations NEW REDIS ENTERPRISE 7.2 FEATURE Scalable search and query for improved performance, up to 16X compared to previous versions Redis as Vector Database
  • 52. VSS Use cases Vector Similarity Search Use Cases Question & Answering
  • 53. VSS Use cases Vector Similarity Search Use Cases Context retrieval for Retrieval Augmented Generation (RAG) Pairing Redis Enterprise with Large Language Models (LLM) such as OpenAI's ChatGPT, you can give the LLM access to external contextual knowledge. ➔ Enables more accurate answers and prevents model 'hallucinations'. ➔ An LLM combines text fragments in a (most often) semantically correct way.
  • 54. VSS Use cases Vector Similarity Search Use Cases LLM Conversion Memory The idea is to improve the model quality and personalization through an adaptive memory. ➔ Persist all conversation history (memories) as embeddings in a vector database. ➔ A conversational agent checks for relevant memories to aid or personalize the LLM behaviour. ➔ Allows users to change topics without misunderstandings seamlessly.
  • 55. VSS Use cases Vector Similarity Search Use Cases Semantic Caching Because LLM completions are expensive, it helps to reduce the overall costs of the ML-powered application. ➔ Use vector database to cache input prompts. ➔ Cache hits evaluated by semantic similarity.
  • 57. Redis resources Additional resources for learning about Redis Central place to find example apps that are built on Redis launchpad.redis.com Redis Launchpad Free online courses taught by Redis experts university.redis.com Redis University Create a database Code your application Explore your data developer.redis.com Developers Portal Professional certification program for developers university.redis.com/ certification Redis Certification
  • 59. Which are the actual programing languages supported by Redis OM? ⓘ Click Present with Slido or install our Chrome extension to activate this poll while presenting.