The topic is about Azure solution architectures that involve IoT and AI to solve common business domain problems. With near real time recommender system and an object detection with image recognition we review the architecture, build from the ground-up and illustrate how the typical realistic challenges could be addressed.
4. Uniquely Identify Users
ā¢ User has UniqueID regardless of the fact whether they are registered or not. The UniqueID for unregistered
user could be assigned based. Below are a couple of ways of generating such unique ID
BrowserID
ā¢ Unique id generated from the browsers user agent string.
Browser|BrowserVersion|OS|OSVersion|Processor|MozzilaMajorVersion|GeckoMajorVersion
ComputerID
ā¢ Generated from users IP Address and HTTPS session key. getISP(requestIP)|getHTTPSClientKey()
FingerPrintID
ā¢ JavaScript based fingerprinting based on a modified fingerprint.js.FingerPrint.get()
SessionID
ā¢ Random key generated when user 1st visits site. BrowserID|ComputerID|randombytes(256)
GoogleID
ā¢ Generated from __utma cookie. getCookie(__utma).uniqueid
Unregistered users
ā¢ generate uniqueID as above and store it in a cookie.
ā¢ User actions in the website will be traced: Searches; Views; Likes; View Contact Information
5. Functional Features
User Profiling
ā¢ Establish detailed user profiles by collecting information such as industry, preferences, recent
searches, and location.
ā¢ Encourage users to input additional preferences and requirements to fine-tune
recommendations.
Contextual Awareness
ā¢ Utilize geolocation data to offer services relevant to the user's location.
ā¢ Question: Do we have time-sensitive offers, expiring during day or limited time? Consider
temporal factors, such as time of day and day of the week.
Behavioral Analysis
ā¢ Analyze user behaviour, including clicks, time spent on offer pages and even offer previews
ā¢ Analyze behaviour of interactions (likes, comments).
6. Functional Features (Continuedā¦)
Collaborative Filtering
ā¢ Implement collaborative filtering to reflect the preferences and feedback of similar users.
Supplier Score
ā¢ Integrate a supplier scoring system based on user reviews, ratings and purchase history. Suppliers
with high scores should be considered reputable and reliable service providers.
Real-time Feedback
ā¢ Allow users to provide feedback on recommended services directly through the banner. I.e. āhideā
and āadd to favoritesā with sub options to hide this offer or offers from that supplier
ā¢ The feedback shall be used to continuously improve the recommendation algorithm output for the
user, but also to affect the collaborative score.
Transparency
ā¢ Provide user with transparency on why certain services are recommended by highlighting key
factors such as user preferences, location relevance, and supplier scores. (i.e. decomposing score)
ā¢ The point will also be useful for assessment during development time
7. Types of Feature Filters
Content-based filtering (similarity between item features)
ā¢ The underlying assumption of the collaborative filtering approach is that if a person A has the same
opinion as a person B on a set of items, A is more likely to have B's opinion for a given item than that of a
randomly chosen person.
ā¢ Method makes automatic predictions (filtering) about the interests of a user by collecting preferences
or taste information from many users (collaborating).
Collaborative item filtering (similarity between items based on interactions)
ā¢ Uses only information about the description and attributes of the items users has previously consumed
to model user's preferences.
ā¢ Algorithms tries to recommend items that are similar to those that a user liked in the past (or is
examining in the present).
ā¢ Various candidate items are compared with items previously rated by the user and the best-matching
items are recommended.
8. Hybrid Feature Filters
Hybrid Filter Recommender
ā¢ Combining collaborative filtering and content-based filtering could be more effective than pure
approaches in some cases.
ā¢ These methods can also be used to overcome some of the common problems in recommender
systems such as cold start (no previous history).
RecommenderHybridScore = RecommenderContentScore*ContentWeight
+ RecommenderCollaborativeScore*ContentWeight
9. Feature Types
# Feature Guest User
Weight
Auth User
Weight
Type
F01 Location 5 5 Content
F02 Trending offers (offer views and interactions score) 5 2 Collaborative
F03 Person/Organization keywords, profile N/A 10 Collaborative
F04 View time (time spent on offer during scrolling)
(most recent = most important)
3 Collaborative
F05 Search history, Type of service 7 7 Collaborative
F06 Promoted offers (a paid service) 10 5 Content
F07 Content (keywords, service type) 10 10 Content
F08 Clicks (most recent = most important) 15 10 Collaborative
F09 Interactions (likes, comments, views of contact
details) (most recent = most important)
N/A 20 Collaborative
F10 Realtime feedback ā hide offer, hide supplier 20 30 Collaborative
10. Recommender Flow
Identification
ā¢ Unique User ID - Used for tracking user behaviour on the web site
ā¢ Anonymous users - Automatically generated from the browser agent and IP location
ā¢ Authorized users
Incentivisation
ā¢ The web portal shall support SSO using both Facebook and Google to increase the likelihood to identify users
ā¢ There shall be certain hidden information when accessing an offer to urge the end user to log in. (i.e. contact
info visible only for registered users, although for free).
Data Collection
ā¢ User-Based Collaborative Data
ā¢ Anonymous user, Authenticated user
ā¢ Authenticated user only
ā¢ History of purchases
ā¢ History of visited offers (with more weight on the most recent offers accessed)
ā¢ History of interactions
11. Recommender Flow (Continuedā¦)
ā¢ Item-Based Collaborative Data
ā¢ User preferences, behavior, and interactions with the platform - historical ratings, reviews, and any explicit feedback.
ā¢ Time spent reading an offer. Look at Facebook and Instagram reels, as well as TikTok.
ā¢ The offer feed may be different for each user and could be implemented with infinite scroll where the user navigates
through a summary view (images changing in a slideshow, title, short description) and the time spent to read the
summary is used as a feature indicating the level of interest of end user
ā¢ Content-based Data
ā¢ Information about the items or services being recommended - attributes, features, or content-related information
about each item.
Data Preprocessing
ā¢ Clean and preprocess the data. Handle missing values, remove duplicates, and format the data for easy processing
User-Based Collaborative Filtering
o Calculate user-based collaborative filtering score
o Build a user-item matrix: Create a matrix where rows represent users, columns represent items, and the cells contain
ratings or interactions.
o Calculate user similarities: Use metrics like cosine similarity or Pearson correlation to measure the similarity between
users based on their preferences.
o Predict ratings for items: Predict the ratings for items that a user has not interacted with based on the ratings of similar
users.
12. Recommender Flow (Continuedā¦)
Item-Based Collaborative Filtering
ā¢ Transpose the user-item matrix: Create an item-user matrix by transposing the user-item matrix.
ā¢ Calculate item similarities: Measure the similarity between items based on user interactions using techniques like
cosine similarity.
ā¢ Predict ratings for users: Predict the ratings a user might give to items they haven't interacted with based on the ratings
of similar items.
Content Filtering
ā¢ Extract features for items: For each item, extract relevant features or attributes. This could include textual data,
categorical data, or any other relevant information.
ā¢ Build an item-feature matrix where rows represent items, columns represent features, and the cells contain the values
of the features.
ā¢ Calculate content similarities. Use similarity measures (e.g., cosine similarity) to determine how similar items are
based on their features.
ā¢ Combine collaborative and content similarities. This could involve weighting each similarity measure based on its
importance.
Hybrid Recommender
ā¢ Combine the recommendations from collaborative and content filtering, giving appropriate weight to each.
13. Evaluation and Improvement
Evaluation
ā¢ Split the data into training and testing sets. Evaluate the performance of your recommender system by comparing
predicted ratings or recommendations with actual user interactions in a testing set.
ā¢ Utilize metrics such as Root Mean Squared Error (RMSE) ā i.e. the distance between real and predicted score
Continuous Improvement
ā¢ Implement mechanisms to continuously update and retrain the recommender system as new data becomes available.
Takeaways
ā¢ Content Based Recommendation System
o https://www.youtube.com/watch?v=9_YdIjGl2m4
ā¢ Kaggle Music Recommender
o https://www.kaggle.com/code/vatsalmavani/music-recommendation-system-using-spotify-dataset
ā¢ Kaggle Movie Recommender
o https://www.kaggle.com/code/rounakbanik/movie-recommender-systems
ā¢ Kaggle E-commerce Recommender
o https://www.kaggle.com/code/shawamar/product-recommendation-system-for-e-commerce
ā¢ Kaggle Content based Recommender
o https://www.kaggle.com/code/omeroruccelik/content-based-recommendation-systems
ā¢ Recommender System in Python
o https://www.kaggle.com/code/gspmoreira/recommender-systems-in-python-101
ā¢ Azure ML Recommender System
o https://azure.microsoft.com/en-us/blog/building-recommender-systems-with-azure-machine-learning-service/
14. App Service - Hybrid Connection
ā¢ Access on-premises AZ SQL DB from the cloud
ā¢ Install Hybrid Connection Manager on-premises
17. Azure Service ā to ā Function Mapping
Function App
ā¢ Receives images for scoring (Flow 1)
ā¢ Receives prediction to be saved (Flow 2)
ā¢ Synchronizes device config. with DB
AKS
ā¢ Hosting AI Models
AZ ML Service
ā¢ Labeling images
ā¢ Datasets
ā¢ Models
ā¢ Training
ā¢ Deployments
Function App (Builder)
ā¢ Builds datasets
ā¢ Crates ML job
ā¢ Deploys models
Web App
ā¢ Show prediction results
ā¢ Manage known items and daily tasks (menu)
Azure SQL DB
ā¢ Asset Structure
ā¢ Menus
ā¢ Predictions
IoT Hub
ā¢ Device Configuration
Storage
ā¢ Store prediction images
18. Gateway Module Architecture Flow
Camera
captures
image
Deduplication
(Scene has
Changes)
Check Target
- Crop image
to tray area
Deduplication
- Check
image focus
Check for Known
Items (Flow 1)
Deduplication
- Check items
on tray
GDPR Check ā
Personal
items
Upload image
to storage
Persist Image
(Flow 2)
Tray Detection
AI Model
GDPR Detection
AI Model
19. Auxiliary Flows
RPI Device API Management Dynamic Endpoint
AZ ML Deployment
Endpoint
RPI Device Azure Function
Dynamic
Endpoint
Flow 2: Saving Prediction Results
Flow 1: Send Items for Object Detection
20. Object Scene Deduplication
ā¢ Deduplication 1: Image must be at least 20% different that the previous one. Comparing image hash values.
ā¢ Deduplication 2: Image must pass blurriness threshold. Checking pixel variance.
ā¢ Deduplication 3: Comparing polygons (fingerprint) made of the centroids of the detected foods between
two images.
23. Detection Issues from Labeling
ā¢ Too small polygon shape
ā¢ Too simple polygon shape
ā¢ Too complex polygon shape
ā¢ Wrong polygon shape
ā¢ ML assisted labeling glitch
24. Detection Issues from Labeling
ā¢ Too small polygon shape
ā¢ Too simple polygon shape
ā¢ Too complex polygon shape
ā¢ Wrong polygon shape
ā¢ ML assisted labeling glitch
25. Detection Issues from Labeling
ā¢ Too small polygon shape
ā¢ Too simple polygon shape
ā¢ Too complex polygon shape
ā¢ Wrong polygon shape
ā¢ ML assisted labeling glitch
26. Detection Issues from Labeling
ā¢ Too small polygon shape
ā¢ Too simple polygon shape
ā¢ Too complex polygon shape
ā¢ Wrong polygon shape
ā¢ ML assisted labeling glitch
27. Detection Issues from Labeling
ā¢ Too small polygon shape
ā¢ Too simple polygon shape
ā¢ Too complex polygon shape
ā¢ Wrong polygon shape
ā¢ ML assisted labeling glitch