- The document discusses various topics related to perception, representation, structure, and recognition of visual concepts including taxonomic hierarchies, conceptual categories, and flexible knowledge structures.
- Different studies are mentioned that examine emerging conceptual categories at different layers of deep neural networks trained on visual datasets, as well as investigations into semantic representations derived from object co-occurrence in scenes.
- The analysis of neural network representations and human behavioral data suggests a more flexible representation of conceptual knowledge that captures cross-cutting relationships rather than a pure hierarchical structure.
2. Quillan,1966
Knowledge structure
• taxonomic hierarchy can provide an
efficient mechanism for storing and
retrieving semantic information.
• General benefits of this structure
• Inheritance property
• Economy of use
• Generalization
• Semantic deficit
• Cognitive development
E. Rosch
M. R. Quillian
E.
Warrington
need for a more flexible structure of
conceptual knowledge
3. • the superordinate level contains a very wide range of object categories with different appearances and shapes
2 2
1
, _
, _
( _ )
( , ) Re( ( , )) Im( ( , ))
( , ) tan (Im( ( , ) / Re( ( , ))
(1) | ( , ) |
(2) log(1 ( , ))
(3)
x y input image
x y input image
FI F input image
magnitude x y FI x y FI x y
phase x y FI x y FI x y
FreqFeat magnitude x y
FreqFeat magnitude x y
FreqFeat
, _
| ( , ) |
x y input image
phase x y
}
6
/
5
,
3
/
2
,
2
/
,
3
/
,
6
/
,
0
{
}
2
,
5
.
1
,
1
,
5
{.
|
)
,
,
,
(
|
)
,
(
)
,
,
_
(
)
,
,
,
(
_
,
s
s
y
x
GI
s
y
gaborEnerg
s
image
input
G
o
s
y
x
GI
image
input
y
x
( ).log( ( )
entropy H I H I
(1) ( _ _ )
(2) ( _ _ )
EntropyFeat entropy Input image RGB
EntropyFeat entropy Input image gray
Diversity= the number of filled bins of histogram
Variability=the number of peaks of histogram
(1) (i _ _ )
(2) var ( _ _ )
colorHfeat diversity nput image RGB
colorHfeat iablity input image RGB
frequency orientation
color entropy
0
0.2
0.4
0.6
0.8
1
RI F P R
FreqFeat
GaborFeat
EntropyFeat
colorHFeat 0
0.2
0.4
0.6
0.8
1
RI F P R
FreqFeat
GaborFeat
EntropyFeat
colorHFeat
for im = 1 to size(ImageDataset) do
InputImage = ImageDataset(im)
BF = BasicF eatures(InputImage)
CF(im; :) = ComplexF eatures(BF)
end for
Cluster(CF)
Global strategy
for im = 1 to size(ImageDataset) do
image = ImageDataset(im)
for i = 1 to exploringWindowsNum do
InputImage = randomPatch(image)
BF(i; :) = BasicF eatures(InputImage)
CFi(i; :) = ComplexF eatures(BF; inputImage)
end for
CF(im; :) = Average(CFi)
end for
Cluster(CF)
Local strategy
0
0.2
0.4
0.6
0.8
1
RI F P R
FreqFeat
GaborFeat
EntropyFeat
colorHFeat
0
0.2
0.4
0.6
0.8
1
RI F P R
FreqFeat
GaborFetat
EntropyFeat
colorHFeat
Hemera
Caltech-coil
• We compared the distinction power of energy of low
visual features, i.e., color, orientation, and frequency
feature sets in an unsupervised manner.
Superordinate level: Artificial/Natural
Sadeghi, Z., Ahmadabadi, M. N., & Araabi, B. N. (2013). Unsupervised categorization of objects into artificial and natural
superordinate classes using features from low-level vision. International Journal of Image Processing, 7(4), 339-352.
4. Basic (intermediate) level: Animal/Plant
0 50 100
0
50
100
vertical projection
0 50 100
0
50
horizontal projection
0 50 100
0
50
100
left profile
0 50 100
0
50
100
right profile
0 50 100
20
40
60
top profile
0 50 100
20
40
60
bottom profile
0 50 100
0
50
100
vertical projection
0 50 100
0
50
100
horizontal projection
0 50 100
0
50
100
left profile
0 50 100
0
50
right profile
0 50 100
0
50
100
top profile
0 50 100
0
50
100
bottom profile
raw 25 50 75 100
0.7
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
0.88
hidden units
accuracy
h
v
l
r
t
b
unsupervised h v l r t b [h,v,t,b]
P 68.57 69.17 60.94 59.08 67.77 65.07 70.21
R 59.98 59.70 53.70 58.87 58.02 55.68 60.39
F1-score 63.99 64.09 57.09 58.97 62.52 60.01 64.93
accuracy 60.57 60.93 52.86 52.17 59.37 56.66 61.91
Shape descriptors: moment, profile an projection shape
descriptors
0
10
20
30
40
50
boys
girls
Children’s mental object representations: shape-concept
is the dominant strategy
Sadeghi, Z. (2019). Visual Categorization of Objects into Animal and Plant Classes Using Global Shape
Descriptors. arXiv preprint arXiv:1901.11398.
5. Utilizing hierarchical information
we defined dictionary of features based on PCA approach in two modes of flat and hierarchical subspaces.
Our aim was to point out the effect of contextual prior information in improving the accuracy of recognition.
0 2 4 6 8 10 12 14
10
20
30
40
0 2 4 6 8 10 12 14
0
20
40
60
0 2 4 6 8 10 12 14
0
20
40
60
0 2 4 6 8 10 12 14
0
20
40
60
0 2 4 6 8 10 12 14
10
20
30
40
0 2 4 6 8 10 12 14
10
20
30
40
0
20
40
60
cugar-body
elephant
flamingo
gerenuk
pigeon
rooster
bonsai
joshua
t
ree
lotus
strawberry
sunflower
water-lilly
0
20
40
60
cugar-body
elephant
flamingo
gerenuk
pigeon
rooster
bonsai
joshua
t
ree
lotus
strawberry
sunflower
water-lilly
cugar-body
elephant
flamingo
gerenuk
pigeon
rooster
bonsai
joshua
t
ree
lotus
strawberry
sunflower
water-lilly
flat
conceptual
Total #eigenvectors Half #eigenvectors
#nt=20
#nt=18
#nt=30
#nt=24
flat mode
( )
[ S, S] conceptual mode
T
AP
T T
A P
u S
subFeat S
u u
SVM classification
Flat mode
Hierarchical mode
Eigen spaces
Plant subspace
Animal subspace
Sadeghi, Z., Araabi, B. N., & Ahmadabadi, M. N. (2015). A computational approach towards visual object
recognition at taxonomic levels of concepts. Computational intelligence and neuroscience, 2015, 72-72.
6. Sadeghi, 2016
• Human knowledge structure of
visual form and appearance
information from a behavioral
dataset
• We trained a 3 layer deep belief
network on this dataset and
performed an unsupervised
learning scheme on the obtained
deep representations.
• There’s a progressive differentiation in layers of deep network that is trained on a data with hierarchical structure.
• In a distributed approach such as a neural network, connections are potentially sensitive to many kinds of structure.
Developmental learning in DNNs
• There’s a progression in
depth in hidden layers of
DBN where low level layers
represent finer distinctions
and high level layers
represent coarser
distinctions
Sadeghi, Z. (2016). Deep learning and developmental learning: emergence of fine-to-coarse conceptual categories at layers of deep belief network. Perception, 45(9), 1036-
1045.
7. 7
SIMILARITYMATRIXFOROBJECTSIN
THESCENEDATASET
SIMILARITY MATRIX FOROBJECTSIN
THEFEATUREDATASET
Featural vs. distributed approach in object representation
SIMILARITYMATRIXFOROBJECTSIN
THEVERBALDATASET
• like the feature similarity approach, representations
derived from the scene dataset revealed strong
relationships within category co-ordinates.
• unlike the feature approach, analysis of the scene
dataset also captured information about cross-category
associations.
• structure of semantic relationships
• Taxonomic similarity: object concepts are related to the
degree to which they share basic properties or features
• Associative similarity: concepts(words) are related to the
degree they occur in similar linguistic contexts
• Here, we investigated whether the second approach could be
applied to non-verbal patterns of object co-occurrence in
natural environments.
Sadeghi, Z., McClelland, J. L., & Hoffman, P. (2015). You shall know an object by the company it keeps: An investigation of semantic
representations derived from object co-occurrence in visual scenes. Neuropsychologia, 76, 52-61.
8. correlations matrix
hierarchical tree captures many strong similarity relations
(reflected by dark red color near the main diagonal) but also
misses many others (dark red colors not near the main diagonal).
Human knowledge
structure
Agglomerative clustering
A discrete structure may often provide an imperfect guide to the
full structure present in a data set.
we offer the view that semantic structure might best be captured by
a more flexible system of representation that can be sensitive to
multiple types of structure that may be present in a data set.
• The first dimension identifies aquatic vs. non-aquatic mammals;
• the second identifies predators vs. prey;
• and the third picks out the size dimension.
these are cross-cutting dimensions
McClelland, J. L., Sadeghi, Z., & Saxe, A. M. (2017). A Critique of Pure Hierarchy:
Uncovering Cross-Cutting Structure in a Natural Dataset. In Neurocomputational
Models of Cognitive Development and Processing: Proceedings of the 14th Neural
Computation and Psychology Workshop (pp. 51-68).
9. Neurons’ encoding and representation
• quantification of the degree of neurons’ responses to different classes
)
(
)
(
)
(
i
i
i
n
sparsity
n
flatness
n
MF
• Localist representation: neurons respond selectively to one thing
• Distributed representation: each unit or neuron is involved in coding many
different things
Multi-faceted degree quantifies the number of sources of information that stimulates each neuron
Rodrigo Quian Quiroga, Itzhak Fried and Christof Koch, 2013
SF Neuron MF Neuron
IC visualization can reveal the multi-responsive
feature of neurons based on the similarity
of the patterns captured by each component.
Sadeghi, Z. (2020). Conceptual Content in Deep Convolutional Neural Networks: An analysis into multi-
faceted properties of neurons. In Proceedings of International Joint Conference on Computational
Intelligence: IJCCI 2019 (pp. 19-31). Springer Singapore.
10. Attention in Object recognition
• Comparing the performance of model and human in rapid object recognition in the task of animal/non-animal classification
Linsely et al., 2018
Eberhardt et al., 2014
Importance maps from human view points in ordered time frame
While recognition accuracy increases with higher stages of
visual processing, human decisions agreed best with
predictions from intermediate stages.
• cueing deep nets to more meaningful
image regions derived from experimental
study
• Comparing performance of human and
model in different levels of occlusion
10
Sadeghi, Z. (2020). An Investigation on Performance of Attention Deep Neural Networks in Rapid Object
Recognition. In Intelligent Computing Systems: Third International Symposium, ISICS 2020, Sharjah,
United Arab Emirates, March 18–19, 2020, Proceedings 3 (pp. 1-10). Springer International Publishing.
11. + +
inconsistent case consistent case
Occluded object recognition
Hit const vs
hit inconst
Miss const vs
miss inconst
Sup hit const vs
sup hit inconst
Sup miss const vs
sup miss inconst
Hypo_pos1 vs
hypo_neg1
Hypo_pos2 vs
hypo_neg2
Resp-time
const vs
inconst
p-val 0.0027 0.0027 0.0027 0.0027 0.0027 0.0027 4.6921e-11
11
Sadeghi, Z. (2020). The effect of top-down
attention in occluded object recognition. arXiv
preprint arXiv:2007.10232.
12. Visualization and information analysis
Sadeghi, Z. (2019, September). An Information Analysis
Approach into Feature Understanding of Convolutional
Deep Neural Networks. In Machine Learning, Optimization,
and Data Science: 5th International Conference, LOD
2019, Siena, Italy, September 10–13, 2019,
Proceedings (pp. 36-44).