Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
TENSOR DECOMPOSITION WITH PYTHON
1. TENSOR DECOMPOSITION WITH PYTHON
LEARNING STRUCTURES FROM MULTIDIMENSIONAL DATA
ANDRÉ PANISSON
@apanisson
ISI Foundation, Torino & New York City
2. WHAT IS DATA DECOMPOSITION?
DECOMPOSITION == FACTORIZATION
Representation a dataset as a sum of (interpretable) parts
▸ Represent data as the combination of many components / factors
▸ Dimensionality reduction: each new dimension
represents a latent variable:
▸ text corpus => topics
▸ shopping behaviour => segments (user segmentation)
▸ social network => groups, communities
▸ psychology surveys => personality traits
▸ electronic medical records => health conditions
▸ chemical solutions => chemical ingredients
4. DATA DECOMPOSITION
▸ Decomposition of data represented in two dimensions:
MATRIX FACTORIZATION
▸ text: documents X terms
▸ surveys: subjects X questions
▸ electronic medical records: patients X diagnosis/drugs
▸ Decomposition of data represented in more dimensions:
TENSOR FACTORIZATION
▸ social networks: user X user (adjacency matrix) X time
▸ text: authors X terms X time
▸ spectroscopy:
solution sample X wavelength (emission) X wavelength (excitation)
5. WHY TENSOR FACTORIZATION + PYTHON?
▸ Matrix Factorization is already used in many fields
▸ Tensor Factorization is becoming very popular
for multiway data analysis
▸ TF is very useful to explore time-varying network data
▸ But still, the most used tool is Matlab
▸ There’s room for improvement in
the Python libraries for TF
7. FACTOR ANALYSIS
Spearman ~1900
X≈WH
Xtests x subjects ≈ Wtests x intelligences Hintelligences x subjects
Spearman, 1927: The abilities of man.
≈
tests
subjects subjects
tests
Int.
Int.
X W
H
8. TOPIC MODELING / LATENT SEMANTIC ANALYSIS
Blei, David M. "Probabilistic topic models." Communications of the ACM 55.4 (2012): 77-84.
. , ,
. , ,
. . .
gene
dna
genetic
life
evolve
organism
brai n
neuron
nerve
data
number
computer
. , ,
Topics Documents
Topic proportions and
assignments
0.04
0.02
0.01
0.04
0.02
0.01
0.02
0.01
0.01
0.02
0.02
0.01
data
number
computer
. , ,
0.02
0.02
0.01
9. TOPIC MODELING / LATENT SEMANTIC ANALYSIS
X≈WH
Non-negative Matrix Factorization (NMF):
(~1970 Lawson, ~1995 Paatero, ~2000 Lee & Seung)
2005 Gaussier et al. "Relation between PLSA and NMF and implications."
arg min
W,H
kX WHk s. t. W, H 0
≈
documents
terms terms
documents
topic
topic
Sparse
Matrix! W
H
10. NON-NEGATIVE MATRIX FACTORIZATION (NMF)
NMF gives Part based representation
(Lee & Seung – Nature 1999)
NMF
=×
Original
PCA
×
=
NMF is similar to Spectral Clustering
(Ding et al. - SDM 2005)
arg min
W,H
kX WHk s. t. W, H 0
W W •
XHT
WHHT
H H •
WT
X
WTWH
NMF brings interpretation!
11. from sklearn import datasets, decomposition, utils
digits = datasets.fetch_mldata('MNIST original')
A = utils.shuffle(digits.data)
nmf = decomposition.NMF(n_components=20)
W = nmf.fit_transform(A)
H = nmf.components_
plt.rc("image", cmap="binary")
plt.figure(figsize=(8,4))
for i in range(20):
plt.subplot(2,5,i+1)
plt.imshow(H[i].reshape(28,28))
plt.xticks(())
plt.yticks(())
plt.tight_layout()
13. BEYOND MATRICES: HIGH DIMENSIONAL DATASETS
Cichocki et al. Nonnegative Matrix and Tensor Factorizations
Environmental analysis
▸ Measurement as a function of (Location, Time, Variable)
Sensory analysis
▸ Score as a function of (Wine sample, Judge, Attribute)
Process analysis
▸ Measurement as a function of (Batch, Variable, time)
Spectroscopy
▸ Intensity as a function of (Wavelength, Retention, Sample, Time,
Location, …)
…
MULTIWAY DATA ANALYSIS
21. RANK-1 TENSOR
The outer product of N vectors results in a rank-1 tensor
array([[[ 1., 2.],
[ 2., 4.],
[ 3., 6.],
[ 4., 8.]],
[[ 2., 4.],
[ 4., 8.],
[ 6., 12.],
[ 8., 16.]],
[[ 3., 6.],
[ 6., 12.],
[ 9., 18.],
[ 12., 24.]]])
a = np.array([1, 2, 3])
b = np.array([1, 2, 3, 4])
c = np.array([1, 2])
T = np.zeros((a.shape[0], b.shape[0], c.shape[0]))
for i in range(a.shape[0]):
for j in range(b.shape[0]):
for k in range(c.shape[0]):
T[i, j, k] = a[i] * b[j] * c[k]
T = a(1)
· · · a(N)
=
a
c
b
Ti,j,k = a
(1)
i a
(2)
j a
(3)
k
22. TENSOR RANK
▸ Every tensor can be written as a sum of rank-1 tensors
=
a1 aJ
c1 cJ
b1 bJ
+ +
▸ Tensor rank: smallest number of rank-1 tensors
that can generate it by summing up
X ⇡
RX
r=1
a(1)
r a(2)
r · · · a(N)
r ⌘ JA(1)
, A(2)
, · · · , A(N)
K
T ⇡
RX
r=1
ar br cr ⌘ JA, B, CK
23. array([[[ 61., 82.],
[ 74., 100.],
[ 87., 118.],
[ 100., 136.]],
[[ 77., 104.],
[ 94., 128.],
[ 111., 152.],
[ 128., 176.]],
[[ 93., 126.],
[ 114., 156.],
[ 135., 186.],
[ 156., 216.]]])
A = np.array([[1, 2, 3],
[4, 5, 6]]).T
B = np.array([[1, 2, 3, 4],
[5, 6, 7, 8]]).T
C = np.array([[1, 2],
[3, 4]]).T
T = np.zeros((A.shape[0], B.shape[0], C.shape[0]))
for i in range(A.shape[0]):
for j in range(B.shape[0]):
for k in range(C.shape[0]):
for r in range(A.shape[1]):
T[i, j, k] += A[i, r] * B[j, r] * C[k, r]
T = np.einsum('ir,jr,kr->ijk', A, B, C)
: Kruskal Tensorbr cr ⌘ JA, B, CK
24. TENSOR FACTORIZATION
▸ CANDECOMP/PARAFAC factorization (CP)
▸ extensions of SVD / PCA / NMF to tensors
NON-NEGATIVE TENSOR FACTORIZATION
▸ Decompose a non-negative tensor to
a sum of R non-negative rank-1 tensors
arg min
A,B,C
kT JA, B, CKk
with JA, B, CK ⌘
RX
r=1
ar br cr
subject to A 0, B 0, C 0
25. TENSOR FACTORIZATION: HOW TO
Alternating Least Squares(ALS):
Fix all but one factor matrix to which LS is applied
min
A 0
kT(1) A(C B)T
k
min
B 0
kT(2) B(C A)T
k
min
C 0
kT(3) C(B A)T
k
denotes the Khatri-Rao product, which is a
column-wise Kronecker product, i.e., C B = [c1 ⌦ b1, c2 ⌦ b2, . . . , cr ⌦ br]
T(1) = ˆA(ˆC ˆB)T
T(2) = ˆB(ˆC ˆA)T
T(3) = ˆC(ˆB ˆA)T
Unfolded Tensor
on the kth mode
26. F = [zeros(n, r), zeros(m, r), zeros(o, r)]
FF_init = np.rand((len(F), r, r))
def iter_solver(T, F, FF_init):
# Update each factor
for k in range(len(F)):
# Compute the inner-product matrix
FF = ones((r, r))
for i in range(k) + range(k+1, len(F)):
FF = FF * FF_init[i]
# unfolded tensor times Khatri-Rao product
XF = T.uttkrp(F, k)
F[k] = F[k]*XF/(F[k].dot(FF))
# F[k] = nnls(FF, XF.T).T
FF_init[k] = (F[k].T.dot(F[k]))
return F, FF_init
min
A 0
kT(1) A(C B)T
k
min
B 0
kT(2) B(C A)T
k
min
C 0
kT(3) C(B A)T
k
arg min
W,H
kX WHk s.
J. Kim and H. Park. Fast Nonnegative Tensor Factorization with an Active-set-like Method.
In High-Performance Scientific Computing: Algorithms and Applications, Springer, 2012, pp. 311-326.
W W •
XHT
WHHT
T(1)(C B)
27. HOW TO INTERPRET: USER X TERM X TIME
X is a 3-way tensor in which
xnmt is 1 if the term m was used by user n at interval t,
0 otherwise
ANxK
is the the association of each user n to a factor k
BMxK
is the association of each term m to a factor k
CTxK
shows the time activity of each factor
users
users
C
=
X
A
B
(N×M×T)
(T×K)
(N×K)
(M×K)
terms
tim
e
tim
e
terms
factors