4. Perceived Tempo
Metrical ambiguity:
listeners don’t agree about bpm
typically in two camps
perceived values differ by factor of 2 or 3
McKinney and Moelants:
24-40 subjects
released experimental data
5. Perceived Tempo
Metrical ambiguity:
listeners
listeners
bpm bpm
McKinney and Moelants, 2004
6. Machine-Estimated Tempo
Also affected by metrical ambiguity:
makes estimation difficult
natural to see multiple bpm values
estimated values often out by factor of 2 or 3
(“octave error”)
9. Crowd Sourcing
Music:
over 4000 songs
30-second clips
• rock, country, pop, soul, funk and rnb, jazz,
latin, reggae, disco, rap, punk, electronic,
trance, industrial, house, folk, ...
• recent releases back to 60s
10. Response
First week (reported/released):
4k tracks annotated by 2k listeners
20k labels and bpm estimates
To date:
6k tracks annotated by 27k listeners
200k labels and bpm estimates
11. Analysis: ambiguity
When people tap to a song at different bpm
do they really disagree about whether it’s
slow or fast?
Investigation:
inspect labels from people who tap differently
quantify disagreement for ambiguous songs
12. Analysis: ambiguity
Subset of slow/fast songs:
labelled by at least five listeners
majority label “slow” or “fast”
16. Analysis: ambiguity
Quantify disagreement over labels:
model conflict, extremity of tempo
conflict coefficient
min(Ls , L f ) Ls Lf
C
max(Ls , L f ) L
Ls, Lf, L: number of slow, fast, all labels for a song
18. Analysis: ambiguity
Subset of metrically ambiguous songs:
at least 30% of listeners tap at half/twice the
majority estimate
Compared to the rest:
no significant difference in C
19. Evaluation metrics
MIREX:
capture metrical ambiguity
replicate human disagreement
Ambiguity considered unhelpful:
automatic playlisting
DJ tools, production tools
jogging
20. Evaluation metrics
Application-oriented :
compare with majority* human estimate
(*median in most popular bin)
categorise machine estimates
same as humans
twice as fast
twice as slow
three times as fast
and so on
unrelated to humans
22. Analysis: machine vs human
80%
70%
60%
50%
BPM List
40%
VAMP
30% EchoNest
20%
10%
0%
x2 same /2 unrelated other
23. Analysis: controlled test
Controlled comparison:
exploit experience from website A/B testing
use this to improve algorithm iteratively
Result is independent of any quality metric
24. Analysis: controlled test
When visitor arrives at the page:
choose a source S at random
choose a bpm value at random
choose two songs given that value by S
display them together
Then ask which sounds faster!
25. Analysis: controlled test
Null Hypothesis:
there will be presentation effects
listeners will attend to subtle differences
but
these effects are independent of the source
of bpm estimates
if the quality of the sources is the same
26. Analysis: controlled test
100%
90%
80%
70%
60%
50% different
40% same
30%
20%
10%
0%
BPM List VAMP EchoNest
27. Analysis: improving estimates
Adjust bpm based on class:
imagine an accurate slow/fast classifier
Hockmann and Fujinaga, 2010
adjust as follows:
bpm:= bpm/2 if slow and bpm > 100
bpm:= bpm*2 if fast and bpm < 100
otherwise don’t adjust
simulation: accept majority human label
28. Analysis: adjusted vs human
80%
70%
60%
50%
BPM List
40%
VAMP
30% EchoNest
20%
10%
0%
x2 same /2 unrelated other
29. Conclusions
Crowd sourcing:
gather thousands of data points in a few
days, half a million over time
humans agree over slow/fast labels, even
when they tap at different bpm
Improving machine estimates:
use controlled testing
exploit a slow/fast classifier
30. Thanks!
mark@last.fm @gamboviol
http://mir-in-action.blogspot.com
http://playground.last.fm/demo/speedo
http://users.last.fm/~mark/speedo.tgz
We are looking for interns/research fellows!