SlideShare a Scribd company logo
1 of 47
Download to read offline
Enabling open and reproducible research
at computer systems' conferences
the good, the bad and the ugly
CNRS webinarCNRS webinar
GrenobleGrenoble
March 2017March 2017
GrigoriGrigori FursinFursin
Chief Scientist, cTuning foundation, FranceChief Scientist, cTuning foundation, France
CTO,CTO, dividitidividiti, UK, UK
fursin.net / researchfursin.net / research
cTuning.org /cTuning.org / aeae
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 22
• What is computer systems research?
• Major problems in computer systems’ research in the past 15 years
• Artifact Evaluation Initiative
• Good: active community participation with ACM/industrial support
• Bad: software and hardware chaos; highly stochastic behaviour
• Ugly: ad-hoc, non-portable scripts difficult to customize and reuse
• Improving Artifact Evaluation
• Preparing common replication/reproducibility methodology
(new ACM taskforce on reproducibility)
• Introducing community-driven artifact and paper reviewing
• Introducing common workflow framework (Collective Knowledge)
• Introducing simple JSON API and meta for artifacts
• Introducing portable and customizable package manager
• Demonstrating open and reproducible computer systems’ research
• Collaboratively optimizing deep learning across diverse datasets/SW/HW
• Conclusions and call for action!
Seminar outline
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 33
Computer System
ResultResult
Back to basics: what is computer systems’ research?
Users of computer systems (researchers, engineers,
entrepreneurs) want to quickly prototype their algorithms
or develop efficient, reliable and cheap products
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 44
ResultResult
ProgramProgram
CompilersCompilers
Binary and librariesBinary and libraries
Hardware,Hardware,
Simulators
Run-time environmentRun-time environment
State of the systemState of the system Data setData set
AlgorithmAlgorithm
Back to basics: what is computer systems’ research?
Computer systems’ researchers,
software developers and hardware
designers help end-users
improve all characteristics of their tasks
(execution time, power consumption,
numerical instability, size, faults, price …)
guarantee real-time constraints
(bandwidth, QoS, etc)
StorageStorage
Users of computer systems (researchers, engineers,
entrepreneurs) want to quickly prototype their algorithms
or develop efficient, reliable and cheap products
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 55
ResultResult
ProgramProgram
CompilersCompilers
Binary and librariesBinary and libraries
Hardware,Hardware,
Simulators
Run-time environmentRun-time environment
State of the systemState of the system Data setData set
AlgorithmAlgorithm
Two major problems: raising complexity and physical limitations
StorageStorage
Finding efficient, reliable and cheap solution
for end-user tasks is very non-trivial!
Thousands of benchmarks real applications
MPI, OpenMP, TBB, CUDA, OpenCL, StarPU, OmpSs
C,C++,Fortran,Java,Python,assembler
LLVM,GCC,ICC,PGI (hundreds of optimizations)
BLAS,MAGMA,ViennaCL,cuBLAS,clBLAST,cuDNN,
openBLAS, clBLAS
TensorFlow, Caffe, Torch, TensorRT
Infinite number of possible data sets
Linux, Windows, Android, MacOS
heterogeneous, many-core, out-of-order, cache
x86, ARM, PTX, NN, extensions
Numerous architecture and platform simulators
Too many design and optimization choices!
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 66
What are you paying for?
Smart and powerful mobile devices
are everywhere (mobile phones, IoT)
Weather prediction;
physics; medicine; finances
Unchanged algorithms may run
only a fraction of peak
performance thus wasting
expensive resources and energy!
Some years later you may get
more efficient algorithms
just before new systems arrives!
Various SW/HW optimizations
may result in
7x speedups, 5x energy savings,
but poor accuracy
2x speedups without
sacrificing accuracy –
enough to enable RT processing
Users expect new platforms to be faster, more energy efficient
more accurate and more reliable – is it true?
New systems require further tedious, ad-hoc and error-prone optimization.
This slows down innovation and development of new products.
Mobiledevices,sensors,IoTHPCplatforms
Many attempts to run DNN,
HOG, Slam algorithms
Supercomputers and data
centers cost millions of $
to build, install and use
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 77
How can we solve these problems?
What did I learn from other sciences that deal with complex systems:
physics, mathematics, chemistry, biology, AI …?
Major breakthroughs came from collaborative and reproducible R&D based on
sharing, validation, systematization
and reuse of artifacts and knowledge!
statistical analysis,
data mining,
machine learning
My original background is NOT in computer engineering
but in physics, electronics and machine learning
(first research project in 1994 to develop artificial neural networks chip)
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 88
ResultResult
Idea
cTuning.org (2008-cur) – collaborative, machine learning-based optimization
cTuning1 framework and public portal to
1) share realistic benchmarks and data sets
2) share whole experimental setups
(benchmarking and autotuning)
3) crowdsource empirical optimization
4) collect results in a centralized repository
5) apply machine learning to predict
optimizations
6) involve the community to improve models
• G. Fursin et.al. MILEPOST GCC: Machine learning based self-tuning compiler. 2008, 2011
• G. Fursin. Collective Tuning Initiative: automating and accelerating development and
optimization of computing systems, 2009
ProgramProgram
CompilersCompilers
Binary and librariesBinary and libraries
Hardware,Hardware,
Simulators
Run-time environmentRun-time environment
State of the systemState of the system Data setData set
AlgorithmAlgorithm
StorageStorage
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 99
cTuning.org/ae (2014-cur.) – artifact evaluation at CGO,PPoPP,PACT,SC
“Artifact Evaluation for Publications” (Dagstuhl Perspectives Workshop 15452), 2016,
Bruce R. Childers, Grigori Fursin, Shriram Krishnamurthi, Andreas Zeller,
http://drops.dagstuhl.de/opus/volltexte/2016/5762
Numerous publications My personal AE goals
• validate experimental results from
published articles and restore trust
(see ACM TRUST’14 @ PLDI workshop:
cTuning.org/event/acm-trust2014)
• promote artifact sharing (benchmarks,
data sets, tools, models)
• enable fair comparison of results and
techniques
• develop common methodology for
reproducible computer systems’ research
• build upon others’ research
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1010
How Artifact Evaluation (AE) works?
Authors of accepted articles has an
option to submit related material for
an AE committee to be evaluated
http://cTuning.org/ae/submission.html
keep track of a methodology)
http://cTuning.org/ae/submission.html
• Abstract
• Packed artifact (or remote access)
• Artifact Appendix (with a version to
keep track of a methodology)
http://ctuning.org/ae/reviewing.html
Formalized reviewing process
Multiple criteria for artifact evaluation
Artifact ranking:
1.
2.
3.
4.
5.
http://ctuning.org/ae/reviewing.html
Formalized reviewing process
Multiple criteria for artifact evaluation
Artifact ranking:
1. Significantly exceeded expectations
2. Exceeded expectations
3. Met expectations
4. Fell below expectations
5. Significantly fell below expectations
PC members nominate one or two
senior PhD students/engineers
for AE committee
Pool of
reviewers!
Paper with artifacts which
passed evaluation receives
AE badge
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1111
Artifact evaluation timeline
time line
paper
accepted
paper
accepted
artifacts
submitted
artifacts
submitted
evaluatorevaluator
bidding
artifacts
assigned
artifacts
assigned
evaluationsevaluations
available
evaluationsevaluations
finalized
artifactartifact
decision
7..12 days
to prepare
artifacts
according to
guidelines:
cTuning.org/
submission.html
2..4 days
for evaluators
to bid on
artifacts
(according
to their
knowledge and
access to
required
SW/HW)
2 days
to assign
artifacts –
ensure at least
3 reviews
per artifact,
reduce risks,
avoid mix ups
minimize
conflicts of
interests
2 weeks
to review
artifacts
according to
guidelines:
cTuning.org/
reviewing.html
3..4 days
for authors to
respond to
reviews and fix
problems
2..3 days
to finalize
reviews
Light communication between
authors and reviewers
is allowed via AE chairs
(to preserve anonymity
of the reviewers)
2..3 days
to add
AE stamp
and AE appendix
to a camera-
ready paper
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1212
Artifact Evaluation: good
Year PPoPP CGO PACT Total Problems Rejected
2015 10 8 18 7 2 
2016 12 11 23 4 0
2016 5 5 2 0
2017 14 13 27 7 0
• Strong support from academia, industry and ACM
• Active participation in AE discussion sessions
• Lots of feedback
• Many interesting artifacts!
NOTE: we consider AE a cooperative process and try to help authors fix artifacts
and pass evaluation (particularly if artifacts will be open-sourced)
We use this practical experience (> 70 papers in the past 3 years!)
to continuously improve common experimental methodology
for reproducible computer systems’ research
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1313
In 2016 ACM organized a special taskforce (former AE chairs) to develop
common methodology for artifact sharing and evaluation across all SIGS!
We produced “Result and Artifact Review and Badging” policy:
http://www.acm.org/publications/policies/artifact-review-badging
1) Define terminology
Repeatability (Same team, same experimental setup)
Replicability (Different team, same experimental setup)
Reproducibility (Different team, different experimental setup)
2) Prepare new sets of badges (covering various SIGs)
Artifacts Evaluated – Functional
Artifacts Evaluated – Reusable
Artifacts Available
Results Replicated
Results Reproduced
ACM taskforce on reproducibility
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1414
We introduced Artifact Appendices to describe experiments
Two years ago we introduced Artifact Appendix templates to unify
Artifact submissions and let authors add up to two pages of such
appendices to their camera ready paper:
http://cTuning.org/ae/submission.html
http://cTuning.org/ae/submission_extra.html
The idea is to help readers better understand what was evaluated
and let them reproduce published research and build upon it.
We did not receive complaints about our appendices and many
researchers decided to add them to their camera ready papers
(see http://cTuning.org/ae/artifacts.html).
Similar AE appendices are now used by other conferences
(SC,RTSS):
http://sc17.supercomputing.org/submitters/technical-
papers/reproducibility-initiatives-for-technical-papers/artifact-description-
paper-title
We are now trying to unify Artifact Appendices across all SIGs
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1515
Artifact Evaluation: bad
• too many artifacts to evaluate – need to somehow scale AE while keeping the
quality (41 evaluators, ~120 reviews to handle during 2.5 weeks)
• difficult to find evaluators with appropriate skills and access
to proprietary SW and rare HW
• very intense schedule and not enough time for rebuttals
• communication between authors and reviewers via AE chairs is a bottleneck
time line
paper
accepted
paper
accepted
artifacts
submitted
artifacts
submitted
evaluatorevaluator
bidding
artifacts
assigned
artifacts
assigned
evaluationsevaluations
available
evaluationsevaluations
finalized
artifactartifact
decision
7..12 days 2..4 days 2 days 2 weeks 3..4 days 2..3 days 2..3 days
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1616
New option of open artifact evaluation!
time line
paper
accepted
paper
accepted
artifacts
submitted
artifacts
submitted
evaluatorevaluator
bidding
artifacts
assigned
artifacts
assigned
evaluationsevaluations
available
evaluationsevaluations
finalized
artifactartifact
decision
7..12 days 2..4 days 2 days 2 weeks 3..4 days 2..3 days 2..3 days
Introduce two evaluation options: private and public
a) traditional evaluation for private artifacts (for example, from industry, though less and less common)
time line
paper
accepted
paper
accepted
artifacts
submitted
artifacts
submitted
AE chairs announce
public artifacts at
XSEDE/GRID5000/etc
AE chairs announce
public artifacts at
XSEDE/GRID5000/etc
AE chairs monitor open
artifacts are evaluated
AE chairs monitor open
discussions until
artifacts are evaluated
artifactartifact
decision
any time 1..2 days from a few days to 2 weeks 3..4 days 2..3 days
b) open evaluation of public and open-source artifacts (if already avialable at
GitHub, BitBucket, GitLab with “discussion mechanisms” during submission…)
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1717
Trying open artifact and paper evaluation
At CGO/PPoPP’17, we have sent out requests to validate several open-source artifacts to the public
mailing lists from the conferences, network of excellence, supercomputer centers, etc.
We found evaluators willing to help and having an access to rare hardware or supercomputers
as well as required software and proprietary benchmarks
Authors quickly fixed issues and answered research questions while AE chairs steered the discussion!
GRID5000 users participated in open evaluation of a PPoPP’17 artifact:
https://github.com/thu-pacman/self-checkpoint/issues/1
See other public evaluation examples:
cTuning.org/ae/artifacts.html
We validated open reviewing of publications via Reddit at ADAPT’16:
http://adapt-workshop.org
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1818
Artifact Evaluation: what can possibly be ugly ;) ?
Algorithm
ProgramProgram
CompilersCompilers
Binary and librariesBinary and libraries
Hardware,Hardware,
Simulators
Run-time environmentRun-time environment
State of the systemState of the system Data setData set
AlgorithmAlgorithm
StorageStorage
ResultResult
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1919
Ugly: no common experimental methodology and SW/HW chaos
• difficult (sometimes impossible) to reproduce
empirical results across ever changing
software and hardware stack
(highly stochastic behavior)
• everyone uses their own ad-hoc scripts to
prepare and run experiments with many
hardwired paths
• practically impossible to customize and reuse
artifacts (for example, try another compiler,
library, data set)
• practically impossible to run on another OS
or platform
• no common API and meta information for
shared artifacts and results
(benchmarks, data sets, tools)ResultResult
Algorithm
GCC 4.1.x
GCC 4.2.x
GCC 4.3.x
GCC 4.4.x
GCC 4.5.x
GCC 4.6.x
GCC 4.7.x
ICC 10.1
ICC 11.0
ICC 11.1
ICC 12.0
ICC 12.1
LLVM 2.6
LLVM 2.7
LLVM 3.7
LLVM 2.9
LLVM 3.0
Phoenix
MVS 2013
XLC
Open64
Jikes
Testarossa
OpenMP MPI
HMPP
OpenCL
CUDA 4.x
gprofprof
perf
oprofile
PAPI
TAU
Scalasca
VTune
Amplifier
predictive
scheduling
algorithm-
level
TBB
MKL
ATLAS
program-
level
function-
level
Codelet
loop-level
hardware
counters
IPA
polyhedral
transformations
LTO
pass
reordering
per phase
reconfiguration
frequency
bandwidth
HDD size
TLB
memory size
execution time
GCC 5.2
LLVM 3.4
SVM
ARM v8
Intel SandyBridge
SSE4
AVX
CUDA 7.x
SimpleScalar
algorithm precision
Joint CGO/PPoPP Distinguished Artifact Award
Xiuxia Zhang1, Guangming Tan1, Shuangbai Xue1, Jiajia Li2, Mingyu Chen1
1 Chinese Academy of Sciences 2 Georgia Institute of Technology
February 2017
for
“Demystifying GPU Microarchitecture
to Tune SGEMM Performance”
The cTuning foundation and
NVIDIA are pleased to present
Awarding distinguished artifactsAwarding distinguished artifacts ––not enough!not enough!
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2121
ResultResult
VM
Using Virtual Machines and Docker
GCC 4.1.x
GCC 4.2.x
GCC 4.3.x
GCC 4.4.x
GCC 4.5.x
GCC 4.6.x
GCC 4.7.x
ICC 10.1
ICC 11.0
ICC 11.1
ICC 12.0
ICC 12.1
LLVM 2.6
LLVM 2.7
LLVM 3.7
LLVM 2.9
LLVM 3.0
Phoenix
MVS 2013
XLC
Open64
Jikes
Testarossa
OpenMP MPI
HMPP
OpenCL
CUDA 4.x
gprofprof
perf
oprofile
PAPI
TAU
Scalasca
VTune
Amplifier
predictive
scheduling
algorithm-
level
TBB
MKL
ATLAS
program-
level
function-
level
Codelet
loop-level
hardware
counters
IPA
polyhedral
transformations
LTO
pass
reordering
per phase
reconfiguration
bandwidth
HDD size
TLB
memory size
execution time
GCC 5.2
LLVM 3.4
SVM
ARM v8
Intel SandyBridge
SSE4
AVX
CUDA 7.x
SimpleScalar
algorithm precision
VM or Docker images
good at hiding complexity
but they do not solve
major problems
in computer systems’ research
Docker is useful to archive artifacts
while hiding all underlying complexity!
However Docker does not address many other
issues vital for open and collaborative
computer systems’ research, i.e. how to
1) work with a native user SW/HW environment
2) customize and reuse artifacts and workflows
3) capture run-time state
4) deal with hardware dependencies
5) deal with proprietary benchmarks and tools
6) automate validation of experiments
Images are also often too large in size!
Need portable and customizable
workflow framework suitable for
computer systems’ research!
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2222
Some attempts to clean up this mess in computer systems’ R&D
ResultResult
Algorithm
GCC 4.1.x
GCC 4.2.x
GCC 4.3.x
GCC 4.4.x
GCC 4.5.x
GCC 4.6.x
GCC 4.7.x
ICC 10.1
ICC 11.0
ICC 11.1
ICC 12.0
ICC 12.1
LLVM 2.6
LLVM 2.7
LLVM 3.7
LLVM 2.9
LLVM 3.0
Phoenix
MVS 2013
XLC
Open64
Jikes
Testarossa
OpenMP MPI
HMPP
OpenCL
CUDA 4.x
gprofprof
perf
oprofile
PAPI
TAU
Scalasca
VTune
Amplifier
predictive
scheduling
algorithm-
level
TBB
MKL
ATLAS
program-
level
function-
level
Codelet
loop-level
hardware
counters
IPA
polyhedral
transformations
LTO
pass
reordering
per phase
reconfiguration
frequency
bandwidth
HDD size
TLB
memory size
execution time
GCC 5.2
LLVM 3.4
SVM
ARM v8
Intel SandyBridge
SSE4
AVX
CUDA 7.x
SimpleScalar
algorithm precision
Better package managersBetter package managers
apt; yum; spack;
pip ; homebrew …
Smart build toolsSmart build tools
cmake; easybuild …
Multi-versioningMulti-versioning
spack; easybuild; virtualenv …
Third-party resources
cKnowledge.org/reproducibility
Numerous online projects
But we want to systematize our local artifacts
Numerous online projects
But we want to systematize our local artifacts
without being locked up on third-party
web services or learn complex GUI
Missing: portable and customizable workflow
framework suitable for computer systems’ research!
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2323
Open-source Collective Knowledge Framework (2015-cur.)
ResultResult
Algorithm
GCC 4.1.x
GCC 4.2.x
GCC 4.3.x
GCC 4.4.x
GCC 4.5.x
GCC 4.6.x
GCC 4.7.x
ICC 10.1
ICC 11.0
ICC 11.1
ICC 12.0
ICC 12.1
LLVM 2.6
LLVM 2.7
LLVM 3.7
LLVM 2.9
LLVM 3.0
Phoenix
MVS 2013
XLC
Open64
Jikes
Testarossa
OpenMP MPI
HMPP
OpenCL
CUDA 4.x
gprofprof
perf
oprofile
PAPI
TAU
Scalasca
VTune
Amplifier
predictive
scheduling
algorithm-
level
TBB
MKL
ATLAS
program-
level
function-
level
Codelet
loop-level
hardware
counters
IPA
polyhedral
transformations
LTO
pass
reordering
per phase
reconfiguration
frequency
bandwidth
HDD size
TLB
memory size
execution time
GCC 5.2
LLVM 3.4
SVM
ARM v8
Intel SandyBridge
SSE4
AVX
CUDA 7.x
SimpleScalar
algorithm precision
cKnowledge.org ; github.com/ctuning/ck
Eventually we didn’t have a choices but to use all
our experience and develop portable and
customizable workflow framework
for computer systems’ research:
1) organize your artifacts (programs, data sets,
tools, scripts) as customizable and reusable
components with JSON API and meta data
2) assemble portable and customizable
workflows with JSON API from shared artifacts
as LEGO ™
3) integrate portable package and environment
manager which can detect multiple versions
of required software or installed one across
Linux, MacOS, Windows and Android
4) integrate web server to show interactive
reports (locally!) or exchange data when
crowdsourcing experiments
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2424
ResultResult
ProgramProgram
CompilerCompiler
Binary and librariesBinary and libraries
ArchitectureArchitecture
Run-time environmentRun-time environment
AlgorithmAlgorithm
Idea
Noticed during AE: all projects have similar structure
Data setData setState of the systemState of the system
image corner detectionimage corner detection
matmul OpenCLmatmul OpenCL
compressioncompression
neural network CUDAneural network CUDA
Ad-hoc scripts to
compile and run a
program…
Have some
common meta:
which datasets
can use, how to
compile, CMD, …
image-jpeg-0001image-jpeg-0001
bzip2-0006bzip2-0006
txt-0012txt-0012
video-raw-1280x1024video-raw-1280x1024
Ad-hoc dirs for data
sets with some ad-hoc
scripts to find them,
extract features, etc
Have some
(common)
meta:
filename, size,
width, height,
colors, …
Ad-hoc scripts to set
up environment for
a given and possibly
proprietary compiler
Have some
common meta:
compilation,
linking and
optimization
flags
Create project directory
Ad-hoc dirs and
scripts to record
and analyze
experiments
cvs speedupscvs speedups
txt hardware counterstxt hardware counters
xls table with graphxls table with graph
Have some
common meta:
features,
characteristics,
optimizations
GCC V5.2GCC V5.2
LLVM 3.6LLVM 3.6
Intel Compilers 2015Intel Compilers 2015
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2525
image corner detectionimage corner detection
matmul OpenCLmatmul OpenCL
compressioncompression
neural network CUDAneural network CUDA
meta.json
image-jpeg-0001image-jpeg-0001
bzip2-0006bzip2-0006
txt-0012txt-0012
video-raw-1280x1024video-raw-1280x1024
meta.json
meta.json
meta.json
GCC V5.2GCC V5.2
LLVM 3.6LLVM 3.6
Intel Compilers 2015Intel Compilers 2015
Python module
compile and run
Python module
“program”
with functions:
compile and run
Python module
setup
Python module
“soft”
with function:
setup
Python module
extract_features
Python module
“dataset”
with function:
extract_features
Python module
add, get, analyze
Python module
“experiment”
with function:
add, get, analyze
Collective Knowledge: organize and share artifacts as reusable components
data UID and alias
cvs speedupscvs speedups
txt hardware counterstxt hardware counters
xls table with graphxls table with graph
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2626
image corner detectionimage corner detection
matmul OpenCLmatmul OpenCL
compressioncompression
neural network CUDAneural network CUDA
image-jpeg-0001image-jpeg-0001
bzip2-0006bzip2-0006
txt-0012txt-0012
video-raw-1280x1024video-raw-1280x1024
GCC V5.2GCC V5.2
LLVM 3.6LLVM 3.6
Intel Compilers 2015Intel Compilers 2015
Python module
compile and run
Python module
“program”
with functions:
compile and run
Python module
setup
Python module
“soft”
with function:
setup
Python module
extract_features
Python module
“dataset”
with function:
extract_features
Python module
add, get, analyze
Python module
“experiment”
with function:
add, get, analyze
Collective Knowledge: organize and share your artifacts as reusable components
data UID and alias
JSON
input
JSON
input
JSON
input
JSON
input
JSON
output
JSON
output
JSON
output
JSON
output
CK: small python module (~200Kb); no extra dependencies; Linux; Win; MacOS
ck <function> <module>:<data UID> @input.json
meta.json
meta.json
meta.json
meta.json
cvs speedupscvs speedups
txt hardware counterstxt hardware counters
xls table with graphxls table with graph
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2727
image corner detectionimage corner detection
matmul OpenCLmatmul OpenCL
compressioncompression
neural network CUDAneural network CUDA
image-jpeg-0001image-jpeg-0001
bzip2-0006bzip2-0006
txt-0012txt-0012
video-raw-1280x1024video-raw-1280x1024
GCC V5.2GCC V5.2
LLVM 3.6LLVM 3.6
Intel Compilers 2015Intel Compilers 2015
Python module
compile and run
Python module
“program”
with functions:
compile and run
Python module
setup
Python module
“soft”
with function:
setup
Python module
extract_features
Python module
“dataset”
with function:
extract_features
Python module
add, get, analyze
Python module
“experiment”
with function:
add, get, analyze
Collective Knowledge: organize and share your artifacts as reusable components
data UID and alias
JSON
input
JSON
output
CK: small python module (~200Kb); no extra dependencies; Linux; Win; MacOS
Connect into workflows as LEGO™
meta.json
meta.json
meta.json
meta.json
cvs speedupscvs speedups
txt hardware counterstxt hardware counters
xls table with graphxls table with graph
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2828
Local repository structure (with internal UID and meta)!
program
soft
image corner detectionimage corner detection
matmul OpenCLmatmul OpenCL
compressioncompression
neural network CUDAneural network CUDA
gcc 5.2gcc 5.2
llvm 3.6llvm 3.6
icc 2015icc 2015
dataset image-jpeg-0001image-jpeg-0001
bzip2-0006bzip2-0006
video-raw-1280x1024video-raw-1280x1024
……
……
……
……
……
……
module programprogram
softsoft
datasetdataset
……
……
……
/ module UID and alias / data UID or alias / .cm / meta.json
CK
local
project
repo
experimentexperiment ……
……
……
……
……
Both code (with API) and data (with meta) inside repository
Can be referenced and cross-linked via CID (similar to DOI):
module UOA : data UOA
Local files –
no locks up on
third-party
web services
No complex GUI
Can be shared
via GIT/SVN/etc
Can be easily
indexed via
ElasticSearch to
speed up search
Can be easily
connected to
Python-based
predictive
analytics
(sklearn-kit)
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2929
Provide customizable and SW/HW independent access to tools
CLBLast
OpenBLASOpenBLAS
LLVM 3.8LLVM 3.8
GCC 7.0
CUDA 8.0CUDA 8.0 cuBLAS “Black box”“Black box”
“big data”
database
Hardwired
experiment
scripts /
cmake
Ad-hoc
scripts to
process CSV,
XLS, TXT, etc.Caffe OpenCLCaffe OpenCL
Caffe CUDACaffe CUDA
Caffe CPU
JSONAPI(input)JSONAPI(input)
Source files and auxiliary scriptsSource files and auxiliary scripts
CK program entry (ck list program)CK program entry (ck list program)
$ ck pull repo:ck-autotuning
$ ck compile program:cbench-automotive-susan –speed
$ ck run program:cbench-automotive-susan
$ ck run program:caffe
Implement SW/HW independent, reusable
and shareable CK modules with unified CMD
and JSON API for various experimental
scenarios (such as program module)
.cm/meta.json – describes soft dependencies ,
data sets, and how to compile and run this program
.cm/meta.json – describes soft dependencies ,
data sets, and how to compile and run this program
Read
program
Read
program
meta
Detect all softwareDetect all software
dependencies; ask user if
multiple versions exists
Prepare
environment
CompileCompile
program
Run
program
JSONAPI(output)JSONAPI(output)
CK program module
Use program entries to describe a given program
in meta.json and store sources (if needed)
Ad-hoc experimental workflows
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3030
local / env / 03ca0be16962f471 / env.sh
Tags: compiler,cuda,v8.0
local / env / 03ca0be16962f471 / env.sh
Tags: compiler,cuda,v8.0
local / env / 0a5ba198d48e3af3 / env.bat
Tags: lib,blas,cublas,v8.0
local / env / 0a5ba198d48e3af3 / env.bat
Tags: lib,blas,cublas,v8.0
Soft entries in CK describe how
to detect if a given software is
already installed, how to set up
all its environment including
all paths (to binaries, libraries,
include, aux tools, etc),
and how to detect its version.
$ ck detect soft --tags=compiler,cuda$ ck detect soft --tags=compiler,cuda
$ ck detect soft:compiler.gcc$ ck detect soft:compiler.gcc
$ ck detect soft:compiler.llvm --target_os=android19-arm$ ck detect soft:compiler.llvm --target_os=android19-arm
$ ck list soft:compiler*$ ck list soft:compiler*
$ ck detect soft:lib.cublas$ ck detect soft:lib.cublas
Env entries are created in CK local
repo for all found software
instances together with their meta
and an auto-generated environment
script env.sh (on Linux) or env.bat
(on Windows).
Package entries describe how to
install a given software if it is not
installed (using install.sh script on
Linux host or install.bat on
Windows host).
$ ck install package:caffemodel-bvlc-googlenet$ ck install package:caffemodel-bvlc-googlenet
$ ck install package:imagenet-2012-val$ ck install package:imagenet-2012-val
$ ck install package:lib-caffe-bvlc-master-cuda$ ck install package:lib-caffe-bvlc-master-cuda
$ ck list package:*caffemodel*$ ck list package:*caffemodel*
LocalCKrepoLocalCKrepo
Portable and customizable package manager in the CK
$ ck search soft --tags=blas$ ck search soft --tags=blas
$ ck show env$ ck show env
$ ck show env –tags=cublas$ ck show env –tags=cublas
$ ck rm env:* –tags=cublas$ ck rm env:* –tags=cublas
$ ck search package –tags=caffe$ ck search package –tags=caffe
Works on Linux, Windows, MacOS, Android
Detects multiple versions of various software
Builds missing software
Extended by the community
github.com/ctuning/ck-env
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3131
Customizable and reusable workflows with JSON API as LEGO™
local / env / 0a5ba198d48e3af3 / env.bat
Tags: lib,blas,cublas,v8.0
local / env / 0a5ba198d48e3af3 / env.bat
Tags: lib,blas,cublas,v8.0
Chaining CK modules to an experimental pipeline via JSON API to quickly prototype research ideas
Public modular auto-tuning and machine
learning repository and buildbot
Unified
web services Interdisciplinary crowd
Choose
exploration
strategy
Select choices (program,
data set, compiler, flags,
opts, hardware design …)
Compile
program
or kernel
Run
program
or kernel
Perform
statistical
analysis
Apply
Pareto
filter
Model and
predict choices
and behavior
Reduce
complexity
Shared scenarios from past research
…
Expose and unify information needed for performance
analysis and optimization combined with data mining!
Optimizing/modeling behavior b
of any object in the CK (program, library function, kernel, …)
as a function of design and optimization choices c, features f
and run-time state s
b = B( c , f , s )
… … … …
Flattened JSON vectors
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3232
Gradually add JSON specification (depends on research scenario)
Autotuning and machine learning specification:
{
"characteristics":{
"execution times": ["10.3","10.1","13.3"],
"code size": "131938", ...},
"choices":{
"os":"linux", "os version":"2.6.32-5-amd64",
"compiler":"gcc", "compiler version":"4.6.3",
"compiler_flags":"-O3 -fno-if-conversion",
"platform":{"processor":"intel xeon e5520",
"l2":"8192“, ...}, ...},
"features":{
"semantic features": {"number_of_bb": "24", ...},
"hardware counters": {"cpi": "1.4" ...}, ... }
"state":{
"frequency":"2.27", ...}
}
CK flattened JSON key
##characteristics#execution_times@1
"flattened_json_key”:{
"type": "text”|"integer" | “float" | "dict" | "list”
| "uid",
"characteristic": "yes" | "no",
"feature": "yes" | "no",
"state": "yes" | "no",
"has_choice": "yes“ | "no",
"choices": [ list of strings if categorical
choice],
"explore_start": "start number if numerical
range",
"explore_stop": "stop number if numerical
range",
"explore_step": "step if numerical range",
"can_be_omitted" : "yes" | "no"
...
}
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3333
• "Collective Mind: Towards practical and collaborative autotuning“,
Journal of Scientific Programming 22 (4), 2014
http://hal.inria.fr/hal-01054763
• “Collective Mind, Part II: Towards Performance- and Cost-Aware
Software Engineering as a Natural Science”, CPC 2015, London, UK
http://arxiv.org/abs/1506.06256
• Android application to crowdsource autotuning across mobile devices:
https://play.google.com/store/apps/details?id=openscience.crowdsource.experiments
• “Collective Knowledge: towards R&D sustainability”, DATE 2016, Dresden, Germany
http://bit.ly/ck-date16
• “Optimizing convolutional neural networks on embedded platforms with OpenCL”,
IWOCL 2016, Amsterdam
http://dl.acm.org/citation.cfm?id=2909449
The community now use CK to collaboratively tackle old problems
Crowdsource performance analysis and optimization, machine-learning,
run-time adaptation across diverse workloads and hardware provided
by volunteers
$ sudo pip install ck
$ ck pull repo:ck-crowdtuning
$ ck crowdtune program --gcc
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3434
Reproducibility came as a side effect!
• Can preserve the whole experimental setup with all data and software dependencies
• Can perform statistical analysis for characteristics
• Community can add missing features or improve machine learning models
Execution time:
10 sec.
Reproducibility of experimental results as a side effect
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3535
Reproducibility came as a side effect!
• Can preserve the whole experimental setup with all data and software dependencies
• Can perform statistical analysis for characteristics
• Community can add missing features or improve machine learning models
Variation of experimental results:
10 ± 5 secs.
Reproducibility of experimental results as a side effect
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3636
Execution time (sec.)
Distribution
Unexpected behavior - expose to the community including experts
to explain, find missing feature and add to the system
Reproducibility of experimental results as a side effect
Reproducibility came as a side effect!
• Can preserve the whole experimental setup with all data and software dependencies
• Can perform statistical analysis for characteristics
• Community can add missing features or improve machine learning models
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3737
Execution time (sec.)
Distribution
Class A Class B
800MHz CPU Frequency 2400MHz
Unexpected behavior - expose to the community including experts
to explain, find missing feature and add to the system
Reproducibility of experimental results as a side effect
Reproducibility came as a side effect!
• Can preserve the whole experimental setup with all data and software dependencies
• Can perform statistical analysis for characteristics
• Community can add missing features or improve machine learning models
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3838
Gradually extend/fix shared workflows by the community during AE
•Init pipeline
•Detected system information
•Initialize parameters
•Prepare dataset
•Clean program
•Prepare compiler flags
•Use compiler profiling
•Use cTuning CC/MILEPOST GCC for fine-grain program analysis and tuning
•Use universal Alchemist plugin (with any OpenME-compatible compiler or tool)
•Use Alchemist plugin (currently for GCC)
•Compile program
•Get objdump and md5sum (if supported)
•Use OpenME for fine-grain program analysis and online tuning (build & run)
•Use 'Intel VTune Amplifier' to collect hardware counters
•Use 'perf' to collect hardware counters
•Set frequency (in Unix, if supported)
•Get system state before execution
•Run program
•Check output for correctness (use dataset UID to save different outputs)
•Finish OpenME
•Misc info
•Observed characteristics
•Observed statistical characteristics
•Finalize pipeline
We’ve shared and continuously extend our
experimental workflow for autotuning:
http://github.com/ctuning/ck-autotuning
http://cknowledge.org/repo
http://github.com/ctuning
• Hundreds of benchmarks/kernels/codelets
(CPU, OpenMP, OpenCL, CUDA)
• Thousands of data sets
• Wrappers around all main tools and libs
• Optimization description of major compilers
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3939
Distinguished CGO’17 artifact in Collective Knowledge format
Highest ranked artifact at CGO’17 turned out to be implemented
using Collective Knowledge Framework
"Software Prefetching for Indirect Memory Accesses",
Sam Ainsworth and Timothy M. Jones
https://github.com/SamAinsworth/reproduce-cgo2017-paper
It take advantage of a portable package manager
to install required LLVM for either x86 or ARM platforms,
automatically build LLVM plugins,
run empirical experiments on a user machine via CK workflow,
compare speedups with pre-recorded results by authors,
and prepare interactive report
See PDF with Artifact Evaluation Appendix:
http://ctuning.org/ae/resources/paper-with-distinguished-ck-artifact-and-ae-appendix-cgo2017.pdf
See snapshot of an interactive CK dashboard:
https://github.com/SamAinsworth/reproduce-cgo2017-paper/files/618737/ck-aarch64-dashboard.pdf
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4040
Collective Knowledge mini-tutorials
Create repository: ck add repo:my_new_project
Add new module: ck add my_new_project:module:my_module
Add new data for this module: ck add my_new_project:my_module:my_data @@dict
{“tags”:”cool”,”data”}
Add dummy function to module: ck add_action my_module –func=my_func
Test dummy function: ck my_func my_module --param1=var1
List my_module data: ck list my_module
Find data by tags: ck search my_module –tags=cool
Archive repository: ck zip repo:my_new_project
Pull existing repo from GitHub: ck pull repo:ck-autotuning
List modules from this repo: ck list ck-autotuning:module:*
Compile program (using GCC): ck compile program: cbench-automotive-susan --speed
Run program: ck run program: cbench-automotive-susan
Start server for crowdsourcing: ck start web
View interactive articles: ck browser http://localhost:3344
https://michel-steuwer.github.io/About-CK/
https://github.com/ctuning/ck/wiki/Compiler-autotuning
http://github.com/ctuning/ck/wiki
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4141
CK demo: workflow to collaboratively optimizing DNN
• “Deep” (multi-layered) neural networks that take advantage of
structure in digital signals (e.g. images).
• State-of-the-art for applications in automotive, robotics,
healthcare, entertainment (e.g. image classification).
• Commonly trained on workstations or clusters with GPGPUs.
• Increasingly deployed on mobile and embedded systems.
Difficult and time consuming to build and optimize
due to multiple programming models, multiple objectives
(performance, energy, accuracy, memory footprint)
and continuously evolving hardware
particularly with constrained resources (sensors, IoT)
Perfect use case for Collective Knowledge Framework!
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4242
Deploying DNNs on resource-constrained platforms
• Can we deploy a DNN to achieve desired performance
(rate and accuracy of recognition) on a given platform?
• Can we identify or build such a platform under given
constraints such as those on power, memory, price?
• If all else fails, can we design (autotune) another DNN
(topology) by trading off performance, accuracy and costs?
NVIDIA Drive PX2 ~ 250 Watts ~ $15,000 (early
development boards)
~ $1,000 per unit?
? < 10 Watts < $100
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4343
Customizable CK workflow to collaborative co-design DNN
We collaborate with General Motors and ARM to develop open-source
CK-based tools to collaboratively evaluate and optimize various DNN engines
and models across diverse platforms and datasets
cKnowledge.org/ai
cKnowledge.org/repo
Download Android app to participate in collaborative evaluation
and optimization of DNN using your mobile device and CK web-service:
https://play.google.com/store/apps/details?id=openscience.crowdsource.video.experiments
CK portable workflow systems and package manager allows users to run DNN
experiments on Windows, Linux, MacOS, Android!
$ ck pull repo --url=http://github.com/dividiti/ck-caffe
$ ck install package:lib-caffe-bvlc-master-cpu-universal
$ ck install package:lib-caffe-bvlc-opencl-viennacl-universal --target_os=android21-arm64
$ ck run program:caffe-classification
$ firefox http://cKnowledge.org/repo
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4444
• Collective Knowledge approach helps fight SW/HW chaos
• CK is also changing the mentality of computer systems’ researchers:
• sharing artifacts as customizable and reusable components with JSON API
• sharing customizable and portable experimental workflows
• building a repository of representative benchmarks and data sets
• crowdsourcing experiments and sharing negative/unexpected results
• collaboratively improving reproducibility of empirical experiments
• using data mining (statistical analysis and machine learning) to predict
efficient software optimizations and hardware designs
• collaboratively improving prediction models and finding missing features
• formulating and solving important real-world problems
• CK brings closer together industry and academia (common research
methodology, reproducible experiments, validated techniques)
cKnowledge.org: from ad-hoc computer systems’ research to data science
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4545
GCC 4.1.x
GCC 4.2.x
GCC 4.3.x
GCC 4.4.x
GCC 4.5.x
GCC 4.6.x
GCC 4.7.x
ICC 10.1
ICC 11.0
ICC 11.1
ICC 12.0
ICC 12.1
LLVM 2.6
LLVM 2.7
LLVM 2.8
LLVM 2.9
LLVM 3.1
Phoenix
MVS XLC
Open64
Jikes
Testarossa
OpenMP
MPI
HMPP
OpenCL
CUDA
gprof
prof
perf
oprofile
PAPI
TAU
Scalasca
VTune
Amplifier
scheduling
algorithm-level
TBB
MKL
ATLASprogram-level
function-level
Codelet
loop-level
hardware
counters
IPA
polyhedral
transformations
LTO
threads
process pass reordering
run-time adaptation
per phase
reconfiguration
cache size
frequency
bandwidth
HDD size
TLB
ISA
memory size
coresprocessors
threads
power consumption
execution time reliability
Current state of computer engineering
likwid
Classification,Classification,
predictive
modeling
Optimal
solutions
Systematization and unification
of collective knowledge
(big data)
“crowd”
The community
Task, idea
Result
Quick, non-reproducible hack?
Ad-hoc heuristic?
Quick publication?
Waste of expensive resources
and energy?
Collaboratively optimize shared
programs and data sets across
diverse hardware
Extrapolate shared knowledge to build fast, energy efficient, reliable and cheap
computer systems to boost innovation in science and technology!
Collective
Knowledge
open RDK
and repository
cKnowledge.org: enable open & reproducible computer systems’ research!
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4646
Help us improve artifact evaluation!
We need your feedback - remember that new AE procedures
may affect you at the future conferences
• AE steering committee: http://cTuning.org/ae/committee.html
• Mailing list: https://groups.google.com/forum/#!forum/collective-knowledge
Feel free to reuse and improve our Artifact Evaluation procedures
and Artifact Appendices:
http://cTuning.org/ae https://github.com/ctuning/ck-web-artifact-evaluation
Extra resources
• ACM Result and Artifact Review and Badging policy:
http://www.acm.org/publications/policies/artifact-review-badging
• Evaluation of frameworks for reproducible R&D:
https://wilkie.github.io/reproducibility-site
• Community driven artifact evaluation and paper reviewing:
http://dl.acm.org/citation.cfm?doid=2618137.2618142
Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4747
Questions, comments, collaborations?
Many new exciting opportunities for collaboration
Artifact sharing, customizable and portable workflows, experiment crowdsourcing,
collaborative deep learning optimization, public repositores of knowledge, adaptive systems
Acknowledgments

More Related Content

What's hot

OpenACC Monthly Highlights: February 2021
OpenACC Monthly Highlights: February 2021OpenACC Monthly Highlights: February 2021
OpenACC Monthly Highlights: February 2021OpenACC
 
OpenACC Monthly Highlights: March 2021
OpenACC Monthly Highlights: March 2021OpenACC Monthly Highlights: March 2021
OpenACC Monthly Highlights: March 2021OpenACC
 
OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC
 
OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020OpenACC
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Jason Dai
 
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...Keiichiro Ono
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Gael Varoquaux
 
Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.Alison B. Lowndes
 
Some "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data frontSome "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data frontGreg Landrum
 
AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology Intel® Software
 
OpenACC Monthly Highlights: July 2020
OpenACC Monthly Highlights: July 2020OpenACC Monthly Highlights: July 2020
OpenACC Monthly Highlights: July 2020OpenACC
 

What's hot (12)

OpenACC Monthly Highlights: February 2021
OpenACC Monthly Highlights: February 2021OpenACC Monthly Highlights: February 2021
OpenACC Monthly Highlights: February 2021
 
OpenACC Monthly Highlights: March 2021
OpenACC Monthly Highlights: March 2021OpenACC Monthly Highlights: March 2021
OpenACC Monthly Highlights: March 2021
 
OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020OpenACC Monthly Highlights: June 2020
OpenACC Monthly Highlights: June 2020
 
OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020OpenACC Monthly Highlights: May 2020
OpenACC Monthly Highlights: May 2020
 
OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020OpenACC Monthly Highlights: October2020
OpenACC Monthly Highlights: October2020
 
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
Automated ML Workflow for Distributed Big Data Using Analytics Zoo (CVPR2020 ...
 
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
 
Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016Scikit-learn: the state of the union 2016
Scikit-learn: the state of the union 2016
 
Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.Harnessing AI for the Benefit of All.
Harnessing AI for the Benefit of All.
 
Some "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data frontSome "challenges" on the open-source/open-data front
Some "challenges" on the open-source/open-data front
 
AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology AI for All: Biology is eating the world & AI is eating Biology
AI for All: Biology is eating the world & AI is eating Biology
 
OpenACC Monthly Highlights: July 2020
OpenACC Monthly Highlights: July 2020OpenACC Monthly Highlights: July 2020
OpenACC Monthly Highlights: July 2020
 

Viewers also liked

Women in Islam & Refutation of some Common Misconceptions
Women in Islam & Refutation of some Common MisconceptionsWomen in Islam & Refutation of some Common Misconceptions
Women in Islam & Refutation of some Common MisconceptionsIslamic Invitation
 
Calendrier des activités de JEADER _ AFRIQUE _ 2016
Calendrier des activités de JEADER  _ AFRIQUE _ 2016 Calendrier des activités de JEADER  _ AFRIQUE _ 2016
Calendrier des activités de JEADER _ AFRIQUE _ 2016 JEADER
 
employee-awareness-and-training-the-holy-grail-of-cybersecurity
employee-awareness-and-training-the-holy-grail-of-cybersecurityemployee-awareness-and-training-the-holy-grail-of-cybersecurity
employee-awareness-and-training-the-holy-grail-of-cybersecurityPaul Ferrillo
 
Product Cost Analytics solution overview
Product Cost Analytics solution overviewProduct Cost Analytics solution overview
Product Cost Analytics solution overviewSridhar Pai
 
창원오피톡 창원오피방 창원오피 오피톡[ optok4] 창원op 창원건마 창원건전마사지
창원오피톡 창원오피방 창원오피 오피톡[ optok4] 창원op 창원건마 창원건전마사지창원오피톡 창원오피방 창원오피 오피톡[ optok4] 창원op 창원건마 창원건전마사지
창원오피톡 창원오피방 창원오피 오피톡[ optok4] 창원op 창원건마 창원건전마사지tok opop
 
East Coast Transport-MACRO POINT CASE STUDY
East Coast Transport-MACRO POINT CASE STUDYEast Coast Transport-MACRO POINT CASE STUDY
East Coast Transport-MACRO POINT CASE STUDYPaul Berman
 
Los 12 mandamientos del periodista
Los 12 mandamientos del periodistaLos 12 mandamientos del periodista
Los 12 mandamientos del periodistaIgnacioRamosMancheno
 
A Administração e a Contabilidade : Uma relação que gera vantagem competitiva...
A Administração e a Contabilidade : Uma relação que gera vantagem competitiva...A Administração e a Contabilidade : Uma relação que gera vantagem competitiva...
A Administração e a Contabilidade : Uma relação que gera vantagem competitiva...Cra-es Conselho
 
10 tendencias e-commerce_y_porqué_ec_oscooting_es_la_solución
10 tendencias e-commerce_y_porqué_ec_oscooting_es_la_solución10 tendencias e-commerce_y_porqué_ec_oscooting_es_la_solución
10 tendencias e-commerce_y_porqué_ec_oscooting_es_la_soluciónRobain de Jong
 
Effectively Using Technology For Leading and Learning
Effectively Using Technology For Leading and LearningEffectively Using Technology For Leading and Learning
Effectively Using Technology For Leading and LearningBobby Dodd
 

Viewers also liked (14)

Portadas nacionales 22 marzo-17
Portadas nacionales 22 marzo-17Portadas nacionales 22 marzo-17
Portadas nacionales 22 marzo-17
 
Women in Islam & Refutation of some Common Misconceptions
Women in Islam & Refutation of some Common MisconceptionsWomen in Islam & Refutation of some Common Misconceptions
Women in Islam & Refutation of some Common Misconceptions
 
Calendrier des activités de JEADER _ AFRIQUE _ 2016
Calendrier des activités de JEADER  _ AFRIQUE _ 2016 Calendrier des activités de JEADER  _ AFRIQUE _ 2016
Calendrier des activités de JEADER _ AFRIQUE _ 2016
 
employee-awareness-and-training-the-holy-grail-of-cybersecurity
employee-awareness-and-training-the-holy-grail-of-cybersecurityemployee-awareness-and-training-the-holy-grail-of-cybersecurity
employee-awareness-and-training-the-holy-grail-of-cybersecurity
 
Product Cost Analytics solution overview
Product Cost Analytics solution overviewProduct Cost Analytics solution overview
Product Cost Analytics solution overview
 
창원오피톡 창원오피방 창원오피 오피톡[ optok4] 창원op 창원건마 창원건전마사지
창원오피톡 창원오피방 창원오피 오피톡[ optok4] 창원op 창원건마 창원건전마사지창원오피톡 창원오피방 창원오피 오피톡[ optok4] 창원op 창원건마 창원건전마사지
창원오피톡 창원오피방 창원오피 오피톡[ optok4] 창원op 창원건마 창원건전마사지
 
East Coast Transport-MACRO POINT CASE STUDY
East Coast Transport-MACRO POINT CASE STUDYEast Coast Transport-MACRO POINT CASE STUDY
East Coast Transport-MACRO POINT CASE STUDY
 
Sri Lankan Tours
Sri Lankan  ToursSri Lankan  Tours
Sri Lankan Tours
 
Los 12 mandamientos del periodista
Los 12 mandamientos del periodistaLos 12 mandamientos del periodista
Los 12 mandamientos del periodista
 
TRANS LIFER
TRANS LIFERTRANS LIFER
TRANS LIFER
 
Mrs 2017 Orientation
Mrs 2017 OrientationMrs 2017 Orientation
Mrs 2017 Orientation
 
A Administração e a Contabilidade : Uma relação que gera vantagem competitiva...
A Administração e a Contabilidade : Uma relação que gera vantagem competitiva...A Administração e a Contabilidade : Uma relação que gera vantagem competitiva...
A Administração e a Contabilidade : Uma relação que gera vantagem competitiva...
 
10 tendencias e-commerce_y_porqué_ec_oscooting_es_la_solución
10 tendencias e-commerce_y_porqué_ec_oscooting_es_la_solución10 tendencias e-commerce_y_porqué_ec_oscooting_es_la_solución
10 tendencias e-commerce_y_porqué_ec_oscooting_es_la_solución
 
Effectively Using Technology For Leading and Learning
Effectively Using Technology For Leading and LearningEffectively Using Technology For Leading and Learning
Effectively Using Technology For Leading and Learning
 

Similar to Enabling open and reproducible computer systems research: the good, the bad and the ugly

CK: from ad hoc computer engineering to collaborative and reproducible data s...
CK: from ad hoc computer engineering to collaborative and reproducible data s...CK: from ad hoc computer engineering to collaborative and reproducible data s...
CK: from ad hoc computer engineering to collaborative and reproducible data s...Grigori Fursin
 
Collective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimizationCollective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimizationGrigori Fursin
 
Panel at acm_sigplan_trust2014
Panel at acm_sigplan_trust2014Panel at acm_sigplan_trust2014
Panel at acm_sigplan_trust2014Grigori Fursin
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilChristian Frech
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science researchAnubhav Jain
 
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...3TU.Datacentrum
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
H2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User GroupH2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User GroupSri Ambati
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Lionel Briand
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
20072311272506
2007231127250620072311272506
20072311272506Vinod Vyas
 
Role of python in hpc
Role of python in hpcRole of python in hpc
Role of python in hpcDr Reeja S R
 
A personal journey towards more reproducible networking research
A personal journey towards more reproducible networking researchA personal journey towards more reproducible networking research
A personal journey towards more reproducible networking researchOlivier Bonaventure
 
Artifact Evaluation Experience CGO'15 / PPoPP'15
Artifact Evaluation Experience CGO'15 / PPoPP'15Artifact Evaluation Experience CGO'15 / PPoPP'15
Artifact Evaluation Experience CGO'15 / PPoPP'15Grigori Fursin
 
Open Engineering Framework
Open Engineering FrameworkOpen Engineering Framework
Open Engineering FrameworkJohn Vogel
 
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersAlan Sill
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringRafael Ferreira da Silva
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuantUniversity
 

Similar to Enabling open and reproducible computer systems research: the good, the bad and the ugly (20)

CK: from ad hoc computer engineering to collaborative and reproducible data s...
CK: from ad hoc computer engineering to collaborative and reproducible data s...CK: from ad hoc computer engineering to collaborative and reproducible data s...
CK: from ad hoc computer engineering to collaborative and reproducible data s...
 
Collective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimizationCollective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimization
 
Panel at acm_sigplan_trust2014
Panel at acm_sigplan_trust2014Panel at acm_sigplan_trust2014
Panel at acm_sigplan_trust2014
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and Anduril
 
Software tools to facilitate materials science research
Software tools to facilitate materials science researchSoftware tools to facilitate materials science research
Software tools to facilitate materials science research
 
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
[3.6] Beyond Data Sharing - Pieter van Gorp [3TU.Datacentrum Symposium 2014, ...
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Available HPC Resources at CSUC
Available HPC Resources at CSUCAvailable HPC Resources at CSUC
Available HPC Resources at CSUC
 
H2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User GroupH2O with Erin LeDell at Portland R User Group
H2O with Erin LeDell at Portland R User Group
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
20072311272506
2007231127250620072311272506
20072311272506
 
Role of python in hpc
Role of python in hpcRole of python in hpc
Role of python in hpc
 
A personal journey towards more reproducible networking research
A personal journey towards more reproducible networking researchA personal journey towards more reproducible networking research
A personal journey towards more reproducible networking research
 
Artifact Evaluation Experience CGO'15 / PPoPP'15
Artifact Evaluation Experience CGO'15 / PPoPP'15Artifact Evaluation Experience CGO'15 / PPoPP'15
Artifact Evaluation Experience CGO'15 / PPoPP'15
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 
Open Engineering Framework
Open Engineering FrameworkOpen Engineering Framework
Open Engineering Framework
 
Cloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for DevelopersCloud Standards in the Real World: Cloud Standards Testing for Developers
Cloud Standards in the Real World: Cloud Standards Testing for Developers
 
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven EngineeringBridging Concepts and Practice in eScience via Simulation-driven Engineering
Bridging Concepts and Practice in eScience via Simulation-driven Engineering
 
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
 

Recently uploaded

LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2AuEnriquezLontok
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learningvschiavoni
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxMedical College
 
Total Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of CannabinoidsTotal Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of CannabinoidsMarkus Roggen
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPRPirithiRaju
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterHanHyoKim
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfGABYFIORELAMALPARTID1
 
complex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfcomplex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfSubhamKumar3239
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxpriyankatabhane
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPirithiRaju
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxzeus70441
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456
 
Probability.pptx, Types of Probability, UG
Probability.pptx, Types of Probability, UGProbability.pptx, Types of Probability, UG
Probability.pptx, Types of Probability, UGSoniaBajaj10
 
cybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitationcybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitationSanghamitraMohapatra5
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaEGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaDr.Mahmoud Abbas
 

Recently uploaded (20)

LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
LESSON PLAN IN SCIENCE GRADE 4 WEEK 1 DAY 2
 
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep LearningCombining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
Combining Asynchronous Task Parallelism and Intel SGX for Secure Deep Learning
 
Introduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptxIntroduction of Human Body & Structure of cell.pptx
Introduction of Human Body & Structure of cell.pptx
 
Total Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of CannabinoidsTotal Legal: A “Joint” Journey into the Chemistry of Cannabinoids
Total Legal: A “Joint” Journey into the Chemistry of Cannabinoids
 
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
6.2 Pests of Sesame_Identification_Binomics_Dr.UPR
 
final waves properties grade 7 - third quarter
final waves properties grade 7 - third quarterfinal waves properties grade 7 - third quarter
final waves properties grade 7 - third quarter
 
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdfKDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
KDIGO-2023-CKD-Guideline-Public-Review-Draft_5-July-2023.pdf
 
Ultrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptxUltrastructure and functions of Chloroplast.pptx
Ultrastructure and functions of Chloroplast.pptx
 
complex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdfcomplex analysis best book for solving questions.pdf
complex analysis best book for solving questions.pdf
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptxEnvironmental Acoustics- Speech interference level, acoustics calibrator.pptx
Environmental Acoustics- Speech interference level, acoustics calibrator.pptx
 
Introduction Classification Of Alkaloids
Introduction Classification Of AlkaloidsIntroduction Classification Of Alkaloids
Introduction Classification Of Alkaloids
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPR
 
Abnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptxAbnormal LFTs rate of deco and NAFLD.pptx
Abnormal LFTs rate of deco and NAFLD.pptx
 
CHROMATOGRAPHY PALLAVI RAWAT.pptx
CHROMATOGRAPHY  PALLAVI RAWAT.pptxCHROMATOGRAPHY  PALLAVI RAWAT.pptx
CHROMATOGRAPHY PALLAVI RAWAT.pptx
 
Probability.pptx, Types of Probability, UG
Probability.pptx, Types of Probability, UGProbability.pptx, Types of Probability, UG
Probability.pptx, Types of Probability, UG
 
cybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitationcybrids.pptx production_advanges_limitation
cybrids.pptx production_advanges_limitation
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer ZahanaEGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
EGYPTIAN IMPRINT IN SPAIN Lecture by Dr Abeer Zahana
 

Enabling open and reproducible computer systems research: the good, the bad and the ugly

  • 1. Enabling open and reproducible research at computer systems' conferences the good, the bad and the ugly CNRS webinarCNRS webinar GrenobleGrenoble March 2017March 2017 GrigoriGrigori FursinFursin Chief Scientist, cTuning foundation, FranceChief Scientist, cTuning foundation, France CTO,CTO, dividitidividiti, UK, UK fursin.net / researchfursin.net / research cTuning.org /cTuning.org / aeae
  • 2. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 22 • What is computer systems research? • Major problems in computer systems’ research in the past 15 years • Artifact Evaluation Initiative • Good: active community participation with ACM/industrial support • Bad: software and hardware chaos; highly stochastic behaviour • Ugly: ad-hoc, non-portable scripts difficult to customize and reuse • Improving Artifact Evaluation • Preparing common replication/reproducibility methodology (new ACM taskforce on reproducibility) • Introducing community-driven artifact and paper reviewing • Introducing common workflow framework (Collective Knowledge) • Introducing simple JSON API and meta for artifacts • Introducing portable and customizable package manager • Demonstrating open and reproducible computer systems’ research • Collaboratively optimizing deep learning across diverse datasets/SW/HW • Conclusions and call for action! Seminar outline
  • 3. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 33 Computer System ResultResult Back to basics: what is computer systems’ research? Users of computer systems (researchers, engineers, entrepreneurs) want to quickly prototype their algorithms or develop efficient, reliable and cheap products
  • 4. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 44 ResultResult ProgramProgram CompilersCompilers Binary and librariesBinary and libraries Hardware,Hardware, Simulators Run-time environmentRun-time environment State of the systemState of the system Data setData set AlgorithmAlgorithm Back to basics: what is computer systems’ research? Computer systems’ researchers, software developers and hardware designers help end-users improve all characteristics of their tasks (execution time, power consumption, numerical instability, size, faults, price …) guarantee real-time constraints (bandwidth, QoS, etc) StorageStorage Users of computer systems (researchers, engineers, entrepreneurs) want to quickly prototype their algorithms or develop efficient, reliable and cheap products
  • 5. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 55 ResultResult ProgramProgram CompilersCompilers Binary and librariesBinary and libraries Hardware,Hardware, Simulators Run-time environmentRun-time environment State of the systemState of the system Data setData set AlgorithmAlgorithm Two major problems: raising complexity and physical limitations StorageStorage Finding efficient, reliable and cheap solution for end-user tasks is very non-trivial! Thousands of benchmarks real applications MPI, OpenMP, TBB, CUDA, OpenCL, StarPU, OmpSs C,C++,Fortran,Java,Python,assembler LLVM,GCC,ICC,PGI (hundreds of optimizations) BLAS,MAGMA,ViennaCL,cuBLAS,clBLAST,cuDNN, openBLAS, clBLAS TensorFlow, Caffe, Torch, TensorRT Infinite number of possible data sets Linux, Windows, Android, MacOS heterogeneous, many-core, out-of-order, cache x86, ARM, PTX, NN, extensions Numerous architecture and platform simulators Too many design and optimization choices!
  • 6. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 66 What are you paying for? Smart and powerful mobile devices are everywhere (mobile phones, IoT) Weather prediction; physics; medicine; finances Unchanged algorithms may run only a fraction of peak performance thus wasting expensive resources and energy! Some years later you may get more efficient algorithms just before new systems arrives! Various SW/HW optimizations may result in 7x speedups, 5x energy savings, but poor accuracy 2x speedups without sacrificing accuracy – enough to enable RT processing Users expect new platforms to be faster, more energy efficient more accurate and more reliable – is it true? New systems require further tedious, ad-hoc and error-prone optimization. This slows down innovation and development of new products. Mobiledevices,sensors,IoTHPCplatforms Many attempts to run DNN, HOG, Slam algorithms Supercomputers and data centers cost millions of $ to build, install and use
  • 7. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 77 How can we solve these problems? What did I learn from other sciences that deal with complex systems: physics, mathematics, chemistry, biology, AI …? Major breakthroughs came from collaborative and reproducible R&D based on sharing, validation, systematization and reuse of artifacts and knowledge! statistical analysis, data mining, machine learning My original background is NOT in computer engineering but in physics, electronics and machine learning (first research project in 1994 to develop artificial neural networks chip)
  • 8. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 88 ResultResult Idea cTuning.org (2008-cur) – collaborative, machine learning-based optimization cTuning1 framework and public portal to 1) share realistic benchmarks and data sets 2) share whole experimental setups (benchmarking and autotuning) 3) crowdsource empirical optimization 4) collect results in a centralized repository 5) apply machine learning to predict optimizations 6) involve the community to improve models • G. Fursin et.al. MILEPOST GCC: Machine learning based self-tuning compiler. 2008, 2011 • G. Fursin. Collective Tuning Initiative: automating and accelerating development and optimization of computing systems, 2009 ProgramProgram CompilersCompilers Binary and librariesBinary and libraries Hardware,Hardware, Simulators Run-time environmentRun-time environment State of the systemState of the system Data setData set AlgorithmAlgorithm StorageStorage
  • 9. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 99 cTuning.org/ae (2014-cur.) – artifact evaluation at CGO,PPoPP,PACT,SC “Artifact Evaluation for Publications” (Dagstuhl Perspectives Workshop 15452), 2016, Bruce R. Childers, Grigori Fursin, Shriram Krishnamurthi, Andreas Zeller, http://drops.dagstuhl.de/opus/volltexte/2016/5762 Numerous publications My personal AE goals • validate experimental results from published articles and restore trust (see ACM TRUST’14 @ PLDI workshop: cTuning.org/event/acm-trust2014) • promote artifact sharing (benchmarks, data sets, tools, models) • enable fair comparison of results and techniques • develop common methodology for reproducible computer systems’ research • build upon others’ research
  • 10. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1010 How Artifact Evaluation (AE) works? Authors of accepted articles has an option to submit related material for an AE committee to be evaluated http://cTuning.org/ae/submission.html keep track of a methodology) http://cTuning.org/ae/submission.html • Abstract • Packed artifact (or remote access) • Artifact Appendix (with a version to keep track of a methodology) http://ctuning.org/ae/reviewing.html Formalized reviewing process Multiple criteria for artifact evaluation Artifact ranking: 1. 2. 3. 4. 5. http://ctuning.org/ae/reviewing.html Formalized reviewing process Multiple criteria for artifact evaluation Artifact ranking: 1. Significantly exceeded expectations 2. Exceeded expectations 3. Met expectations 4. Fell below expectations 5. Significantly fell below expectations PC members nominate one or two senior PhD students/engineers for AE committee Pool of reviewers! Paper with artifacts which passed evaluation receives AE badge
  • 11. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1111 Artifact evaluation timeline time line paper accepted paper accepted artifacts submitted artifacts submitted evaluatorevaluator bidding artifacts assigned artifacts assigned evaluationsevaluations available evaluationsevaluations finalized artifactartifact decision 7..12 days to prepare artifacts according to guidelines: cTuning.org/ submission.html 2..4 days for evaluators to bid on artifacts (according to their knowledge and access to required SW/HW) 2 days to assign artifacts – ensure at least 3 reviews per artifact, reduce risks, avoid mix ups minimize conflicts of interests 2 weeks to review artifacts according to guidelines: cTuning.org/ reviewing.html 3..4 days for authors to respond to reviews and fix problems 2..3 days to finalize reviews Light communication between authors and reviewers is allowed via AE chairs (to preserve anonymity of the reviewers) 2..3 days to add AE stamp and AE appendix to a camera- ready paper
  • 12. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1212 Artifact Evaluation: good Year PPoPP CGO PACT Total Problems Rejected 2015 10 8 18 7 2  2016 12 11 23 4 0 2016 5 5 2 0 2017 14 13 27 7 0 • Strong support from academia, industry and ACM • Active participation in AE discussion sessions • Lots of feedback • Many interesting artifacts! NOTE: we consider AE a cooperative process and try to help authors fix artifacts and pass evaluation (particularly if artifacts will be open-sourced) We use this practical experience (> 70 papers in the past 3 years!) to continuously improve common experimental methodology for reproducible computer systems’ research
  • 13. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1313 In 2016 ACM organized a special taskforce (former AE chairs) to develop common methodology for artifact sharing and evaluation across all SIGS! We produced “Result and Artifact Review and Badging” policy: http://www.acm.org/publications/policies/artifact-review-badging 1) Define terminology Repeatability (Same team, same experimental setup) Replicability (Different team, same experimental setup) Reproducibility (Different team, different experimental setup) 2) Prepare new sets of badges (covering various SIGs) Artifacts Evaluated – Functional Artifacts Evaluated – Reusable Artifacts Available Results Replicated Results Reproduced ACM taskforce on reproducibility
  • 14. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1414 We introduced Artifact Appendices to describe experiments Two years ago we introduced Artifact Appendix templates to unify Artifact submissions and let authors add up to two pages of such appendices to their camera ready paper: http://cTuning.org/ae/submission.html http://cTuning.org/ae/submission_extra.html The idea is to help readers better understand what was evaluated and let them reproduce published research and build upon it. We did not receive complaints about our appendices and many researchers decided to add them to their camera ready papers (see http://cTuning.org/ae/artifacts.html). Similar AE appendices are now used by other conferences (SC,RTSS): http://sc17.supercomputing.org/submitters/technical- papers/reproducibility-initiatives-for-technical-papers/artifact-description- paper-title We are now trying to unify Artifact Appendices across all SIGs
  • 15. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1515 Artifact Evaluation: bad • too many artifacts to evaluate – need to somehow scale AE while keeping the quality (41 evaluators, ~120 reviews to handle during 2.5 weeks) • difficult to find evaluators with appropriate skills and access to proprietary SW and rare HW • very intense schedule and not enough time for rebuttals • communication between authors and reviewers via AE chairs is a bottleneck time line paper accepted paper accepted artifacts submitted artifacts submitted evaluatorevaluator bidding artifacts assigned artifacts assigned evaluationsevaluations available evaluationsevaluations finalized artifactartifact decision 7..12 days 2..4 days 2 days 2 weeks 3..4 days 2..3 days 2..3 days
  • 16. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1616 New option of open artifact evaluation! time line paper accepted paper accepted artifacts submitted artifacts submitted evaluatorevaluator bidding artifacts assigned artifacts assigned evaluationsevaluations available evaluationsevaluations finalized artifactartifact decision 7..12 days 2..4 days 2 days 2 weeks 3..4 days 2..3 days 2..3 days Introduce two evaluation options: private and public a) traditional evaluation for private artifacts (for example, from industry, though less and less common) time line paper accepted paper accepted artifacts submitted artifacts submitted AE chairs announce public artifacts at XSEDE/GRID5000/etc AE chairs announce public artifacts at XSEDE/GRID5000/etc AE chairs monitor open artifacts are evaluated AE chairs monitor open discussions until artifacts are evaluated artifactartifact decision any time 1..2 days from a few days to 2 weeks 3..4 days 2..3 days b) open evaluation of public and open-source artifacts (if already avialable at GitHub, BitBucket, GitLab with “discussion mechanisms” during submission…)
  • 17. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1717 Trying open artifact and paper evaluation At CGO/PPoPP’17, we have sent out requests to validate several open-source artifacts to the public mailing lists from the conferences, network of excellence, supercomputer centers, etc. We found evaluators willing to help and having an access to rare hardware or supercomputers as well as required software and proprietary benchmarks Authors quickly fixed issues and answered research questions while AE chairs steered the discussion! GRID5000 users participated in open evaluation of a PPoPP’17 artifact: https://github.com/thu-pacman/self-checkpoint/issues/1 See other public evaluation examples: cTuning.org/ae/artifacts.html We validated open reviewing of publications via Reddit at ADAPT’16: http://adapt-workshop.org
  • 18. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1818 Artifact Evaluation: what can possibly be ugly ;) ? Algorithm ProgramProgram CompilersCompilers Binary and librariesBinary and libraries Hardware,Hardware, Simulators Run-time environmentRun-time environment State of the systemState of the system Data setData set AlgorithmAlgorithm StorageStorage ResultResult
  • 19. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 1919 Ugly: no common experimental methodology and SW/HW chaos • difficult (sometimes impossible) to reproduce empirical results across ever changing software and hardware stack (highly stochastic behavior) • everyone uses their own ad-hoc scripts to prepare and run experiments with many hardwired paths • practically impossible to customize and reuse artifacts (for example, try another compiler, library, data set) • practically impossible to run on another OS or platform • no common API and meta information for shared artifacts and results (benchmarks, data sets, tools)ResultResult Algorithm GCC 4.1.x GCC 4.2.x GCC 4.3.x GCC 4.4.x GCC 4.5.x GCC 4.6.x GCC 4.7.x ICC 10.1 ICC 11.0 ICC 11.1 ICC 12.0 ICC 12.1 LLVM 2.6 LLVM 2.7 LLVM 3.7 LLVM 2.9 LLVM 3.0 Phoenix MVS 2013 XLC Open64 Jikes Testarossa OpenMP MPI HMPP OpenCL CUDA 4.x gprofprof perf oprofile PAPI TAU Scalasca VTune Amplifier predictive scheduling algorithm- level TBB MKL ATLAS program- level function- level Codelet loop-level hardware counters IPA polyhedral transformations LTO pass reordering per phase reconfiguration frequency bandwidth HDD size TLB memory size execution time GCC 5.2 LLVM 3.4 SVM ARM v8 Intel SandyBridge SSE4 AVX CUDA 7.x SimpleScalar algorithm precision
  • 20. Joint CGO/PPoPP Distinguished Artifact Award Xiuxia Zhang1, Guangming Tan1, Shuangbai Xue1, Jiajia Li2, Mingyu Chen1 1 Chinese Academy of Sciences 2 Georgia Institute of Technology February 2017 for “Demystifying GPU Microarchitecture to Tune SGEMM Performance” The cTuning foundation and NVIDIA are pleased to present Awarding distinguished artifactsAwarding distinguished artifacts ––not enough!not enough!
  • 21. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2121 ResultResult VM Using Virtual Machines and Docker GCC 4.1.x GCC 4.2.x GCC 4.3.x GCC 4.4.x GCC 4.5.x GCC 4.6.x GCC 4.7.x ICC 10.1 ICC 11.0 ICC 11.1 ICC 12.0 ICC 12.1 LLVM 2.6 LLVM 2.7 LLVM 3.7 LLVM 2.9 LLVM 3.0 Phoenix MVS 2013 XLC Open64 Jikes Testarossa OpenMP MPI HMPP OpenCL CUDA 4.x gprofprof perf oprofile PAPI TAU Scalasca VTune Amplifier predictive scheduling algorithm- level TBB MKL ATLAS program- level function- level Codelet loop-level hardware counters IPA polyhedral transformations LTO pass reordering per phase reconfiguration bandwidth HDD size TLB memory size execution time GCC 5.2 LLVM 3.4 SVM ARM v8 Intel SandyBridge SSE4 AVX CUDA 7.x SimpleScalar algorithm precision VM or Docker images good at hiding complexity but they do not solve major problems in computer systems’ research Docker is useful to archive artifacts while hiding all underlying complexity! However Docker does not address many other issues vital for open and collaborative computer systems’ research, i.e. how to 1) work with a native user SW/HW environment 2) customize and reuse artifacts and workflows 3) capture run-time state 4) deal with hardware dependencies 5) deal with proprietary benchmarks and tools 6) automate validation of experiments Images are also often too large in size! Need portable and customizable workflow framework suitable for computer systems’ research!
  • 22. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2222 Some attempts to clean up this mess in computer systems’ R&D ResultResult Algorithm GCC 4.1.x GCC 4.2.x GCC 4.3.x GCC 4.4.x GCC 4.5.x GCC 4.6.x GCC 4.7.x ICC 10.1 ICC 11.0 ICC 11.1 ICC 12.0 ICC 12.1 LLVM 2.6 LLVM 2.7 LLVM 3.7 LLVM 2.9 LLVM 3.0 Phoenix MVS 2013 XLC Open64 Jikes Testarossa OpenMP MPI HMPP OpenCL CUDA 4.x gprofprof perf oprofile PAPI TAU Scalasca VTune Amplifier predictive scheduling algorithm- level TBB MKL ATLAS program- level function- level Codelet loop-level hardware counters IPA polyhedral transformations LTO pass reordering per phase reconfiguration frequency bandwidth HDD size TLB memory size execution time GCC 5.2 LLVM 3.4 SVM ARM v8 Intel SandyBridge SSE4 AVX CUDA 7.x SimpleScalar algorithm precision Better package managersBetter package managers apt; yum; spack; pip ; homebrew … Smart build toolsSmart build tools cmake; easybuild … Multi-versioningMulti-versioning spack; easybuild; virtualenv … Third-party resources cKnowledge.org/reproducibility Numerous online projects But we want to systematize our local artifacts Numerous online projects But we want to systematize our local artifacts without being locked up on third-party web services or learn complex GUI Missing: portable and customizable workflow framework suitable for computer systems’ research!
  • 23. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2323 Open-source Collective Knowledge Framework (2015-cur.) ResultResult Algorithm GCC 4.1.x GCC 4.2.x GCC 4.3.x GCC 4.4.x GCC 4.5.x GCC 4.6.x GCC 4.7.x ICC 10.1 ICC 11.0 ICC 11.1 ICC 12.0 ICC 12.1 LLVM 2.6 LLVM 2.7 LLVM 3.7 LLVM 2.9 LLVM 3.0 Phoenix MVS 2013 XLC Open64 Jikes Testarossa OpenMP MPI HMPP OpenCL CUDA 4.x gprofprof perf oprofile PAPI TAU Scalasca VTune Amplifier predictive scheduling algorithm- level TBB MKL ATLAS program- level function- level Codelet loop-level hardware counters IPA polyhedral transformations LTO pass reordering per phase reconfiguration frequency bandwidth HDD size TLB memory size execution time GCC 5.2 LLVM 3.4 SVM ARM v8 Intel SandyBridge SSE4 AVX CUDA 7.x SimpleScalar algorithm precision cKnowledge.org ; github.com/ctuning/ck Eventually we didn’t have a choices but to use all our experience and develop portable and customizable workflow framework for computer systems’ research: 1) organize your artifacts (programs, data sets, tools, scripts) as customizable and reusable components with JSON API and meta data 2) assemble portable and customizable workflows with JSON API from shared artifacts as LEGO ™ 3) integrate portable package and environment manager which can detect multiple versions of required software or installed one across Linux, MacOS, Windows and Android 4) integrate web server to show interactive reports (locally!) or exchange data when crowdsourcing experiments
  • 24. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2424 ResultResult ProgramProgram CompilerCompiler Binary and librariesBinary and libraries ArchitectureArchitecture Run-time environmentRun-time environment AlgorithmAlgorithm Idea Noticed during AE: all projects have similar structure Data setData setState of the systemState of the system image corner detectionimage corner detection matmul OpenCLmatmul OpenCL compressioncompression neural network CUDAneural network CUDA Ad-hoc scripts to compile and run a program… Have some common meta: which datasets can use, how to compile, CMD, … image-jpeg-0001image-jpeg-0001 bzip2-0006bzip2-0006 txt-0012txt-0012 video-raw-1280x1024video-raw-1280x1024 Ad-hoc dirs for data sets with some ad-hoc scripts to find them, extract features, etc Have some (common) meta: filename, size, width, height, colors, … Ad-hoc scripts to set up environment for a given and possibly proprietary compiler Have some common meta: compilation, linking and optimization flags Create project directory Ad-hoc dirs and scripts to record and analyze experiments cvs speedupscvs speedups txt hardware counterstxt hardware counters xls table with graphxls table with graph Have some common meta: features, characteristics, optimizations GCC V5.2GCC V5.2 LLVM 3.6LLVM 3.6 Intel Compilers 2015Intel Compilers 2015
  • 25. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2525 image corner detectionimage corner detection matmul OpenCLmatmul OpenCL compressioncompression neural network CUDAneural network CUDA meta.json image-jpeg-0001image-jpeg-0001 bzip2-0006bzip2-0006 txt-0012txt-0012 video-raw-1280x1024video-raw-1280x1024 meta.json meta.json meta.json GCC V5.2GCC V5.2 LLVM 3.6LLVM 3.6 Intel Compilers 2015Intel Compilers 2015 Python module compile and run Python module “program” with functions: compile and run Python module setup Python module “soft” with function: setup Python module extract_features Python module “dataset” with function: extract_features Python module add, get, analyze Python module “experiment” with function: add, get, analyze Collective Knowledge: organize and share artifacts as reusable components data UID and alias cvs speedupscvs speedups txt hardware counterstxt hardware counters xls table with graphxls table with graph
  • 26. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2626 image corner detectionimage corner detection matmul OpenCLmatmul OpenCL compressioncompression neural network CUDAneural network CUDA image-jpeg-0001image-jpeg-0001 bzip2-0006bzip2-0006 txt-0012txt-0012 video-raw-1280x1024video-raw-1280x1024 GCC V5.2GCC V5.2 LLVM 3.6LLVM 3.6 Intel Compilers 2015Intel Compilers 2015 Python module compile and run Python module “program” with functions: compile and run Python module setup Python module “soft” with function: setup Python module extract_features Python module “dataset” with function: extract_features Python module add, get, analyze Python module “experiment” with function: add, get, analyze Collective Knowledge: organize and share your artifacts as reusable components data UID and alias JSON input JSON input JSON input JSON input JSON output JSON output JSON output JSON output CK: small python module (~200Kb); no extra dependencies; Linux; Win; MacOS ck <function> <module>:<data UID> @input.json meta.json meta.json meta.json meta.json cvs speedupscvs speedups txt hardware counterstxt hardware counters xls table with graphxls table with graph
  • 27. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2727 image corner detectionimage corner detection matmul OpenCLmatmul OpenCL compressioncompression neural network CUDAneural network CUDA image-jpeg-0001image-jpeg-0001 bzip2-0006bzip2-0006 txt-0012txt-0012 video-raw-1280x1024video-raw-1280x1024 GCC V5.2GCC V5.2 LLVM 3.6LLVM 3.6 Intel Compilers 2015Intel Compilers 2015 Python module compile and run Python module “program” with functions: compile and run Python module setup Python module “soft” with function: setup Python module extract_features Python module “dataset” with function: extract_features Python module add, get, analyze Python module “experiment” with function: add, get, analyze Collective Knowledge: organize and share your artifacts as reusable components data UID and alias JSON input JSON output CK: small python module (~200Kb); no extra dependencies; Linux; Win; MacOS Connect into workflows as LEGO™ meta.json meta.json meta.json meta.json cvs speedupscvs speedups txt hardware counterstxt hardware counters xls table with graphxls table with graph
  • 28. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2828 Local repository structure (with internal UID and meta)! program soft image corner detectionimage corner detection matmul OpenCLmatmul OpenCL compressioncompression neural network CUDAneural network CUDA gcc 5.2gcc 5.2 llvm 3.6llvm 3.6 icc 2015icc 2015 dataset image-jpeg-0001image-jpeg-0001 bzip2-0006bzip2-0006 video-raw-1280x1024video-raw-1280x1024 …… …… …… …… …… …… module programprogram softsoft datasetdataset …… …… …… / module UID and alias / data UID or alias / .cm / meta.json CK local project repo experimentexperiment …… …… …… …… …… Both code (with API) and data (with meta) inside repository Can be referenced and cross-linked via CID (similar to DOI): module UOA : data UOA Local files – no locks up on third-party web services No complex GUI Can be shared via GIT/SVN/etc Can be easily indexed via ElasticSearch to speed up search Can be easily connected to Python-based predictive analytics (sklearn-kit)
  • 29. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 2929 Provide customizable and SW/HW independent access to tools CLBLast OpenBLASOpenBLAS LLVM 3.8LLVM 3.8 GCC 7.0 CUDA 8.0CUDA 8.0 cuBLAS “Black box”“Black box” “big data” database Hardwired experiment scripts / cmake Ad-hoc scripts to process CSV, XLS, TXT, etc.Caffe OpenCLCaffe OpenCL Caffe CUDACaffe CUDA Caffe CPU JSONAPI(input)JSONAPI(input) Source files and auxiliary scriptsSource files and auxiliary scripts CK program entry (ck list program)CK program entry (ck list program) $ ck pull repo:ck-autotuning $ ck compile program:cbench-automotive-susan –speed $ ck run program:cbench-automotive-susan $ ck run program:caffe Implement SW/HW independent, reusable and shareable CK modules with unified CMD and JSON API for various experimental scenarios (such as program module) .cm/meta.json – describes soft dependencies , data sets, and how to compile and run this program .cm/meta.json – describes soft dependencies , data sets, and how to compile and run this program Read program Read program meta Detect all softwareDetect all software dependencies; ask user if multiple versions exists Prepare environment CompileCompile program Run program JSONAPI(output)JSONAPI(output) CK program module Use program entries to describe a given program in meta.json and store sources (if needed) Ad-hoc experimental workflows
  • 30. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3030 local / env / 03ca0be16962f471 / env.sh Tags: compiler,cuda,v8.0 local / env / 03ca0be16962f471 / env.sh Tags: compiler,cuda,v8.0 local / env / 0a5ba198d48e3af3 / env.bat Tags: lib,blas,cublas,v8.0 local / env / 0a5ba198d48e3af3 / env.bat Tags: lib,blas,cublas,v8.0 Soft entries in CK describe how to detect if a given software is already installed, how to set up all its environment including all paths (to binaries, libraries, include, aux tools, etc), and how to detect its version. $ ck detect soft --tags=compiler,cuda$ ck detect soft --tags=compiler,cuda $ ck detect soft:compiler.gcc$ ck detect soft:compiler.gcc $ ck detect soft:compiler.llvm --target_os=android19-arm$ ck detect soft:compiler.llvm --target_os=android19-arm $ ck list soft:compiler*$ ck list soft:compiler* $ ck detect soft:lib.cublas$ ck detect soft:lib.cublas Env entries are created in CK local repo for all found software instances together with their meta and an auto-generated environment script env.sh (on Linux) or env.bat (on Windows). Package entries describe how to install a given software if it is not installed (using install.sh script on Linux host or install.bat on Windows host). $ ck install package:caffemodel-bvlc-googlenet$ ck install package:caffemodel-bvlc-googlenet $ ck install package:imagenet-2012-val$ ck install package:imagenet-2012-val $ ck install package:lib-caffe-bvlc-master-cuda$ ck install package:lib-caffe-bvlc-master-cuda $ ck list package:*caffemodel*$ ck list package:*caffemodel* LocalCKrepoLocalCKrepo Portable and customizable package manager in the CK $ ck search soft --tags=blas$ ck search soft --tags=blas $ ck show env$ ck show env $ ck show env –tags=cublas$ ck show env –tags=cublas $ ck rm env:* –tags=cublas$ ck rm env:* –tags=cublas $ ck search package –tags=caffe$ ck search package –tags=caffe Works on Linux, Windows, MacOS, Android Detects multiple versions of various software Builds missing software Extended by the community github.com/ctuning/ck-env
  • 31. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3131 Customizable and reusable workflows with JSON API as LEGO™ local / env / 0a5ba198d48e3af3 / env.bat Tags: lib,blas,cublas,v8.0 local / env / 0a5ba198d48e3af3 / env.bat Tags: lib,blas,cublas,v8.0 Chaining CK modules to an experimental pipeline via JSON API to quickly prototype research ideas Public modular auto-tuning and machine learning repository and buildbot Unified web services Interdisciplinary crowd Choose exploration strategy Select choices (program, data set, compiler, flags, opts, hardware design …) Compile program or kernel Run program or kernel Perform statistical analysis Apply Pareto filter Model and predict choices and behavior Reduce complexity Shared scenarios from past research … Expose and unify information needed for performance analysis and optimization combined with data mining! Optimizing/modeling behavior b of any object in the CK (program, library function, kernel, …) as a function of design and optimization choices c, features f and run-time state s b = B( c , f , s ) … … … … Flattened JSON vectors
  • 32. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3232 Gradually add JSON specification (depends on research scenario) Autotuning and machine learning specification: { "characteristics":{ "execution times": ["10.3","10.1","13.3"], "code size": "131938", ...}, "choices":{ "os":"linux", "os version":"2.6.32-5-amd64", "compiler":"gcc", "compiler version":"4.6.3", "compiler_flags":"-O3 -fno-if-conversion", "platform":{"processor":"intel xeon e5520", "l2":"8192“, ...}, ...}, "features":{ "semantic features": {"number_of_bb": "24", ...}, "hardware counters": {"cpi": "1.4" ...}, ... } "state":{ "frequency":"2.27", ...} } CK flattened JSON key ##characteristics#execution_times@1 "flattened_json_key”:{ "type": "text”|"integer" | “float" | "dict" | "list” | "uid", "characteristic": "yes" | "no", "feature": "yes" | "no", "state": "yes" | "no", "has_choice": "yes“ | "no", "choices": [ list of strings if categorical choice], "explore_start": "start number if numerical range", "explore_stop": "stop number if numerical range", "explore_step": "step if numerical range", "can_be_omitted" : "yes" | "no" ... }
  • 33. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3333 • "Collective Mind: Towards practical and collaborative autotuning“, Journal of Scientific Programming 22 (4), 2014 http://hal.inria.fr/hal-01054763 • “Collective Mind, Part II: Towards Performance- and Cost-Aware Software Engineering as a Natural Science”, CPC 2015, London, UK http://arxiv.org/abs/1506.06256 • Android application to crowdsource autotuning across mobile devices: https://play.google.com/store/apps/details?id=openscience.crowdsource.experiments • “Collective Knowledge: towards R&D sustainability”, DATE 2016, Dresden, Germany http://bit.ly/ck-date16 • “Optimizing convolutional neural networks on embedded platforms with OpenCL”, IWOCL 2016, Amsterdam http://dl.acm.org/citation.cfm?id=2909449 The community now use CK to collaboratively tackle old problems Crowdsource performance analysis and optimization, machine-learning, run-time adaptation across diverse workloads and hardware provided by volunteers $ sudo pip install ck $ ck pull repo:ck-crowdtuning $ ck crowdtune program --gcc
  • 34. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3434 Reproducibility came as a side effect! • Can preserve the whole experimental setup with all data and software dependencies • Can perform statistical analysis for characteristics • Community can add missing features or improve machine learning models Execution time: 10 sec. Reproducibility of experimental results as a side effect
  • 35. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3535 Reproducibility came as a side effect! • Can preserve the whole experimental setup with all data and software dependencies • Can perform statistical analysis for characteristics • Community can add missing features or improve machine learning models Variation of experimental results: 10 ± 5 secs. Reproducibility of experimental results as a side effect
  • 36. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3636 Execution time (sec.) Distribution Unexpected behavior - expose to the community including experts to explain, find missing feature and add to the system Reproducibility of experimental results as a side effect Reproducibility came as a side effect! • Can preserve the whole experimental setup with all data and software dependencies • Can perform statistical analysis for characteristics • Community can add missing features or improve machine learning models
  • 37. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3737 Execution time (sec.) Distribution Class A Class B 800MHz CPU Frequency 2400MHz Unexpected behavior - expose to the community including experts to explain, find missing feature and add to the system Reproducibility of experimental results as a side effect Reproducibility came as a side effect! • Can preserve the whole experimental setup with all data and software dependencies • Can perform statistical analysis for characteristics • Community can add missing features or improve machine learning models
  • 38. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3838 Gradually extend/fix shared workflows by the community during AE •Init pipeline •Detected system information •Initialize parameters •Prepare dataset •Clean program •Prepare compiler flags •Use compiler profiling •Use cTuning CC/MILEPOST GCC for fine-grain program analysis and tuning •Use universal Alchemist plugin (with any OpenME-compatible compiler or tool) •Use Alchemist plugin (currently for GCC) •Compile program •Get objdump and md5sum (if supported) •Use OpenME for fine-grain program analysis and online tuning (build & run) •Use 'Intel VTune Amplifier' to collect hardware counters •Use 'perf' to collect hardware counters •Set frequency (in Unix, if supported) •Get system state before execution •Run program •Check output for correctness (use dataset UID to save different outputs) •Finish OpenME •Misc info •Observed characteristics •Observed statistical characteristics •Finalize pipeline We’ve shared and continuously extend our experimental workflow for autotuning: http://github.com/ctuning/ck-autotuning http://cknowledge.org/repo http://github.com/ctuning • Hundreds of benchmarks/kernels/codelets (CPU, OpenMP, OpenCL, CUDA) • Thousands of data sets • Wrappers around all main tools and libs • Optimization description of major compilers
  • 39. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 3939 Distinguished CGO’17 artifact in Collective Knowledge format Highest ranked artifact at CGO’17 turned out to be implemented using Collective Knowledge Framework "Software Prefetching for Indirect Memory Accesses", Sam Ainsworth and Timothy M. Jones https://github.com/SamAinsworth/reproduce-cgo2017-paper It take advantage of a portable package manager to install required LLVM for either x86 or ARM platforms, automatically build LLVM plugins, run empirical experiments on a user machine via CK workflow, compare speedups with pre-recorded results by authors, and prepare interactive report See PDF with Artifact Evaluation Appendix: http://ctuning.org/ae/resources/paper-with-distinguished-ck-artifact-and-ae-appendix-cgo2017.pdf See snapshot of an interactive CK dashboard: https://github.com/SamAinsworth/reproduce-cgo2017-paper/files/618737/ck-aarch64-dashboard.pdf
  • 40. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4040 Collective Knowledge mini-tutorials Create repository: ck add repo:my_new_project Add new module: ck add my_new_project:module:my_module Add new data for this module: ck add my_new_project:my_module:my_data @@dict {“tags”:”cool”,”data”} Add dummy function to module: ck add_action my_module –func=my_func Test dummy function: ck my_func my_module --param1=var1 List my_module data: ck list my_module Find data by tags: ck search my_module –tags=cool Archive repository: ck zip repo:my_new_project Pull existing repo from GitHub: ck pull repo:ck-autotuning List modules from this repo: ck list ck-autotuning:module:* Compile program (using GCC): ck compile program: cbench-automotive-susan --speed Run program: ck run program: cbench-automotive-susan Start server for crowdsourcing: ck start web View interactive articles: ck browser http://localhost:3344 https://michel-steuwer.github.io/About-CK/ https://github.com/ctuning/ck/wiki/Compiler-autotuning http://github.com/ctuning/ck/wiki
  • 41. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4141 CK demo: workflow to collaboratively optimizing DNN • “Deep” (multi-layered) neural networks that take advantage of structure in digital signals (e.g. images). • State-of-the-art for applications in automotive, robotics, healthcare, entertainment (e.g. image classification). • Commonly trained on workstations or clusters with GPGPUs. • Increasingly deployed on mobile and embedded systems. Difficult and time consuming to build and optimize due to multiple programming models, multiple objectives (performance, energy, accuracy, memory footprint) and continuously evolving hardware particularly with constrained resources (sensors, IoT) Perfect use case for Collective Knowledge Framework!
  • 42. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4242 Deploying DNNs on resource-constrained platforms • Can we deploy a DNN to achieve desired performance (rate and accuracy of recognition) on a given platform? • Can we identify or build such a platform under given constraints such as those on power, memory, price? • If all else fails, can we design (autotune) another DNN (topology) by trading off performance, accuracy and costs? NVIDIA Drive PX2 ~ 250 Watts ~ $15,000 (early development boards) ~ $1,000 per unit? ? < 10 Watts < $100
  • 43. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4343 Customizable CK workflow to collaborative co-design DNN We collaborate with General Motors and ARM to develop open-source CK-based tools to collaboratively evaluate and optimize various DNN engines and models across diverse platforms and datasets cKnowledge.org/ai cKnowledge.org/repo Download Android app to participate in collaborative evaluation and optimization of DNN using your mobile device and CK web-service: https://play.google.com/store/apps/details?id=openscience.crowdsource.video.experiments CK portable workflow systems and package manager allows users to run DNN experiments on Windows, Linux, MacOS, Android! $ ck pull repo --url=http://github.com/dividiti/ck-caffe $ ck install package:lib-caffe-bvlc-master-cpu-universal $ ck install package:lib-caffe-bvlc-opencl-viennacl-universal --target_os=android21-arm64 $ ck run program:caffe-classification $ firefox http://cKnowledge.org/repo
  • 44. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4444 • Collective Knowledge approach helps fight SW/HW chaos • CK is also changing the mentality of computer systems’ researchers: • sharing artifacts as customizable and reusable components with JSON API • sharing customizable and portable experimental workflows • building a repository of representative benchmarks and data sets • crowdsourcing experiments and sharing negative/unexpected results • collaboratively improving reproducibility of empirical experiments • using data mining (statistical analysis and machine learning) to predict efficient software optimizations and hardware designs • collaboratively improving prediction models and finding missing features • formulating and solving important real-world problems • CK brings closer together industry and academia (common research methodology, reproducible experiments, validated techniques) cKnowledge.org: from ad-hoc computer systems’ research to data science
  • 45. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4545 GCC 4.1.x GCC 4.2.x GCC 4.3.x GCC 4.4.x GCC 4.5.x GCC 4.6.x GCC 4.7.x ICC 10.1 ICC 11.0 ICC 11.1 ICC 12.0 ICC 12.1 LLVM 2.6 LLVM 2.7 LLVM 2.8 LLVM 2.9 LLVM 3.1 Phoenix MVS XLC Open64 Jikes Testarossa OpenMP MPI HMPP OpenCL CUDA gprof prof perf oprofile PAPI TAU Scalasca VTune Amplifier scheduling algorithm-level TBB MKL ATLASprogram-level function-level Codelet loop-level hardware counters IPA polyhedral transformations LTO threads process pass reordering run-time adaptation per phase reconfiguration cache size frequency bandwidth HDD size TLB ISA memory size coresprocessors threads power consumption execution time reliability Current state of computer engineering likwid Classification,Classification, predictive modeling Optimal solutions Systematization and unification of collective knowledge (big data) “crowd” The community Task, idea Result Quick, non-reproducible hack? Ad-hoc heuristic? Quick publication? Waste of expensive resources and energy? Collaboratively optimize shared programs and data sets across diverse hardware Extrapolate shared knowledge to build fast, energy efficient, reliable and cheap computer systems to boost innovation in science and technology! Collective Knowledge open RDK and repository cKnowledge.org: enable open & reproducible computer systems’ research!
  • 46. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4646 Help us improve artifact evaluation! We need your feedback - remember that new AE procedures may affect you at the future conferences • AE steering committee: http://cTuning.org/ae/committee.html • Mailing list: https://groups.google.com/forum/#!forum/collective-knowledge Feel free to reuse and improve our Artifact Evaluation procedures and Artifact Appendices: http://cTuning.org/ae https://github.com/ctuning/ck-web-artifact-evaluation Extra resources • ACM Result and Artifact Review and Badging policy: http://www.acm.org/publications/policies/artifact-review-badging • Evaluation of frameworks for reproducible R&D: https://wilkie.github.io/reproducibility-site • Community driven artifact evaluation and paper reviewing: http://dl.acm.org/citation.cfm?doid=2618137.2618142
  • 47. Grigori Fursin “Enabling open and reproducible research at computer systems' conferences ( cKnowledge.org )” 4747 Questions, comments, collaborations? Many new exciting opportunities for collaboration Artifact sharing, customizable and portable workflows, experiment crowdsourcing, collaborative deep learning optimization, public repositores of knowledge, adaptive systems Acknowledgments