20. what we want
from scribe agent
•easy to deploy
•works w/o any httpd configurations
•delivery target failover/takeback
•lightweight (without JVM)
•stable
2011 9 26
22. scribeline
log delivery agent tool
python 2.4, thrift
easy to setup and start/stop
works without any httpd configurations
works with logrotate-ed log files
automatic delivery target failover/takeback
https://github.com/tagomoris/scribe_line
2011 9 26
23. how to setup scribeline
in livedoor
1. yum install scribeline
(tar xzf && cd && sudo make install)
2. vi /etc/scribeline.conf
blog /var/log/httpd/access_log
blogimg /var/log/nginx/access_log
3. /etc/init.d/scribeline start
2011 9 26
25. overview
hourly
daily on
hourly
demand
2011 9 26
26. what we want
about hive client
•easy to experiment
•from PC on our desks
•result caching
•protection against data loss
•friendly look & feel
2011 9 26
27. shib
hive client web application
node.js, thrift, kyoto tycoon
query history browser
query editor, based on copy&paste
result caching & download tsv/csv
filter INSERT/DROP/CREATE ...
https://github.com/tagomoris/shib
2011 9 26
30. what shib cannot
do now
•access control
•graph & chart
•hive 0.7.0+ features support
•database, authentication and ...
•mapreduce status notification
2011 9 26
31. what we are trying now
•New cluster
•more nodes
•CDH3b2 + Hive 0.6.0 -> CDH3u1
•New tools
•Hoop (instead of fuse-hdfs)
•Any stream processing framework
2011 9 26