3. New Demand for IT infrastructure
Capacity
Massive processing power
Massive Storage
Security
Availability
Scalability
Start small and grow on demand
Cost effective
4. New Challenges
High operating cost
Man power cost
Equipment cost
Energy cost
High operating complexity
Changing technology
Increase complexity
Network, server, storage , security
5. Dream machine
Computer with infinite capacity
Start small and grow big based
on my demand
Capacity can scale up and down
on demand
Pay only what we use.
No complex operating and
maintenance
6. What is Clouding Computing?
• A style of computing in
Cloud which dynamically scalable
and often virtualized
computing resources are provided as a
service over the Internet.
Google
Saleforce
Amazon
Source: Wikipedia (cloud computing)
Microsoft
Yahoo
7. Power Grid Inspiration for Computing?: Deliver ICT services
as “computing utilities” to users
8. Economic of Cloud Usage
Source: “Above the Clouds: A Berkeley View of Cloud Computing”, RAD lab,
UC Berkeley
9. Why we should move to the cloud?
Quick start up
no need to purchase any equipment. Subscribe, pay, and use it.
Scalability
less demand less computing power, more demand more
computing power
Elasticity
Handle the demand surge
Less maintenance
No need to hire people to fix server broken, hacking, tuning
Less operation cost
Pay only what you really use
Cut the cost of maintaining huge infrastructure
It is cool, trendy
Just a stupid execute when people do not believe you ^_^
10. Cloud Computing Definition (NIST)
This cloud model is composed of
five essential characteristic
three service models
four deployment models.
11. 5 Characteristics of Cloud System
Broad
On-demand Resource
network
self-service pooling
access
Rapid Measured
elasticity Service
12. Three Cloud Service Models
Software as a Service
• End user software
• gmail. Googledoc, facebook
Platform as a Service
• Programming platform
• Azure, google app engine
Infrastructure as a Service
• Computer server
• Vmware, EC2 , Openstack
13. Cloud Deployment Model
Private Cloud
• Internal cloud used by an organization
Community Cloud
• Internal Cloud Shared by multiple organizations
Public Cloud
• Providers Cloud shared by many users
Hybrid Cloud
• Cloud that composed of two or more cloud
14. Using IaaS Cloud
User view the cloud as a number of servers
Look the same as co-location server
This is actually a virtual server
Windows or many flavor of Linux
User can start stop and reboot from web interface
Normal web based application work fine
Usage is charge on pay per use
Can try at aws.amazon.com
Open a new account and start a new server use less than 30
minutes to apply
15. Using PaaS Cloud
PaaS cloud give you an API to program on the cloud
There is a need to port application etc.
.NET to Windows Azure
Python to google app engine
Pros and Cons
More light weight that IaaS but need some application
porting effort
16. Using SaaS Cloud
You have already used it!
Facebook
Gmail
Calendar
Google Map
Running application directly from you browser
No coding , no porting just pay and use or use it for
free
17. What the Cloud can do?
server consolidation
Iaas cloud is the same as allow you to use many servers
hosted by service providers
Scalable web application
Community web like sanook , kapook
Web app for anything you want to do
Back end for mobile app
iCloud, GoogleCloud are being used
18. The Cloud and I
Money Video
Data
Music books
Computing Power
Personal information
Services Picture
Application Games
ACCESS
ANYTIME STORAGE
ANYWHERE Internet SHARING
ANYHOW
RELIABILITY
SECURITY
AVAILABILITY
19. The Cloud and I
Google docs (Office)
Spread Sheet
Word processor
Presentation
Calendar
Gmail
20. The Cloud and I
calendar picture music
My cloud (google, facebook,
dropbox, amazon)
document
21. Work Life with a Cloud
Appointment (google calendar)
My secretary take appointment , add to calendar
I got to see it on every device quickly, so is she
Device notify me
Email (gmail)
I can go to any computer/device with browser, my email
follow me there.
I have no need to install mail client, maintain mail server
22. Work Life with a Cloud
Document (google docs)
I can create basic document, good spreadsheet, basic
presentation without installing any software
I can down load document and edit it on my computer
I can share my document with other on internet and edit it
together
Storage (google drive, dropbox)
Create presentation on notebook, drop in in dropbox
Present from iPad, Smartphone
Secure, no need to carry thumb drive
Easily share file with other people making team work easy
23.
24. Play Life with a Cloud
Picture
Using Instagram, photo, video I take instantly appear on
twitter and facebook and neatly catalog
Picture can be shared, tag, comment among my 2000 friends
on facebook!
If I want, they will know where I was. (Little dangerous)
Communication
My thought can be spread anytime anyway using facebook,
googleplus, multiply
I can even “hang out” with friend on google plus
25. Play Life with a Cloud
Book
Amazon Kindle Store. Buy book from amazon and they
will keep it on their cloud
Unlimited book shelves, no cleaning, dusting
Read your book on any device iPad, iPhone, Androiod
Phone, Tablet, PC, Mac
I read mine on iPad, and my Galaxy S2 phone
26. Play Life with a Cloud
Music
iTune Store allow you to shopping for music, movies
You can load it and play on many of your devices
Media Industry is changing, now you can own a radio
station and TV station and get audiences around the
world
Power shift from infrastructure provider (TV station) to
content creator ( like grammy etc.)
27. Some Existing Cloud Computing Systems
Amazon AWS
Google App Engine
Microsoft Azure
Openstack
30. Google App Engine
Google App Engine is a platform for
developing and hosting web applications
in Google-managed data centers
first released as a beta version in April
2008.
Google App virtualizes applications across
multiple servers and data centers.
Google App Engine is free up to a certain
level of used resources. Fees are charged
for additional storage, bandwidth, or CPU
cycles required by the application.[
31. App Engine Architecture
req/resp
stateless APIs R/O FS
urlfech Python stdlib
VM
mail
process app
images
stateful datastore
APIs memcache
31
34. Cloud Application Development
UI Tier
Web2.0
Processing Data
Tier Management Tier
Separate processing logic , UI, and DM Tier
Using Services Oriented Architecture (SOA) design
35. OpenStack Architecture
OpenStack is a cloud operating system that controls large pools of compute,
storage, and networking resources throughout a datacenter, all managed
through a dashboard that gives administrators control while empowering their
users to provision resources through a web interface.
42. We are living in the world of Data
Video
Surveillance
Social Media
Mobile Sensors
Gene Sequencing
Smart Grids
Geophysical Medical Imaging
Exploration
43. Big Data
“Big data is data that exceeds the processing capacity of
conventional database systems. The data is too big,
moves too fast, or doesn’t fit the strictures of your
database architectures. To gain value from this data, you
must choose an alternative way to process it.”
Reference: “What is big data? An introduction to the big data
landscape.”, EddDumbill, http://radar.oreilly.com/2012/01/what-is-big-
data.html
44. The Value of Big Data
Analytical use
Big data analytics can reveal insights hidden previously by
data too costly to process.
peer influence among customers, revealed by analyzing
shoppers’ transactions, social and geographical data.
Being able to process every item of data in reasonable time
removes the troublesome need for sampling and promotes an
investigative approach to data.
Enabling new products.
Facebookhas been able to craft a highly personalized user
experience and create a new kind of advertising business
45. 3 Characteristics of Big Data
Volume •Volumes of data are larger than those conventional
relational database infrastructures can cope with
•Rate at which data flows in is much faster.
Velocity •Mobile event and interaction by users.
•Video, image , audio from users
•the source data is diverse, and doesn’t fall into
Variety neat relational structures eg. text from social
networks, image data, a raw feed directly from a
sensor source.
46. Big Data Challenge
Volume
How to process data so big that can not be move, or store.
Velocity
A lot of data coming very fast so it can not be stored such as
Web usage log , Internet, mobile messages. Stream
processing is needed to filter unused data or extract some
knowledge real-time.
Variety
So many type of unstructured data format making
conventional database useless.
47. How to deal with big data
Integration of
Storage
Processing
Analysis Algorithm
Visualization
Processin
g
Massive Processin Visualiz
Data Stream
g e
Stream processing
Storage
Processin
g
Analysis
48. Hadoop
Hadoopis a platform for distributing computing problems across a
number of servers. First developed and released as open source by
Yahoo.
Implements the MapReduce approach pioneered by Google in
compiling its search indexes.
Distributing a dataset among multiple servers and operating on the
data: the “map” stage. The partial results are then recombined: the
“reduce” stage.
Hadooputilizes its own distributed filesystem, HDFS, which makes
data available to multiple computing nodes
Hadoopusage pattern involves three stages:
loading data into HDFS,
MapReduce operations, and
retrieving results from HDFS.
49. WHAT FACEBOOK KNOWS
Cameron Marlow calls himself Facebook's "in-
house sociologist." He and his team can analyze
http://www.facebook.com/data essentially all the information the site gathers.
50. Study of Human Society
Facebook, in collaboration with the University of
Milan, conducted experiment that involved
the entire social network as of May 2011
more than 10 percent of the world's population.
Analyzing the 69 billion friend connections among
those 721 million people showed that
four intermediary friends are usually enough to
introduce anyone to a random stranger.
51. The links of Love
Often young women specify that
they are “in a relationship” with
their “best friend forever”.
Roughly 20% of all relationships for
the 15-and-under crowd are between
girls.
This number dips to 15% for 18-year-
olds and is just 7% for 25-year-olds.
Anonymous US users who were
over 18 at the start of the
relationship
the average of the shortest number
of steps to get from any one U.S.
user to any other individual is 16.7.
This is much higher than the 4.74
steps you’d need to go from any
Facebook user to another through
friendship, as opposed to romantic,
ties.
Graph shown the relationship of anonymous US users who were
over 18 at the start of the relationship.
http://www.facebook.com/notes/facebook-data-team/the-links-of-
love/10150572088343859
52. Why?
Facebook can improve users experience
make useful predictions about users' behavior
make better guesses about which ads you might be
more or less open to at any given time
Right before Valentine's Day this year a blog post
from the Data Science Team listed the songs most
popular with people who had recently signaled on
Facebook that they had entered or left a relationship
53. How facebook handle Big Data?
Facebook built its data storage system using open-source
software called Hadoop.
Hadoop spreading them across many machines inside a data
center.
Use Hive, open-source that acts as a translation service, making it
possible to query vast Hadoop data stores using relatively simple
code.
Much of Facebook's data resides in one Hadoop store more than
100 petabytes (a million gigabytes) in size, says SameetAgarwal,
a director of engineering at Facebook who works on data
infrastructure, and the quantity is growing exponentially. "Over
the last few years we have more than doubled in size every
year,”
54.
55. San Diego Supercomputer Center
Unleashes the Value of its User Data
Challnege
To make SDSC’ s data stores widely available so that they could be
accessed, searched, and shared anywhere via Web-based access,
SDSC made the decision to move from a tape-based system to cloud-
based object storage.
Solution
OpenStack Object Storage uses open-source software to create
redundant, scalable storage using clusters of standardized servers to
store petabytes of accessible data.
Objects are written to multiple hardware devices, with the OpenStack
software responsible for ensuring data replication and integrity
across the cluster. Storage clusters can scale horizontally by adding
new nodes. Should a node fail, OpenStack replicates its content from
other active nodes.
Benefit
Today, SDSC's Cloud Storage provides academic and research
partners with a convenient and affordable way to store, share, and
archive data, including extremely large data sets. Utilizing the
OpenStack Object Storage software, files (objects) are written to
multiple physical storage arrays simultaneously, ensuring that at least
two verified copies exist on different servers at all times.
56. Cloud Library
Cloud Library e-book lending service that
will allow users to browse and borrow
digital books directly from their iPads,
Nooks and Android-based tablets.
3M will outfit local libraries with its own
software, hardware and e-book collection
be able to access via special apps, or 3M's
new eReaders, which will be synced with
available digital content.
Discovery Terminal download stations in
libraries, allowing visitors to leaf through the
collection from a touch-based interface.
Random House and IPG have signed on to
the initiative
57. Moving KU Computer Engineering on
the Cloud
Introduction
Department of Computer Engineering is one of the leading
computer engineering in Thailand (23 years)
Research and Education
30 faculty member
20-30 Ph.D students 50 Master, 120 MSIT, 400 Undergrad
Mission
Must support the teaching and research by providing server /
network/ service infrastructure
Driving toward mobile anytime anywhere infrastructure
58. Moving KU Computer Engineering on
the Cloud
Challenge and Opportunity
Must provide a scalable and reliable infrastructure
Servers, Storage
Services
Previously, a number of physical server has been used
Getting old quickly, hard to maintain, a lot of space
Consume a lot of power, cooling
59. Moving KU Computer Engineering on
the Cloud
Cloud is Solution
For Server, use VM cloud (VMware) to consolidate all small
server into a set of VM on only 5 machines
Every lab, professor can request for VM for their use
Can scale easily using more physical server
Moving to centralize large storage using NAS/SAN storage
cloud
60. Standard is needed
IEEE Standards Association (IEEE-SA) has formed two
new Working Groups (WGs) around IEEE P2301 and
IEEE P2302.
IEEE P2301 is a cloud computing standards in critical
areas such as application, portability, management, and
interoperability interfaces, as well as file formats and
operation conventions.
IEEE P2302 defines essential topology, protocols,
functionality, and governance required for reliable
cloud-to-cloud interoperability and federation.
61. Trend
Software as a Service
Framework as a Service
Virtualized Infrastructure
Physical Infrastructure
63. Cloud computing open issues
People do not trust other to have their important
data
And why people trust your bank to have all their
money?
People do not trust that cloud provider can
provide a robust and secure environment
How many time your system went down or being
hacked compared to google or facebook?
Do avrage company have better staff than ISP
who deal with these problems on a daily basis
Interesting!
64. Conclusion
Cloud Computing is here!
You are using it everyday
SaaS Level such as facebook, gmail
Let fly above the cloud and see what
it can do for you.
The sources of information are expanding. Many new sources are machine generated. It’s also big files (siesmic scans can be 5TB per file) and massive numbers of small files (email, social media).Leading companies for decades have always sought to leverage new sources of data, and the insights that can be gleaned from those data sources, as new sources of competitive advantage.More detailed structured dataNew unstructured dataDevice-generated dataBut big data isn’t only about data, a comprehensive big data strategy also needs to consider the role and prominence of new, enabling-technologies such as:Scale out storageMPP database architecturesHadoop and the Hadoop ecosystemIn-database analyticsIn-memory computingData virtualizationData visualization