SlideShare a Scribd company logo
1 of 19
Gabriel PREDA
@eRadical
(Almost) Serverless Analytics System
with BigQuery & AppEngine
Agenda
Going Serverless with
AppEngine & Tasks
Pub/Sub, DataStore
BigQuery
Load
Batch
Streaming Inserts
Query
UDF
Export
...some BigQueries...
AeonsSome years ago...
~ 500,000 - 2,000,000 events / day
(on average)
Some time ago...
~2,000,000 - 22,000,000 events / day
Dec 2014: 57,430,000 events / day
1 day to recompute » 12 hours
NOW()
22,000,000 - 70,000,000 events / day
AVG » 40,000,000 events / day
Processing ~30GB-70GB / day
Recompute 1 day » 10-20 minutes
serverless?
Desired for: https://www.innertrends.com
other... (almost) serverless products
Cloud Functions (alpha - Node.JS)
Cloud DataFlow (Java, Python - beta)
BigQuery
https://cloud.google.com/bigquery/docs/
BigQuery - data types
● STRING - UTF-8 (2 bytes + encoded string size)
● BYTES - base64 encoded (except in Avro)
● INTEGER - 64-bit signed (8 bytes)
● FLOAT (8 bytes)
● BOOLEAN - true/false, 1/0 only in CSV (1 byte)
● TIMESTAMP ex:”2014-08-19 12:41:35.220 UTC” (8 bytes)
● DATE, TIME, DATETIME - limited support in Legacy SQL
● RECORD - a collection of fields (size of fields)
https://cloud.google.com/bigquery/data-types
BigQuery -> loadData()
Formats: CSV, JSON (newline delimited), Avro, Parquet (experimental)
Tools: Web UI, bq, API
Source:
local files,
Cloud Storage, [demo]
Cloud Datastore (backup files),
POST requests,
SQL DML*
Google Sheets
- Federated Data Sources
- Streaming Inserts
BigQuery -> loadData()
bq load ...
BigQuery -> loadData()
Got some rows?
BigQuery -> SELECT … FROM surprise…
query:
SELECT { * | field_path.* | expression } [ [ AS ] alias ] [ , ... ]
[ FROM from_body
[ WHERE bool_expression ]
[ OMIT RECORD IF bool_expression]
[ GROUP [ EACH ] BY [ ROLLUP ] { field_name_or_alias } [ , ... ] ]
[ HAVING bool_expression ]
[ ORDER BY field_name_or_alias [ { DESC | ASC } ] [, ... ] ]
[ LIMIT n ]
];
from_body:
from_item [, ...] | # Warning: Comma means UNION ALL here
from_item [ join_type ] JOIN [ EACH ] from_item [ ON join_predicate ] |
(FLATTEN({ table_name | (query) }, field_name_or_alias)) |
table_wildcard_function
from_item:
{ table_name | (query) } [ [ AS ] alias ]
join_type:
{ INNER | [ FULL ] [ OUTER ] | RIGHT [ OUTER ] | LEFT [ OUTER ] | CROSS }
BigQuery -> SELECT … FROM surprise…
Date-Partitioned Tables [demo]
Table Decorators - See the past w/ @
Table Wildcard Functions - TABLE_DATE_RANGE() & TABLE_QUERY()
Interesting functions
- DateTime » UTC_USEC_TO_DAY/HOUR/MONTH/WEEK/YEAR()
» Shifts a UNIX timestamp in microseconds to the beginning of the period it occurs in.
- JSON_EXTRACT[_SCALAR]()
- URL functions » HOST(), DOMAIN(), TLD()
- REGEXP_MATCH(), REGEXP_EXTRACT()
bigquery.defineFunction(
'expandAssetLibrary', // Name of the function exported to SQL
['user_id', 'video_id', 'stage_settings'], // Names of input columns
[ {'name': 'user_id', 'type': 'integer'}, // Output schema
{'name': 'video_id', 'type': 'string'},
{'name': 'asset', 'type': 'string'} ],
expandAssetLibrary // Reference to JavaScript UDF
);
function expandAssetLibrary(row, emit) { …………………………
emit({ user_id: row.user_id, video_id: row.video_id, asset: ss.url.replace('http://', ''));
}
BigQuery -> User Defined Functions
BigQuery -> DML
Standard SQL only
Maximum UPDATE/DELETE statements per day per table: 48
Maximum UPDATE/DELETE statements per day per project: 500
Maximum INSERT statements per day per table: 1,000
Maximum INSERT statements per day per project: 10,000
BigQuery -> export()
To: Google Cloud Storage
Format: CSV, JSON [.gz], Avro
…1G files
BigQuery -> some (Big)Queries
SELECT year, count(1)
FROM [bigquery-public-data:samples.natality]
WHERE father_age < 18
GROUP BY year
ORDER BY year
SELECT year, count(1)
FROM [bigquery-public-data:samples.natality]
WHERE mother_age < 18
GROUP BY year
ORDER BY year
SELECT table_id, row_count, CEIL(size_bytes/POW(1024, 3)) AS gb
FROM [bigquery-public-data:ghcn_m.__TABLES__] ORDER BY gb DESC
BigQuery -> some (Big)Queries
SELECT REGEXP_EXTRACT(path, r'.*.(.*)$') AS file_extension,
COUNT(1) AS k
FROM [bigquery-public-data:github_repos.files]
GROUP BY file_extension
ORDER BY k DESC
LIMIT 20
SELECT table_id, row_count,
CEIL(size_bytes/POW(1024, 3)) AS gb
FROM [bigquery-public-data:github_repos.__TABLES__]
ORDER BY gb DESC

More Related Content

What's hot

Introduction to cron queue
Introduction to cron queueIntroduction to cron queue
Introduction to cron queueADCI Solutions
 
Data analytics with hadoop hive on multiple data centers
Data analytics with hadoop hive on multiple data centersData analytics with hadoop hive on multiple data centers
Data analytics with hadoop hive on multiple data centersHirotaka Niisato
 
2016 gunma.web games-and-asm.js
2016 gunma.web games-and-asm.js2016 gunma.web games-and-asm.js
2016 gunma.web games-and-asm.jsNoritada Shimizu
 
Asynchronous programming
Asynchronous programmingAsynchronous programming
Asynchronous programmingFilip Ekberg
 
No More Deadlocks; Asynchronous Programming in .NET
No More Deadlocks; Asynchronous Programming in .NETNo More Deadlocks; Asynchronous Programming in .NET
No More Deadlocks; Asynchronous Programming in .NETFilip Ekberg
 
RxJS 5 in Depth
RxJS 5 in DepthRxJS 5 in Depth
RxJS 5 in DepthC4Media
 
Working with NoSQL in a SQL Database (XDevApi)
Working with NoSQL in a SQL Database (XDevApi)Working with NoSQL in a SQL Database (XDevApi)
Working with NoSQL in a SQL Database (XDevApi)Lior Altarescu
 
NoSQL in SQL - Lior Altarescu
NoSQL in SQL - Lior Altarescu NoSQL in SQL - Lior Altarescu
NoSQL in SQL - Lior Altarescu Wix Engineering
 
W3C HTML5 KIG-How to write low garbage real-time javascript
W3C HTML5 KIG-How to write low garbage real-time javascriptW3C HTML5 KIG-How to write low garbage real-time javascript
W3C HTML5 KIG-How to write low garbage real-time javascriptChanghwan Yi
 
University of Bedford Knowledge Network 2.12.13
University of Bedford Knowledge Network 2.12.13University of Bedford Knowledge Network 2.12.13
University of Bedford Knowledge Network 2.12.13Business BUZZ - Watford
 
Data visualization by Kenneth Odoh
Data visualization by Kenneth OdohData visualization by Kenneth Odoh
Data visualization by Kenneth Odohpyconfi
 
Do something in 5 minutes with gas 1-use spreadsheet as database
Do something in 5 minutes with gas 1-use spreadsheet as databaseDo something in 5 minutes with gas 1-use spreadsheet as database
Do something in 5 minutes with gas 1-use spreadsheet as databaseBruce McPherson
 
Functional Programming
Functional ProgrammingFunctional Programming
Functional ProgrammingSovTech
 
Visdjango presentation django_boston_oct_2014
Visdjango presentation django_boston_oct_2014Visdjango presentation django_boston_oct_2014
Visdjango presentation django_boston_oct_2014jlbaldwin
 
Rubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyRubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyPrasun Anand
 
Business Networking Cambridge April 2014
Business Networking Cambridge April 2014Business Networking Cambridge April 2014
Business Networking Cambridge April 2014Business BUZZ - Watford
 

What's hot (20)

Introduction to cron queue
Introduction to cron queueIntroduction to cron queue
Introduction to cron queue
 
Functional programming
Functional programming Functional programming
Functional programming
 
Data analytics with hadoop hive on multiple data centers
Data analytics with hadoop hive on multiple data centersData analytics with hadoop hive on multiple data centers
Data analytics with hadoop hive on multiple data centers
 
2016 gunma.web games-and-asm.js
2016 gunma.web games-and-asm.js2016 gunma.web games-and-asm.js
2016 gunma.web games-and-asm.js
 
20151224-games
20151224-games20151224-games
20151224-games
 
Asynchronous programming
Asynchronous programmingAsynchronous programming
Asynchronous programming
 
No More Deadlocks; Asynchronous Programming in .NET
No More Deadlocks; Asynchronous Programming in .NETNo More Deadlocks; Asynchronous Programming in .NET
No More Deadlocks; Asynchronous Programming in .NET
 
RxJS 5 in Depth
RxJS 5 in DepthRxJS 5 in Depth
RxJS 5 in Depth
 
Working with NoSQL in a SQL Database (XDevApi)
Working with NoSQL in a SQL Database (XDevApi)Working with NoSQL in a SQL Database (XDevApi)
Working with NoSQL in a SQL Database (XDevApi)
 
NoSQL in SQL - Lior Altarescu
NoSQL in SQL - Lior Altarescu NoSQL in SQL - Lior Altarescu
NoSQL in SQL - Lior Altarescu
 
W3C HTML5 KIG-How to write low garbage real-time javascript
W3C HTML5 KIG-How to write low garbage real-time javascriptW3C HTML5 KIG-How to write low garbage real-time javascript
W3C HTML5 KIG-How to write low garbage real-time javascript
 
A Shiny Example-- R
A Shiny Example-- RA Shiny Example-- R
A Shiny Example-- R
 
University of Bedford Knowledge Network 2.12.13
University of Bedford Knowledge Network 2.12.13University of Bedford Knowledge Network 2.12.13
University of Bedford Knowledge Network 2.12.13
 
Data visualization by Kenneth Odoh
Data visualization by Kenneth OdohData visualization by Kenneth Odoh
Data visualization by Kenneth Odoh
 
Do something in 5 minutes with gas 1-use spreadsheet as database
Do something in 5 minutes with gas 1-use spreadsheet as databaseDo something in 5 minutes with gas 1-use spreadsheet as database
Do something in 5 minutes with gas 1-use spreadsheet as database
 
Functional Programming
Functional ProgrammingFunctional Programming
Functional Programming
 
Visdjango presentation django_boston_oct_2014
Visdjango presentation django_boston_oct_2014Visdjango presentation django_boston_oct_2014
Visdjango presentation django_boston_oct_2014
 
Rubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for RubyRubyconfindia2018 - GPU accelerated libraries for Ruby
Rubyconfindia2018 - GPU accelerated libraries for Ruby
 
Business Networking Cambridge April 2014
Business Networking Cambridge April 2014Business Networking Cambridge April 2014
Business Networking Cambridge April 2014
 
G* on GAE/J 挑戦編
G* on GAE/J 挑戦編G* on GAE/J 挑戦編
G* on GAE/J 挑戦編
 

Viewers also liked

Social Media For Beginners - Agcas 2012
Social Media For Beginners - Agcas 2012Social Media For Beginners - Agcas 2012
Social Media For Beginners - Agcas 2012Matthew Mobbs
 
9no a 2da version
9no a 2da version9no a 2da version
9no a 2da versionAna María
 
Framtidens ehandel redan idag
Framtidens ehandel redan idagFramtidens ehandel redan idag
Framtidens ehandel redan idagUlrika Schreil
 
Introducción a la cerámica popular canaria cuadernillo
Introducción a la cerámica popular canaria cuadernilloIntroducción a la cerámica popular canaria cuadernillo
Introducción a la cerámica popular canaria cuadernilloGustavo Rivero Vega
 
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. МукачевоСвято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. МукачевоНаталія Бабич
 
Worcester Food & Active Living Policy Council: An Introduction
Worcester Food & Active Living Policy Council: An IntroductionWorcester Food & Active Living Policy Council: An Introduction
Worcester Food & Active Living Policy Council: An Introductionesheehancastro
 
Professional scepticism judgment uia 2
Professional scepticism judgment uia 2Professional scepticism judgment uia 2
Professional scepticism judgment uia 2Nik Hasyudeen
 
8th pre alg -jan22
8th pre alg -jan228th pre alg -jan22
8th pre alg -jan22jdurst65
 
Introducción a la ciencia e ingeniería de los materiales william d. callist...
Introducción a la ciencia e ingeniería de los materiales   william d. callist...Introducción a la ciencia e ingeniería de los materiales   william d. callist...
Introducción a la ciencia e ingeniería de los materiales william d. callist...elkinn
 
IntroduccióN A La ClíNica PsicolóGica Con NiñOs
IntroduccióN A La ClíNica PsicolóGica  Con  NiñOsIntroduccióN A La ClíNica PsicolóGica  Con  NiñOs
IntroduccióN A La ClíNica PsicolóGica Con NiñOsguesta14865ae
 
Evolucion de la informatica y su aplicacion
Evolucion de la informatica y su aplicacionEvolucion de la informatica y su aplicacion
Evolucion de la informatica y su aplicacionJessy Acosta
 
Introducción a la CMNUCC
Introducción a la CMNUCCIntroducción a la CMNUCC
Introducción a la CMNUCCCO2.cr
 
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCAINTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCAAdriana Amo
 
Introducción a la Biotecnología. Capítulo 2
Introducción a la Biotecnología. Capítulo 2Introducción a la Biotecnología. Capítulo 2
Introducción a la Biotecnología. Capítulo 2CiberGeneticaUNAM
 

Viewers also liked (20)

Mashing the data
Mashing the dataMashing the data
Mashing the data
 
Social Media For Beginners - Agcas 2012
Social Media For Beginners - Agcas 2012Social Media For Beginners - Agcas 2012
Social Media For Beginners - Agcas 2012
 
9no a 2da version
9no a 2da version9no a 2da version
9no a 2da version
 
Framtidens ehandel redan idag
Framtidens ehandel redan idagFramtidens ehandel redan idag
Framtidens ehandel redan idag
 
Introducción a la cerámica popular canaria cuadernillo
Introducción a la cerámica popular canaria cuadernilloIntroducción a la cerámica popular canaria cuadernillo
Introducción a la cerámica popular canaria cuadernillo
 
Weekly plannig52012
Weekly plannig52012Weekly plannig52012
Weekly plannig52012
 
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. МукачевоСвято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
Свято 8 Березня в середній групі "Ромашка" ДНЗ № 28 м. Мукачево
 
Aнглийский сленг (U-Z)
Aнглийский сленг (U-Z)Aнглийский сленг (U-Z)
Aнглийский сленг (U-Z)
 
Worcester Food & Active Living Policy Council: An Introduction
Worcester Food & Active Living Policy Council: An IntroductionWorcester Food & Active Living Policy Council: An Introduction
Worcester Food & Active Living Policy Council: An Introduction
 
Innovation in digital schools Gess Dubai 2013
Innovation in digital schools Gess Dubai 2013Innovation in digital schools Gess Dubai 2013
Innovation in digital schools Gess Dubai 2013
 
Professional scepticism judgment uia 2
Professional scepticism judgment uia 2Professional scepticism judgment uia 2
Professional scepticism judgment uia 2
 
8th pre alg -jan22
8th pre alg -jan228th pre alg -jan22
8th pre alg -jan22
 
Introducción a la ciencia e ingeniería de los materiales william d. callist...
Introducción a la ciencia e ingeniería de los materiales   william d. callist...Introducción a la ciencia e ingeniería de los materiales   william d. callist...
Introducción a la ciencia e ingeniería de los materiales william d. callist...
 
Guitar 5th grade
Guitar 5th gradeGuitar 5th grade
Guitar 5th grade
 
IntroduccióN A La ClíNica PsicolóGica Con NiñOs
IntroduccióN A La ClíNica PsicolóGica  Con  NiñOsIntroduccióN A La ClíNica PsicolóGica  Con  NiñOs
IntroduccióN A La ClíNica PsicolóGica Con NiñOs
 
Evolucion de la informatica y su aplicacion
Evolucion de la informatica y su aplicacionEvolucion de la informatica y su aplicacion
Evolucion de la informatica y su aplicacion
 
Introducción a la CMNUCC
Introducción a la CMNUCCIntroducción a la CMNUCC
Introducción a la CMNUCC
 
Retailing
RetailingRetailing
Retailing
 
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCAINTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
INTRODUCCIÓN A LA COMUNICACIÓN CIENTIFÍCA
 
Introducción a la Biotecnología. Capítulo 2
Introducción a la Biotecnología. Capítulo 2Introducción a la Biotecnología. Capítulo 2
Introducción a la Biotecnología. Capítulo 2
 

Similar to (Almost) Serverless Analytics System with BigQuery & AppEngine

Using redux and angular 2 with meteor
Using redux and angular 2 with meteorUsing redux and angular 2 with meteor
Using redux and angular 2 with meteorKen Ono
 
Using redux and angular 2 with meteor
Using redux and angular 2 with meteorUsing redux and angular 2 with meteor
Using redux and angular 2 with meteorKen Ono
 
U-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningU-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningMichael Rys
 
Writing MySQL User-defined Functions in JavaScript
Writing MySQL User-defined Functions in JavaScriptWriting MySQL User-defined Functions in JavaScript
Writing MySQL User-defined Functions in JavaScriptRoland Bouman
 
Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014Dieter Plaetinck
 
BigQueryで作る分析環境
BigQueryで作る分析環境BigQueryで作る分析環境
BigQueryで作る分析環境将央 山口
 
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_103 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1mlraviol
 
A Tour of Building Web Applications with R Shiny
A Tour of Building Web Applications with R Shiny A Tour of Building Web Applications with R Shiny
A Tour of Building Web Applications with R Shiny Wendy Chen Dubois
 
What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2MariaDB plc
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformMartin Zapletal
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1MariaDB plc
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1MariaDB plc
 
Programming IoT Gateways in JavaScript with macchina.io
Programming IoT Gateways in JavaScript with macchina.ioProgramming IoT Gateways in JavaScript with macchina.io
Programming IoT Gateways in JavaScript with macchina.ioGünter Obiltschnig
 
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...Big Data Spain
 
MySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and GraphiteMySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and GraphiteDB-Art
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiMonitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiInfluxData
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupDatabricks
 
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1MariaDB plc
 

Similar to (Almost) Serverless Analytics System with BigQuery & AppEngine (20)

App bot
App botApp bot
App bot
 
Using redux and angular 2 with meteor
Using redux and angular 2 with meteorUsing redux and angular 2 with meteor
Using redux and angular 2 with meteor
 
Using redux and angular 2 with meteor
Using redux and angular 2 with meteorUsing redux and angular 2 with meteor
Using redux and angular 2 with meteor
 
U-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance TuningU-SQL Query Execution and Performance Tuning
U-SQL Query Execution and Performance Tuning
 
Writing MySQL User-defined Functions in JavaScript
Writing MySQL User-defined Functions in JavaScriptWriting MySQL User-defined Functions in JavaScript
Writing MySQL User-defined Functions in JavaScript
 
Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014Rethinking metrics: metrics 2.0 @ Lisa 2014
Rethinking metrics: metrics 2.0 @ Lisa 2014
 
BigQueryで作る分析環境
BigQueryで作る分析環境BigQueryで作る分析環境
BigQueryで作る分析環境
 
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_103 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
03 2017Emea_RoadshowMilan-WhatsNew-Mariadbserver10_2andmaxscale 2_1
 
A Tour of Building Web Applications with R Shiny
A Tour of Building Web Applications with R Shiny A Tour of Building Web Applications with R Shiny
A Tour of Building Web Applications with R Shiny
 
What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2What’s New in MariaDB Server 10.2
What’s New in MariaDB Server 10.2
 
Large volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive PlatformLarge volume data analysis on the Typesafe Reactive Platform
Large volume data analysis on the Typesafe Reactive Platform
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
What's New in MariaDB Server 10.2 and MariaDB MaxScale 2.1
 
Programming IoT Gateways in JavaScript with macchina.io
Programming IoT Gateways in JavaScript with macchina.ioProgramming IoT Gateways in JavaScript with macchina.io
Programming IoT Gateways in JavaScript with macchina.io
 
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
BigQuery JavaScript User-Defined Functions by THOMAS PARK and FELIPE HOFFA at...
 
MySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and GraphiteMySQL performance monitoring using Statsd and Graphite
MySQL performance monitoring using Statsd and Graphite
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry PiMonitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
 
Spark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark MeetupSpark SQL Deep Dive @ Melbourne Spark Meetup
Spark SQL Deep Dive @ Melbourne Spark Meetup
 
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
Die Neuheiten in MariaDB 10.2 und MaxScale 2.1
 

Recently uploaded

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 

Recently uploaded (20)

CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

(Almost) Serverless Analytics System with BigQuery & AppEngine

  • 1. Gabriel PREDA @eRadical (Almost) Serverless Analytics System with BigQuery & AppEngine
  • 2. Agenda Going Serverless with AppEngine & Tasks Pub/Sub, DataStore BigQuery Load Batch Streaming Inserts Query UDF Export ...some BigQueries...
  • 3. AeonsSome years ago... ~ 500,000 - 2,000,000 events / day (on average)
  • 4. Some time ago... ~2,000,000 - 22,000,000 events / day Dec 2014: 57,430,000 events / day 1 day to recompute » 12 hours
  • 5. NOW() 22,000,000 - 70,000,000 events / day AVG » 40,000,000 events / day Processing ~30GB-70GB / day Recompute 1 day » 10-20 minutes
  • 7. other... (almost) serverless products Cloud Functions (alpha - Node.JS) Cloud DataFlow (Java, Python - beta)
  • 9. BigQuery - data types ● STRING - UTF-8 (2 bytes + encoded string size) ● BYTES - base64 encoded (except in Avro) ● INTEGER - 64-bit signed (8 bytes) ● FLOAT (8 bytes) ● BOOLEAN - true/false, 1/0 only in CSV (1 byte) ● TIMESTAMP ex:”2014-08-19 12:41:35.220 UTC” (8 bytes) ● DATE, TIME, DATETIME - limited support in Legacy SQL ● RECORD - a collection of fields (size of fields) https://cloud.google.com/bigquery/data-types
  • 10. BigQuery -> loadData() Formats: CSV, JSON (newline delimited), Avro, Parquet (experimental) Tools: Web UI, bq, API Source: local files, Cloud Storage, [demo] Cloud Datastore (backup files), POST requests, SQL DML* Google Sheets - Federated Data Sources - Streaming Inserts
  • 13. BigQuery -> SELECT … FROM surprise… query: SELECT { * | field_path.* | expression } [ [ AS ] alias ] [ , ... ] [ FROM from_body [ WHERE bool_expression ] [ OMIT RECORD IF bool_expression] [ GROUP [ EACH ] BY [ ROLLUP ] { field_name_or_alias } [ , ... ] ] [ HAVING bool_expression ] [ ORDER BY field_name_or_alias [ { DESC | ASC } ] [, ... ] ] [ LIMIT n ] ]; from_body: from_item [, ...] | # Warning: Comma means UNION ALL here from_item [ join_type ] JOIN [ EACH ] from_item [ ON join_predicate ] | (FLATTEN({ table_name | (query) }, field_name_or_alias)) | table_wildcard_function from_item: { table_name | (query) } [ [ AS ] alias ] join_type: { INNER | [ FULL ] [ OUTER ] | RIGHT [ OUTER ] | LEFT [ OUTER ] | CROSS }
  • 14. BigQuery -> SELECT … FROM surprise… Date-Partitioned Tables [demo] Table Decorators - See the past w/ @ Table Wildcard Functions - TABLE_DATE_RANGE() & TABLE_QUERY() Interesting functions - DateTime » UTC_USEC_TO_DAY/HOUR/MONTH/WEEK/YEAR() » Shifts a UNIX timestamp in microseconds to the beginning of the period it occurs in. - JSON_EXTRACT[_SCALAR]() - URL functions » HOST(), DOMAIN(), TLD() - REGEXP_MATCH(), REGEXP_EXTRACT()
  • 15. bigquery.defineFunction( 'expandAssetLibrary', // Name of the function exported to SQL ['user_id', 'video_id', 'stage_settings'], // Names of input columns [ {'name': 'user_id', 'type': 'integer'}, // Output schema {'name': 'video_id', 'type': 'string'}, {'name': 'asset', 'type': 'string'} ], expandAssetLibrary // Reference to JavaScript UDF ); function expandAssetLibrary(row, emit) { ………………………… emit({ user_id: row.user_id, video_id: row.video_id, asset: ss.url.replace('http://', '')); } BigQuery -> User Defined Functions
  • 16. BigQuery -> DML Standard SQL only Maximum UPDATE/DELETE statements per day per table: 48 Maximum UPDATE/DELETE statements per day per project: 500 Maximum INSERT statements per day per table: 1,000 Maximum INSERT statements per day per project: 10,000
  • 17. BigQuery -> export() To: Google Cloud Storage Format: CSV, JSON [.gz], Avro …1G files
  • 18. BigQuery -> some (Big)Queries SELECT year, count(1) FROM [bigquery-public-data:samples.natality] WHERE father_age < 18 GROUP BY year ORDER BY year SELECT year, count(1) FROM [bigquery-public-data:samples.natality] WHERE mother_age < 18 GROUP BY year ORDER BY year SELECT table_id, row_count, CEIL(size_bytes/POW(1024, 3)) AS gb FROM [bigquery-public-data:ghcn_m.__TABLES__] ORDER BY gb DESC
  • 19. BigQuery -> some (Big)Queries SELECT REGEXP_EXTRACT(path, r'.*.(.*)$') AS file_extension, COUNT(1) AS k FROM [bigquery-public-data:github_repos.files] GROUP BY file_extension ORDER BY k DESC LIMIT 20 SELECT table_id, row_count, CEIL(size_bytes/POW(1024, 3)) AS gb FROM [bigquery-public-data:github_repos.__TABLES__] ORDER BY gb DESC