SlideShare a Scribd company logo
1 of 36
Unidata’s Common Data Model

John Caron
Unidata/UCAR
Nov 2006
Goals / Overview
• Look at the landscape of scientific
datasets from a few thousand feet up.
• What semantics are needed to make
these useful?
– georeferencing
– specialized subsetting
What’s a Data Model?
• An Abstract Data Model describes data objects
and what methods you can use on them.
• An API is the interface to the Data Model for a
specific programming language
• A file format is a way to persist the objects in
the Data Model.
• An Abstract Data Model removes the details of
any particular API and the persistence format.
Common Data Model Layers
Scientific Datatypes
Point

Trajectory
Radial

Grid

Station
Swath

Coordinate Systems

Data Access

Profile
Application

Scientific Datatypes
Datatype Adapter

NetCDF-Java
version 2.2
architecture

NetcdfDataset
ADDE

CoordSystem Builder
NetcdfFile

THREDDS
I/O service provider
OPeNDAP

Catalog.xml
NcML
NcML

NetCDF-3

NIDS

NetCDF-4

GRIB

HDF5

GINI
Nexrad
…

DMSP
NetCDF-4 and
Common Data Model
(Data Access Layer)
I/O Service Provider
Implementations
•
•
•
•
•
•

General: NetCDF, HDF5, OPeNDAP
Gridded: GRIB-1, GRIB-2
Radar: NEXRAD level 2 and 3, DORADE
Point: BUFR, ASCII
Satellite: DMSP, GINI
In development
– NOAA: GOES (Knapp/Nelson), many others
Coordinate Systems needed
• NetCDF, OPeNDAP, HDF data models do
not have integrated coordinate systems
– so georeferencing not part of API
– Need conventions to specify (eg CF-1,
COARDS, etc)

• Contrast GRIB, HDF-EOS, other
specialized formats
NetCDF Coordinate Variables
dimensions:
lat = 64;
lon = 128;
variables:
float lat(lat);
float lon(lon);
double temperature(lat,lon);
Coordinate Variables
– One-dimension variable with same
name as its dimension
– Strictly monotonic values
– No missing values
The coordinates of a point (i,j,k) is
{CV1(i), CV2(j), CV3(k)}
Limitations of 1D Coordinate Variables
• Non lat/lon horizontal grids:
float temperature(y,x)
float lat(y, x);
float lon(y, x);
• Trajectory data:
float NKoreaRadioactivity(pt);
float lat(pt);
float lon(pt);
float altitude(pt);
float time(pt)
General Coordinates in CF-1.0
float P(y,x);
P:coordinates = “lat lon”;
float lat(y, x);
float lon(y, x);
float Sr90(pt);
Sr90:coordinates
= “lat lon altitude time”;
Coordinate Systems (abstract)
• A Coordinate System for a data variable is
a set of Coordinate Variables2 such that the
coordinates of the (i,j,k) data point is
{CV1(i,j,k),CV2(i,j,k),CV3(i,j,k),CV4(i,j,k)…}
previous was {CV1(i), CV2(j), CV3(k)}

• The dimensions of each Coordinate
Variable must be a subset of the
dimensions of the data variable.
Need Coordinate Axis Types
float gridData(t,z,y,x);
float time(t);
float y(y);
float x(x);
float lat(y,x);
float lon(y,x);
float height(t,z,y,x);

float radialData(radial, gate)
float distance(gate)
float azimuth(radial)
float elevation(radial)
float time(radial)
The same??
float stationObs(pt);
float lat(pt);
float lon(pt);
float z(pt);
float time(pt);

float trajectory(pt);
float lat(pt);
float lon(pt);
float z(pt);
float time(pt);
Revised Coordinate Systems
1. Specify Coordinate Variables
2. Specify Coordinate Types
(time, lat, lon, projection x, y, height,
pressure, z, radial, azimuth, elevation)

3. Specify connectivity (implicit or
explicit) between data points
– Implicit: Neighbors in index space are
(connected) neighbors in coordinate
space. Allows efficient searching.
Gridded Data
float gridData(t,z,y,x);
float time(t); // Time
float y(y); // GeoX
float x(x); // GeoY
float z(t,z,y,x); // Height or Pressure
• Cartesian

coordinates
• All dimensions are connected

Connected means
Neighbors in index space
are neighbors in
coordinate space
Coordinate Systems UML
Scientific Data Types
• Based on datasets Unidata is familiar with
– APIs are evolving

• How are data points connected?
• Intended to scale to large, multifile
collections
• Intended to support “specialized queries”
– Space, Time

• Corresponding “standard” NetCDF file
conventions
Gridded Data
• Cartesian

coordinates
• All dimensions are connected
• x, y, z, time
• recently added runtime and ensemble
• refactored into GridDatatype interface
float gridData(t,z,y,x);
float time(t);
float y(y);
float x(x);
float lat(y,x);
float lon(y,x);
float height(t,z,y,x);
GridDatatype methods
CoordinateAxis getTaxis();
CoordinateAxis getXaxis();
CoordinateAxis getYaxis();
CoordinateAxis getZaxis();
Projection getProjection();
int[] findXYindexFromCoord( double x_coord,
double y_coord);
LatLonRect getLatLonBoundingBox();
Array getDataSlice (Range[] …)
GridDatatype makeSubset (Range[] …)
Radial Data
• Polar

coordinates
• All dimensions are connected
• Not separate time dimension
radialData(radial, gate) :
distance(gate)
azimuth(radial)
elevation(radial)
time(radial)
Swath
• lat/lon

coordinates
• not separate time dimension
• all dimensions are connected
swathData(line,cell)
lat(line,cell)
lon(line,cell)
time(line)
z(line,cell) ??
Point Observation Data
• Set

of measurements at the
same point in space and time
• Point dimension not connected
float obs1(pt);
float obs2(pt);
float lat(pt);
float lon(pt);
float z(pt);
float time(pt);
Structure {
lat, lon, z, time;
v1, v2, ...
} obs( pt);
PointObsDataset Methods
// Iterator<StructureData>
Iterator getData(
LatLonRect boundingBox,
Date start, Date end);
Time series Station Data
Structure {
name;
lat, lon, z;
Structure{
time;
v1, v2, ...
} obs(*); // connected
} stn(stn); // not connected
StationObs Methods
// List<Station>
List getStations(
LatLonRect boundingBox);
// Iterator<StructureData>
Iterator getData(
Station s,
Date start, Date end);
Trajectory Data
• pt dimension is connected
• Collection dimension not
connected
Structure {
lat, lon, z, time;
v1, v2, ...
} obs(pt); // connected
Structure {
name;
Structure {
lat, lon, z, time;
v1, v2, ...
} obs(*); // connected
} traj(traj) // not connected
Profiler/Sounding Station Data
Structure {
name;
lat, lon, time;
Structure {
z;
v1, v2, ...
} obs(*); // connected
} loc(nloc); // not connected
Structure {
name;
lat, lon;
Structure {
time,
Structure {
z;
v1, v2, ...
} obs(*); // connected
} time(*); // connected
} stn(stn); // not connected
Unstructured Grid
• Pt dimension not connected
• Looks the same as point data
• Need to specify the connectivity
explicitly
float unstructGrid(t,z,pt);
float lat(pt);
float lon(pt);
float time(t);
float height(z);
Data Types Summary
• Data access through a standard API
• Convenient georeferencing
• Specialized subsetting methods
– Efficiency for large datasets
Payoff
N + M instead of N * M things on your TODO List!
File Format
#1

CDM

Visualization
&Analysis

NetCDF file
File Format
#2
OpenDAP Server
File Format
#N

WCS Service

Web Service
THREDDS Data Server
HTTP Tomcat Server

Catalog.xml
THREDDS Server

•OPeNDAP
•HTTPServer
•WCS

NetCDF-Java
library

hostname.edu

Datasets

IDD Data

Application
Next: DataType Aggregation
•
•

Work at the CDM DataType level, know (some)
data semantics
Forecast Model Collection
–
–

•

Combine multiple model forecasts into single
dataset with two time dimensions
With NOAA/IOOS (Steve Hankin)

Point/Station/Trajectory/Profile Data
–
–

Allow space/time queries, return nested sequences
Start from / standardize “Dapper conventions”
Forecast
Model
Collections
Conclusion
• Standardized Data Access in good shape
– HDF5, NetCDF, OPeNDAP
– Write an IOSP for proprietary formats (Java)

• But that’s not good enough!
• To do:
– Standard representations of coordinate
systems
– Classifications of data types, standard
services for them

More Related Content

What's hot

EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
 
My cool new Slideshow!
My cool new Slideshow!My cool new Slideshow!
My cool new Slideshow!Dung Trương
 
Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Sparksscdotopen
 
Mahout scala and spark bindings
Mahout scala and spark bindingsMahout scala and spark bindings
Mahout scala and spark bindingsDmitriy Lyubimov
 
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse MatricesPresto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse MatricesQian Lin
 
Bitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingBitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingKyong-Ha Lee
 
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...DB Tsai
 
3.1,2,3 pushdown automata definition, moves &amp; id
3.1,2,3 pushdown automata   definition, moves &amp; id3.1,2,3 pushdown automata   definition, moves &amp; id
3.1,2,3 pushdown automata definition, moves &amp; idSampath Kumar S
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkDB Tsai
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Jen Aman
 
Bfs algorithm & its application
Bfs algorithm & its applicationBfs algorithm & its application
Bfs algorithm & its applicationAmit Kundu
 
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmNECST Lab @ Politecnico di Milano
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkCloudera, Inc.
 

What's hot (14)

EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
 
Raster package jacob
Raster package jacobRaster package jacob
Raster package jacob
 
My cool new Slideshow!
My cool new Slideshow!My cool new Slideshow!
My cool new Slideshow!
 
Co-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and SparkCo-occurrence Based Recommendations with Mahout, Scala and Spark
Co-occurrence Based Recommendations with Mahout, Scala and Spark
 
Mahout scala and spark bindings
Mahout scala and spark bindingsMahout scala and spark bindings
Mahout scala and spark bindings
 
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse MatricesPresto: Distributed Machine Learning and Graph Processing with Sparse Matrices
Presto: Distributed Machine Learning and Graph Processing with Sparse Matrices
 
Bitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query ProcessingBitmap Indexes for Relational XML Twig Query Processing
Bitmap Indexes for Relational XML Twig Query Processing
 
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
2015-06-15 Large-Scale Elastic-Net Regularized Generalized Linear Models at S...
 
3.1,2,3 pushdown automata definition, moves &amp; id
3.1,2,3 pushdown automata   definition, moves &amp; id3.1,2,3 pushdown automata   definition, moves &amp; id
3.1,2,3 pushdown automata definition, moves &amp; id
 
Multinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache SparkMultinomial Logistic Regression with Apache Spark
Multinomial Logistic Regression with Apache Spark
 
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
Massive Simulations In Spark: Distributed Monte Carlo For Global Health Forec...
 
Bfs algorithm & its application
Bfs algorithm & its applicationBfs algorithm & its application
Bfs algorithm & its application
 
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 

Viewers also liked

Unidata Overview 3.6.15
Unidata Overview 3.6.15Unidata Overview 3.6.15
Unidata Overview 3.6.15Josh Young
 
ESIP presentation on DMRC 7.14.15
ESIP presentation on DMRC 7.14.15ESIP presentation on DMRC 7.14.15
ESIP presentation on DMRC 7.14.15Josh Young
 
コードを書きやすくしてくれる Xcode の基本機能 #NSStudy #devsap
コードを書きやすくしてくれる Xcode の基本機能 #NSStudy #devsapコードを書きやすくしてくれる Xcode の基本機能 #NSStudy #devsap
コードを書きやすくしてくれる Xcode の基本機能 #NSStudy #devsapTomohiro Kumagai
 
Unidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology SharingUnidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology SharingThe HDF-EOS Tools and Information Center
 
SIXTH SENSE TECHNOLOGY (PRANAV MISTRY) -WEAR YOUR WORLD!!!
SIXTH SENSE TECHNOLOGY (PRANAV MISTRY) -WEAR YOUR WORLD!!!SIXTH SENSE TECHNOLOGY (PRANAV MISTRY) -WEAR YOUR WORLD!!!
SIXTH SENSE TECHNOLOGY (PRANAV MISTRY) -WEAR YOUR WORLD!!!Fathima Mizna Kalathingal
 
Sixth Sense Technology
Sixth Sense TechnologySixth Sense Technology
Sixth Sense TechnologyNavin Kumar
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation AMD
 
Trabalho encadernado 1(2017) vigas curvas
Trabalho encadernado 1(2017)   vigas curvasTrabalho encadernado 1(2017)   vigas curvas
Trabalho encadernado 1(2017) vigas curvasVinicius Fernandes
 
Digital Marketing seminar at VRIT
Digital Marketing seminar at VRITDigital Marketing seminar at VRIT
Digital Marketing seminar at VRITAmitesh Kumar
 
The sixth sense technology complete ppt
The sixth sense technology complete pptThe sixth sense technology complete ppt
The sixth sense technology complete pptatinav242
 

Viewers also liked (13)

Plans for Enhanced NetCDF-4 Interface to HDF5 Data
Plans for Enhanced NetCDF-4 Interface to HDF5 DataPlans for Enhanced NetCDF-4 Interface to HDF5 Data
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
 
Data model
Data modelData model
Data model
 
Unidata Overview 3.6.15
Unidata Overview 3.6.15Unidata Overview 3.6.15
Unidata Overview 3.6.15
 
ESIP presentation on DMRC 7.14.15
ESIP presentation on DMRC 7.14.15ESIP presentation on DMRC 7.14.15
ESIP presentation on DMRC 7.14.15
 
Web-based On-demand Global NDVI Data Services
Web-based On-demand Global NDVI Data ServicesWeb-based On-demand Global NDVI Data Services
Web-based On-demand Global NDVI Data Services
 
コードを書きやすくしてくれる Xcode の基本機能 #NSStudy #devsap
コードを書きやすくしてくれる Xcode の基本機能 #NSStudy #devsapコードを書きやすくしてくれる Xcode の基本機能 #NSStudy #devsap
コードを書きやすくしてくれる Xcode の基本機能 #NSStudy #devsap
 
Unidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology SharingUnidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology Sharing
 
SIXTH SENSE TECHNOLOGY (PRANAV MISTRY) -WEAR YOUR WORLD!!!
SIXTH SENSE TECHNOLOGY (PRANAV MISTRY) -WEAR YOUR WORLD!!!SIXTH SENSE TECHNOLOGY (PRANAV MISTRY) -WEAR YOUR WORLD!!!
SIXTH SENSE TECHNOLOGY (PRANAV MISTRY) -WEAR YOUR WORLD!!!
 
Sixth Sense Technology
Sixth Sense TechnologySixth Sense Technology
Sixth Sense Technology
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
 
Trabalho encadernado 1(2017) vigas curvas
Trabalho encadernado 1(2017)   vigas curvasTrabalho encadernado 1(2017)   vigas curvas
Trabalho encadernado 1(2017) vigas curvas
 
Digital Marketing seminar at VRIT
Digital Marketing seminar at VRITDigital Marketing seminar at VRIT
Digital Marketing seminar at VRIT
 
The sixth sense technology complete ppt
The sixth sense technology complete pptThe sixth sense technology complete ppt
The sixth sense technology complete ppt
 

Similar to Unidata's Common Data Model

Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced Flink Forward
 
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov
 
Opensource gis development - part 2
Opensource gis development - part 2Opensource gis development - part 2
Opensource gis development - part 2Andrea Antonello
 
Tale of Two Models
Tale of Two ModelsTale of Two Models
Tale of Two Modelskgeographer
 
Roberto Trasarti PhD Thesis
Roberto Trasarti PhD ThesisRoberto Trasarti PhD Thesis
Roberto Trasarti PhD ThesisRoberto Trasarti
 
R Spatial Analysis using SP
R Spatial Analysis using SPR Spatial Analysis using SP
R Spatial Analysis using SPtjagger
 
A Divine Data Comedy
A Divine Data ComedyA Divine Data Comedy
A Divine Data ComedyMike Harris
 
060128 Galeon Rept
060128 Galeon Rept060128 Galeon Rept
060128 Galeon ReptRudolf Husar
 
ST-Toolkit, a Framework for Trajectory Data Warehousing
ST-Toolkit, a Framework for Trajectory Data WarehousingST-Toolkit, a Framework for Trajectory Data Warehousing
ST-Toolkit, a Framework for Trajectory Data WarehousingSimone Campora
 
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQLModeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQLKostis Kyzirakos
 
Rdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimationRdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimationINRIA-OAK
 
The Swift Compiler and Standard Library
The Swift Compiler and Standard LibraryThe Swift Compiler and Standard Library
The Swift Compiler and Standard LibrarySantosh Rajan
 
Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...Andrew Yongjoon Kong
 
To Infinity & Beyond: Protocols & sequences in Node - Part 2
To Infinity & Beyond: Protocols & sequences in Node - Part 2To Infinity & Beyond: Protocols & sequences in Node - Part 2
To Infinity & Beyond: Protocols & sequences in Node - Part 2Bahul Neel Upadhyaya
 
What make Swift Awesome
What make Swift AwesomeWhat make Swift Awesome
What make Swift AwesomeSokna Ly
 
Introduction To PostGIS
Introduction To PostGISIntroduction To PostGIS
Introduction To PostGISmleslie
 
#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and Protocols#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and ProtocolsPhilippe Back
 

Similar to Unidata's Common Data Model (20)

Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced
 
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizationsEgor Bogatov - .NET Core intrinsics and other micro-optimizations
Egor Bogatov - .NET Core intrinsics and other micro-optimizations
 
Pune Clojure Course Outline
Pune Clojure Course OutlinePune Clojure Course Outline
Pune Clojure Course Outline
 
Opensource gis development - part 2
Opensource gis development - part 2Opensource gis development - part 2
Opensource gis development - part 2
 
Tale of Two Models
Tale of Two ModelsTale of Two Models
Tale of Two Models
 
Roberto Trasarti PhD Thesis
Roberto Trasarti PhD ThesisRoberto Trasarti PhD Thesis
Roberto Trasarti PhD Thesis
 
R Spatial Analysis using SP
R Spatial Analysis using SPR Spatial Analysis using SP
R Spatial Analysis using SP
 
A Divine Data Comedy
A Divine Data ComedyA Divine Data Comedy
A Divine Data Comedy
 
The STL
The STLThe STL
The STL
 
060128 Galeon Rept
060128 Galeon Rept060128 Galeon Rept
060128 Galeon Rept
 
ST-Toolkit, a Framework for Trajectory Data Warehousing
ST-Toolkit, a Framework for Trajectory Data WarehousingST-Toolkit, a Framework for Trajectory Data Warehousing
ST-Toolkit, a Framework for Trajectory Data Warehousing
 
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQLModeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
Modeling and Querying Metadata in the Semantic Sensor Web: stRDF and stSPARQL
 
R programming by ganesh kavhar
R programming by ganesh kavharR programming by ganesh kavhar
R programming by ganesh kavhar
 
Rdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimationRdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimation
 
The Swift Compiler and Standard Library
The Swift Compiler and Standard LibraryThe Swift Compiler and Standard Library
The Swift Compiler and Standard Library
 
Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...Stream analysis with kafka native way and considerations about monitoring as ...
Stream analysis with kafka native way and considerations about monitoring as ...
 
To Infinity & Beyond: Protocols & sequences in Node - Part 2
To Infinity & Beyond: Protocols & sequences in Node - Part 2To Infinity & Beyond: Protocols & sequences in Node - Part 2
To Infinity & Beyond: Protocols & sequences in Node - Part 2
 
What make Swift Awesome
What make Swift AwesomeWhat make Swift Awesome
What make Swift Awesome
 
Introduction To PostGIS
Introduction To PostGISIntroduction To PostGIS
Introduction To PostGIS
 
#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and Protocols#Pharo Days 2016 Data Formats and Protocols
#Pharo Days 2016 Data Formats and Protocols
 

More from The HDF-EOS Tools and Information Center

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...The HDF-EOS Tools and Information Center
 

More from The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 

Recently uploaded

How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeCzechDreamin
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyJohn Staveley
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutesconfluent
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Patrick Viafore
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty SecureFemke de Vroome
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfFIDO Alliance
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераMark Opanasiuk
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...FIDO Alliance
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FIDO Alliance
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfFIDO Alliance
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...FIDO Alliance
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekCzechDreamin
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfFIDO Alliance
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsStefano
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoUXDXConf
 

Recently uploaded (20)

How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdfSyngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
The UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, OcadoThe UX of Automation by AJ King, Senior UX Researcher, Ocado
The UX of Automation by AJ King, Senior UX Researcher, Ocado
 

Unidata's Common Data Model

  • 1. Unidata’s Common Data Model John Caron Unidata/UCAR Nov 2006
  • 2. Goals / Overview • Look at the landscape of scientific datasets from a few thousand feet up. • What semantics are needed to make these useful? – georeferencing – specialized subsetting
  • 3. What’s a Data Model? • An Abstract Data Model describes data objects and what methods you can use on them. • An API is the interface to the Data Model for a specific programming language • A file format is a way to persist the objects in the Data Model. • An Abstract Data Model removes the details of any particular API and the persistence format.
  • 4. Common Data Model Layers Scientific Datatypes Point Trajectory Radial Grid Station Swath Coordinate Systems Data Access Profile
  • 5. Application Scientific Datatypes Datatype Adapter NetCDF-Java version 2.2 architecture NetcdfDataset ADDE CoordSystem Builder NetcdfFile THREDDS I/O service provider OPeNDAP Catalog.xml NcML NcML NetCDF-3 NIDS NetCDF-4 GRIB HDF5 GINI Nexrad … DMSP
  • 6. NetCDF-4 and Common Data Model (Data Access Layer)
  • 7. I/O Service Provider Implementations • • • • • • General: NetCDF, HDF5, OPeNDAP Gridded: GRIB-1, GRIB-2 Radar: NEXRAD level 2 and 3, DORADE Point: BUFR, ASCII Satellite: DMSP, GINI In development – NOAA: GOES (Knapp/Nelson), many others
  • 8. Coordinate Systems needed • NetCDF, OPeNDAP, HDF data models do not have integrated coordinate systems – so georeferencing not part of API – Need conventions to specify (eg CF-1, COARDS, etc) • Contrast GRIB, HDF-EOS, other specialized formats
  • 9. NetCDF Coordinate Variables dimensions: lat = 64; lon = 128; variables: float lat(lat); float lon(lon); double temperature(lat,lon);
  • 10. Coordinate Variables – One-dimension variable with same name as its dimension – Strictly monotonic values – No missing values The coordinates of a point (i,j,k) is {CV1(i), CV2(j), CV3(k)}
  • 11. Limitations of 1D Coordinate Variables • Non lat/lon horizontal grids: float temperature(y,x) float lat(y, x); float lon(y, x); • Trajectory data: float NKoreaRadioactivity(pt); float lat(pt); float lon(pt); float altitude(pt); float time(pt)
  • 12. General Coordinates in CF-1.0 float P(y,x); P:coordinates = “lat lon”; float lat(y, x); float lon(y, x); float Sr90(pt); Sr90:coordinates = “lat lon altitude time”;
  • 13. Coordinate Systems (abstract) • A Coordinate System for a data variable is a set of Coordinate Variables2 such that the coordinates of the (i,j,k) data point is {CV1(i,j,k),CV2(i,j,k),CV3(i,j,k),CV4(i,j,k)…} previous was {CV1(i), CV2(j), CV3(k)} • The dimensions of each Coordinate Variable must be a subset of the dimensions of the data variable.
  • 14. Need Coordinate Axis Types float gridData(t,z,y,x); float time(t); float y(y); float x(x); float lat(y,x); float lon(y,x); float height(t,z,y,x); float radialData(radial, gate) float distance(gate) float azimuth(radial) float elevation(radial) float time(radial)
  • 15. The same?? float stationObs(pt); float lat(pt); float lon(pt); float z(pt); float time(pt); float trajectory(pt); float lat(pt); float lon(pt); float z(pt); float time(pt);
  • 16. Revised Coordinate Systems 1. Specify Coordinate Variables 2. Specify Coordinate Types (time, lat, lon, projection x, y, height, pressure, z, radial, azimuth, elevation) 3. Specify connectivity (implicit or explicit) between data points – Implicit: Neighbors in index space are (connected) neighbors in coordinate space. Allows efficient searching.
  • 17. Gridded Data float gridData(t,z,y,x); float time(t); // Time float y(y); // GeoX float x(x); // GeoY float z(t,z,y,x); // Height or Pressure • Cartesian coordinates • All dimensions are connected Connected means Neighbors in index space are neighbors in coordinate space
  • 19. Scientific Data Types • Based on datasets Unidata is familiar with – APIs are evolving • How are data points connected? • Intended to scale to large, multifile collections • Intended to support “specialized queries” – Space, Time • Corresponding “standard” NetCDF file conventions
  • 20. Gridded Data • Cartesian coordinates • All dimensions are connected • x, y, z, time • recently added runtime and ensemble • refactored into GridDatatype interface float gridData(t,z,y,x); float time(t); float y(y); float x(x); float lat(y,x); float lon(y,x); float height(t,z,y,x);
  • 21. GridDatatype methods CoordinateAxis getTaxis(); CoordinateAxis getXaxis(); CoordinateAxis getYaxis(); CoordinateAxis getZaxis(); Projection getProjection(); int[] findXYindexFromCoord( double x_coord, double y_coord); LatLonRect getLatLonBoundingBox(); Array getDataSlice (Range[] …) GridDatatype makeSubset (Range[] …)
  • 22. Radial Data • Polar coordinates • All dimensions are connected • Not separate time dimension radialData(radial, gate) : distance(gate) azimuth(radial) elevation(radial) time(radial)
  • 23. Swath • lat/lon coordinates • not separate time dimension • all dimensions are connected swathData(line,cell) lat(line,cell) lon(line,cell) time(line) z(line,cell) ??
  • 24. Point Observation Data • Set of measurements at the same point in space and time • Point dimension not connected float obs1(pt); float obs2(pt); float lat(pt); float lon(pt); float z(pt); float time(pt); Structure { lat, lon, z, time; v1, v2, ... } obs( pt);
  • 25. PointObsDataset Methods // Iterator<StructureData> Iterator getData( LatLonRect boundingBox, Date start, Date end);
  • 26. Time series Station Data Structure { name; lat, lon, z; Structure{ time; v1, v2, ... } obs(*); // connected } stn(stn); // not connected
  • 27. StationObs Methods // List<Station> List getStations( LatLonRect boundingBox); // Iterator<StructureData> Iterator getData( Station s, Date start, Date end);
  • 28. Trajectory Data • pt dimension is connected • Collection dimension not connected Structure { lat, lon, z, time; v1, v2, ... } obs(pt); // connected Structure { name; Structure { lat, lon, z, time; v1, v2, ... } obs(*); // connected } traj(traj) // not connected
  • 29. Profiler/Sounding Station Data Structure { name; lat, lon, time; Structure { z; v1, v2, ... } obs(*); // connected } loc(nloc); // not connected Structure { name; lat, lon; Structure { time, Structure { z; v1, v2, ... } obs(*); // connected } time(*); // connected } stn(stn); // not connected
  • 30. Unstructured Grid • Pt dimension not connected • Looks the same as point data • Need to specify the connectivity explicitly float unstructGrid(t,z,pt); float lat(pt); float lon(pt); float time(t); float height(z);
  • 31. Data Types Summary • Data access through a standard API • Convenient georeferencing • Specialized subsetting methods – Efficiency for large datasets
  • 32. Payoff N + M instead of N * M things on your TODO List! File Format #1 CDM Visualization &Analysis NetCDF file File Format #2 OpenDAP Server File Format #N WCS Service Web Service
  • 33. THREDDS Data Server HTTP Tomcat Server Catalog.xml THREDDS Server •OPeNDAP •HTTPServer •WCS NetCDF-Java library hostname.edu Datasets IDD Data Application
  • 34. Next: DataType Aggregation • • Work at the CDM DataType level, know (some) data semantics Forecast Model Collection – – • Combine multiple model forecasts into single dataset with two time dimensions With NOAA/IOOS (Steve Hankin) Point/Station/Trajectory/Profile Data – – Allow space/time queries, return nested sequences Start from / standardize “Dapper conventions”
  • 36. Conclusion • Standardized Data Access in good shape – HDF5, NetCDF, OPeNDAP – Write an IOSP for proprietary formats (Java) • But that’s not good enough! • To do: – Standard representations of coordinate systems – Classifications of data types, standard services for them

Editor's Notes

  1. Diversity of formats:
  2. Appropriate design decision for General formats
  3. Need more dynamic system for real time and very large datasets. Catalog is a file, but these are services, that is, code. Show IDD Server catalog – show sattellite DQC, then show radar DQC