SlideShare a Scribd company logo
1 of 28
Download to read offline
Using data to improve
student research
EasyBib is an automatic
bibliography composer.
Students use it to cite
sources for their research.
We teach information
literacy.
18%
of all student papers include plagiarism1
Source: (1) TurnItIn; (2) Both Sides Now: Librarians Looking at Information Literacy from High School and College.
50%
likelihood of using a credible vs. non-
credible source1
4%
increase in the use of paper mills and
cheating sites1
~16%
of students are adequately prepared for
college.2
That’s how we felt too..
The problem is becoming
bigger.
Unprepared students
make for unprepared
adults.
It’s not just students who
plagiarize:
•Pal Schmitt, former president
of Hungary
•German education minister
•Jayson Blair (former New
York Times writer)
•Jonah Lehrer, journalist and
author
•Fareed Zakaria (reporter,
author, host)
We are in the right place
to figure it out.
Over half of all
students in the
US (40M)
Over half a billion
citations
We asked ourselves the
following questions:
•What are students using in their
research?
•How good are their sources?
•How can we help them?
We started with the
basics._gaq.push([
'citations._trackEvent',
citationTitle,
citationPublisher,
citationId
]);
Here’s what we found.
Top sources 2010
•Wikipedia
•Google
1.The New York Times
2.CIA World Factbook
3.Oracle Thinkquest
4.Buzzle
5.US BLS
6.Dictionary.com
7.CDC
8.PBS
9.eHow
Source: EasyBib Google Analytics Oct 2010-Nov 2010 data.
What could we do?
•Warn them when their source’s
credibility is in question
•Analyze the quality of their full
bibliography
•Make it easier to not plagiarize
•Suggest better sources
Define credibility.
Improve citation quality
Gave students access to
their own analytics
To combat plagiarism, we
built an audit trail for notes
So after all this...
Does it blend (tm) ?
1. Wikipedia
2. Bio.com
3. History.com
4. PBS
5. Mayo Clinic
6. CDC
7. The New York Times
8. BBC
9. CNN
10.WebMD
11.US BLS
• Wikipedia still on top,
but ...
• No content farms, no
Google..
• WebMD is questionable,
but its credibility can be
argued for.
Source: Apr-May 2013 Google Analytics data
We have to admit, it’s getting
better...
We have to admit, it’s getting
better...
Help students find better
sources
How does the Research
engine currently work?
Cloudant (CouchDB)
MySQL
Lucene/Solr
Slow, asynchronous, lots of moving
parts.
Starting to do a bit more
StatsD::increment($metrics);
$response = $rediska->publish(
array('realtime'),
$citation
);
There’s a lot more we can
do, and data will help us.
Cloudant Search
•Full-text search integrated into Cloudant
•Lucene syntax
•Indexing is easy
function(doc){
index("title", doc.title, {"store": "yes"});
}
•Grouping of sources via chained map-reduce
map: function(doc){
if (doc.title){ emit({"title": doc.title}, 1); }
}
reduce: _sum
dbcopy: citationGroup
------
map: function(doc){
if (doc.title && doc.key.title){ emit(doc.value, doc.key.title); }
}
Live data analysis.
Crowdsourcing.
•Use Cloudant Search to power
feedback on sources (# of times
cited in real time, quality of
bibliographies derived from)
•Allow users to submit their own
credibility evaluations and aggregate
results
SourceRank!
Credibility weighting + crowdsourcing
Synchronous & realtime via Cloudant Search
Value nodes based on nearest neighbors
And other things...
Driving growth
We have the largest UGC citation
set. Making this searchable
creates a “moat.”
The more people that use EasyBib,
the better the tool becomes.
What about other data
analytics tools?
Too stretched to learn more complex tools
(looking for easy answers)
Costs (GA is free!)
EMR, Hadoop, Redshift, Cloudant Search:
This is what’s next.
Questions?
Darshan Somashekar
@darshan
darshan@imagineeasy.com

More Related Content

Viewers also liked

Crossing the Chasm (Ikanow - Chicago Summit)
Crossing the Chasm (Ikanow - Chicago Summit)Crossing the Chasm (Ikanow - Chicago Summit)
Crossing the Chasm (Ikanow - Chicago Summit)Open Analytics
 
On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...
On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...
On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...Open Analytics
 
CDM….Where do you start? (OA Cyber Summit)
CDM….Where do you start? (OA Cyber Summit)CDM….Where do you start? (OA Cyber Summit)
CDM….Where do you start? (OA Cyber Summit)Open Analytics
 
An Immigrant’s view of Cyberspace (OA Cyber Summit)
An Immigrant’s view of Cyberspace (OA Cyber Summit)An Immigrant’s view of Cyberspace (OA Cyber Summit)
An Immigrant’s view of Cyberspace (OA Cyber Summit)Open Analytics
 
Using Real-Time Data to Drive Optimization & Personalization
Using Real-Time Data to Drive Optimization & PersonalizationUsing Real-Time Data to Drive Optimization & Personalization
Using Real-Time Data to Drive Optimization & PersonalizationOpen Analytics
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Open Analytics
 
Piwik: An Analytics Alternative (Chicago Summit)
Piwik: An Analytics Alternative (Chicago Summit)Piwik: An Analytics Alternative (Chicago Summit)
Piwik: An Analytics Alternative (Chicago Summit)Open Analytics
 
Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...
Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...
Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...Open Analytics
 
From Insight to Impact (Chicago Summit - Keynote)
From Insight to Impact (Chicago Summit - Keynote)From Insight to Impact (Chicago Summit - Keynote)
From Insight to Impact (Chicago Summit - Keynote)Open Analytics
 
Competing in the Digital Economy
Competing in the Digital EconomyCompeting in the Digital Economy
Competing in the Digital EconomyOpen Analytics
 
MOLOCH: Search for Full Packet Capture (OA Cyber Summit)
MOLOCH: Search for Full Packet Capture (OA Cyber Summit)MOLOCH: Search for Full Packet Capture (OA Cyber Summit)
MOLOCH: Search for Full Packet Capture (OA Cyber Summit)Open Analytics
 
M&A Trends in Telco Analytics
M&A Trends in Telco AnalyticsM&A Trends in Telco Analytics
M&A Trends in Telco AnalyticsOpen Analytics
 
Cyber after Snowden (OA Cyber Summit)
Cyber after Snowden (OA Cyber Summit)Cyber after Snowden (OA Cyber Summit)
Cyber after Snowden (OA Cyber Summit)Open Analytics
 
Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)
Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)
Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)Open Analytics
 
Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...
Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...
Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...Open Analytics
 

Viewers also liked (15)

Crossing the Chasm (Ikanow - Chicago Summit)
Crossing the Chasm (Ikanow - Chicago Summit)Crossing the Chasm (Ikanow - Chicago Summit)
Crossing the Chasm (Ikanow - Chicago Summit)
 
On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...
On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...
On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...
 
CDM….Where do you start? (OA Cyber Summit)
CDM….Where do you start? (OA Cyber Summit)CDM….Where do you start? (OA Cyber Summit)
CDM….Where do you start? (OA Cyber Summit)
 
An Immigrant’s view of Cyberspace (OA Cyber Summit)
An Immigrant’s view of Cyberspace (OA Cyber Summit)An Immigrant’s view of Cyberspace (OA Cyber Summit)
An Immigrant’s view of Cyberspace (OA Cyber Summit)
 
Using Real-Time Data to Drive Optimization & Personalization
Using Real-Time Data to Drive Optimization & PersonalizationUsing Real-Time Data to Drive Optimization & Personalization
Using Real-Time Data to Drive Optimization & Personalization
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
 
Piwik: An Analytics Alternative (Chicago Summit)
Piwik: An Analytics Alternative (Chicago Summit)Piwik: An Analytics Alternative (Chicago Summit)
Piwik: An Analytics Alternative (Chicago Summit)
 
Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...
Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...
Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...
 
From Insight to Impact (Chicago Summit - Keynote)
From Insight to Impact (Chicago Summit - Keynote)From Insight to Impact (Chicago Summit - Keynote)
From Insight to Impact (Chicago Summit - Keynote)
 
Competing in the Digital Economy
Competing in the Digital EconomyCompeting in the Digital Economy
Competing in the Digital Economy
 
MOLOCH: Search for Full Packet Capture (OA Cyber Summit)
MOLOCH: Search for Full Packet Capture (OA Cyber Summit)MOLOCH: Search for Full Packet Capture (OA Cyber Summit)
MOLOCH: Search for Full Packet Capture (OA Cyber Summit)
 
M&A Trends in Telco Analytics
M&A Trends in Telco AnalyticsM&A Trends in Telco Analytics
M&A Trends in Telco Analytics
 
Cyber after Snowden (OA Cyber Summit)
Cyber after Snowden (OA Cyber Summit)Cyber after Snowden (OA Cyber Summit)
Cyber after Snowden (OA Cyber Summit)
 
Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)
Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)
Utilizing cyber intelligence to combat cyber adversaries (OA Cyber Summit)
 
Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...
Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...
Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...
 

Similar to Easybib Open Analytics NYC

The Transition Years: Evaluating Info Lit Skills from High School to College-...
The Transition Years: Evaluating Info Lit Skills from High School to College-...The Transition Years: Evaluating Info Lit Skills from High School to College-...
The Transition Years: Evaluating Info Lit Skills from High School to College-...Imagine Easy Solutions
 
T carse ESOL_October_2013_3D_Research_presentation
T carse ESOL_October_2013_3D_Research_presentationT carse ESOL_October_2013_3D_Research_presentation
T carse ESOL_October_2013_3D_Research_presentationTimCarse
 
Trying to stop the kids using google greg sheaf hslg conference 2013
Trying to stop the kids using google greg sheaf hslg conference 2013Trying to stop the kids using google greg sheaf hslg conference 2013
Trying to stop the kids using google greg sheaf hslg conference 2013hslgcommittee
 
Nine Strategies for Enhancing Critical Internet Literacy. Colin Harrison ukla...
Nine Strategies for Enhancing Critical Internet Literacy. Colin Harrison ukla...Nine Strategies for Enhancing Critical Internet Literacy. Colin Harrison ukla...
Nine Strategies for Enhancing Critical Internet Literacy. Colin Harrison ukla...Colin Harrison
 
Evaluer les nouvelles plates-formes de services web et leur impact sur les bi...
Evaluer les nouvelles plates-formes de services web et leur impact sur les bi...Evaluer les nouvelles plates-formes de services web et leur impact sur les bi...
Evaluer les nouvelles plates-formes de services web et leur impact sur les bi...ABES
 
The Power of Open Data!
The Power of Open Data!The Power of Open Data!
The Power of Open Data!Renaine Julian
 
How Does Reading & Learning Change on the Internet: Responding to New Literacies
How Does Reading & Learning Change on the Internet: Responding to New LiteraciesHow Does Reading & Learning Change on the Internet: Responding to New Literacies
How Does Reading & Learning Change on the Internet: Responding to New LiteraciesJulie Coiro
 
Google & garbage lsta 2012
Google & garbage lsta 2012Google & garbage lsta 2012
Google & garbage lsta 2012Paige Jaeger
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataHamilton Public Library
 
Fight for your right!
Fight for your right!Fight for your right!
Fight for your right!Lynda Kellam
 
Teaching ten steps to better research
Teaching ten steps to better researchTeaching ten steps to better research
Teaching ten steps to better researchlibrarykate
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015Michael Nelson
 
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & EvaluationFSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & EvaluationLorri Mon
 
Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)Paul Royster
 
Day 3: Introduction to Information Literacy
Day 3:  Introduction to Information LiteracyDay 3:  Introduction to Information Literacy
Day 3: Introduction to Information LiteracyBuffy Hamilton
 

Similar to Easybib Open Analytics NYC (20)

Perceptions of Libraries
Perceptions of LibrariesPerceptions of Libraries
Perceptions of Libraries
 
The Transition Years: Evaluating Info Lit Skills from High School to College-...
The Transition Years: Evaluating Info Lit Skills from High School to College-...The Transition Years: Evaluating Info Lit Skills from High School to College-...
The Transition Years: Evaluating Info Lit Skills from High School to College-...
 
T carse ESOL_October_2013_3D_Research_presentation
T carse ESOL_October_2013_3D_Research_presentationT carse ESOL_October_2013_3D_Research_presentation
T carse ESOL_October_2013_3D_Research_presentation
 
Trying to stop the kids using google greg sheaf hslg conference 2013
Trying to stop the kids using google greg sheaf hslg conference 2013Trying to stop the kids using google greg sheaf hslg conference 2013
Trying to stop the kids using google greg sheaf hslg conference 2013
 
Nine Strategies for Enhancing Critical Internet Literacy. Colin Harrison ukla...
Nine Strategies for Enhancing Critical Internet Literacy. Colin Harrison ukla...Nine Strategies for Enhancing Critical Internet Literacy. Colin Harrison ukla...
Nine Strategies for Enhancing Critical Internet Literacy. Colin Harrison ukla...
 
Data 101: A Gentle Introduction
Data 101: A Gentle IntroductionData 101: A Gentle Introduction
Data 101: A Gentle Introduction
 
Evaluer les nouvelles plates-formes de services web et leur impact sur les bi...
Evaluer les nouvelles plates-formes de services web et leur impact sur les bi...Evaluer les nouvelles plates-formes de services web et leur impact sur les bi...
Evaluer les nouvelles plates-formes de services web et leur impact sur les bi...
 
The Power of Open Data!
The Power of Open Data!The Power of Open Data!
The Power of Open Data!
 
How Does Reading & Learning Change on the Internet: Responding to New Literacies
How Does Reading & Learning Change on the Internet: Responding to New LiteraciesHow Does Reading & Learning Change on the Internet: Responding to New Literacies
How Does Reading & Learning Change on the Internet: Responding to New Literacies
 
Google & garbage lsta 2012
Google & garbage lsta 2012Google & garbage lsta 2012
Google & garbage lsta 2012
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with Data
 
Fight for your right!
Fight for your right!Fight for your right!
Fight for your right!
 
The Transition Years
The Transition YearsThe Transition Years
The Transition Years
 
Teaching ten steps to better research
Teaching ten steps to better researchTeaching ten steps to better research
Teaching ten steps to better research
 
Data 101: A Gentle Introduction
Data 101: A Gentle IntroductionData 101: A Gentle Introduction
Data 101: A Gentle Introduction
 
@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015@WebSciDL PhD Student Project Reviews August 5&6, 2015
@WebSciDL PhD Student Project Reviews August 5&6, 2015
 
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & EvaluationFSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
FSU SLIS InfoSvcs Wk 3 - Web Search & Evaluation
 
Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)Institutional Repositories (NLA 2011)
Institutional Repositories (NLA 2011)
 
Introduction to open-data
Introduction to open-dataIntroduction to open-data
Introduction to open-data
 
Day 3: Introduction to Information Literacy
Day 3:  Introduction to Information LiteracyDay 3:  Introduction to Information Literacy
Day 3: Introduction to Information Literacy
 

More from Open Analytics

Characterizing Risk in your Supply Chain (nContext - Chicago Summit)
Characterizing Risk in your Supply Chain (nContext - Chicago Summit)Characterizing Risk in your Supply Chain (nContext - Chicago Summit)
Characterizing Risk in your Supply Chain (nContext - Chicago Summit)Open Analytics
 
MarkLogic - Open Analytics Meetup
MarkLogic - Open Analytics MeetupMarkLogic - Open Analytics Meetup
MarkLogic - Open Analytics MeetupOpen Analytics
 
The caprate presentation_july2013_open analytics dc meetup
The caprate presentation_july2013_open analytics dc meetupThe caprate presentation_july2013_open analytics dc meetup
The caprate presentation_july2013_open analytics dc meetupOpen Analytics
 
Verifeed open analytics_3min deck_071713_final
Verifeed open analytics_3min deck_071713_finalVerifeed open analytics_3min deck_071713_final
Verifeed open analytics_3min deck_071713_finalOpen Analytics
 
Oas schwartz OA Summit
Oas schwartz OA SummitOas schwartz OA Summit
Oas schwartz OA SummitOpen Analytics
 
Luigi presentation OA Summit
Luigi presentation OA SummitLuigi presentation OA Summit
Luigi presentation OA SummitOpen Analytics
 
Intridea ajn-rttos OA NYC Summit
Intridea ajn-rttos OA NYC SummitIntridea ajn-rttos OA NYC Summit
Intridea ajn-rttos OA NYC SummitOpen Analytics
 
Open analytics summit nyc
Open analytics summit nycOpen analytics summit nyc
Open analytics summit nycOpen Analytics
 
Big data-science-oanyc
Big data-science-oanycBig data-science-oanyc
Big data-science-oanycOpen Analytics
 
Optier presentation for open analytics event
Optier presentation for open analytics eventOptier presentation for open analytics event
Optier presentation for open analytics eventOpen Analytics
 
Candor - open analytics nyc
Candor  - open analytics nycCandor  - open analytics nyc
Candor - open analytics nycOpen Analytics
 
Big data bi-mature-oanyc summit
Big data bi-mature-oanyc summitBig data bi-mature-oanyc summit
Big data bi-mature-oanyc summitOpen Analytics
 
No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summitOpen Analytics
 

More from Open Analytics (15)

Characterizing Risk in your Supply Chain (nContext - Chicago Summit)
Characterizing Risk in your Supply Chain (nContext - Chicago Summit)Characterizing Risk in your Supply Chain (nContext - Chicago Summit)
Characterizing Risk in your Supply Chain (nContext - Chicago Summit)
 
MarkLogic - Open Analytics Meetup
MarkLogic - Open Analytics MeetupMarkLogic - Open Analytics Meetup
MarkLogic - Open Analytics Meetup
 
The caprate presentation_july2013_open analytics dc meetup
The caprate presentation_july2013_open analytics dc meetupThe caprate presentation_july2013_open analytics dc meetup
The caprate presentation_july2013_open analytics dc meetup
 
Verifeed open analytics_3min deck_071713_final
Verifeed open analytics_3min deck_071713_finalVerifeed open analytics_3min deck_071713_final
Verifeed open analytics_3min deck_071713_final
 
HDScores OA DC Pitch
HDScores OA DC PitchHDScores OA DC Pitch
HDScores OA DC Pitch
 
Oas schwartz 16
Oas schwartz 16Oas schwartz 16
Oas schwartz 16
 
Oas schwartz OA Summit
Oas schwartz OA SummitOas schwartz OA Summit
Oas schwartz OA Summit
 
Luigi presentation OA Summit
Luigi presentation OA SummitLuigi presentation OA Summit
Luigi presentation OA Summit
 
Intridea ajn-rttos OA NYC Summit
Intridea ajn-rttos OA NYC SummitIntridea ajn-rttos OA NYC Summit
Intridea ajn-rttos OA NYC Summit
 
Open analytics summit nyc
Open analytics summit nycOpen analytics summit nyc
Open analytics summit nyc
 
Big data-science-oanyc
Big data-science-oanycBig data-science-oanyc
Big data-science-oanyc
 
Optier presentation for open analytics event
Optier presentation for open analytics eventOptier presentation for open analytics event
Optier presentation for open analytics event
 
Candor - open analytics nyc
Candor  - open analytics nycCandor  - open analytics nyc
Candor - open analytics nyc
 
Big data bi-mature-oanyc summit
Big data bi-mature-oanyc summitBig data bi-mature-oanyc summit
Big data bi-mature-oanyc summit
 
No sql and sql - open analytics summit
No sql and sql - open analytics summitNo sql and sql - open analytics summit
No sql and sql - open analytics summit
 

Recently uploaded

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 

Recently uploaded (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 

Easybib Open Analytics NYC

  • 1. Using data to improve student research
  • 2. EasyBib is an automatic bibliography composer. Students use it to cite sources for their research.
  • 3. We teach information literacy. 18% of all student papers include plagiarism1 Source: (1) TurnItIn; (2) Both Sides Now: Librarians Looking at Information Literacy from High School and College. 50% likelihood of using a credible vs. non- credible source1 4% increase in the use of paper mills and cheating sites1 ~16% of students are adequately prepared for college.2
  • 4. That’s how we felt too..
  • 5. The problem is becoming bigger.
  • 6. Unprepared students make for unprepared adults. It’s not just students who plagiarize: •Pal Schmitt, former president of Hungary •German education minister •Jayson Blair (former New York Times writer) •Jonah Lehrer, journalist and author •Fareed Zakaria (reporter, author, host)
  • 7. We are in the right place to figure it out. Over half of all students in the US (40M) Over half a billion citations
  • 8. We asked ourselves the following questions: •What are students using in their research? •How good are their sources? •How can we help them?
  • 9. We started with the basics._gaq.push([ 'citations._trackEvent', citationTitle, citationPublisher, citationId ]);
  • 10. Here’s what we found. Top sources 2010 •Wikipedia •Google 1.The New York Times 2.CIA World Factbook 3.Oracle Thinkquest 4.Buzzle 5.US BLS 6.Dictionary.com 7.CDC 8.PBS 9.eHow Source: EasyBib Google Analytics Oct 2010-Nov 2010 data.
  • 11. What could we do? •Warn them when their source’s credibility is in question •Analyze the quality of their full bibliography •Make it easier to not plagiarize •Suggest better sources
  • 14. Gave students access to their own analytics
  • 15. To combat plagiarism, we built an audit trail for notes
  • 16. So after all this... Does it blend (tm) ? 1. Wikipedia 2. Bio.com 3. History.com 4. PBS 5. Mayo Clinic 6. CDC 7. The New York Times 8. BBC 9. CNN 10.WebMD 11.US BLS • Wikipedia still on top, but ... • No content farms, no Google.. • WebMD is questionable, but its credibility can be argued for. Source: Apr-May 2013 Google Analytics data
  • 17. We have to admit, it’s getting better... We have to admit, it’s getting better...
  • 18. Help students find better sources
  • 19. How does the Research engine currently work? Cloudant (CouchDB) MySQL Lucene/Solr Slow, asynchronous, lots of moving parts.
  • 20. Starting to do a bit more StatsD::increment($metrics); $response = $rediska->publish( array('realtime'), $citation );
  • 21. There’s a lot more we can do, and data will help us.
  • 22. Cloudant Search •Full-text search integrated into Cloudant •Lucene syntax •Indexing is easy function(doc){ index("title", doc.title, {"store": "yes"}); } •Grouping of sources via chained map-reduce map: function(doc){ if (doc.title){ emit({"title": doc.title}, 1); } } reduce: _sum dbcopy: citationGroup ------ map: function(doc){ if (doc.title && doc.key.title){ emit(doc.value, doc.key.title); } }
  • 23. Live data analysis. Crowdsourcing. •Use Cloudant Search to power feedback on sources (# of times cited in real time, quality of bibliographies derived from) •Allow users to submit their own credibility evaluations and aggregate results
  • 24. SourceRank! Credibility weighting + crowdsourcing Synchronous & realtime via Cloudant Search Value nodes based on nearest neighbors And other things...
  • 25. Driving growth We have the largest UGC citation set. Making this searchable creates a “moat.” The more people that use EasyBib, the better the tool becomes.
  • 26. What about other data analytics tools? Too stretched to learn more complex tools (looking for easy answers) Costs (GA is free!) EMR, Hadoop, Redshift, Cloudant Search: This is what’s next.