JSTOR Labs has been exploring new ways to use the JSTOR Corpus, leading to a series of innovative projects in which content from the JSTOR archive is “mashed up” alongside other content. In this talk, we will demonstrate content-mashups that Labs has developed, including Understanding Shakespeare and Text Analyzer. We will also describe how both open, collaborative partnerships and natural language processing have made these innovative projects possible.
Web & Social Media Analytics Previous Year Question Paper.pdf
Your Chocolate, My Peanut Butter: JSTOR Labs' Content Mashups - NFAIS Webinar on Content Integrations
1. JSTOR Labs’ Content Mashups
Ron Snyder
Ron.Snyder@ithaka.org
Jan 16, 2018
NFAIS Webinar on Content Integrations
Your Chocolate,
My Peanut Butter:
2. ITHAKA is a not-for-profit organization that helps the academic
community use digital technologies to preserve the scholarly record
and to advance research and teaching in sustainable ways.
JSTOR is a not-for-profit
digital library of academic
journals, books, and
primary sources.
Ithaka S+R is a not-for-profit
research and consulting
service that helps academic,
cultural, and publishing
communities thrive in the
digital environment.
Portico is a not-for-profit
preservation service for
digital publications, including
electronic journals, books,
and historical collections.
Artstor provides 2+ million
high-quality images and
digital asset management
software to enhance
scholarship and teaching.
3. JSTOR Labs works with partner publishers,
libraries and labs to create tools for
researchers, teachers and students that are
immediately useful – and a little bit magical.
4. THE
JSTOR
CORPUS
• Full runs of 2600 journals
• 10 million articles
• 50,000 books
• 2 million primary sources
• Humanities and Social Sciences,
especially
5. What new and innovative uses
can we find for the JSTOR Corpus?
8. WHAT WE
LEARNED
The value of open, collaborative
partnerships
The power of combining our
corpus with other content
The JSTOR Corpus is valuable
not just as content but as data
10. WHAT WE
LEARNED
Free and open matters.
Structured data is valuable.
APIs can take time to find their
audience
Not everyone wants to share
what they’ve built.
16. WHAT WE
LEARNED
Combining semantic indexing
with topic modeling is powerful.
Keyword searching is great, but
there’s more we can do for
users.
Getting users to adopt a new
tool takes effort.