SlideShare a Scribd company logo
1 of 25
Download to read offline
Modeling documents with Generative
Adversarial Networks
John Glover
Overview
Learning representations of natural language documents
A brief introduction to Generative Adversarial Networks
Energy-based Generative Adversarial Networks
An adversarial document model
Future work & conclusion
Representation learning
The ability to learn robust, reusable feature representations
from unlabelled data has potential applications in a wide
variety of machine learning tasks, such as data retrieval
and classification.
One way to create such representations is to train deep
generative models that can learn to capture the complex
distributions of real-world data.
Representation learning
Document representations: LDA
The traditional approach to doing this is to use something
like LDA.
In LDA documents consist of a mixture of topics, with each
topic defining a probability distribution over the words in
the vocabulary.
Documents represented by a vector of mixture weights
over associated topics.
Document representations: LDA
α
β
z w N
M
θ
α is the parameter of the Dirichlet prior on the
per-document topic distributions, β is the parameter of the
Dirichlet prior on the per-topic word distribution, θm is the
topic distribution for document m, zmn is the topic for the
nth word in document m, and wmn is the specific word.
Document representations: beyond LDA
Replicated softmax (Salakhutdinov and Hinton, 2009).
DocNADE (Larochelle and Lauly, 2012).
Generative models: recent trends
Variational inference: Neural variational inference (Miao,
Yu, Blunsom, 2016).
Generative Adversarial Networks: ?
Generative Adversarial Networks
Generative Adversarial Networks (GANs) involve a
min-max adversarial game between a generative model G
and a discriminative model D.
G(z) is a neural network, that is trained to map samples z
from a prior noise distribution p(z) to the data space.
D(x) is another neural network that takes a data sample x
as input and outputs a single scalar value representing the
probability that x came from the data distribution instead of
G(z).
Generative Adversarial Networks
source: https://ishmaelbelghazi.github.io/ALI
Generative Adversarial Networks
D is trained to maximise the probability of assigning the
correct label to the input x.
G is trained to maximally confuse D, using the gradient of
D(x) with respect to x to update its parameters.
min
G
max
D
Ex∼p(data)[log D(x)] + Ez∼p(z)[log(1 − D(G(z)))]
GAN samples
Source: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks
https://arxiv.org/abs/1511.06434v2
GAN samples
Source: ”Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”
https://arxiv.org/abs/1609.04802
Energy-based Generative Adversarial Networks
Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016.
Energy function: outputs low values on the data manifold,
higher values everywhere else.
Energy-based Generative Adversarial Networks
Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016.
Easy to push down energy of observed data via SGD.
How to choose where to push energy up?
Energy-based Generative Adversarial Networks
Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016.
Generator learns to pick points where the energy should
be increased.
Can view D as a learned objective function.
Energy-based Generative Adversarial Networks
The energy function is trained to push down on the energy
of real samples x, and to push up on the energy of
generated samples ˆx. (fD is the value to be minimised at
each iteration and m is a margin between positive and
negative energies):
fD(x, z) = D(x) + max(0, m − D(G(z)))
At each iteration, the generator G is trained adversarially
against D to minimize fG:
fG(z) = D(G(z))
Energy-based Generative Adversarial Networks
In practise, the energy-based GAN formulation seems to
be easier to train.
Empirical results in ”Energy-based Generative Adversarial
Network” (https://arxiv.org/abs/1609.03126) with more than
6500 experiments.
An adversarial document model
Can we use the GAN formulation to learn representations
of natural language documents?
Questions:
1. How to represent documents? GANs require everything to
be differentiable, but need to deal with discrete text.
2. How to get a representation? No explicit mapping back to
latent (z) space.
An adversarial document model
z
x
CG Enc
DecMSE
h
D
Using an Energy-Based GAN to learn document representations. G is the generator, Enc and Dec are DAE encoder
and decoder networks, C is a corruption process (bypassed at test time) and D is the discriminator.
Input to discriminator is the binary bag-of-words
representation of a document: x ∈ {0, 1}V
.
Energy-based GAN with Denoising Autoencoder
discriminator.
Document retrieval evaluation
0.0001 0.0002 0.0005 0.002 0.01 0.05 0.2 1.0
Recall
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Precision
ADM
ADM (AE)
DocNADE
DAE
Precision-recall curves for the document retrieval task on the 20 Newsgroups dataset. DocNADE is described in
(Larochelle and Lauly, 2012), ADM is the adversarial document model, ADM (AE) is the adversarial document
model with a standard Autoencoder as the discriminator (and so it similar to the Energy-Based GAN), and DAE is a
Denoising Autoencoder.
Qualitative evaluation: TSNE plot
t-SNE visualizations of the document representations learned by the adversarial document model on the held-out
test dataset of 20 Newsgroups. The documents belong to 20 different topics, which correspond to different coloured
points in the figure.
Future work
Understanding why the DAE in the GAN discriminator
appears to produce significantly better representations
than a standalone DAE.
Exploring the impact of applying additional constraints to
the representation layer.
Conclusion
Showed that a variation on the recently proposed
Energy-Based GAN can be used to learn document
representations in an unsupervised setting.
In the current formulation still short of state-of-the-art, but
still very early days for this line of research so likely that we
can push this a lot further.
Suggested some interesting areas for future research.
More information
Introduction to GANs: http://blog.aylien.com/introduction-
generative-adversarial-networks-code-tensorflow
Paper:
https://sites.google.com/site/nips2016adversarial/home/accepted-
papers

More Related Content

What's hot

Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modelingHiroyuki Kuromiya
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information RetrievalBhaskar Mitra
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftSebastian Ruder
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Bhaskar Mitra
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for SearchBhaskar Mitra
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information RetrievalBhaskar Mitra
 
Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Daniele Di Mitri
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document RankingBhaskar Mitra
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningSebastian Ruder
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackBhaskar Mitra
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspacePrakash Dubey
 
Topic modeling using big data analytics
Topic modeling using big data analyticsTopic modeling using big data analytics
Topic modeling using big data analyticsFarheen Nilofer
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligencevini89
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introductionYueshen Xu
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsClaudia Wagner
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 

What's hot (20)

Basic review on topic modeling
Basic review on  topic modelingBasic review on  topic modeling
Basic review on topic modeling
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval5 Lessons Learned from Designing Neural Models for Information Retrieval
5 Lessons Learned from Designing Neural Models for Information Retrieval
 
Neural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain ShiftNeural Semi-supervised Learning under Domain Shift
Neural Semi-supervised Learning under Domain Shift
 
Topics Modeling
Topics ModelingTopics Modeling
Topics Modeling
 
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
 
The Duet model
The Duet modelThe Duet model
The Duet model
 
Deep Learning for Search
Deep Learning for SearchDeep Learning for Search
Deep Learning for Search
 
Neural Models for Information Retrieval
Neural Models for Information RetrievalNeural Models for Information Retrieval
Neural Models for Information Retrieval
 
Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation Lifelong Topic Modelling presentation
Lifelong Topic Modelling presentation
 
Neural Models for Document Ranking
Neural Models for Document RankingNeural Models for Document Ranking
Neural Models for Document Ranking
 
Topic Modeling
Topic ModelingTopic Modeling
Topic Modeling
 
Transfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine LearningTransfer Learning -- The Next Frontier for Machine Learning
Transfer Learning -- The Next Frontier for Machine Learning
 
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning TrackConformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
 
Document ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspaceDocument ranking using qprp with concept of multi dimensional subspace
Document ranking using qprp with concept of multi dimensional subspace
 
Topic modeling using big data analytics
Topic modeling using big data analyticsTopic modeling using big data analytics
Topic modeling using big data analytics
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Topic model an introduction
Topic model an introductionTopic model an introduction
Topic model an introduction
 
Topic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic ModelsTopic Models - LDA and Correlated Topic Models
Topic Models - LDA and Correlated Topic Models
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 

Similar to Modeling documents with Generative Adversarial Networks - John Glover

Enhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K AnonymizationEnhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K AnonymizationIDES Editor
 
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015rusbase
 
Adversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative modelAdversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative modelLoc Nguyen
 
Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN RishirajChakraborty4
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemlolokikipipi
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanfordSakthivel C R
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLuba Elliott
 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingContext-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingTomonari Masada
 
Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpAdrian Ziegler
 
dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...Bikash Chandra Karmokar
 
NLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationNLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationEugene Nho
 
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLabBeyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLabVijay Srinivas Agneeswaran, Ph.D
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson StudioSasha Lazarevic
 
Language Model Information Retrieval with Document Expansion
Language Model Information Retrieval with Document ExpansionLanguage Model Information Retrieval with Document Expansion
Language Model Information Retrieval with Document Expansionashish_hzb
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
IMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONIMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONijscai
 
Image Generation from Caption
Image Generation from Caption Image Generation from Caption
Image Generation from Caption IJSCAI Journal
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2Viral Gupta
 

Similar to Modeling documents with Generative Adversarial Networks - John Glover (20)

Enhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K AnonymizationEnhancing Privacy of Confidential Data using K Anonymization
Enhancing Privacy of Confidential Data using K Anonymization
 
Canini09a
Canini09aCanini09a
Canini09a
 
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
Сергей Кольцов —НИУ ВШЭ —ICBDA 2015
 
Adversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative modelAdversarial Variational Autoencoders to extend and improve generative model
Adversarial Variational Autoencoders to extend and improve generative model
 
Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN Deep Domain Adaptation using Adversarial Learning and GAN
Deep Domain Adaptation using Adversarial Learning and GAN
 
Wsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problemWsd as distributed constraint optimization problem
Wsd as distributed constraint optimization problem
 
Big data analytics_7_giants_public_24_sep_2013
Big data analytics_7_giants_public_24_sep_2013Big data analytics_7_giants_public_24_sep_2013
Big data analytics_7_giants_public_24_sep_2013
 
Ch03 Mining Massive Data Sets stanford
Ch03 Mining Massive Data Sets  stanfordCh03 Mining Massive Data Sets  stanford
Ch03 Mining Massive Data Sets stanford
 
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetupLucas Theis - Compressing Images with Neural Networks - Creative AI meetup
Lucas Theis - Compressing Images with Neural Networks - Creative AI meetup
 
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic ModelingContext-dependent Token-wise Variational Autoencoder for Topic Modeling
Context-dependent Token-wise Variational Autoencoder for Topic Modeling
 
Neo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExpNeo4j MeetUp - Graph Exploration with MetaExp
Neo4j MeetUp - Graph Exploration with MetaExp
 
dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...dmapply: A functional primitive to express distributed machine learning algor...
dmapply: A functional primitive to express distributed machine learning algor...
 
NLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic ClassificationNLP Project: Paragraph Topic Classification
NLP Project: Paragraph Topic Classification
 
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLabBeyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
Beyond Hadoop 1.0: A Holistic View of Hadoop YARN, Spark and GraphLab
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
 
Language Model Information Retrieval with Document Expansion
Language Model Information Retrieval with Document ExpansionLanguage Model Information Retrieval with Document Expansion
Language Model Information Retrieval with Document Expansion
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
IMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONIMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTION
 
Image Generation from Caption
Image Generation from Caption Image Generation from Caption
Image Generation from Caption
 
SNLI_presentation_2
SNLI_presentation_2SNLI_presentation_2
SNLI_presentation_2
 

More from Sebastian Ruder

Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language ProcessingSebastian Ruder
 
On the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionOn the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionSebastian Ruder
 
Successes and Frontiers of Deep Learning
Successes and Frontiers of Deep LearningSuccesses and Frontiers of Deep Learning
Successes and Frontiers of Deep LearningSebastian Ruder
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep LearningSebastian Ruder
 
Human Evaluation: Why do we need it? - Dr. Sheila Castilho
Human Evaluation: Why do we need it? - Dr. Sheila CastilhoHuman Evaluation: Why do we need it? - Dr. Sheila Castilho
Human Evaluation: Why do we need it? - Dr. Sheila CastilhoSebastian Ruder
 
Machine intelligence in HR technology: resume analysis at scale - Adrian Mihai
Machine intelligence in HR technology: resume analysis at scale - Adrian MihaiMachine intelligence in HR technology: resume analysis at scale - Adrian Mihai
Machine intelligence in HR technology: resume analysis at scale - Adrian MihaiSebastian Ruder
 
Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana Ifrim
Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana IfrimHashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana Ifrim
Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana IfrimSebastian Ruder
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingSebastian Ruder
 
Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...Sebastian Ruder
 
Spoken Dialogue Systems and Social Talk - Emer Gilmartin
Spoken Dialogue Systems and Social Talk - Emer GilmartinSpoken Dialogue Systems and Social Talk - Emer Gilmartin
Spoken Dialogue Systems and Social Talk - Emer GilmartinSebastian Ruder
 
NIPS 2016 Highlights - Sebastian Ruder
NIPS 2016 Highlights - Sebastian RuderNIPS 2016 Highlights - Sebastian Ruder
NIPS 2016 Highlights - Sebastian RuderSebastian Ruder
 
Multi-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoMulti-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoSebastian Ruder
 
Funded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIENFunded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIENSebastian Ruder
 
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)Sebastian Ruder
 
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...Sebastian Ruder
 
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
A Hierarchical Model of Reviews for Aspect-based Sentiment AnalysisA Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
A Hierarchical Model of Reviews for Aspect-based Sentiment AnalysisSebastian Ruder
 
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Sebastian Ruder
 

More from Sebastian Ruder (17)

Frontiers of Natural Language Processing
Frontiers of Natural Language ProcessingFrontiers of Natural Language Processing
Frontiers of Natural Language Processing
 
On the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary InductionOn the Limitations of Unsupervised Bilingual Dictionary Induction
On the Limitations of Unsupervised Bilingual Dictionary Induction
 
Successes and Frontiers of Deep Learning
Successes and Frontiers of Deep LearningSuccesses and Frontiers of Deep Learning
Successes and Frontiers of Deep Learning
 
Optimization for Deep Learning
Optimization for Deep LearningOptimization for Deep Learning
Optimization for Deep Learning
 
Human Evaluation: Why do we need it? - Dr. Sheila Castilho
Human Evaluation: Why do we need it? - Dr. Sheila CastilhoHuman Evaluation: Why do we need it? - Dr. Sheila Castilho
Human Evaluation: Why do we need it? - Dr. Sheila Castilho
 
Machine intelligence in HR technology: resume analysis at scale - Adrian Mihai
Machine intelligence in HR technology: resume analysis at scale - Adrian MihaiMachine intelligence in HR technology: resume analysis at scale - Adrian Mihai
Machine intelligence in HR technology: resume analysis at scale - Adrian Mihai
 
Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana Ifrim
Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana IfrimHashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana Ifrim
Hashtagger+: Real-time Social Tagging of Streaming News - Dr. Georgiana Ifrim
 
Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language Processing
 
Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...Making sense of word senses: An introduction to word-sense disambiguation and...
Making sense of word senses: An introduction to word-sense disambiguation and...
 
Spoken Dialogue Systems and Social Talk - Emer Gilmartin
Spoken Dialogue Systems and Social Talk - Emer GilmartinSpoken Dialogue Systems and Social Talk - Emer Gilmartin
Spoken Dialogue Systems and Social Talk - Emer Gilmartin
 
NIPS 2016 Highlights - Sebastian Ruder
NIPS 2016 Highlights - Sebastian RuderNIPS 2016 Highlights - Sebastian Ruder
NIPS 2016 Highlights - Sebastian Ruder
 
Multi-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer CalixtoMulti-modal Neural Machine Translation - Iacer Calixto
Multi-modal Neural Machine Translation - Iacer Calixto
 
Funded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIENFunded PhD/MSc. Opportunities at AYLIEN
Funded PhD/MSc. Opportunities at AYLIEN
 
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
Dynamic Topic Modeling via Non-negative Matrix Factorization (Dr. Derek Greene)
 
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
 
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
A Hierarchical Model of Reviews for Aspect-based Sentiment AnalysisA Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis
 
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
Topic Listener - Observing Key Topics from Multi-Channel Speech Audio Streams...
 

Recently uploaded

The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionJadeNovelo1
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptAmirRaziq1
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGiovaniTrinidad
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh
 
AICTE activity on Water Conservation spreading awareness
AICTE activity on Water Conservation spreading awarenessAICTE activity on Water Conservation spreading awareness
AICTE activity on Water Conservation spreading awareness1hk20is002
 
Interpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTInterpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTAlexander F. Mayer
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxpriyankatabhane
 
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyLAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyChayanika Das
 
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...jana861314
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxPayal Shrivastava
 
3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docx3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docxUlahVanessaBasa
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlshansessene
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPirithiRaju
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxfarhanvvdk
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPRPirithiRaju
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Sérgio Sacani
 
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Chiheb Ben Hammouda
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerLuis Miguel Chong Chong
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Christina Parmionova
 

Recently uploaded (20)

Interferons.pptx.
Interferons.pptx.Interferons.pptx.
Interferons.pptx.
 
The Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and FunctionThe Sensory Organs, Anatomy and Function
The Sensory Organs, Anatomy and Function
 
Immunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.pptImmunoblott technique for protein detection.ppt
Immunoblott technique for protein detection.ppt
 
Gas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptxGas-ExchangeS-in-Plants-and-Animals.pptx
Gas-ExchangeS-in-Plants-and-Animals.pptx
 
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests GlycosidesGLYCOSIDES Classification Of GLYCOSIDES  Chemical Tests Glycosides
GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides
 
AICTE activity on Water Conservation spreading awareness
AICTE activity on Water Conservation spreading awarenessAICTE activity on Water Conservation spreading awareness
AICTE activity on Water Conservation spreading awareness
 
Interpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWSTInterpreting SDSS extragalactic data in the era of JWST
Interpreting SDSS extragalactic data in the era of JWST
 
Loudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptxLoudspeaker- direct radiating type and horn type.pptx
Loudspeaker- direct radiating type and horn type.pptx
 
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary MicrobiologyLAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
LAMP PCR.pptx by Dr. Chayanika Das, Ph.D, Veterinary Microbiology
 
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
Speed Breeding in Vegetable Crops- innovative approach for present era of cro...
 
FBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptxFBI Profiling - Forensic Psychology.pptx
FBI Profiling - Forensic Psychology.pptx
 
3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docx3.-Acknowledgment-Dedication-Abstract.docx
3.-Acknowledgment-Dedication-Abstract.docx
 
bonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girlsbonjourmadame.tumblr.com bhaskar's girls
bonjourmadame.tumblr.com bhaskar's girls
 
Pests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPRPests of Sunflower_Binomics_Identification_Dr.UPR
Pests of Sunflower_Binomics_Identification_Dr.UPR
 
Oxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptxOxo-Acids of Halogens and their Salts.pptx
Oxo-Acids of Halogens and their Salts.pptx
 
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
6.1 Pests of Groundnut_Binomics_Identification_Dr.UPR
 
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
Observation of Gravitational Waves from the Coalescence of a 2.5–4.5 M⊙ Compa...
 
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
Efficient Fourier Pricing of Multi-Asset Options: Quasi-Monte Carlo & Domain ...
 
Advances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of CancerAdvances in AI-driven Image Recognition for Early Detection of Cancer
Advances in AI-driven Image Recognition for Early Detection of Cancer
 
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
Charateristics of the Angara-A5 spacecraft launched from the Vostochny Cosmod...
 

Modeling documents with Generative Adversarial Networks - John Glover

  • 1. Modeling documents with Generative Adversarial Networks John Glover
  • 2. Overview Learning representations of natural language documents A brief introduction to Generative Adversarial Networks Energy-based Generative Adversarial Networks An adversarial document model Future work & conclusion
  • 3. Representation learning The ability to learn robust, reusable feature representations from unlabelled data has potential applications in a wide variety of machine learning tasks, such as data retrieval and classification. One way to create such representations is to train deep generative models that can learn to capture the complex distributions of real-world data.
  • 5. Document representations: LDA The traditional approach to doing this is to use something like LDA. In LDA documents consist of a mixture of topics, with each topic defining a probability distribution over the words in the vocabulary. Documents represented by a vector of mixture weights over associated topics.
  • 6. Document representations: LDA α β z w N M θ α is the parameter of the Dirichlet prior on the per-document topic distributions, β is the parameter of the Dirichlet prior on the per-topic word distribution, θm is the topic distribution for document m, zmn is the topic for the nth word in document m, and wmn is the specific word.
  • 7. Document representations: beyond LDA Replicated softmax (Salakhutdinov and Hinton, 2009). DocNADE (Larochelle and Lauly, 2012).
  • 8. Generative models: recent trends Variational inference: Neural variational inference (Miao, Yu, Blunsom, 2016). Generative Adversarial Networks: ?
  • 9. Generative Adversarial Networks Generative Adversarial Networks (GANs) involve a min-max adversarial game between a generative model G and a discriminative model D. G(z) is a neural network, that is trained to map samples z from a prior noise distribution p(z) to the data space. D(x) is another neural network that takes a data sample x as input and outputs a single scalar value representing the probability that x came from the data distribution instead of G(z).
  • 10. Generative Adversarial Networks source: https://ishmaelbelghazi.github.io/ALI
  • 11. Generative Adversarial Networks D is trained to maximise the probability of assigning the correct label to the input x. G is trained to maximally confuse D, using the gradient of D(x) with respect to x to update its parameters. min G max D Ex∼p(data)[log D(x)] + Ez∼p(z)[log(1 − D(G(z)))]
  • 12. GAN samples Source: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks https://arxiv.org/abs/1511.06434v2
  • 13. GAN samples Source: ”Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network” https://arxiv.org/abs/1609.04802
  • 14. Energy-based Generative Adversarial Networks Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016. Energy function: outputs low values on the data manifold, higher values everywhere else.
  • 15. Energy-based Generative Adversarial Networks Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016. Easy to push down energy of observed data via SGD. How to choose where to push energy up?
  • 16. Energy-based Generative Adversarial Networks Source: Yann Lecun’s slides on energy-based GANs, NIPS 2016. Generator learns to pick points where the energy should be increased. Can view D as a learned objective function.
  • 17. Energy-based Generative Adversarial Networks The energy function is trained to push down on the energy of real samples x, and to push up on the energy of generated samples ˆx. (fD is the value to be minimised at each iteration and m is a margin between positive and negative energies): fD(x, z) = D(x) + max(0, m − D(G(z))) At each iteration, the generator G is trained adversarially against D to minimize fG: fG(z) = D(G(z))
  • 18. Energy-based Generative Adversarial Networks In practise, the energy-based GAN formulation seems to be easier to train. Empirical results in ”Energy-based Generative Adversarial Network” (https://arxiv.org/abs/1609.03126) with more than 6500 experiments.
  • 19. An adversarial document model Can we use the GAN formulation to learn representations of natural language documents? Questions: 1. How to represent documents? GANs require everything to be differentiable, but need to deal with discrete text. 2. How to get a representation? No explicit mapping back to latent (z) space.
  • 20. An adversarial document model z x CG Enc DecMSE h D Using an Energy-Based GAN to learn document representations. G is the generator, Enc and Dec are DAE encoder and decoder networks, C is a corruption process (bypassed at test time) and D is the discriminator. Input to discriminator is the binary bag-of-words representation of a document: x ∈ {0, 1}V . Energy-based GAN with Denoising Autoencoder discriminator.
  • 21. Document retrieval evaluation 0.0001 0.0002 0.0005 0.002 0.01 0.05 0.2 1.0 Recall 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Precision ADM ADM (AE) DocNADE DAE Precision-recall curves for the document retrieval task on the 20 Newsgroups dataset. DocNADE is described in (Larochelle and Lauly, 2012), ADM is the adversarial document model, ADM (AE) is the adversarial document model with a standard Autoencoder as the discriminator (and so it similar to the Energy-Based GAN), and DAE is a Denoising Autoencoder.
  • 22. Qualitative evaluation: TSNE plot t-SNE visualizations of the document representations learned by the adversarial document model on the held-out test dataset of 20 Newsgroups. The documents belong to 20 different topics, which correspond to different coloured points in the figure.
  • 23. Future work Understanding why the DAE in the GAN discriminator appears to produce significantly better representations than a standalone DAE. Exploring the impact of applying additional constraints to the representation layer.
  • 24. Conclusion Showed that a variation on the recently proposed Energy-Based GAN can be used to learn document representations in an unsupervised setting. In the current formulation still short of state-of-the-art, but still very early days for this line of research so likely that we can push this a lot further. Suggested some interesting areas for future research.
  • 25. More information Introduction to GANs: http://blog.aylien.com/introduction- generative-adversarial-networks-code-tensorflow Paper: https://sites.google.com/site/nips2016adversarial/home/accepted- papers