SlideShare a Scribd company logo
1 of 45
Advanced SEO
COMPETITIVE INTELLIGENCE,
MODERN WEB SCRAPING, & MORE
Advanced SEO
Competitive intelligence, modern web scraping, & more
Hello!
I am Melissa Sciorra
Sr Manager, SEO @ SmarterTravel (a TripAdvisor Company)
3
@mel_arroics
#cmc2019
FamilyVacationCritic.com | SmarterTravel.com | Jetsetter.com | WhatToPack.com | AirfareWatchdog.com | Oyster.com
RAISE OF HANDS
How many people in this session work on Search Engine Optimization?
How many people in this session have used Screaming Frog?
How many people in this session have used XPath?
4
Agenda
How SEOs and Content Teams can work
together, SMARTER
Web Scraping technology & intro to XPath
Elements of webpages that can be
EXTRACTED
REAL LIFE USE CASES so you can go into
work next week and
5
impress your boss
EVOLVE@mel_arroics #cmc2019
Content Strategy
Workflow
Ideation Stage = the time to brainstorm topics.
Brainstorming topics can consist of:
- aha moments
- discovering topics through reading
- watching tv
-something you care about
idea
Writing &
Optimization
publishing
EVOLVE@mel_arroics #cmc2019
“Web Scraping is a way of automating the
process of gathering information from
different websites on the Internet.”
EVOLVE@mel_arroics #cmc2019
XPath
A query language that describes a way to find and
process items in XML (and HTML) documents
(Short For XML Path Language)
It’s supported by modern web browsers
In plain ENGLISH:
You can select any element, attribute, table,
content of an element, or meta object in a webpage.
Let’s See an Example
“I want to find all <h3> tags in my blog post.”
SCREAMING FROG can extract 2 <h1> and 2 <h2>, but
extracting <h3> doesn’t come out of the box, and it doesn’t
crawl more than 2 Header Tag types.
EVOLVE@mel_arroics #cmc2019
Screaming Frog
Custom Extraction
//h1
//h2
//h3
extract all <h1>
extract all <h2>
extract all <h3>
EVOLVE@mel_arroics #cmc2019@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
PSA: The internetis a collection of pages. LOTS of pages.
Every website is built differently from the next.
Its all HTLM, CSS, JavaScript, etc.
Some are built well. Some are not.
Inconsistency in coding can make data collection hard.
…XPath can help!
EVOLVE@mel_arroics #cmc2019
Xpath: Location Paths
Xpath expressions can begin with the root node (the element) with /
/ selects the entire document
/html/head selects contents of the head element only
/html/head/title selects contents of a title element
Node-by-Node is important to understand for XPath, but not necessary to use
//title selects title element no matter where it is
EVOLVE@mel_arroics #cmc2019
Your XPath Syntax should be //a/@href
This is because //@href would give you ALL link attributes, from any line of code,
including references to JS, CSS and so on.
What if you want to extract all of the links on a page?
A link is defined by <a @href=“www.website.com/example”</a>
EVOLVE@mel_arroics #cmc2019
The Tools You
Need
Screaming Frog
https://bit.ly/29AEs8Q
Google Chrome
http://bit.ly/2CqZqp7
Scraper for Chrome
http://bit.ly/2W6dbAT
XPath Helper
https://bit.ly/2n8gtTC
Make sure developer
tools is enabled.
EVOLVE@mel_arroics #cmc2019
Screaming Frog
Google Chrome
http://bit.ly/2CqZqp7
EVOLVE@mel_arroics #cmc2019
SCREAMINGFROG XPATH EXTRACTION
• 10fields allow youto insert Xpath, CSSPath, or RegEx to searchand extract custom elements
• IncludesSyntax Validator
ExtractHTML Element
The selected element andall ofits
innerHTML content.
ExtractInner HTML
The innerHTML contentofthe
selected element; if theselected
element containsotherHTML
elements, they’llbeincluded.
ExtractText
The textcontentofa selected
element andthe textcontentof
anysub elements.
Tip: You choose what
you want to extract
EVOLVE@mel_arroics #cmc2019
Google Chrome
Developer Tools
EVOLVE@mel_arroics #cmc2019
Scraper For Chrome
EVOLVE@mel_arroics #cmc2019
XPath Helper
For
Chrome
How To Use XPath
In Your Day-to-Day
EVOLVE@mel_arroics #cmc2019
1.
ExtractExternal
Lists
“Airlines + Luggage Policy” =
I Need To Find All The Airlines To Create A Keyword
Tree To Provide My Content Team.
opportunity
I Find A Ranker.Com Site That Lists Out All
Airlines, But If I + Paste Into Excel It Would Be
Messy. Copy.
Right-click On An Airline Header > Scrape Similar
1
2
3
EVOLVE@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
1.
ExtractExternal
Lists
Shorten to
//h2/div/a
to collect ALL
airline <h2>
EVOLVE@mel_arroics #cmc2019
2.
ExtractArticle
Publish
Date
Updated Date Schema =
I Need To Provide My Content Team With High-
performing URLs That Need To Be Reviewed And Updated.
HIGHER CTR%
EVOLVE@mel_arroics #cmc2019
2.
ExtractArticle
Publish
Date
1. Identify Top Pages In Google Search Console, Export,
And Open Up A Page Into Your Browser.
2. Find The Date Of Your Article And Right-click > Inspect.
3. Right Click On The Highlighted Entry > Copy > Copy
Xpath:
https://www.jetsetter.com/magazine/cool-things-to-do-in-denver/
//*[@id="container-scroll"]/div/div[2]/div[2]/div[1]/div/span[2]/time
4. Close Source Code And Open XPath Helper. Paste Your
Copied XPath Into “Query” And Make Sure It Returns The
Date Result.
5. Open Screaming Frog > Configuration > Custom >
Extraction.
EVOLVE@mel_arroics #cmc2019
2.
ExtractArticle
Publish
Date
5. Open Screaming Frog > Configuration > Custom >
Extraction.
6. Paste Your XPath Function And Name It. Extract
Inner HTML. Check For Checkmark Validation.
7. Paste Your Top URLS Into Screaming Frog And
Crawl.
Find Your Extractions Under Custom
Tab > Extraction Filter
EVOLVE@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
3.
Analyze
Competitor’sArticle
Titles
Competitive Analysis =
I Want To See The Main Themes Of What They
Are Writing About To Begin My Competitive
Analysis.
1. Run A Crawl Of Competitor’s Website, Or
Extract Highest Performing URLs From
SEMRush And Crawl.
2. Download <H1> Or Title Tags.
3. Paste Into A Text Analyzer, Like
online-utility.org
Find content gaps
EVOLVE@mel_arroics #cmc2019
3.
Analyze
Competitor’sArticle
Titles
EVOLVE@mel_arroics #cmc2019
4.
ExtractYouTube
Video Titles And
Tags
New Video Strategy=
I Need To See Where To Start With My Video SEO
Strategy.
1. Visit The YouTube Channel And Load Up Videos
Until You Can’t Load Anymore Under Channel
Videos
2. Right-click On A Video Title And Select Scrape
Similar
3. Export to Google Docs
More visibility
EVOLVE@mel_arroics #cmc2019
4.
ExtractYouTube
Video Titles And
Tags
4. Add YouTube.com Through A Concatenate Formula
Onto All URLs:
5. Paste Full URL Into Screaming Frog.
6. Export Crawl Into Excel To Analyze Title, Meta
Description, And Meta Keywords.
=concatenate(”https://www.youtube.com",B2)
EVOLVE@mel_arroics #cmc2019
5.
Find Pages With
Specific Anchor
Text
Extract Certain On-Page Links=
I Want To See If Any Of My On-page Link Anchor
Text Contains “Amazon”.
1. Open Screaming Frog.
2. Enter The Below Formula Into Configuration >
Custom > Extraction:
//a[contains(translate(.,'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
'abcdefghijklmnopqrstuvwxyz'),'amazon')]/@href
More opportunity
3. Replace ‘Amazon’ With Other Anchor Text You Want
To Search. Extract Inner HTML.
EVOLVE@mel_arroics #cmc2019
5.
Find Pages with
Specific Anchor
Text
EVOLVE@mel_arroics #cmc2019
6.
FindPages That
ContainExternal
LinksFromSpecific
Sites
Optimize Profitable Pages=
I Want To Extract A List Of All My Affiliate URLs
(fave.co)
1. Open Screaming Frog.
2. Enter The Below Formula Into Configuration >
Custom > Extraction:
//A[contains(@href,'Fave.Co')]/@Href
3. Extract Inner HTML And Crawl Your Website To
Find Your URLs That Contain Fave.co.
More Money
EVOLVE@mel_arroics #cmc2019
EVOLVE@mel_arroics #cmc2019
7.
Find Your
Content Fans For
Outreach
Your Fans =
I Want To Reach Out To People Who Left Comments
On My Site And Let Them Know About A New Piece Of
Content.
Most Users Who Comment On WordPress Blogs Enter
Their Name And Website.
interested IN YOU
EVOLVE@mel_arroics #cmc2019
7.
Find Your
Content Fans For
Outreach
Your Fans =
If This Is Something You Or Your Competitor Has
Enabled, Scrape The Names And Websites Of The
Commenters To Reach Out And Tell Them About Your
Content.
interested IN YOU
EVOLVE@mel_arroics #cmc2019
8. Analyze Which
Of Your Content
PerformsBest
Finding Valuable Category Types=
I Want To Find Which Type Of Content Gets The
Most Organic Clicks.
1. Pull Top 100 URL From Google Search Console
And Paste Into Screaming Frog.
2. Open A Sample UTL And Find The Location Of
Your Primary Tag.
3. Copy XPath (Right-click, Inspect, Copy XPath)
4. Paste Formula Into Screaming Frog Custom
Extraction.
//*[@id="container-scroll"]/div/div[2]/div[1]/div[1]
Content opp’y
EVOLVE@mel_arroics #cmc2019
8. Analyze Which
Of Your Content
PerformsBest
5. Combine Tag Data With Google Search Console
Data Via VLookup And Create A Pivot Table. Create A
Bar Chart.
Clicks by Tag
EVOLVE@mel_arroics #cmc2019
XPATH OUTPUT
//h1 Extract all H1tags
//h3[1] Extract the firstH3tag
//h3[2] Extract the secondH3tag
//div/p Extract any<p> containedwithina <div>
//div[@class='author'] Extract any<div> with class“author”
//p[@class='bio'] Extract any<p> with class“bio”
//*[@class='bio'] Extract anyelementwith class“bio”
//ul/li[last()] Extract the last<li>ina <ul>
//ol[@class='cat']/li[1] Extract the first<li> in a <ol> with class“cat”
count(//h2) Countthe numberof H2’s(setextractionfilter to “FunctionValue”)
//a[contains(.,'clickhere')] Extract anylinkwith anchortext containing“click here”
//a[starts-with(@title,'Writtenby')] Extract anylinkwith a titlestartingwith “Writtenby”
//@href Extract all links
//a[starts-with(@href,'mailto')]/@href Extract linkthat startswith “mailto” (emailaddress)
//meta[@property='article:published_time']/@content Extract the articlepublishdate(commonly-foundmetatag onWP)
Keep Learning!
• https://www.linkedin.com/pulse/secret-increasing-organic-ctr-
2019-updating-your-article-sciorra/
• https://builtvisible.com/seo-guide-to-xpath/
• https://www.screamingfrog.co.uk/web-scraping/
• https://www.w3schools.com/xml/xpath_intro.asp
• https://www.pmg.com/blog/how-to-use-xpath-in-screaming-frog/
• https://uproer.com/articles/screaming-frog-custom-extraction-
xpath-regex/
• https://ahrefs.com/blog/web-scraping-for-marketers/
43
44
Thanks!
Any questions?
@mel_arroics
#cmc2019
Advanced SEO
COMPETITIVE INTELLIGENCE,
MODERN WEB SCRAPING, & MORE

More Related Content

What's hot

SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsDistilled
 
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5Hamlet Batista
 
Hey Googlebot, did you cache that ?
Hey Googlebot, did you cache that ?Hey Googlebot, did you cache that ?
Hey Googlebot, did you cache that ?Petra Kis-Herczegh
 
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...Mauro Cattaneo
 
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You NeedThe Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Needfrankmo920
 
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021Alex Wright
 
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...Hamlet Batista
 
Split Testing for SEO - 9 Months of Learning
Split Testing for SEO - 9 Months of LearningSplit Testing for SEO - 9 Months of Learning
Split Testing for SEO - 9 Months of LearningDominic Woodman
 
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
So you think you know canonical tags -  Sean Butcher Brighton SEO presentationSo you think you know canonical tags -  Sean Butcher Brighton SEO presentation
So you think you know canonical tags - Sean Butcher Brighton SEO presentationSean Butcher
 
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...Catalyst
 
Technical SEO for international markets - Leonie Mann - Brighton SEO 2021
Technical SEO for international markets- Leonie Mann - Brighton SEO 2021Technical SEO for international markets- Leonie Mann - Brighton SEO 2021
Technical SEO for international markets - Leonie Mann - Brighton SEO 2021Leonie Mann
 
Combatting Crawl Bloat & Pruning Your Content Effectively
Combatting Crawl Bloat & Pruning Your Content EffectivelyCombatting Crawl Bloat & Pruning Your Content Effectively
Combatting Crawl Bloat & Pruning Your Content EffectivelyCharlie Whitworth
 
Deep crawl the chaotic landscape of JavaScript
Deep crawl the chaotic landscape of JavaScript Deep crawl the chaotic landscape of JavaScript
Deep crawl the chaotic landscape of JavaScript Onely
 
Schema.org and the changing world of Rich Results - SEOEdinburgh Meetup
Schema.org and the changing world of Rich Results - SEOEdinburgh MeetupSchema.org and the changing world of Rich Results - SEOEdinburgh Meetup
Schema.org and the changing world of Rich Results - SEOEdinburgh MeetupGeoff Kennedy
 
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...
SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...Distilled
 
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...Distilled
 
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...Charly Wargnier
 
On-Page SEO EXTREME - SEOZone Istanbul 2013
On-Page SEO EXTREME - SEOZone Istanbul 2013On-Page SEO EXTREME - SEOZone Istanbul 2013
On-Page SEO EXTREME - SEOZone Istanbul 2013Bastian Grimm
 
Single Page Apps - Gerry White @ BrightonSEO
Single Page Apps - Gerry White @ BrightonSEOSingle Page Apps - Gerry White @ BrightonSEO
Single Page Apps - Gerry White @ BrightonSEOGerry White
 
Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020Tom Anthony
 

What's hot (20)

SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your LogsSearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
SearchLove London 2016 | Dom Woodman | How to Get Insight From Your Logs
 
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5Solving Complex JavaScript Issues and Leveraging Semantic HTML5
Solving Complex JavaScript Issues and Leveraging Semantic HTML5
 
Hey Googlebot, did you cache that ?
Hey Googlebot, did you cache that ?Hey Googlebot, did you cache that ?
Hey Googlebot, did you cache that ?
 
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
Mauro Cattaneo - Why hreflang is crucial to international SEO success - Brigh...
 
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You NeedThe Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
The Ultimate Guide to Scrapebox - The Only Scrapebox Tutorial You Need
 
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
Headless SEO: Optimising Next Gen Sites | brightonSEO 2021
 
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
How to scale SEO work NOBODY wants to do (including your competitors) to rapi...
 
Split Testing for SEO - 9 Months of Learning
Split Testing for SEO - 9 Months of LearningSplit Testing for SEO - 9 Months of Learning
Split Testing for SEO - 9 Months of Learning
 
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
So you think you know canonical tags -  Sean Butcher Brighton SEO presentationSo you think you know canonical tags -  Sean Butcher Brighton SEO presentation
So you think you know canonical tags - Sean Butcher Brighton SEO presentation
 
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
TechSEO Boost 2021 - Rendering Strategies: Measuring the Devil’s Details in C...
 
Technical SEO for international markets - Leonie Mann - Brighton SEO 2021
Technical SEO for international markets- Leonie Mann - Brighton SEO 2021Technical SEO for international markets- Leonie Mann - Brighton SEO 2021
Technical SEO for international markets - Leonie Mann - Brighton SEO 2021
 
Combatting Crawl Bloat & Pruning Your Content Effectively
Combatting Crawl Bloat & Pruning Your Content EffectivelyCombatting Crawl Bloat & Pruning Your Content Effectively
Combatting Crawl Bloat & Pruning Your Content Effectively
 
Deep crawl the chaotic landscape of JavaScript
Deep crawl the chaotic landscape of JavaScript Deep crawl the chaotic landscape of JavaScript
Deep crawl the chaotic landscape of JavaScript
 
Schema.org and the changing world of Rich Results - SEOEdinburgh Meetup
Schema.org and the changing world of Rich Results - SEOEdinburgh MeetupSchema.org and the changing world of Rich Results - SEOEdinburgh Meetup
Schema.org and the changing world of Rich Results - SEOEdinburgh Meetup
 
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...
SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...SearchLove Boston 2018 -  Bartosz Goralewicz -  JavaScript: Looking Past the ...
SearchLove Boston 2018 - Bartosz Goralewicz - JavaScript: Looking Past the ...
 
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
SearchLove Boston 2018 - Tom Anthony - Hacking Google: what you can learn fro...
 
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
How to build simple web apps to automate your SEO tasks - BrightonSEO Spring ...
 
On-Page SEO EXTREME - SEOZone Istanbul 2013
On-Page SEO EXTREME - SEOZone Istanbul 2013On-Page SEO EXTREME - SEOZone Istanbul 2013
On-Page SEO EXTREME - SEOZone Istanbul 2013
 
Single Page Apps - Gerry White @ BrightonSEO
Single Page Apps - Gerry White @ BrightonSEOSingle Page Apps - Gerry White @ BrightonSEO
Single Page Apps - Gerry White @ BrightonSEO
 
Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020Browser Changes That Will Impact SEO From 2019-2020
Browser Changes That Will Impact SEO From 2019-2020
 

Similar to #CMC2019: Advanced SEO: Competitive intelligence, Web Scraping, and More.

Link Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam OhLink Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam OhSam Oh
 
The ultimate seo_checklist
The ultimate seo_checklistThe ultimate seo_checklist
The ultimate seo_checklistKenny Mark
 
Link Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny TeamLink Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny Team97th Floor
 
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014Glenn Gutmacher
 
Plug and Play Tools for the Recruiting Empiricist
Plug and Play Tools for the Recruiting EmpiricistPlug and Play Tools for the Recruiting Empiricist
Plug and Play Tools for the Recruiting EmpiricistJung Kim
 
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018Esteve Castells
 
SEO Presentation
SEO PresentationSEO Presentation
SEO Presentationganeh17
 
WordPress Development Confoo 2010
WordPress Development Confoo 2010WordPress Development Confoo 2010
WordPress Development Confoo 2010Brendan Sera-Shriar
 
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...DeepCrawl
 
Scrape box presentation
Scrape box presentationScrape box presentation
Scrape box presentationElephate1
 
Ultimate Guide to White Hat SEO using Scrapebox
Ultimate Guide to White Hat SEO using ScrapeboxUltimate Guide to White Hat SEO using Scrapebox
Ultimate Guide to White Hat SEO using ScrapeboxŁukasz Rogala
 
How to disrupt established markets with SEO in 2015 - LOGIN 2015
How to disrupt established markets with SEO in 2015 - LOGIN 2015How to disrupt established markets with SEO in 2015 - LOGIN 2015
How to disrupt established markets with SEO in 2015 - LOGIN 2015Yannis Karagiannidis
 
Wordpress SEO
Wordpress SEOWordpress SEO
Wordpress SEOBeFound
 
Week 12 - Search Engine Optimization
Week 12 -  Search Engine OptimizationWeek 12 -  Search Engine Optimization
Week 12 - Search Engine Optimizationhenri_makembe
 
Future of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessFuture of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessAnetwork
 
Demand quest seo training session 2 5.2018
Demand quest seo training session 2 5.2018Demand quest seo training session 2 5.2018
Demand quest seo training session 2 5.2018Nate Plaunt
 
Redefining Technical SEO, #MozCon 2019 by Paul Shapiro
Redefining Technical SEO, #MozCon 2019 by Paul ShapiroRedefining Technical SEO, #MozCon 2019 by Paul Shapiro
Redefining Technical SEO, #MozCon 2019 by Paul ShapiroPaul Shapiro
 
Redefining Technical SEO - Paul Shapiro at MozCon 2019
Redefining Technical SEO - Paul Shapiro at MozCon 2019Redefining Technical SEO - Paul Shapiro at MozCon 2019
Redefining Technical SEO - Paul Shapiro at MozCon 2019Catalyst
 

Similar to #CMC2019: Advanced SEO: Competitive intelligence, Web Scraping, and More. (20)

Link Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam OhLink Building at Scale With a Tiny Team - Sam Oh
Link Building at Scale With a Tiny Team - Sam Oh
 
The ultimate seo_checklist
The ultimate seo_checklistThe ultimate seo_checklist
The ultimate seo_checklist
 
Link Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny TeamLink Building at Scale: Big Links with a Tiny Team
Link Building at Scale: Big Links with a Tiny Team
 
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
SourceCon Lab- Bookmarklets by Glenn Gutmacher Oct 2014
 
Plug and Play Tools for the Recruiting Empiricist
Plug and Play Tools for the Recruiting EmpiricistPlug and Play Tools for the Recruiting Empiricist
Plug and Play Tools for the Recruiting Empiricist
 
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
Advanced Web Scraping or How To Make Internet Your Database #seoplus2018
 
SEO Presentation
SEO PresentationSEO Presentation
SEO Presentation
 
WordPress Development Confoo 2010
WordPress Development Confoo 2010WordPress Development Confoo 2010
WordPress Development Confoo 2010
 
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
Cut The Crap: Running Content Audits With Crawlers - Sam Marsden, Technical S...
 
Meta tag creation
Meta tag creationMeta tag creation
Meta tag creation
 
Scrape box presentation
Scrape box presentationScrape box presentation
Scrape box presentation
 
Ultimate Guide to White Hat SEO using Scrapebox
Ultimate Guide to White Hat SEO using ScrapeboxUltimate Guide to White Hat SEO using Scrapebox
Ultimate Guide to White Hat SEO using Scrapebox
 
How to disrupt established markets with SEO in 2015 - LOGIN 2015
How to disrupt established markets with SEO in 2015 - LOGIN 2015How to disrupt established markets with SEO in 2015 - LOGIN 2015
How to disrupt established markets with SEO in 2015 - LOGIN 2015
 
Wordpress SEO
Wordpress SEOWordpress SEO
Wordpress SEO
 
Week 12 - Search Engine Optimization
Week 12 -  Search Engine OptimizationWeek 12 -  Search Engine Optimization
Week 12 - Search Engine Optimization
 
Future of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to SuccessFuture of Search Engine Factors, AMP, On-Page Key to Success
Future of Search Engine Factors, AMP, On-Page Key to Success
 
Demand quest seo training session 2 5.2018
Demand quest seo training session 2 5.2018Demand quest seo training session 2 5.2018
Demand quest seo training session 2 5.2018
 
Redefining Technical SEO, #MozCon 2019 by Paul Shapiro
Redefining Technical SEO, #MozCon 2019 by Paul ShapiroRedefining Technical SEO, #MozCon 2019 by Paul Shapiro
Redefining Technical SEO, #MozCon 2019 by Paul Shapiro
 
Flavours of SEO
Flavours of SEOFlavours of SEO
Flavours of SEO
 
Redefining Technical SEO - Paul Shapiro at MozCon 2019
Redefining Technical SEO - Paul Shapiro at MozCon 2019Redefining Technical SEO - Paul Shapiro at MozCon 2019
Redefining Technical SEO - Paul Shapiro at MozCon 2019
 

Recently uploaded

DGTLmart : Digital Solutions for 4X Growth
DGTLmart  : Digital Solutions for 4X GrowthDGTLmart  : Digital Solutions for 4X Growth
DGTLmart : Digital Solutions for 4X Growthcsear2019
 
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Richard Ingilby
 
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing StrategyDIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing StrategySouvikRay24
 
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdfSnapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdfEastern Online-iSURVEY
 
How videos can elevate your Google rankings and improve your EEAT - Benjamin ...
How videos can elevate your Google rankings and improve your EEAT - Benjamin ...How videos can elevate your Google rankings and improve your EEAT - Benjamin ...
How videos can elevate your Google rankings and improve your EEAT - Benjamin ...Benjamin Szturmaj
 
BLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly BulletinBLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly BulletinBalmerLawrie
 
Best Persuasive selling skills presentation.pptx
Best Persuasive selling skills  presentation.pptxBest Persuasive selling skills  presentation.pptx
Best Persuasive selling skills presentation.pptxMasterPhil1
 
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...Search Engine Journal
 
Digital Marketing Spotlight: Lifecycle Advertising Strategies.pdf
Digital Marketing Spotlight: Lifecycle Advertising Strategies.pdfDigital Marketing Spotlight: Lifecycle Advertising Strategies.pdf
Digital Marketing Spotlight: Lifecycle Advertising Strategies.pdfDemandbase
 
The Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO CopywritingThe Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO CopywritingJuan Pineda
 
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Onlineanilsa9823
 
ASO Process: What is App Store Optimization
ASO Process: What is App Store OptimizationASO Process: What is App Store Optimization
ASO Process: What is App Store OptimizationAli Raza
 
BrightonSEO - Addressing SEO & CX - CMDL - Apr 24 .pptx
BrightonSEO -  Addressing SEO & CX - CMDL - Apr 24 .pptxBrightonSEO -  Addressing SEO & CX - CMDL - Apr 24 .pptx
BrightonSEO - Addressing SEO & CX - CMDL - Apr 24 .pptxcollette15
 
Cost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surgesCost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surgesPushON Ltd
 
Local SEO Domination: Put your business at the forefront of local searches!
Local SEO Domination:  Put your business at the forefront of local searches!Local SEO Domination:  Put your business at the forefront of local searches!
Local SEO Domination: Put your business at the forefront of local searches!dstvtechnician
 
GreenSEO April 2024: Join the Green Web Revolution
GreenSEO April 2024: Join the Green Web RevolutionGreenSEO April 2024: Join the Green Web Revolution
GreenSEO April 2024: Join the Green Web RevolutionWilliam Barnes
 
Forecast of Content Marketing through AI
Forecast of Content Marketing through AIForecast of Content Marketing through AI
Forecast of Content Marketing through AIRinky
 
Call Us ➥9654467111▻Call Girls In Delhi NCR
Call Us ➥9654467111▻Call Girls In Delhi NCRCall Us ➥9654467111▻Call Girls In Delhi NCR
Call Us ➥9654467111▻Call Girls In Delhi NCRSapana Sha
 
The Impact of Digital Technologies
The Impact of Digital Technologies The Impact of Digital Technologies
The Impact of Digital Technologies bruguardarib
 
TAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
TAM AdEx 2023 Cross Media Advertising Recap - Auto SectorTAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
TAM AdEx 2023 Cross Media Advertising Recap - Auto SectorSocial Samosa
 

Recently uploaded (20)

DGTLmart : Digital Solutions for 4X Growth
DGTLmart  : Digital Solutions for 4X GrowthDGTLmart  : Digital Solutions for 4X Growth
DGTLmart : Digital Solutions for 4X Growth
 
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
Moving beyond multi-touch attribution - DigiMarCon CanWest 2024
 
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing StrategyDIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
DIGITAL MARKETING COURSE IN BTM -Influencer Marketing Strategy
 
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdfSnapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
Snapshot of Consumer Behaviors of March 2024-EOLiSurvey (EN).pdf
 
How videos can elevate your Google rankings and improve your EEAT - Benjamin ...
How videos can elevate your Google rankings and improve your EEAT - Benjamin ...How videos can elevate your Google rankings and improve your EEAT - Benjamin ...
How videos can elevate your Google rankings and improve your EEAT - Benjamin ...
 
BLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly BulletinBLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
BLOOM_April2024. Balmer Lawrie Online Monthly Bulletin
 
Best Persuasive selling skills presentation.pptx
Best Persuasive selling skills  presentation.pptxBest Persuasive selling skills  presentation.pptx
Best Persuasive selling skills presentation.pptx
 
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
Do More with Less: Navigating Customer Acquisition Challenges for Today's Ent...
 
Digital Marketing Spotlight: Lifecycle Advertising Strategies.pdf
Digital Marketing Spotlight: Lifecycle Advertising Strategies.pdfDigital Marketing Spotlight: Lifecycle Advertising Strategies.pdf
Digital Marketing Spotlight: Lifecycle Advertising Strategies.pdf
 
The Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO CopywritingThe Pitfalls of Keyword Stuffing in SEO Copywriting
The Pitfalls of Keyword Stuffing in SEO Copywriting
 
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Hazratganj Lucknow best sexual service Online
 
ASO Process: What is App Store Optimization
ASO Process: What is App Store OptimizationASO Process: What is App Store Optimization
ASO Process: What is App Store Optimization
 
BrightonSEO - Addressing SEO & CX - CMDL - Apr 24 .pptx
BrightonSEO -  Addressing SEO & CX - CMDL - Apr 24 .pptxBrightonSEO -  Addressing SEO & CX - CMDL - Apr 24 .pptx
BrightonSEO - Addressing SEO & CX - CMDL - Apr 24 .pptx
 
Cost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surgesCost-effective tactics for navigating CPC surges
Cost-effective tactics for navigating CPC surges
 
Local SEO Domination: Put your business at the forefront of local searches!
Local SEO Domination:  Put your business at the forefront of local searches!Local SEO Domination:  Put your business at the forefront of local searches!
Local SEO Domination: Put your business at the forefront of local searches!
 
GreenSEO April 2024: Join the Green Web Revolution
GreenSEO April 2024: Join the Green Web RevolutionGreenSEO April 2024: Join the Green Web Revolution
GreenSEO April 2024: Join the Green Web Revolution
 
Forecast of Content Marketing through AI
Forecast of Content Marketing through AIForecast of Content Marketing through AI
Forecast of Content Marketing through AI
 
Call Us ➥9654467111▻Call Girls In Delhi NCR
Call Us ➥9654467111▻Call Girls In Delhi NCRCall Us ➥9654467111▻Call Girls In Delhi NCR
Call Us ➥9654467111▻Call Girls In Delhi NCR
 
The Impact of Digital Technologies
The Impact of Digital Technologies The Impact of Digital Technologies
The Impact of Digital Technologies
 
TAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
TAM AdEx 2023 Cross Media Advertising Recap - Auto SectorTAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
TAM AdEx 2023 Cross Media Advertising Recap - Auto Sector
 

#CMC2019: Advanced SEO: Competitive intelligence, Web Scraping, and More.

Editor's Notes

  1. Welcome to Advanced SEO, competitive intelligence, modern web scraping, and more.
  2. My name is Melissa Sciorra, and I’m currently the senior manager of SEO at SmarterTravel, a TripAdvisor company. We own and operate travel websites that reach nearly 200 million unique visitors each month. You may have heard of some of my sites, including Jetsetter.com, Airfarewatchdog.com, Oyster.com, and our newest site, whattopack.com. Feel free to tweet at me using my handle, @mel_arroics, and use the hashtag CMC2019. I want to preface this talk by first including a disclaimer; I’ve been in SEO for almost 9 years, and I’m by no means a developer who is proficient in python. We all know that in SEO, sometimes things can get a little repetitive, and I’ve discovered ways of fueling my research that can help save time and automate processes, and provide competitive insights. this TOPIC gets very technical very quickly, so I’m going to try to break it down to a level where anyone can understand and use these functions to make custom extraction easy
  3. Quick poll: How many people in this session work in SEO full time? How many people in this session work in SEO part time? How many people have used screaming frog How many people have never heard of screaming frog? How many people have used xpath? How many people have never heard of xpath?
  4. Today, we are going to learn how SEO’s can automate research processes to help fuel their own competitive research, and to help provide insights to content teams. We’re going to dive into webscraping technology in todays age, and what xpath is. We’ll go over elements of webpages that can be extracted using real life examples, and by the end of this session, you’ll have takeaways that you can start using at work to imress your boss, your colleagues, your friends, and maybe even your mothers.
  5. Let’s dive in. We know that SEO in 2019 is still about creating really awesome content for our users. This means you and your team must must continuously come up with great ideas, or find great ideas from existing posts, search query reports, or competitive analysis and content gaps. Content strategy begins with the ideation stage, and brainstorming topics can consist of aha moments, watching tv, things you are passionate about, and more.
  6. You can also come up with ideas through web scraping. That is, scraping what your competitors are doing, and this starts at the type of content they are writing about. What is Web Scraping? A way of automating the process of gathering information from different sites on the internet. The trick with web scraping is that you have to have a basic understanding of how a web page’s markup is laid out. This, plus an understanding of Xpath, helps you extract data quickly and easily.
  7. So what is Xpath and how can it make my life easier? Xpath is a query language for selecting pieces of information in an XML document. It allows you to extract elements, attributes and objects from the HTML in a webpage. Its supported by most web browsers This means that any website, your own website and your competitors websites, can be scraped for information that you want based on cammands you write in Xpath.
  8. Lets see an example. For those of you who have used screaming frog before, we know that the H1 and H2 tags can be pulled automatically with every site crawl, but lets say we want to also identify and analyze H3 tags.
  9. I’d open the custom extraction field in screaming frog and enter the syntax for H3. The two slashes mean search the entire XML document and looks for any element containing <h3>.
  10. When I enter the syntax, I can find the extraction within the custom field in Screaming frog.
  11. But there is more too it than just copying and pasting expressions. If only it were that simple…. The internet is full of tons of webpages that are built differently from the next. The only similarity is that XML documents contain HTML, CSS, and JS. Xpath can help automate the process of data collection, saving you time at your keyboard to work on more strategic goals.
  12. Node by node begins at the root node, a slash. Two slashes searches the whole document.
  13. Use XPath to extract any HTML element of a webpage. If you want to scrape information contained in a div, span, p, heading tag or really any other HTML element
  14. The Screaming Frog SEO Spider is a website crawler, that allows you to crawl websites’ URLs and fetch key elements to analyse and audit technical and onsite SEO
  15. Inspect and live-edit the HTML and CSS of a page using the Chrome DevTools Elements panel. Google Chrome has a feature that makes writing XPath easier. Using the Inspect tool, you can right-click on any element and copy the XPath syntax. It’ll often be the case that you’ll need to modify what Chrome gives you before pasting the XPath into Screaming Frog, but it at least gets you started.
  16. Scraper for Chrome is a simple and fast tool that allows you to identify and refine xpath expressions.
  17. QA your xpath queries
  18. Lets start off with an easy example. Our content team came up with the idea to create a large piece of content that explained luggage policy by airline after doing a few searches on Google and using SEMrush. As an SEO, I have to provide the content team with the highest volume search terms so they can narrow down their list. I google “list of American airlines” and find a ranker.com website that lists all the airlines in America. I could copy and paste this list in excel, but I would be left with a really messy spreadsheet that would take time to clean up. Instead, I right click on an airline header, and use my tool “Scrape similar “
  19. Right click > Scrape similar
  20. From here, the Xpath reference is /html/body/article/h2/div/a, but I remove my root node info and include two slashes next to my h2 to find all H2s in the XML document. I can then export these into excel, put together a concatenate formula based off of popular luggage policy terms, and upload them into google adwords to find average monthly search volume.
  21. Lets see another example. We know that having updated content not only makes Google happy, but it also makes users happy. For example, I search for best shows on Netflix and am presented with position 1 and position 2 SERPs. One shows me its been updated in April and the other has been updated in march – which one do you think I’m going to click into?
  22. You should make this a normal deliverable to provide to your client or content team. Heres how you do it. First, identify your top pages in Google Search Console and export. Open up one of those pages into your browser and find the date on page. Right click and inspect element, which brings up the code in devbrowser. Rigt click on the highlighted entry within the code, and copy xpath. For example, on my jetsetter.com URL for cool things to do in Denver, my xpath looks like this. To QA, I’m going to open my Xpath helper and paste the xpath into it.
  23. Analyze competitor’s recent posts titles. Plug into a text analysis tool to let us see what posts are about
  24. We advise being very careful with this strategy. Remember, these people may have left a comment, but they didn’t opt into your email list. That could have been for a number of reasons, but chances are they were only really interested in this post. We, therefore, recommend using this strategy only to tell commenters about the updates to the post and/or other new posts that are similar. In other words, don’t email people about stuff they’re unlikely to care about! ..Use hunger.io add-on in Google Sheets for to find Emails
  25. CHEAT SHEET
  26. Resources
  27. Questions