SlideShare a Scribd company logo
1 of 28
Spider
  Spider
“   ”




Spike
“   ”
vs.




                                                               http://www.flickr.com/photos/blueblankut/497571704/sizes/z/in/photostream/




http://www.flickr.com/photos/coreyburger/2481836757/sizes/z/in/photostream/
Dust
ip

     protocol   sitemap   robots.txt
Link
Trie
Query




Cache
……




http://www.flickr.com/photos/regolare/791385521/
Spider……


                    Map-
           Reduce
Crawler Architecture

                                                                      Repository
               Downloader


                 Download                              Extractor
                  Worker                                Worker
                                                                    save page
                                                                    to repository



                   if 302 founded
get a link         update link
                   http status
                                     put downloaded
                                      page to queue


          links queue
                                                      pages queue   extract links
                                                                    and save


       main loop will put
       peek site's links to
       queue




           Crawler                                                     Linkbase
          main loop


Site will refill itself
when it's empty
                                    TaskLoader


                                Priority Heap
         Scope.txt                    Sites

                                Ordered Site
                                and their links
Dust

            Simhash

       PageRank
robust   html      css selector   lxml   tidy




   url

           proxy            UA
1-NoSQL
          NOSQL
-NoSQL
               NOSQL

   Heap




FIFO
-NoSQL


HBase

Cassandra
-NoSQL
 Cassandra

              bug

                                       Random
patitioning




        Crawler              Crawler
-NoSQL
CAP

   HBase -> CP    Cassandra -> AP

      Cassandra   C
Crawler   link
2-Google
incremental processing system -
  Percolator. a.k.a. Caffeine
2-Google
   BigTable

                   timestamp oracle   lightweight
lock

              Notification                trigger

       Observer      Notification

                  Notification   Percolator Worker
2-Google
Map-Reduce




             Locality
2-Google
Trade-off

                    trillion:million
       Map-Reduce

Page   RPC     MR
                    10   MR   RPC
2-Google
Percolator           DBMS       DBMS




             scale
      Percolator




  Percolator           shared-nothing parallel
  databases
Thanks

More Related Content

What's hot

A quick introduction to Storm Crawler
A quick introduction to Storm CrawlerA quick introduction to Storm Crawler
A quick introduction to Storm CrawlerJulien Nioche
 
StormCrawler at Bristech
StormCrawler at BristechStormCrawler at Bristech
StormCrawler at BristechJulien Nioche
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Rupak Roy
 
Multi-threaded web crawler in Ruby
Multi-threaded web crawler in RubyMulti-threaded web crawler in Ruby
Multi-threaded web crawler in RubyPolcode
 
Confitura 2018 — Apache Beam — Promyk Nadziei Data Engineera
Confitura 2018 — Apache Beam — Promyk Nadziei Data EngineeraConfitura 2018 — Apache Beam — Promyk Nadziei Data Engineera
Confitura 2018 — Apache Beam — Promyk Nadziei Data EngineeraPiotr Wikiel
 
Ruby on rails online training
Ruby on rails online trainingRuby on rails online training
Ruby on rails online trainingTRAINING ICON
 
Git major commands
Git major commandsGit major commands
Git major commandsmyepicslides
 
Git major commands
Git major commandsGit major commands
Git major commandsmyepicslides
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure DataMuga Nishizawa
 
Hadoop and Pig at Twitter__HadoopSummit2010
Hadoop and Pig at Twitter__HadoopSummit2010Hadoop and Pig at Twitter__HadoopSummit2010
Hadoop and Pig at Twitter__HadoopSummit2010Yahoo Developer Network
 
Configuration management
Configuration managementConfiguration management
Configuration managementLuca De Vitis
 

What's hot (11)

A quick introduction to Storm Crawler
A quick introduction to Storm CrawlerA quick introduction to Storm Crawler
A quick introduction to Storm Crawler
 
StormCrawler at Bristech
StormCrawler at BristechStormCrawler at Bristech
StormCrawler at Bristech
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
 
Multi-threaded web crawler in Ruby
Multi-threaded web crawler in RubyMulti-threaded web crawler in Ruby
Multi-threaded web crawler in Ruby
 
Confitura 2018 — Apache Beam — Promyk Nadziei Data Engineera
Confitura 2018 — Apache Beam — Promyk Nadziei Data EngineeraConfitura 2018 — Apache Beam — Promyk Nadziei Data Engineera
Confitura 2018 — Apache Beam — Promyk Nadziei Data Engineera
 
Ruby on rails online training
Ruby on rails online trainingRuby on rails online training
Ruby on rails online training
 
Git major commands
Git major commandsGit major commands
Git major commands
 
Git major commands
Git major commandsGit major commands
Git major commands
 
Using Embulk at Treasure Data
Using Embulk at Treasure DataUsing Embulk at Treasure Data
Using Embulk at Treasure Data
 
Hadoop and Pig at Twitter__HadoopSummit2010
Hadoop and Pig at Twitter__HadoopSummit2010Hadoop and Pig at Twitter__HadoopSummit2010
Hadoop and Pig at Twitter__HadoopSummit2010
 
Configuration management
Configuration managementConfiguration management
Configuration management
 

Similar to 爬虫点滴

Storm crawler apachecon_na_2015
Storm crawler apachecon_na_2015Storm crawler apachecon_na_2015
Storm crawler apachecon_na_2015ontopic
 
Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?Bertrand Delacretaz
 
Harnessing the power of Nutch with Scala
Harnessing the power of Nutch with ScalaHarnessing the power of Nutch with Scala
Harnessing the power of Nutch with ScalaKnoldus Inc.
 
Low latency scalable web crawling on Apache Storm
Low latency scalable web crawling on Apache StormLow latency scalable web crawling on Apache Storm
Low latency scalable web crawling on Apache StormJulien Nioche
 
Design and Implementation of a High- Performance Distributed Web Crawler
Design and Implementation of a High- Performance Distributed Web CrawlerDesign and Implementation of a High- Performance Distributed Web Crawler
Design and Implementation of a High- Performance Distributed Web CrawlerGeorge Ang
 
Rails services in the walled garden
Rails services in the walled gardenRails services in the walled garden
Rails services in the walled gardenSidu Ponnappa
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL AdministrationCommand Prompt., Inc
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL AdministrationEDB
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Databricks
 
Dok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on KubernetesDok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on KubernetesDoKC
 
Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure bloomreacheng
 
Web Crawling and Data Gathering with Apache Nutch
Web Crawling and Data Gathering with Apache NutchWeb Crawling and Data Gathering with Apache Nutch
Web Crawling and Data Gathering with Apache NutchSteve Watt
 
SD, a P2P bug tracking system
SD, a P2P bug tracking systemSD, a P2P bug tracking system
SD, a P2P bug tracking systemJesse Vincent
 
Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)
Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)
Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)William Yeh
 
Redis深入浅出
Redis深入浅出Redis深入浅出
Redis深入浅出ruoyi ruan
 
An introduction to Storm Crawler
An introduction to Storm CrawlerAn introduction to Storm Crawler
An introduction to Storm CrawlerJulien Nioche
 
Common Sense Performance Indicators in the Cloud
Common Sense Performance Indicators in the CloudCommon Sense Performance Indicators in the Cloud
Common Sense Performance Indicators in the CloudNick Gerner
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...Yahoo Developer Network
 
[Kotlin Serverless 工作坊] 單元 4 - 實作 RSS Aggregator
[Kotlin Serverless 工作坊] 單元 4 - 實作 RSS Aggregator[Kotlin Serverless 工作坊] 單元 4 - 實作 RSS Aggregator
[Kotlin Serverless 工作坊] 單元 4 - 實作 RSS AggregatorShengyou Fan
 

Similar to 爬虫点滴 (20)

Storm crawler apachecon_na_2015
Storm crawler apachecon_na_2015Storm crawler apachecon_na_2015
Storm crawler apachecon_na_2015
 
Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?Can we run the Whole Web on Apache Sling?
Can we run the Whole Web on Apache Sling?
 
Harnessing the power of Nutch with Scala
Harnessing the power of Nutch with ScalaHarnessing the power of Nutch with Scala
Harnessing the power of Nutch with Scala
 
Low latency scalable web crawling on Apache Storm
Low latency scalable web crawling on Apache StormLow latency scalable web crawling on Apache Storm
Low latency scalable web crawling on Apache Storm
 
Design and Implementation of a High- Performance Distributed Web Crawler
Design and Implementation of a High- Performance Distributed Web CrawlerDesign and Implementation of a High- Performance Distributed Web Crawler
Design and Implementation of a High- Performance Distributed Web Crawler
 
Rails services in the walled garden
Rails services in the walled gardenRails services in the walled garden
Rails services in the walled garden
 
Deployment de Rails
Deployment de RailsDeployment de Rails
Deployment de Rails
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
 
Mastering PostgreSQL Administration
Mastering PostgreSQL AdministrationMastering PostgreSQL Administration
Mastering PostgreSQL Administration
 
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
 
Dok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on KubernetesDok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on Kubernetes
 
Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure Bloomreach - BloomStore Compute Cloud Infrastructure
Bloomreach - BloomStore Compute Cloud Infrastructure
 
Web Crawling and Data Gathering with Apache Nutch
Web Crawling and Data Gathering with Apache NutchWeb Crawling and Data Gathering with Apache Nutch
Web Crawling and Data Gathering with Apache Nutch
 
SD, a P2P bug tracking system
SD, a P2P bug tracking systemSD, a P2P bug tracking system
SD, a P2P bug tracking system
 
Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)
Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)
Docker 對傳統 DevOps 工具鏈的衝擊 (Docker's Impact on traditional DevOps toolchain)
 
Redis深入浅出
Redis深入浅出Redis深入浅出
Redis深入浅出
 
An introduction to Storm Crawler
An introduction to Storm CrawlerAn introduction to Storm Crawler
An introduction to Storm Crawler
 
Common Sense Performance Indicators in the Cloud
Common Sense Performance Indicators in the CloudCommon Sense Performance Indicators in the Cloud
Common Sense Performance Indicators in the Cloud
 
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
October 2016 HUG: Architecture of an Open Source RDBMS powered by HBase and ...
 
[Kotlin Serverless 工作坊] 單元 4 - 實作 RSS Aggregator
[Kotlin Serverless 工作坊] 單元 4 - 實作 RSS Aggregator[Kotlin Serverless 工作坊] 單元 4 - 實作 RSS Aggregator
[Kotlin Serverless 工作坊] 單元 4 - 實作 RSS Aggregator
 

More from Open Party

Sunshine library introduction
Sunshine library introductionSunshine library introduction
Sunshine library introductionOpen Party
 
食品安全与生态农业──小毛驴市民农园项目介绍
食品安全与生态农业──小毛驴市民农园项目介绍食品安全与生态农业──小毛驴市民农园项目介绍
食品安全与生态农业──小毛驴市民农园项目介绍Open Party
 
网站优化实践
网站优化实践网站优化实践
网站优化实践Open Party
 
Introduction to scientific visualization
Introduction to scientific visualizationIntroduction to scientific visualization
Introduction to scientific visualizationOpen Party
 
西藏10日游
西藏10日游西藏10日游
西藏10日游Open Party
 
Applying BDD in refactoring
Applying BDD in refactoringApplying BDD in refactoring
Applying BDD in refactoringOpen Party
 
移动广告不是网盟
移动广告不是网盟移动广告不是网盟
移动广告不是网盟Open Party
 
Android 开源社区,10年后的再思考
Android 开源社区,10年后的再思考Android 开源社区,10年后的再思考
Android 开源社区,10年后的再思考Open Party
 
企业创业融资之路
企业创业融资之路企业创业融资之路
企业创业融资之路Open Party
 
夸父通讯中间件
夸父通讯中间件夸父通讯中间件
夸父通讯中间件Open Party
 
Java mobile 移动应用开发
Java mobile 移动应用开发Java mobile 移动应用开发
Java mobile 移动应用开发Open Party
 
如何做演讲
如何做演讲如何做演讲
如何做演讲Open Party
 
Positive psychology
Positive psychologyPositive psychology
Positive psychologyOpen Party
 
价值驱动的组织转型-王晓明
价值驱动的组织转型-王晓明价值驱动的组织转型-王晓明
价值驱动的组织转型-王晓明Open Party
 
淘宝广告技术部开发流程和Scrum实践
淘宝广告技术部开发流程和Scrum实践淘宝广告技术部开发流程和Scrum实践
淘宝广告技术部开发流程和Scrum实践Open Party
 
对云计算的理解
对云计算的理解对云计算的理解
对云计算的理解Open Party
 
Web前端标准在各浏览器中的实现差异
Web前端标准在各浏览器中的实现差异Web前端标准在各浏览器中的实现差异
Web前端标准在各浏览器中的实现差异Open Party
 
Hs java open_party
Hs java open_partyHs java open_party
Hs java open_partyOpen Party
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development Open Party
 

More from Open Party (20)

Sunshine library introduction
Sunshine library introductionSunshine library introduction
Sunshine library introduction
 
食品安全与生态农业──小毛驴市民农园项目介绍
食品安全与生态农业──小毛驴市民农园项目介绍食品安全与生态农业──小毛驴市民农园项目介绍
食品安全与生态农业──小毛驴市民农园项目介绍
 
Cs open-party
Cs open-partyCs open-party
Cs open-party
 
网站优化实践
网站优化实践网站优化实践
网站优化实践
 
Introduction to scientific visualization
Introduction to scientific visualizationIntroduction to scientific visualization
Introduction to scientific visualization
 
西藏10日游
西藏10日游西藏10日游
西藏10日游
 
Applying BDD in refactoring
Applying BDD in refactoringApplying BDD in refactoring
Applying BDD in refactoring
 
移动广告不是网盟
移动广告不是网盟移动广告不是网盟
移动广告不是网盟
 
Android 开源社区,10年后的再思考
Android 开源社区,10年后的再思考Android 开源社区,10年后的再思考
Android 开源社区,10年后的再思考
 
企业创业融资之路
企业创业融资之路企业创业融资之路
企业创业融资之路
 
夸父通讯中间件
夸父通讯中间件夸父通讯中间件
夸父通讯中间件
 
Java mobile 移动应用开发
Java mobile 移动应用开发Java mobile 移动应用开发
Java mobile 移动应用开发
 
如何做演讲
如何做演讲如何做演讲
如何做演讲
 
Positive psychology
Positive psychologyPositive psychology
Positive psychology
 
价值驱动的组织转型-王晓明
价值驱动的组织转型-王晓明价值驱动的组织转型-王晓明
价值驱动的组织转型-王晓明
 
淘宝广告技术部开发流程和Scrum实践
淘宝广告技术部开发流程和Scrum实践淘宝广告技术部开发流程和Scrum实践
淘宝广告技术部开发流程和Scrum实践
 
对云计算的理解
对云计算的理解对云计算的理解
对云计算的理解
 
Web前端标准在各浏览器中的实现差异
Web前端标准在各浏览器中的实现差异Web前端标准在各浏览器中的实现差异
Web前端标准在各浏览器中的实现差异
 
Hs java open_party
Hs java open_partyHs java open_party
Hs java open_party
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
 

Recently uploaded

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Recently uploaded (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

爬虫点滴

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n