Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Mar 6, 2024. It is now read-only.

code4everything/visual-spider

Repository files navigation

欢迎体验我们全新的桌面端效率工具

欢迎体验我们全新的桌面端效率工具RunFlow

https://myrest.top/myflow

图片爬取

目前支持的图片格式有 bmp,gif,jpeg,png,tiff,pcx,tga,svg,pic

媒体爬取

目前支持的媒体格式有 avi,mov,swf,asf,navi,wmv,3gp,mkv,flv,rmvb,webm,mpg,mp4,qsv,mpeg,mp3,aac,ogg,wav,flac,ape,wma,aif,au,ram,mmf,amr,flac

链接爬取

其实就是下载HTML源代码

文档爬取

目前支持的文档格式有 pdf,docx,txt,log,conf,java,xml,json,css,js,html,hml,php,wps,rtf

其他文件爬取

目前支持的文件格式有 zip,exe,dmg,iso,jar,msi,rar,tmp,xlsx,mdf,com,casm,for,lib,lst,msg,obj,pas,wki,bas,map,bak,dot,bat,sh,rpm

自定义爬取

自定义XPath表达式,将匹配的网页内容存储至MySQL数据库

xpath

了解XPath语法

爬虫工作流程

工作流程

运行截图

截图

点我下载

AltStyle によって変換されたページ (->オリジナル) /