0x01 写在前面
打ctf的时候,有时候很烦一些隐藏的敏感文件,如:swp、bak
等文件,虽然有FileSensor
这个工具了,但用起来不太智能,于是就想着能不能结合下目录扫描+敏感文件泄露,于是就有了以下的研究:hiddenSensor
0x02 对dirsearch的分析
- 入口
class Program(object):
def __init__(self):
self.script_path = (os.path.dirname(os.path.realpath(__file__)))
self.arguments = ArgumentParser(self.script_path)
self.output = CLIOutput()
self.controller = Controller(self.script_path, self.arguments, self.output)
ArgumentParser
和CLIOutput
没什么好看的,直接看Controller:
self.fuzzer = Fuzzer(self.requester, self.dictionary, testFailPath=self.arguments.testFailPath, threads=self.arguments.threadsCount, matchCallbacks=matchCallbacks, notFoundCallbacks=notFoundCallbacks, errorCallbacks=errorCallbacks)
跟进fuzzer
:
def setupScanners(self):
if len(self.scanners) != 0:
self.scanners = {}
self.defaultScanner = Scanner(self.requester, self.testFailPath, "")
self.scanners['/'] = Scanner(self.requester, self.testFailPath, "/")
for extension in self.dictionary.extensions:
self.scanners[extension] = Scanner(
self.requester, self.testFailPath, "." + extension)
fuzzer
调用了 Scanner
, 看来核心就在Scanner
里面了:
import re
from difflib import SequenceMatcher
from lib.utils import RandomUtils
from thirdparty.sqlmap import DynamicContentParser
class ScannerException(Exception):
pass
class Scanner(object):
def __init__(self, requester, testPath=None, suffix=None):
if testPath is None or testPath is "":
self.testPath = RandomUtils.randString()
else:
self.testPath = testPath
self.suffix = suffix if suffix is not None else ""
self.requester = requester
self.tester = None
self.redirectRegExp = None
self.invalidStatus = None
self.dynamicParser = None
self.ratio = 0.98
self.redirectStatusCodes = [301, 302, 307]
self.setup()
def setup(self):
firstPath = self.testPath + self.suffix
firstResponse = self.requester.request(firstPath)
self.invalidStatus = firstResponse.status
if self.invalidStatus == 404:
# Using the response status code is enough :-}
return
# look for redirects
secondPath = RandomUtils.randString(omit=self.testPath) + self.suffix
secondResponse = self.requester.request(secondPath)
if firstResponse.status in self.redirectStatusCodes and firstResponse.redirect and secondResponse.redirect:
self.redirectRegExp = self.generateRedirectRegExp(firstResponse.redirect, secondResponse.redirect)
# Analyze response bodies
self.dynamicParser = DynamicContentParser(self.requester, firstPath, firstResponse.body, secondResponse.body)
baseRatio = float("{0:.2f}".format(self.dynamicParser.comparisonRatio)) # Rounding to 2 decimals
# If response length is small, adjust ratio
if len(firstResponse) < 2000:
baseRatio -= 0.1
if baseRatio < self.ratio:
self.ratio = baseRatio
def generateRedirectRegExp(self, firstLocation, secondLocation):
if firstLocation is None or secondLocation is None:
return None
sm = SequenceMatcher(None, firstLocation, secondLocation)
marks = []
for blocks in sm.get_matching_blocks():
i = blocks[0]
n = blocks[2]
# empty block
if n == 0:
continue
mark = firstLocation[i:i + n]
marks.append(mark)
regexp = "^.*{0}.*$".format(".*".join(map(re.escape, marks)))
return regexp
def scan(self, path, response):
if self.invalidStatus == 404 and response.status == 404:
return False
if self.invalidStatus != response.status:
return True
redirectToInvalid = False
if self.redirectRegExp is not None and response.redirect is not None:
redirectToInvalid = re.match(self.redirectRegExp, response.redirect) is not None
# If redirection doesn't match the rule, mark as found
if not redirectToInvalid:
return True
ratio = self.dynamicParser.compareTo(response.body)
if ratio >= self.ratio:
return False
elif redirectToInvalid and ratio >= (self.ratio - 0.15):
return False
return True
解读下这段代码,思考一个问题:怎么判断一个文件是否存在?
你也许会想到:
1. 取一个随机字符串,将其添加到url
,如果返回404
,那么我们就以status
作为判别状态
2. 如果返回是 302|301
这样的status,怎么判别呢?像淘宝、百度等比较大型的网站为了用户体验,都不直接返回404
的,而是通过跳转,重定向到一个错误的页面
3. 为了fuzz第二点,即判断 302|301
到底是真实文件的跳转还是重定向到错误?dirsearch
采用了响应头中Location
和页面相似度的方法
4. 流程
- 取两个不存在的页面,拿到两个页面的Location,通过
generateRedirectRegExp()
函数产生Location
的正则表达式,如果不满足这个这个正则表达式,证明是一个存在的页面,即:我们可以通过status
判断这个页面 - 如果满足
Location
正则表达式, 用self.dynamicParser.compareTo(response.body)
来对内容进行对比,如果相似度在一定范围内,则认为是不存在的页面,否则就认为这个页面存在
5. 优点:是通过Location
,可以减少对内容进行相似度对比次数,增加程序速度
6. 缺点:
- 如果同一后缀 如:
1xxxx.php
和2.xxxx.php
返回error.html
, 而1xxxx.jsp
和2xxxx.jsp
返回404.html
,那么就会产生很多误判,因为作者只用了一种后缀,不信的可以用dirsearch
扫一下百度,一大堆误判 302|301 的 - 不知道你们发现没,dirsearch的requests设定的allow_redirects=False 即不跟随302|301,那么问题来了,在取页面相似度时,也没有跟随,那不就是取的301或者302的内容吗,301和302在不跳转的时候,绝大多数情况下都是空页面,那相似度对比就没用了,所以我怀疑作者是为了交一份作业吧?(嘻嘻嘻),开个玩笑,
dirsearch
还是非常强大的,特别是他的暂停、重连、输出感觉非常棒,这也就是我为啥子在dirsearch
上动刀,构建自己的hiddenSensor
0x03 hiddenSensor
1. 解决第六点的问题
- 取多个后缀,这里我选择常见的
php|jsp|asp
- 直接跟随重定向
- 关键代码
import sys
sys.path.append('../../')
from difflib import SequenceMatcher
from thirdparty.sqlmap import DynamicContentParser
import re
import random
import string
import urllib.parse
#import requests
#from .Requester import Requester
class Fuzzer(object):
def __init__(self, requester, path=None):
self.requester = requester
self.path = path
self.suffix = ['php', 'jsp', 'asp']
self.redirection_code = ['301', '302', '303', '307']
self.base_ratio = 0.98
self.flag = False
self.redirection_regexp = []
self.setup()
def getRandomPath(self):
letters = string.ascii_letters + string.digits
return ''.join(random.choice(letters) for i in range(8))
def generateRedirectRegExp(self, firstLocation, secondLocation):
if firstLocation is None or secondLocation is None:
return None
sm = SequenceMatcher(None, firstLocation, secondLocation)
marks = []
for blocks in sm.get_matching_blocks():
i = blocks[0]
n = blocks[2]
# empty block
if n == 0:
continue
mark = firstLocation[i:i + n]
if mark.startswith('http') or mark.startswith('https'):
marks.append(mark)
regexp = "^.*{0}.*$".format(".*".join(map(re.escape, marks))
).replace('http', '(https|http)')
return regexp
def getDmain(self, url):
url_parser = urllib.parse.urlparse(url)
return url_parser.scheme + '://' + url_parser.netloc
def getHistory(self, history):
history = re.findall('\d+', history)
history = history[0] if len(history) >= 1 else []
return str(history)
def setup(self):
if self.path is None or self.path is '':
self.path = self.getRandomPath()
firstpath_php = self.path + '.' + self.suffix[0]
res1_php = self.requester.request(firstpath_php, True)
secondpath_php = self.getRandomPath() + '.' + self.suffix[0]
res2_php = self.requester.request(secondpath_php, True)
firstpath_jsp = self.path + '.' + self.suffix[1]
res1_jsp = self.requester.request(firstpath_jsp, True)
secondpath_jsp = self.getRandomPath() + '.' + self.suffix[1]
res2_jsp = self.requester.request(secondpath_jsp, True)
firstpath_asp = self.path + '.' + self.suffix[2]
res1_asp = self.requester.request(firstpath_asp, True)
secondpath_asp = self.getRandomPath() + '.' + self.suffix[2]
res2_asp = self.requester.request(secondpath_asp, True)
if res1_asp.status_code == 404 and res1_php.status_code == 404 and res1_jsp.status_code == 404:
self.flag = True
else:
if self.getHistory(str(res1_php.history)) in self.redirection_code and self.getHistory(str(res2_php.history)) in self.redirection_code:
regExp = self.generateRedirectRegExp(
res1_php.url, res2_php.url)
self.redirection_regexp.append(
regExp) if regExp not in self.redirection_regexp else 0
if self.getHistory(str(res1_jsp.history)) in self.redirection_code and self.getHistory(str(res2_jsp.history)) in self.redirection_code:
regExp = self.generateRedirectRegExp(
res1_jsp.url, res2_jsp.url)
self.redirection_regexp.append(
regExp) if regExp not in self.redirection_regexp else 0
if self.getHistory(str(res1_asp.history)) in self.redirection_code and self.getHistory(str(res2_asp.history)) in self.redirection_code:
regExp = self.generateRedirectRegExp(
res1_asp.url, res2_asp.url)
self.redirection_regexp.append(
regExp) if regExp not in self.redirection_regexp else 0
if res1_asp.status_code == 404 and res1_php.status_code == 404 and res1_jsp.status_code == 404:
self.flag = True
self.dynamic_php = DynamicContentParser(
self.requester, firstpath_php, res1_php.text, res2_php.text)
if self.dynamic_php is not None:
ratio = float('{0:.2f}'.format(
self.dynamic_php.comparisonRatio))
if self.base_ratio > ratio:
self.base_ratio = ratio
self.dynamic_jsp = DynamicContentParser(
self.requester, firstpath_jsp, res1_jsp.text, res2_jsp.text)
if self.dynamic_jsp is not None:
ratio = float('{0:.2f}'.format(
self.dynamic_jsp.comparisonRatio))
if self.base_ratio > ratio:
self.base_ratio = ratio
self.dynamic_asp = DynamicContentParser(
self.requester, firstpath_asp, res1_asp.text, res2_asp.text)
if self.dynamic_asp is not None:
ratio = float('{0:.2f}'.format(
self.dynamic_asp.comparisonRatio))
if self.base_ratio > ratio:
self.base_ratio = ratio
def fuzz(self, cmp_page):
if self.flag == True:
if cmp_page.status_code == 404:
return False
else:
return True
else:
if cmp_page.status_code == 404:
return False
redirectToInvalid = []
for express in self.redirection_regexp:
if express is not None:
redirectToInvalid.append(
re.match(express, cmp_page.url) is not None)
if not any(redirectToInvalid):
return True
ratio_php = self.dynamic_php.compareTo(cmp_page.text)
ratio_jsp = self.dynamic_jsp.compareTo(cmp_page.text)
ratio_asp = self.dynamic_asp.compareTo(cmp_page.text)
if self.base_ratio <= ratio_php or self.base_ratio <= ratio_jsp or self.base_ratio <= ratio_asp:
return False
elif any(redirectToInvalid) and ((self.ratio - 0.15) <= ratio_php or (self.ratio - 0.15) <= ratio_jsp or (self.ratio - 0.15) <= ratio_asp):
return False
return True
if __name__ == '__main__':
req = Requester('https://www.baidu.com/')
fuzzer = Fuzzer(req)
print(fuzzer.fuzz(requests.get('https://www.baidu.com/hello.php')))
2. --ctf参数针对讨厌的bak、swp等文件
3. 喜欢的 star一下吧(:
4. 源码:https://github.com/youncyb/hiddenSensor
0x04 hiddenSensor
1. 支持平台
macOS|Linux|Windows
python3
2. 用法
usage: hiddenSensor.py [-h] [-u URL] [-L URLLIST] [-e EXTENSION] [-H HEADERS]
[--user-agent USER_AGENT] [--random-agent] [-c COOKIES]
[-r RECURSIVE] [--proxy PROXY] [-s DELAY]
[--timeout TIMEOUT] [-m MAX_RETRIES] [-t THREADS_COUNT]
[-404 PATH_404] [--lowercase] [--uppercase]
[--dicts-path WORDLIST] [--ctf]
optional arguments:
-h, --help show this help message and exit
madatory settings:
-u URL, --url URL target
-L URLLIST, --urlList URLLIST
url file path
-e EXTENSION, --extension EXTENSION
the extension of website type (default : "php")
connection settings:
-H HEADERS, --headers HEADERS
set headers
--user-agent USER_AGENT
user-agent you want to specify
--random-agent random-agent (default: False)
-c COOKIES, --cookie COOKIES
cookie you want to specify (example: -c
"domain=xxx;path=xxx")
-r RECURSIVE, --recursive RECURSIVE
Recursive blasting subdir (default: 0 layers)
--proxy PROXY set proxy (http proxy,example:--proxy
http://127.0.0.1:1090)
-s DELAY, --delay DELAY
time.sleep(delay) every request (default: 0)
--timeout TIMEOUT max time every request is waiting (default: 30 s)
-m MAX_RETRIES, --max-retries MAX_RETRIES
max retries when meeting network problem (default: 5)
other settings:
-t THREADS_COUNT, --thread THREADS_COUNT
max thread count you want to specify (default: 10)
-404 PATH_404, --404-page PATH_404
the 404 page you want to specify (example: if
error.php -404 "error")
--lowercase force to be lowercase
--uppercase force to be uppercase
--dicts-path WORDLIST
other dictionary you want to specify
--ctf if it's specified, process will find sensor file
(xxx.php.bak, .xxx.php.swp ...)
example:python3 hiddenSensor.py -u http://www.xxx.com/ -e php --ctf
3. 特点
- 支持多线程
- 支持http头部定制
- 支持多个url扫描
- 支持暂停(ctrl+c)、继续
- 支持自定义字典,不过db里面的应该够了
- 支持自定义延时、最大重试次数
- 支持http代理
- 支持定义404路径
- 支持自定义几层递归扫描
- 支持
.bak|.swp
等文件扫描
Comments NOTHING