1. 研究情况简介
威胁情报质量测试,全称“Threat Intelligence Quotient Test”,简称“tiq-test”,是对威胁情报指标feeds的数据可视化和统计分析(Dataviz and Statistical Analysis of Threat Intelligence Indicator feeds)。
该研究在BSides LV 2014、DEF CON 22、OpenDNS S4 IRespond和HushCon 2014上以“测量你的威胁情报feeds的智商(Measuring the IQ of your threat intelligence feeds)”,以及在nbtcon 2014和SANS CTI Summit 2015上以“从威胁情报到智慧防御:一种数据科学方法(From Threat Intelligence to Defense Cleverness: A Data Science Approach)”两个议题进行公开介绍和分享。
本文也围绕并总结这两个议题,对威胁情报质量测试的具体方法和内容进行介绍和分析。
2.两个议题内容归纳
议题:测量你的威胁威胁情报feeds的智商,对该部分的解读,将结合参考【3】的网页文章以及参考【4】的PPT进行介绍。
议题:从威胁情报到智慧防御:一种数据科学方法,对该部分的解读,将结合参考【6】的网页文章以及参考【7】的PPT进行介绍。
对两个议题进行归纳总结,主要包括对TIQ-Test的介绍及相关测试步骤,如下。
2.1 准备测试步骤
该部分内容主要参考【3】和【4】。
- 1.从indicator feeds中提取原始数据信息
主要提取IP地址和主机名等信息。
代码:
outbound.ti = tiq.data.loadTI("raw", "public_outbound", "20140701")
outbound.ti[, list(entity, type, direction, source, date)]
提取结果:
## entity type direction source date
## 1: 1.224.163.26 IPv4 outbound alienvault 2014-07-01
## 2: 1.242.99.155 IPv4 outbound alienvault 2014-07-01
## 3: 1.85.2.118 IPv4 outbound alienvault 2014-07-01
## 4: 1.93.1.162 IPv4 outbound alienvault 2014-07-01
## 5: 1.93.161.204 IPv4 outbound alienvault 2014-07-01
## ---
## 16298: winscoft.com FQDN outbound zeus 2014-07-01
## 16299: wmzbase.ru FQDN outbound zeus 2014-07-01
## 16300: zhabademon.net FQDN outbound zeus 2014-07-01
## 16301: zhangleetranding.com FQDN outbound zeus 2014-07-01
## 16302: znatnydom.by FQDN outbound zeus 2014-07-01
- 2.选择Feeds
数据根据网络流方向,分为“inbound”和“outbound”,“ inbound”可以看作网络流的来源,“outbound”可以看作网络流的目标。
outbound代码
outbound.ti = tiq.data.loadTI("raw", "public_outbound", "20140701")
unique(outbound.ti$source)
outbound结果
## [1] "alienvault" "botscout" "malcode"
## [4] "malcode_zones" "malwaredomainlist" "malwaredomains"
## [7] "malwaregroup" "palevotracker" "spyeye"
## [10] "zeus"
inbound代码
inbound.ti = tiq.data.loadTI("raw", "public_inbound", "20140701")
unique(inbound.ti$source)
inbound结果
## [1] "alienvault" "autoshun" "blocklistde"
## [4] "bruteforceblocker" "charleshaley" "ciarmy"
## [7] "dragonresearch" "dshield" "honeypot"
## [10] "openbl" "packetmail" "virbl"
- 3.数据预处理和扩充
移除IP中的局域网的IP地址。
对于每个IP记录,添加asnumber和asname(使用Maxmind ASN DB)、添加国家信息(使用Max Mind GeoLite DB)、添加rhost信息(使用DNSDB)
数据预处理和扩充代码:
enrich.ti = tiq.data.loadTI("enriched", "public_outbound", "20140710")
enrich.ti = enrich.ti[, notes := NULL]
enrich.ti[c(2,22264, 22266)]
填充后的部分数据样式:
## entity type direction source date asnumber
## 1: 1.224.163.26 IPv4 outbound alienvault 2014-07-10 9318
## 2: 95.181.178.177 IPv4 outbound zeus 2014-07-10 57311
## 3: 98.131.185.136 IPv4 outbound zeus 2014-07-10 32392
## asname country
## 1: Hanaro Telecom Inc. KR
## 2: FOP ILIUSHENKO VOLODYMYR OLEXANDROVUCH GB
## 3: Ecommerce Corporation US
## host rhost
## 1: NA NA
## 2: newdomaininfo.ru host178-177.neohost.net
## 3: projects.globaltronics.net NA
2.2测试数据
结合两个议题的内容,主要的测试内容有新鲜度测试(Novelty Test)、覆盖测试(Overlap Test)、地域测试(Population Test)、过期测试(Aging Test)、非凡测试(Uniqueness Test)。测试的内容主要参考【6】【7】。
- 1.新鲜度测试
新鲜度测试是用来测量和比较添加或删除指标的情况(Measuring added and droppend indicators)。
Inbound测试代码如下:
inbound.novelty = tiq.test.noveltyTest("public_inbound", "20141001", "20141130",
select.sources=c("alienvault", "blocklistde",
"dshield", "charleshaley"),
.progress=FALSE)
tiq.test.plotNoveltyTest(inbound.novelty, title="Novelty Test - Inbound Indicators")
Inbound测试结果如下:
Outbound测试代码如下:
outbound.novelty = tiq.test.noveltyTest("public_outbound", "20141001", "20141130",
select.sources=c("alienvault", "malwaregroup",
"malcode", "zeus"),
.progress=FALSE)
tiq.test.plotNoveltyTest(outbound.novelty, title="Novelty Test - Outbound Indicators")
Outbound测试结果如下:
- 2.覆盖测试
覆盖测试用来测量和比较数据的重复情况。
覆盖度测试代码如下:
overlap = tiq.test.overlapTest("public_inbound", "20141101", "enriched",
select.sources=NULL)
tiq.test.plotOverlapTest(overlap, title="Overlap Test - Inbound Data - 20141101")
inbound数据覆盖度测试结果如下:
- 3.地域测试
地域测试是用来测量和比较流入、流出数据的地域分布情况。
地域测试代码如下:
outbound.pop = tiq.test.extractPopulationFromTI("public_outbound", "country",
date = "20141111",
select.sources=NULL, split.ti=F)
inbound.pop = tiq.test.extractPopulationFromTI("public_inbound", "country",
date = "20141111",
select.sources=NULL, split.ti=F)
complete.pop = tiq.data.loadPopulation("mmgeo", "country")
tiq.test.plotPopulationBars(c(inbound.pop, outbound.pop, complete.pop), "country")
地域测试结果:
地域测试参考代码:
outbound.pop = tiq.test.extractPopulationFromTI("public_outbound", "country",
date = "20141111",
select.sources=NULL,
split.ti=FALSE)
complete.pop = tiq.data.loadPopulation("mmgeo", "country")
tests = tiq.test.populationInference(complete.pop$mmgeo,
outbound.pop$public_outbound, "country",
exact = TRUE, top=10)
# Whose proportion is bigger than it should be?
tests[p.value < 0.05/10 & conf.int.end > 0][order(conf.int.end, decreasing=T)]
地域测试参考结果:
## country conf.int.start conf.int.end p.value
## 1: CN 0.065323470 0.08150539 4.762343e-97
## 2: RU 0.037520304 0.04752410 3.878014e-141
## 3: HK 0.036605616 0.04563014 2.614466e-265
## 4: UA 0.025893666 0.03373351 1.034106e-168
## 5: NL 0.013268828 0.02084241 7.764204e-30
## 6: DE 0.009467889 0.01878853 3.370376e-11
- 4.过期测试
过期测试是用来观察指标信息最终是否被清除的情况。
过期测试代码:
outbound.aging = tiq.test.agingTest("public_outbound", "20141001", "20141130")
tiq.test.plotAgingTest(outbound.aging, title="Aging Test - Outbound Data")
过期测试结果:
- 5.非凡测试
非凡测试代码:
uniqueTest = rbind(
tiq.test.uniquenessTest("public_inbound", "20141001","20141001", "raw", split.tii = T),
tiq.test.uniquenessTest("public_inbound", "20141001","20141015", "raw", split.tii = T),
tiq.test.uniquenessTest("public_inbound", "20141001","20141030", "raw", split.tii = T),
tiq.test.uniquenessTest("public_inbound", "20141001","20141129", "raw", split.tii = T)
)
uniqueTest[count == 1]
非凡测试结果:
## count ratio days
## 1: 1 0.9759170 1
## 2: 1 0.9737684 15
## 3: 1 0.9727630 30
## 4: 1 0.9710713 60
测试结果图示:
2.个人总结及点评
本方案主要基于威胁情报信息的统计特征进行分析,通过对统计特征的时间属性、地理位置属性等进行统计分析判断,并根据统计分析结果判断不同威胁情报feeds源的质量。
本情报测试的测试结果,仅仅针对威胁情报源的feeds的特征,而没有考虑情报数据之间的关联属性。而根据实际基于威胁情报分析的场景和用例,对威胁情报进行关联分析判断的结果,对于判断威胁情报本身的质量,是起着重要的作用的。
参考
【1】tiq-test - Threat Intelligence Quotient Test,https://github.com/mlsecproject/tiq-test
【2】tiq-test-Summer2014,https://github.com/mlsecproject/tiq-test-Summer2014
【3】Measuring the IQ of your Threat Intelligence - Summer 2014,http://rpubs.com/alexcpsec/tiq-test-Summer2014-2
【4】Measuring the IQ of your Threat Intelligence Feeds (#tiqtest),https://www.slideshare.net/AlexandrePinto10/defcon-22-measuring-the
【5】tiq-test-Winter2015,https://github.com/mlsecproject/tiq-test-Winter2015
【6】From Threat Intelligence to Defense Cleverness: A Data Science Approach,http://rpubs.com/alexcpsec/tiq-test-Winter2015
【7】From Threat Intelligence to Defense Cleverness: A Data Science Approach ,https://digital-forensics.sans.org/summit-archives/cti2015/From-Threat-Intelligence-to-Defense-Cleverness–A-Data-Science-Approach–Alex-Pinto-Niddel.pdf