ShadowTrackr

Log in >
RSS feed

How to monitor keywords

27 September 2017
Keyword monitoring needed improvement. Recently I was confronted with a possible dataleak and wanted to monitor a few specific names and phrases in the news. The way it worked before was that the bots checked for the word with a space before or after it in the text on newsites and copy-pastesites:

        if " "+keyword in text or keyword+" " in text:
    
While it does a reasonable job of not matching things like "blakeywordbla", it had way to much false positives. I needed more finetuning and ended up implementing literal matches, multiple keyword matches and negative keywords.

Literal matches: "the keyword"

It works just like people are used to in Google searches. The exact words need to be present in the order specified. It triggers on "This is the keyword you are looking for", but not on "this keyword is not the one you are looking for".

Multiple matches: the keyword

Both "the" and "keyword" need to be present in the text in order to trigger, but the order is not important. They can easily be sentences apart, like in: "This keyword is what you want. The other words are ignored."

Negative matches: keyword -the

Again like in Google searches, any text with the word "keyword" in which "the" does not appear at all is a match. For example: "The keyword is not enough" doesn't match, and "Keyword matches are very useful for finding leaked data" does.

Of course you can mix all of the above. Do note that all keyword matches are case insensitive, and that news articles from the newsfeeds that we monitor trigger both on keyword matches in the headline and the text. Datadumpsites often don't have a title and matches are on keywords in the dump itself. Here are some keyword combinations to get your creativity started:

shadowtrackr leak -water
@shadowtrackr.com
shadowtrackr password
"Tracking your online footprint"

You can add them under Assets in the sidemenu . Happy keyword hunting! And don't forget to set push notifications to get a heads up on those really bad days.
Older posts >

Resources
API
Blog
Documentation
Integrations
Shodan
OpenCTI