Reconnaissance tools, as should be clear from the name, are those that we use to gather information, usually in a passive state, about the networks and systems that we might plan to take action against in a logical sense. Such efforts may include gathering information from public websites, looking up Domain Name System (DNS) records, collecting metadata from accessible documents, retrieving very specific information through the use of search engine, or any of a number of other similar activities. Many such tools match closely with Open Source Intelligence (OSINT) techniques.
General Information Gathering
When looking for general information that can be used to provide intelligence on a target, there are a variety of sources that we can turn to. We can mine websites for data on companies and individuals, we can search job postings for a variety of information, we can look for personal and technical information in resumes, we can use search engines both in a general and very specific sense, and we can also use specialized searching tools such as Maltego. In all likelihood, we will utilize a combination of such techniques to assemble a more complete picture of our target.
Websites and Web Servers
All manner of interesting information can be found on the websites of individuals and organizations. Some such information may be intentionally displayed, such as corporate organizational information, and some of it may be shared in an unintentional or unauthorized manner.
Search engines, such as Google, can be of great use when conducting research for an attack. They can be used to collect information regarding a particular target, look up application or hardware details, or even collect very specific information to locate vulnerabilities in the target environment.
Even more specifically, search engines can be used to collect data that does not appear during casual searching. Such methods often involve very specifically targeted queries to the standard search engines, or the use of specially tuned search engines, such as Pipl.com, that return information within a particular area of focus.
Google hacking is the use of advanced operators in search engine queries, in order to enable more directly targeted searches. Although the name would tend to indicate that such searching would be specific to the Google search engine, in actuality, similar search parameters can be used with almost any search engine. Lists of such operators can generally be found on the page for the search engine in question.
The Deep Web
When search engines crawl the Internet to construct the indexes on which their search results are based, they touch only the very surface of the information that is available.
Great unexplored depths of information exist unplumbed due to the nature of such indexing. When we are conducting research on a target, we may very well like to see some of this information.
In recent years, several specialized search engines, such as Shodan, have come into being to provide access to some portion of this hidden information. These search engines are generally rather specialized in the information that they provide.
Whois is a tool used to query the globally distributed set of databases that contain the information regarding domain names around the world. The databases contain information regarding when the domain was registered or last updated, which registrar it was registered with, contact information for the owners of the domain, and the name servers that are used to resolve requests sent to the domain name.
Metadata is data about data. For instance, if we have a file containing the text of this chapter, and the file has a file size, last accessed timestamp, and bits set for file permissions, none of this data has anything directly to do with the contents of the file itself, but is data about the file storing the text. Although such information may seem to be rather mundane outside of digital forensics circles, some of the information contained in document or image metadata may be very interesting indeed. We may find items such as the usernames that have edited the file, paths where the file has been stored, previous revisions of the text, coordinates that indicate where a picture was taken, image thumbnails, or any of hundreds of other items of information, all stored in the file with the actual intended content.
Metagoofil is an excellent tool for hunting down metadata. Metagoofil is a script that conducts very directed Google searches, using some of the advance operators.
Exiftool is another wonderful tool for extracting metadata from documents and images. Exiftool is named for a type of metadata, called EXIF data that is normally attached to image files. This data can include information regarding the equipment that the image was created on, including serial numbers, thumbnails of the original image, coordinates where the image was created, for GPS-enabled devices, and a host of other information.
Strings is a utility that will parse a given file for strings of text, generally consisting of several printable characters in a row. Strings can be very helpful in finding data hidden in files, even data, such as deleted content, which may not be accessible through normal applications that are used to access and manipulate the file. Strings is a common tool on Linux and Unix distributions and is available as a download for Microsoft operating systems.
We can use strings to locate metadata not only in documents and images, but also in a variety of other files as well. Although some of the other metadata-centric tools may be more efficient at finding known metadata, strings will find all of the strings in a given file.We may get back quite a bit of irrelevant or useless data, but we will likely get back all of the data that is in the file in plaintext.
Throughout this section, we have discussed a number of types of data that can be helpful when conducting reconnaissance on a target. Wehave also talked about a number of tools that can be used to collect various items of such information. Several tools exist that can collect multiple items of information, but one particular tool shines in this particular area: Maltego from Paterva. Maltego allows us to start with a particular item of information, such as an email address, phone number, or IP address and use this information as the basis to collect other information. In Maltego, such links between information are referred to as transforms and can be very powerful for collecting large amounts of information in a very short period of time.
Maltego is available in both free and commercial editions, varying largely in the information available and what can be done with the information once it is discovered. Maltego often returns information that would have required a considerable amount of manual searching to discover.
Defense against the various tools that can be used for reconnaissance against a target generally revolves around one simple concept: limit the sources of data and the data that is available from each source to the greatest extent that is reasonable. Although it may not always be feasible to completely sever the flow of outgoing data, and in some cases may be outright harmful to do so, we can certainly attempt to keep a handle on the information that we do allow out.
In the case of information gained from general data found on websites, we can limit the information to a certain extent, but, in the case of a business, we cannot afford to be without such methods of communication. We can, however, be careful not to release overly detailed information, particularly in cases where we can very easily leak information, such as the job postings that we discussed earlier in this section. We can also implement policy in organizations to guide those who might post sensitive information to internal or external websites or social networking sites.