Open Source Intelligence (OSINT): A Practical example

This is part 2 of our series of articles on OSINT. Find all articles here.

OSINT is the practice of gathering intelligence from publicly available sources to support intelligence needs. In the cybersecurity arena, OSINT is used widely to discover vulnerabilities in IT systems and is commonly named Technical Footprinting. Footprinting is the first task conducted by hackers – both black and white hat hackers – before attacking computer systems. Gathering technical information about the target computer network is the first phase in any penetration testing methodology.

In this article, I will demonstrate how various OSINT techniques can be exploited to gain useful intelligence from public sources about target computerized systems.

Technical Investigation of Target website

By knowing the type of programming language, web frameworks, content management system (CMS) used to create the target website, we can search for vulnerabilities that target these components (especially zero-day vulnerabilities) and then work to exploit any of these vulnerabilities instantly, once discovered.

There are different online services to examine the type of technology used to build websites. To use such service, all you need to do is to supply a target domain name, to have a full list of technical specifications and online libraries/programming languages used to build a subject website. These services also reveal the hosting provider of the target website, SSL certificate register name in addition to email system type. The following are some popular services to use:

  1. https://builtwith.com
  2. https://www.wappalyzer.com

In the following screen capture, I use builtwith service to investigate the technical specifications of a target website. This reveals different technical information (see Figure 1) and opens the door to more examination for each technology used to build the subject website. Now, I need to check the list of technical specifications to see if there is unpatched operating systems or outdated content management system with known vulnerabilities that I can exploit to gain entrance to target system.    

Figure 1 – Using builtwith to investigate technology used to build the target website

For example, large numbers of ASP.net websites, use Telerik Controls (https://www.telerik.com) to enrich their design. To find security vulnerabilities associated with Telerik Controls, you can go to https://www.cvedetails.com and search for Telerik security vulnerabilities (see Figure 2).

Figure 2 – List of security vulnerabilities for Telerik Controls

There are many websites that list security vulnerabilities of operating systems, software and other web applications. The following are the most popular one that we can use to search for common security vulnerabilities and exposures:

  1. https://vulmon.com
  2. https://sploitus.com
  3. https://www.saucs.com
  4. https://www.shodan.io

Analytics and Tracking

Most websites use Google services to analyze traffic and serve advertisements. We can use this feature to capture all linked domain names. For example, I can find all websites that use the same Google AdSense or Analytical accounts. Dnslytics (https://dnslytics.com/reverse-analytics) is a free online service that finds domains sharing the same Google Analytics ID (see Figure 3).

Figure 3 – Using reverse Google Analytics service to reveal domain names belong to the same entity

Target website previous History

In many instances, checking the old version of the target website can reveal important information. For example, an old website version of a corporation may reveal top managements’ email addresses and phone numbers before they got removed from the new version. Wayback Machine (https://archive.org/web) is a good place to start your search for old versions of websites (see Figure 4).

Figure 4 -Using the Wayback machine to see previous versions of websites

Sub-domain name Discovery

Finding a target website sub-domains is important and can reveal sensitive information about the target such as the VPN portal, email system and FTP server address where some files may have left unprotected. To find all sub-domain names of a target indexed by Google, use the following Google search command (see Figure 5).

Figure 5 – Replace example.com with your target domain name

Type and versions of IT infrastructure of the target company

Job websites – and any job announcement posted on the target website – should be analyzed to discover the exact IT infrastructure used by the target organization. For example, I conducted a simple search on employee resumes on job websites and was able to capture important information about target organization security systems (e.g. Firewalls and Intrusion Detection Systems), server operating system type, email system, networking devices, types of backup systems and much more (see Figure 6).

Figure 6 – Sample resume found on a job website that reveals the type of IT infrastructure of the target organization

Harvest digital files hosted on the target domain name

Using advanced Google search engine techniques (also known as Google dorks) can reveal a great amount of information about the target organizations’ IT systems in addition to confidential files left on the public server. There are thousands of Google dorks and you can practice creating yours. A comprehensive list of Google dorks can be found in the Google Hacking Database (https://www.exploit-db.com/google-hacking-database).

I will experiment using Google dork to locate all PDF files posted on the target website (see Figure 7):

Figure 7 – Find all PDF files on the target domain name

In the above example, I searched for PDF files, however, you can change the file type to something else as you want (doc, docx, xls, txt).

Information contained within files metadata

For each file found on the target website, we should investigate its metadata. Metadata is data about data. In technical terms, it contains hidden descriptive information about the file it belongs to. For example, some metadata included in an MS Office document file might include the author’s name, date/time created, comments, software used to create the file in addition to the type of OS of the device used to create this file. (see Figure 8).

Figure 8 – Checking a PDF file metadata info

From Figure 8, I found the following facts about the subject PDF file metadata:

  1. Installed PDF reader Version on the creation device: 1.5
  2. Application used to create the report: MS PowerPoint 2010 (using the “Save As” function)
  3. Type of OS used on the target device: Windows
  4. File creation date/time:  July 2017
  5. Author Name (The person who creates the file).

If the file contains an author name, an additional search could be conducted to lock up more details of the file’s author using specialized people data collection websites. The following lists some popular people search engines:

  1. Spokeo (https://www.spokeo.com) (see Figure 9)
  2. Truepeoplesearch (https://www.truepeoplesearch.com)
  3. Truthfinder (https://www.truthfinder.com)
  4. 411 (https://www.411.com)
Figure 9 – Using SPOKEO to lock up information about people you know

Email naming criteria

To predicate the naming criteria used by the target organization when creating new email accounts, we should investigate the naming of current email addresses. For example, many organizations use the following naming criteria:

  • Most common patterns of naming new emails: {first}(DOT){last first three characters}@exampleWebsite.com
  • Other naming criteria include: {first}@exampleWebsite.com

I usually use this website https://www.email-format.com to find the email address formats in use at thousands of companies.

Leaked Credentials

Leaked accounts credentials are spread everywhere online, especially in the darknet. For example, pastebin websites (see Figure 10) contain a vast amount of leaked credentials. Anonymous file sharing websites, such as https://anonfile.com (see Figure 11) also contain large numbers of leaked credential files with billions of records.

Figure 10 – Leaked credentials found on Pastebin.com

Figure 11 – A file hosted on https://anonfile.com contains thousands of leaked credentials

Conclusion

In this article, I tried to give a brief overview of OSINT capabilities and how to use it to gather useful intelligence about different entities.

In today’s information age, having OSINT skills is something great to have, however, there are many things – or prerequisites – you should master in order to make your OSINT search rich and effective. For instance, before you begin your OSINT search, you should learn how to conceal your digital identity and become anonymous online. This is essential to prevent threat actors from discovering your search activities. OSINT is strongly related to Digital Forensics and knowing basic information about digital forensics operations will also prove useful when conducting OSINT gathering activities. 

In the next article, I will cover how to assure your online privacy, I will talk about the different tracking techniques – currently employed – to track and profile Internet users and how to avoid them, I will also explore web layers and teach you how to access the Darknet in addition to using anonymity networks such as the TOR network to surf the ordinary web anonymously.

The first part of this series, an introduction to OSINT, can be found here.

Print Friendly, PDF & Email
Chief Strategy Officer ITSEC Group / Co-Founder ITSEC Thailand c | Website

Dr. Khera is a veteran cybersecurity executive with more than two decades worth of experience working with information security technology, models and processes. He is currently the Chief Strategy of ITSEC Group and the Co-founder and CEO of ITSEC (Thailand). ITSEC is an international information security firm offering a wide range of high-quality information security services and solutions with operation in Indonesia, Malaysia, Philippines, Singapore, Thailand and Dubai.

Previously the head of cyber security Presales for NOKIA, Dr. Khera has worked with every major telecom provider and government in the APAC region to design and deliver security solutions to a constantly evolving cybersecurity threat landscape.

Dr. Khera holds a Doctor of Information Technology (DIT) from Murdoch University, a Postgraduate Certificate in Network Computing from Monash University and a Certificate of Executive Leadership from Cornell University.

Dr. Khera was one of the first professionals to be awarded the prestigious Asia Pacific Information Security Leadership Awards (ISLA) from ISC2 a world-leading information security certification body under the category of distinguished IT Security Practitioner for APAC.

Leave a Reply

Your email address will not be published. Required fields are marked *