Back to TILs

Extract data using devtools

Sometimes are necessary to extract/collect data from webpages. And its can be done without any special tool. By using the browser you are seeing the page, hitting F12 key, and typing some JavaScript code in the console window.

For example, is needed to produce a list with all heading (H3) in the current page, formatted as Markdown bullet list.

searching for H3 tags on devtools console:

Lets consider the page https://geraldo.dev/certificate/tag/security.html as starting point for data extraction.

document.getElementsByTagName("h3")

This statement returns:

HTMLCollection(16) [h3, h3, h3, h3, h3, h3, h3, h3, h3, h3, h3, h3, h3, h3, h3, h3]
HTMLCollection
Fig. 1 - HTMLCollection

The following commands can be used to get a list of H3 titles, format as markdown, and copy it to the clipboard:

myList="";
[...document.getElementsByTagName("h3")].forEach(
  e => myList += "* " + e.textContent + '\n'
);
copy(myList);

After running the previous snippet the clipboard will contains a text like this:

* Application Security in DevSecOps
* Crucial Role of Penetration Testing and Vulnerability Assessments in Cybersecurity
* CSSLP Cert Prep 1 Secure Software Concepts
* CSSLP Cert Prep 2 Secure Software Requirements
* CSSLP Cert Prep 3 Secure Software Design
* CSSLP Cert Prep 4 Secure Software Implementation
* CSSLP Cert Prep 5 Secure Software Testing
* CSSLP Cert Prep 6 Secure Lifecycle Management
* CSSLP Cert Prep 7 Software Deployment Operations and Maintenance
* CSSLP Cert Prep 8 Supply Chain and Software Acquisition
* CSSLP Cert Prep The Basics
* Cyber Security Foundation certiprof
* Desenvolvimento Seguro
* DevSecOps SAST and Code Review for DevSecOps
* Dynamic Application Security Testing DAST
* ISO 27001 Segurança da Informação
* Pen Test Analise e Testes de Vulnerabilidades em Redes Corporativas
* Secure Coding in Java
* Segurança da Informação
* TryHackMe Badge webbed
* Web Security OAuth and OpenID Connect