Meet DarkBERT: New AI Trained on the Dark Web - GeeksforGeeks

Mai 20, às 12:35


4 min de leitura


0 leituras

A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.
Meet DarkBERT: New AI Trained on the Dark Web - GeeksforGeeks

In a daring exploration of the hidden corners of the internet, South Korean researchers developed DarkBERT, an AI model specifically designed to navigate the obscure realms of the dark web. With its ability to index and analyze clandestine domains, DarkBERT shed light on the enigmatic and often illicit aspects of online activity that are typically concealed from public view.

Venturing into the shadowy realms of the World Wide Web, researchers have embarked on a captivating exploration of the darkest corners, where illegal and malicious activities thrive. 

In their quest to combat cybercrime, a rapidly evolving field heavily reliant on natural language processing, these researchers have developed DarkBERT—an AI model that aims to illuminate the hidden intricacies of the digital underworld. 

By delving into the murky depths of leaked data sharing and illicit drug trade, DarkBERT offers a unique perspective on the relentless fight against online wrongdoing, with noble intentions at its core.

What’s the Dark Web?

The dark web contains all the links the search engine still needs to index. And yes, a big part of this dark web contains matters which could be illicit and include a big chunk of criminal activities.

This side of the web might even have a track of all the activities about you. Sounds scary? It is. According to a study conducted in 2019, Dr. Michael McGuires showed that the listings of potential dangers on the dark web are just rising by a percentage compared to 2016.

You might find fake credit card numbers, stolen credentials of platforms, hacked accounts, softwares, and even methods to break into these! Some might even find disturbed visuals that could be scary to the core.

Yes, not everything on the dark web is illegal, and there might be helpful and fun things too. But the dark side contains a rabbit hole you wouldn’t like.

What is DarkBERT?

In a groundbreaking endeavor, a team of researchers has ventured into the depths of the dark web, harnessing the power of DarkBERT—a cutting-edge language model—to illuminate the previously impenetrable corners of the internet. The team’s pioneering efforts have resulted in the development of a sophisticated tool capable of comprehending and analyzing the elusive domains hidden from search engines, potentially revolutionizing the fight against cybercrime. 

With its superior performance surpassing that of previous models, DarkBERT holds promising prospects for bolstering cybersecurity measures and monitoring illicit activities within the shadowy recesses of the digital underworld.

In a yet-to-be-peer-reviewed paper titled “DarkBERT: A language model for the dark side of the Internet,” the research team outlines their innovative approach. By leveraging the Tor network—a gateway to the hidden realms of the dark web—the team integrated DarkBERT into this covert ecosystem, collecting a vast database of raw data as a result.

The efficacy of DarkBERT surpasses its predecessors, including the notable RoBERTa model introduced by Facebook researchers in 2019. While RoBERTa focused on predicting concealed text sections within unannotated language samples, DarkBERT’s formidable capabilities encompass unraveling the intricate tapestry of the dark web, as noted in the team’s paper.


The researchers propose a wide range of cybersecurity applications for DarkBERT’s advanced functionality. It possesses the potential to detect websites engaged in selling ransomware or leaking confidential data—a crucial aspect in combating the ever-evolving landscape of cyber threats. 

Moreover, DarkBERT’s capabilities extend to monitoring the ceaseless flux of dark web forums, and proactively identifying and scrutinizing illicit information exchanges.:

DarkBERT emerges as a beacon of hope in the relentless battle against online malevolence. By harnessing the power of natural language processing and delving into the enigmatic world of the dark web, this formidable AI model offers unprecedented insights, empowering cybersecurity professionals to counteract cybercrime with increased efficacy.

As the curtain lifts on the previously concealed aspects of the internet, DarkBERT ushers in a new era of resilience and vigilance, safeguarding the digital landscape against the clandestine forces lurking within.

How Does DarkBERT Function?

Currently, the DarkBERT is still in the works. The developers are currently working on the AI to adapt well to the language that might be being used on the dark web. The researchers will be training the model by crawling through the Tor network.

It has also been reported that the pre-trained model will be filtered well and deduplicated. Data processing will be incorporated into the model to identify threats or concerns from the expected sensitive information.

What’s next?

A lot has been going on as the DarkBERT is being developed. The researchers will be incorporating multiple languages into the pre-trained model. DarkBERT performance is expected to be better with using the latest language in the pre-trained model to allow the crawling of additional data.

Last Updated : 20 May, 2023

Like Article

Save Article

Continue lendo


Motorola Razr Plus é o novo dobrável rival do Galaxy Z Flip
Após duas tentativas da Motorola em emplacar — novamente — telefones dobráveis, eis que temos aqui a terceira, e aparentemente bem-vinda, tentativa. Estamos falando do Motorola Razr Plus, um smartphone...

Hoje, às 15:20


Mentoring for the LGBTQ+ Community
Once unpublished, all posts by chetanan will become hidden and only accessible to themselves. If chetanan is not suspended, they can still re-publish their posts from their dashboard. Note: Once...

Hoje, às 15:13


IA: mais um arrependido / Déficit de TI / Apple: acusação grave · NewsletterOficial
Mais um pioneiro da IA se arrepende de seu trabalho: Yoshua Bengio teria priorizado segurança em vez de utilidade se soubesse o ritmo em que a tecnologia evoluiria – ele junta-se a Geoffr...

Hoje, às 14:37

Hacker News

The Analog Thing: Analog Computing for the Future
THE ANALOG THING (THAT) THE ANALOG THING (THAT) is a high-quality, low-cost, open-source, and not-for-profit cutting-edge analog computer. THAT allows modeling dynamic systems with great speed,...

Hoje, às 14:25


[DISCUSÃO/OPINIÕES] – Outsourcing! O que, para quem, por que sim, por que não! · dougg
Quero tentar trazer nesta minha primeira publicação, uma mistura de um breve esclarecimento sobre o que são empresas de outsourcing, como elas funcionam e ganham dinheiro, mas também, ven...

Hoje, às 13:58


Duvida: JavaScript - Desenvolver uma aplicação que vai ler um arquivo *.json · RafaelMesquita
Bom dia a todos Estou estudando javascript e me deparei com uma dificuldade e preciso de ajuda *Objetivo do estudo: *desenvolver uma aplicação que vai ler um arquivo *.json Conteudo do in...

Hoje, às 13:43


Automatize suas negociações com um robô de criptomoedas
Índice Como o robô de criptomoedas Bitsgap funciona?Qual a vantagem de utilizar um robô de criptomoedas?Bitsgap é confiável? O mercado de trading tem se tornado cada vez mais popular e as possibilidades de...

Hoje, às 13:13

Hacker News

Sketch of a Post-ORM
I’ve been writing a lot of database access code as of late. It’s frustrating that in 2023, my choices are still to either write all of the boilerplate by hand, or hand all database access over to some...

Hoje, às 13:11


14 chuveiros elétricos para o banho dos seus sonhos
Índice Chuveiro ou Ducha?Tipos de chuveiro elétrico9 fatores importantes para considerar na hora de comprar chuveiros elétricosMelhores chuveiros elétricosDuo Shower LorenzettiFit HydraAcqua Storm Ultra...

Hoje, às 11:00


Learn about the difference between var, let, and const keywords in JavaScript and when to use them.
var, let, and const: What's the Difference in JavaScript? JavaScript is a dynamic and flexible language that allows you to declare variables in different ways. You can use var, let, or const keywords to...

Hoje, às 10:21