jsoup: Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety | Website analytics by TrustRadar
Blurry colored background
jsoup.org Web Development Java Libraries HTML Parsing

jsoup: Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety

Open source Java HTML parser, with the best of HTML5 DOM methods and CSS selectors, for easy data extraction.

jsoup is a Java library designed for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS, and jquery-like methods. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

Unique Visits

0

0 / day

Total Views

0

0 / day

Visit Duration, avg.

0 min

0 pages per visit

Bounce Rate

0%

  • Domain Rating

  • Domain Authority

  • Citation Level

Founded in

2009

Supported Languages

English, etc

Website Key Features

HTML Parsing

Parse HTML from a URL, file, or string; find and extract data, using DOM traversal or CSS selectors.

Data Extraction

Extract and manipulate data, using DOM traversal or CSS selectors.

HTML Manipulation

Clean user-submitted content against a safe white-list, to prevent XSS attacks.

CSS Selectors

Use CSS selectors to find elements, and then manipulate their attributes, text, and HTML.

DOM Methods

Provides a very convenient API for extracting and manipulating data, using the best of DOM methods.

HTML5 Support

Implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.

Cross-Platform

Works on all major platforms and is compatible with all Java versions from 1.5 onwards.

Open Source

jsoup is an open-source project distributed under the MIT License.

Additional information

License

MIT License

Repository

https://github.com/jhy/jsoup

Documentation

Comprehensive documentation available at https://jsoup.org/cookbook/

Community

Active community support through forums and GitHub issues.

Contributing

Contributions are welcome. Please read the contributing guide on GitHub.

Version

The latest stable version is 1.14.3 as of the last update.

Dependencies

jsoup has minimal dependencies, making it lightweight and easy to integrate into projects.

HTTP headers

Security headers report is a very important part of user data protection. Learn more about http headers for jsoup.org