scikit-learn: machine learning in Python — scikit-learn 0.16.1 documentation | Website analytics by TrustRadar
Blurry colored background
scikit-learn.org Machine Learning Data Science Artificial Intelligence Python Libraries

scikit-learn: machine learning in Python — scikit-learn 0.16.1 documentation

Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.

Unique Visits

900K

30000 / day

Total Views

1.5M

50000 / day

Visit Duration, avg.

5.5 min

3.5 pages per visit

Bounce Rate

40%

  • Domain Rating

  • Domain Authority

  • Citation Level

Founded in

2007

Supported Languages

English, etc

Website Key Features

Simple and efficient tools for data mining and data analysis

Accessible to everybody, and reusable in various contexts.

Built on NumPy, SciPy, and matplotlib

Integrates well with the scientific Python ecosystem.

Open source, commercially usable - BSD license

Allows for unrestricted use in both academic and commercial settings.

Wide range of supervised and unsupervised learning algorithms

From classical linear models to more advanced techniques like neural networks.

Cross-validation and model selection tools

Helps in evaluating the performance of models and selecting the best one.

Feature extraction and feature selection

Supports transforming raw data into features that can be used by machine learning algorithms.

Dimensionality reduction

Techniques like PCA and t-SNE for reducing the number of random variables under consideration.

Clustering

Grouping unlabeled data into clusters, useful for exploratory data analysis.

Preprocessing and normalization

Tools for scaling, centering, normalization, binarization, and imputation of missing values.

Model persistence

Save and load models using Python’s built-in persistence model, pickle.

Additional information

Community and Support

Scikit-learn has a large and active community. It offers extensive documentation, user guides, and examples. There are also mailing lists and a GitHub repository for support and contributions.

Performance

While scikit-learn is not the fastest machine learning library, it is optimized for ease of use, clarity, and consistency. For performance-critical applications, it can be combined with libraries like Cython or joblib.

Educational Use

Scikit-learn is widely used in academia for teaching and research due to its simplicity and the breadth of algorithms it covers.

Integration with Other Libraries

Scikit-learn can be integrated with other Python libraries such as Pandas for data manipulation, Matplotlib for plotting, and IPython for interactive computing.

Development and Maintenance

The library is actively developed and maintained by a team of volunteers. It is part of the broader scikit-learn project, which includes other tools for machine learning and data science.

HTTP headers

Security headers report is a very important part of user data protection. Learn more about http headers for scikit-learn.org