Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Releases: qubvel-org/segmentation_models.pytorch

Segmentation Models - v0.5.0

17 Apr 10:13
@qubvel qubvel
420ce84
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading

New Models

DPT

DPT

The DPT model adapts the Vision Transformer (ViT) architecture for dense prediction tasks like semantic segmentation. It uses a ViT as a powerful backbone, processing image information with a global receptive field at each stage. The key innovation lies in its decoder, which reassembles token representations from various transformer stages into image-like feature maps at different resolutions. These are progressively combined using convolutional PSP and FPN blocks to produce full-resolution, high-detail predictions.

The model in smp can be used with a wide variety of transformer-based encoders

import segmentation_models_pytorch as smp
# initialize with your own pretrained encoder
model = smp.DPT("tu-mobilevitv2_175.cvnets_in1k", classes=2)
# load fully-pretrained on ADE20K 
model = smp.from_pretrained("smp-hub/dpt-large-ade20k")
# load the same checkpoint for finetuning
model = smp.from_pretrained("smp-hub/dpt-large-ade20k", classes=1, strict=False)

The full table of DPT's supported timm encoders can be found here.

Models export

A lot of work was done to add support for torch.jit.script, torch.compile (without graph breaks: fullgraph=True) and torch.export features in all encoders and models.

This provides several advantages:

  • torch.jit.script: Enables serialization of models into a static graph format, enabling deployment in environments without a Python interpreter and allowing for graph-based optimizations.
  • torch.compile (with fullgraph=True): Leverages Just-In-Time (JIT) compilation (e.g., via Triton or Inductor backends) to generate optimized kernels, reducing Python overhead and enabling significant performance improvements through techniques like operator fusion, especially on GPU hardware. fullgraph=True minimizes graph breaks, maximizing the scope of these optimizations.
  • torch.export: Produces a standardized Ahead-Of-Time (AOT) graph representation, simplifying the process of exporting models to various inference backends and edge devices (e.g., through ExecuTorch) while preserving model dynamism where possible.

PRs:

Core

All encoders from third-party libraries such as efficientnet-pytorch and pretrainedmodels.pytorch are now vendored by SMP. This means we have copied and refactored the underlying code and moved all checkpoints to the smp-hub. As a result, you will have fewer additional dependencies when installing smp and get much faster weights downloads.

🚨🚨🚨 Breaking changes

  1. UperNet model was significantly changed to reflect the original implementation and to bring pretrained checkpoints into SMP. Unfortunately, UperNet model weights trained with v0.4.0 will be not compatible with SMP v0.5.0.

    • Fix UperNet model and add pretrained checkpoints by @qubvel in #1124
  2. While the high-level API for modeling should be backward compatible with v0.4.0, internal modules (such as encoders, decoders, blocks) might have changed initialization and forward interfaces.

  3. timm- prefixed encoders are deprecated, tu- variants are now the recommended way to use encoders from the timm library. Most of the timm- encoders are internally switched to their tu- equivalent with state_dict re-mapping (backward-compatible), but this support will be dropped in upcoming versions.

Other changes

New Contributors

Full Changelog: v0.4.0...v0.5.0

Assets 2
Loading
qubvel, Jordan-Pierce, analokmaus, khairulislam, matejfric, GirinChutia, bach05, and zjans reacted with thumbs up emoji qubvel reacted with hooray emoji
8 people reacted

Segmentation Models - v0.4.0

08 Jan 15:28
@qubvel qubvel
12f8394
This commit was created on GitHub.com and signed with GitHub’s verified signature.
GPG key ID: B5690EEEBB952194
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading

New models

Segformer

contributed by @brianhou0208

Originally, SegFormer is a transformer-based semantic segmentation model known for its simplicity and efficiency. It uses a lightweight hierarchical encoder to capture multi-scale features and a minimal decoder for fast inference.

With segmentation-models-pytorch you can utilize the model with a native Mix Vision Transformer encoder as long as with 800+ other encoders supported by the library. Original weights are also supported and can be loaded as follows:

import segmentation_models_pytorch as smp
model = smp.from_pretrained("smp-hub/segformer-b5-640x640-ade-160k")

or with any other encoder:

import segmentation_models_pytorch as smp
model = smp.Segformer("resnet34")

See more checkpoints on the HF Hub.

UperNet

contributed by @brianhou0208

UPerNet (Unified Perceptual Parsing Network) is a versatile semantic segmentation model designed to handle diverse scene parsing tasks. It combines a Feature Pyramid Network (FPN) with a Pyramid Pooling Module (PPM) to effectively capture multi-scale context.

import segmentation_models_pytorch as smp
model = smp.UPerNet("resnet34")

New Encoders

Thanks to @brianhou0208 contribution 800+ timm encoders are now supported in segmentation_models.pytorch. New modern encoders like convnext, efficientvit, efficientformerv2, hiera, mambaout and more can be used as easy as:

import segmentation_models_pytorch as smp
model = smp.create_model("upernet", encoder_name="tu-mambaout_small")
# or
model = smp.UPerNet("tu-mambaout_small")

New examples

Other changes

  • Project migrated to pyproject.toml by @adamjstewart
  • Better dependency managing and testing (minimal and latest dependencies, linux/windows/mac platforms) by @adamjstewart
  • Better type annotations
  • Tests are refactored for faster CI and local testing by @qubvel

All changes

New Contributors

Full Changelog: v0.3.4...v0.4.0

Loading
brianhou0208, calebrob6, lgy112112, jizhang02, aswahd, nieyan, nrudakov, Jiaqi-Lv, ju-leon, dereyly, and 2 more reacted with thumbs up emoji lgy112112, qubvel, ebouhid, jizhang02, patrontheo, dlueder, davidhuangal, and noshita reacted with hooray emoji weiji14, jizhang02, patrontheo, dlueder, and davidhuangal reacted with rocket emoji
17 people reacted

Segmentation Models - v0.3.4

23 Aug 13:15
@qubvel qubvel
Compare
Choose a tag to compare
Loading

Updates

  • 🤗 Hugging Face integration: you can save, load, and share models with HF Hub, see example notebook.

Full log

New Contributors

Full Changelog: v0.3.3...v0.3.4

Contributors

Borda, Smartappli, and 4 other contributors
Loading
KhoiDOO, merveenoyan, vijay-jaisankar, bach05, rwightman, Serhii-Tiurin, ebouhid, Laynholt, Enshuo-Li, 17SIM, and 7 more reacted with heart emoji
17 people reacted

Segmentation Models - v0.3.3

28 May 15:49
@qubvel qubvel
e5d3db2
This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
GPG key ID: 4AEE18F83AFDEB23
Expired
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading

Updates

  • Pytorch image models (timm) version upgrade to 0.9.2
Loading
abcamiletto, iver56, kitbransby, DShomin, yurithefury, yu4u, jmartinhoj, Kdev2108, ermeson-alves, MathieuHaller, and 2 more reacted with thumbs up emoji
12 people reacted

Segmentation Models - v0.3.2

07 Jan 10:37
@qubvel qubvel
c39de0c
This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
GPG key ID: 4AEE18F83AFDEB23
Expired
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading

Updates

  • Added Apple's Mobile One encoder from repo (use encoder_name="mobileone_s{0..4}").
  • Pytorch image models (timm) version upgrade to 0.6.12 (500+ encoders available)
  • Minor typo fixes and docs updates

Breaking changes

  • Minimum Python version 3.6 -> 3.7

Thanks @VadimLevin, @kevinpl07, @Abd-elr4hman

Contributors

VadimLevin, kevinpl07, and Abd-elr4hman
Loading
achuthasubhash, Diyago, TezRomacH, OrKatz7, and ozymand1a reacted with thumbs up emoji semaphore-egg and mhmdsab reacted with rocket emoji
7 people reacted

Segmentation Models - v0.3.1

30 Nov 12:31
@qubvel qubvel
Compare
Choose a tag to compare
Loading

Updates

  • Added Mix Vision Transformer encoder from SegFormer [official code] [paper]. Use argument encoder_name="mit_b0" (or mit_b1..b5) to create a model.
  • Minor typo fixes and docs updates
Loading
yanggggjie, da2r-20, semaphore-egg, nightandweather, jizhang02, achuthasubhash, ambekarsameer96, gisbi-kim, davidhuangal, aghand0ur, and 3 more reacted with thumbs up emoji mhmdsab reacted with rocket emoji
13 people reacted

Segmentation Models - v0.3.0

29 Jul 10:32
@qubvel qubvel
Compare
Choose a tag to compare
Loading

Updates

  • Added smp.metrics module with different metrics based on confusion matrix, see docs
  • Added new notebook with training example using pytorch-lightning Open In Colab
  • Improved handling of incorrect input image size error (checking image size is 2^n)
  • Codebase refactoring and style checks (black, flake8)
  • Minor typo fixes and bug fixes

Breaking changes

  • utils module is going to be deprecated, if you still need it import it manually from segmentation_models_pytorch import utils

Thanks a lot for all contributors!

Loading
achuthasubhash, veb-101, bugsuse, TezRomacH, semaphore-egg, yohann84L, Minhan-Bae, RichardLiu083, iver56, and csaroff reacted with thumbs up emoji phborba, semaphore-egg, qubvel, achuthasubhash, Diyago, bugsuse, jo-mueller, yohann84L, aiorhiroki, and csaroff reacted with hooray emoji
15 people reacted

Segmentation Models - v0.2.1

18 Nov 10:48
@qubvel qubvel
a288d33
This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
GPG key ID: 4AEE18F83AFDEB23
Expired
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading

Updates

  • Universal timm encoder. 400+ pretrained encoders from timm available with tu- prefix. List of available encoders here.
  • Minor fixes and improvements.
Loading
TezRomacH, lRomul, semaphore-egg, captainst, phborba, achuthasubhash, Diyago, Howeng98, nauyan, arekmula, and 12 more reacted with thumbs up emoji TezRomacH, lRomul, phborba, Diyago, nauyan, and davidhuangal reacted with hooray emoji
22 people reacted

Segmentation Models - v0.2.0

05 Jul 09:05
@qubvel qubvel
914f2bf
This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
GPG key ID: 4AEE18F83AFDEB23
Expired
Verified
Learn about vigilant mode.
Compare
Choose a tag to compare
Loading

Updates

  • New architecture: MANet (#310)
  • New encoders from timm: mobilenetv3 (#355) and gernet (#344)
  • New loss functions in smp.losses module (smp.utils.losses would be deprecated in future versions)
  • New pretrained weight initialization for first convolution if in_channels > 3
  • Updated timm version (0.4.12)
  • Bug fixes and docs improvement

Thanks to @azkalot1 @JulienMaille @originlake @Kupchanski @loopdigga96 @zurk @nmerty @ludics @Vozf @markson14 and others!

Loading
gisbi-kim, AllentDan, crazymidnight, hexfaker, TezRomacH, dasmehdix, Mayukhdeb, mythrex, emilwallner, iosurodri, and 57 more reacted with heart emoji
67 people reacted

Segmentation Models - v0.1.3

13 Dec 10:22
@qubvel qubvel
Compare
Choose a tag to compare
Loading

Updates

  • New architecture Unet++ (#279)
  • New encoders RegNet, ResNest, SK-Net, Res2Net (#286)
  • Updated timm version (0.3.2)
  • Improved docstrings and typehints for models
  • Project documentation on https://smp.readthedocs.io

Thanks to @azkalot1 for the new encoders and architecture!

Loading
Previous 1
Previous

AltStyle によって変換されたページ (->オリジナル) /