SBMnet dataset provides a realistic and diverse set of videos. They have been selected to cover a wide range of detection challenges and are representative of typical indoor and outdoor visual data captured today in surveillance, smart environment, and video database scenarios. These videos come from our personal collection as well as from public datasets, namely CDnet, BMC2012, VSSN, the SABS, LASIESTA, LIMU, CMU, ICRA, IPPR, CIRL, ATON, UCF, MIT, Fish4Knowledge and PETS.

    NOTE! if you use any data from the SBMnet dataset, please cite the following paper :
    Jodoin P-M, Maddaelena L., Petrosino A., Wang Y.
    Extensive Benchmark and Survey of Background Modeling Methods
    IEEE Transactions on Image Processing, 26(11), 2017, p.5244-5256

  • SBMnet was developed as part of the ICPR 2016 Scene Background Modeling Contest challenge (SBMC2016 tab). This dataset consists of 79 camera-captured videos spanning 8 categories selected to include diverse change and motion detection challenges:
    • Basic category represents a mixture of mild challenges typical of the shadows, Dynamic Background, Camera Jitter and Intermittent Object Motion categories. Some videos have subtle background motion, others have isolated shadows, some have an abandoned object and others have pedestrians that stop for a short while and then move away. These videos are fairly easy, but not trivial, to process, and are provided mainly as reference.
    • Intermittent Motion category includes videos with scenarios known for causing “ghosting” artifacts in the detected motion, i.e., objects move, then stop for a short while, after which they start moving again. Some videos include still objects that suddenly start moving, e.g., a parked vehicle driving away, and also abandoned objects. This category is intended for testing how various algorithms adapt to background changes.
    • Clutter category of videos containing a large number of foreground moving objects occluding a large portion of the background.
    • Jitter category contains indoor and outdoor videos captured by unstable (e.g., vibrating) cameras. The jitter magnitude varies from one video to another.
    • Illumination Changes: indoor videos containing strong and mild illumination changes due to a light switch, curtains opening or automatic camera brightness change.
    • Background Motion category includes scenes with strong (parasitic) background motion: boats on shimmering water, cars passing next to a fountain, or pedestrians, cars and trucks passing in front of a tree shaken by the wind.
    • Very Long: videos containing more than 3,500 frames.
    • Very Short: videos containing a limited number of frames (less than 20) with a very low framerate.

The videos have been obtained with different cameras ranging from low-resolution IP cameras through mid-resolution camcorders. As a consequence, spatial resolutions of the videos vary from 240x240 to 800x600. Also, due to diverse lighting conditions present and compression parameters used, the level of noise and compression artifacts varies from one video to another. The length of the videos also varies from 6 to 9,370 frames and the videos shot by low-end IP cameras suffer from noticeable radial distortion. Different cameras may have different hue bias (due to different white balancing algorithms employed) and some cameras apply automatic exposure adjustment resulting in global brightness fluctuations in time. The frame rate also varies from one video to another, often due to a limited bandwidth.

We believe that the fact that our videos have been captured under a wide range of settings will help prevent this dataset from favoring a certain family of background estimation methods over others.

Ground truth

The ground truth is made of one or more background color images. These background images have been obtained after removing foreground objects with a semi-automatic method.