《视频分析前沿》笔记

Image Features

Color

  • RGB:将RGB空间进行归一化,即可使得颜色信息独立于光强。
  • YIQ
  • YUV
  • HSI

Edge

和颜色特征相比,边缘特征对光照变化不敏感,具有较强的鲁棒性。

Optical Flow

基于一个错误的假设下进行计算的。

Texture

测定物体表面的强度变化,可得到图片的平滑性或粗糙性。

Object Tracking Approches

Point Tracking

Kalman Filter

Kernel Filter

Object detection

Background Subtraction

Segmentation

From a single Image(Machine Learning)

Density Extimation

Histograms

The bins of the histogram are defined as the intervals [x0 +mh, x0 +(m + 1)h), for m positive and negative integers, x0 is the origin and h the bin width.
$$
\hat f(x) = \frac{number of X_i in the same bin as x}{nh}
$$
分母中的h:对概率分布求积分,结果应该为1。对应到直方图即每个bin的面积和应该为1。在直方图中计算累计概率分布和时,每个bin的面积为长×宽,因此需要h。

Deawbacks:

  • In procedures like cluster analysis and nonparametric discriminant analysis, using histogram
    results in inefficient use of the data.
  • The histogram is not continuous so trouble arises when derivatives are required.
  • Choice of origin may have an effect in the interpretation.
  • Representing bivariate or trivariate data by histogram is difficult.

Naive Estimator

If the random variable X has density f, then:
$$
f(x) = \lim_{h\to0}\frac{1}{2h}P(x-h<X<x+h)
$$
Thus, the naive estimator is written as:
$$
\hat f(x) = \frac{[number of X_i^{‘}s in (x-h, x+h)]}{2hn}
$$
We define a weight function as follows:
$$
\omega(x) =
\left{\begin{matrix}
\frac{1}{2} & if & -1<x<1 \
0 & if & otherwise
\end{matrix}\right.
$$
Then f(x) becomes
$$
\hat{f}(x) = \frac{1}{nh}\sum_{i=1}^{n}\omega(\frac{x-x_i)}{h})
$$
Drawbacks:

  • $\hat f$is not continuous but has jumps at the points Xi±h and has zero derivative everywhere else.
  • The following example shows the “stepwise” nature of the esitmate.

Kernel Estimator

$h$为平滑因子。类似高斯分布函数里的$\sigma$。

References

  • “Density estimation for statistics and data analysis”, B.W. Silverman, 1998, London: Chapman & Hall/CRC.

Segmentation

Pixel-based Figure-Ground Segmentation

Challenges in Scene Modeling

Illumination changes:

  • Gradual change in illumination as might occur in outdoor scenes due to the change in the relative location of the sun during the day.
  • Sudden change in illumination as might occur in an indoor environment by switching the lights on or off, or in an outdoor environment, e.g. a change between cloudy and sunny conditions.
  • Shadows cast on the background by objects in the background itself (e.g., buildings and trees) or by moving foreground objects.

Motion changes:

  • Global image motion due to small camera displacements. Small camera displacements are common in outdoor situations due to wind load or other sources of motion, which causes global motion in the images.
  • Motion in parts of the background. For example, tree branches moving with the wind, or rippling water.

Structural changes:

  • These are changes introduced to the background, including any change in the geometry or the appearance of the background of the scene introduced by targets. Such changes typically occur when something relatively permanent is introduced into the scene background (BK object removed or parking car.

Parametric Background Models

实验中利用高斯混合模型(Gaussian mixture model, GMM)方法来训练背景模型。GMM方法假设图像中的每一个像素点的强度都服从混合高斯分布(MoG)并对其建模,然后使用on-line方法来逐步更新背景模型。

对于一个视频序列,令$X_t$表示某个像素点$(x_0,y_0)$在时间t时的强度值。那么在时间t时,像素点$(x_0,y_0 )$的历史信息为:$X_1,… ,X_t={V(x_0,y_0,i):i≤t}$ 。这个历史信息可以用k个高斯分布的混合来建模:
$$
P(X_t )=∑{i=1}^{K}ω(i,t) N(X_t ┤| μ(i,t),Σ(i,t))
$$
其中,$N(X_t | μ{it},Σ{i,t})= \frac {1}{(2π)^{D/2}}\frac{1}{|Σ_{i,t}| ^{1/2}} exp⁡(-\frac{1}{2}(X_t-μ{i,t} )^T Σ{i,t}^{-1} (X_t-μ_{i,t}))$

在t=0时,我们利用上述MoG公式对图像中的每一个像素点进行建模,并对k个权重$ω_{i,t}$和k个高斯分布$N(X_t | μ{it},Σ{i,t})$进行相同的初始化。在t=1,2, … ,T时,我们可以利用新检测到的像素点$X_t$对权重和高斯分布的参数进行更新。

权重$ω{i,t}$更新公式为:$ω{i,t}=(1- α) ω_{i,t-1}+αM(i,t)$。其中当新像素点X_t服从第i个高斯分布时,M(i,t)=1,否则M(i,t)=0。

当且仅当M(i,t)=1时,对第i个高斯分布的参数进行如下更新:
$$
μt=(1-p) μ(t-1)+px_t
$$

$$
σt^2=(1-p) σ(t-1)^2+p(x_t-μt )^T (x_t-u_t)
$$

当利用训练集更新好高斯分布的参数后,对每个像素点的k个高斯分布按照$ωj/σ_j^2$降序排序,并选择前B个高斯分布作为对应像素点的背景模型。其中$B=arg min_b⁡(∑_{j=1}^bω_j>T)$,T为全局阈值。对新像素点$X_t$,如果与B个高斯分布中的任何一个之间的标准差超过2.5,即被检测为前景。

至此,整个基于GMM方法的背景模型建立完毕。

Nonparametric Background Models

Moving Shadow Suppression

References

  • Toyama, K., Krumm, J., Brumitt, B., Meyers, B.: Wallflower: Principles and practice of background maintenance. In: IEEE International Conference on Computer Vision (1999)
  • Wern, C.R., Azarbayejani, A., Darrell, T., Pentland, A.: Pfinder: Real-time tracking of human body. IEEE Trans. Pattern Anal. Mach. Intell. (1997)
  • Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (1999)
  • Elgammal, A., Duraiswami, R., Harwood, D., Davis, L.S.: Background and foreground modeling using non-parametric kernel density estimation for visual surveillance. Proc. IEEE 90(7), 1151–1163 (2002)

Mean Shift

Feature Space Analysis

Mean Shift Approach

Proof for Mean Shift Convergence

Pros and Cons

The Bandwidth Selection

Applications

Clustering

Smoothing

Segmentation

核函数的作用:

从公式中可以看到,只要是落入Sh的采样点,无论其离中心x的远近,对最终的Mh(x)计算的贡献是一样的。然而在现实跟踪过程中,当跟踪目标出现遮挡等影响时,由于外层的像素值容易受遮挡或背景的影响,所以目标模型中心附近的像素比靠外的像素更可靠。因此,对于所有采样点,每个样本点的重要性应该是不同的,离中心点越远,其权值应该越小。故引入核函数和权重系数来提高跟踪算法的鲁棒性并增加搜索跟踪能力。

Mean Shift运动跟踪:

运动跟踪说到底就是在一开始告诉程序一个跟踪目标,即我想跟踪什么,然后程序就在接下来的视频帧中去寻找这个目标了。给定跟踪目标很简单,直接在图像中给一个ROI给程序就可以了,那么程序接下来要做的就是在下一帧图像中去找这个ROI,但这个ROI是移动了的,已经不在之前的那个位置了,那么这个时候程序要怎么来找到这个ROI呢?那么在计算机视觉中我们是这么来解决的:首先对跟踪目标进行描述,这个描述是将跟踪目标区域转换为颜色HSV空间,然后得到H的这个通道的分布直方图,有了这个描述之后,我们就是要在下一个视频帧中找到和这个描述的一样的区域,但是我们知道要找到完全一样的区域很难,所以我们就用了一个相似函数来衡量我们找到的区域和我们的目标区域的相似度,通过这个相似函数,相似函数值越大说明我们找打的区域和目标区域越相似,所以我们的目标就是要找这个对应最大相似值的区域,那么怎么来找呢?这个时候meanshift就排上用场了,它可以通过不断地迭代得到有最大相似值的区域(具体里面的是怎么算的,可以参考博文地底下的参考博客),meanshift的作用可以让我们的搜索窗口不断向两个模型相比颜色变化最大的方向不断移动,直到最后两次移动距离小于阈值,即找到当前帧的位置,并以此作为下一帧的起始搜索窗口中心,如此重复,这个过程每两帧之间都会产生一个meanshift向量,整个过程的meanshift向量连起来就是目标的运动路径。

References

  • Mean-Shift算法
  • [综] meanshift算法
  • Dorin Comaniciu, Peter Meer: Mean Shift: A Robust Approach Toward Feature Space Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5): 603-619 (2002).
  • Dorin Comaniciu, Peter Meer: Distribution Free Decomposition of Multivariate Data. Pattern Anal. Appl. 2(1): 22-30 (1999).

Visual Tracking

Kalman Filter

Cons:

  • Kalman filtering is inadequate because it is based on the unimodal Gaussian distribution assumption, and it can’t represent simultaneous alternative hypotheses.
  • It works relatively poorly in clutter which causes the density to be multi-modal and therefore non- Gaussian.

Kalman filter is based on the single Gauss model, and different components have different effects on the Gauss distribution, as follows:

  • The deterministic component causes the density function to drift bodily.
  • The random component of the dynamical model leads to spreading—increasing uncertainty.
  • The effect of an external observation is to superimpose a reactive effect on the diffusion.

image-20181126203932301

Particle Filter

The CONDENSATION Algorithm

image-20181126225223627

At the top of the diagram, the output from time-step t -1 is the weighted sample-set. The aim is to maintain, at successive time-steps, sample sets of fixed size N.

  • The first operation is to sample N times from the set , choosing a given element with probability. Some elements, especially those with high weights, may be chosen several times, leading to identical copies of elements in the new set. Others with relatively low weights may not be chosen at all.
  • Each element chosen from the new set is now subjected to the predictive steps.First, an element undergoes drift and, since this is deterministic, identical elements in the new set undergo the same drift.
  • The second predictive step, diffusion, is random and identical elements now split because each undergoes its own independent motion step. At this stage, the sample set for the new time-step has been generated but, as yet, without its weights;
  • Finally, the observation step is applied, generating weights from the observation density.

Algorithm:

image-20181126231817949

image-20181126231839015

Color-based Particle Filter

Color histograms have many advantages for tracking non-rigid objects as they are robust to partial occlusion, are rotation and scale invariant and are calculated efficiently.

A target is tracked with a particle filter by comparing its histogram with the histograms of the sample positions using the Bhattacharyya distance.

Bhattacharyya distance:在统计学中,Bhattacharyya距离(以下称巴氏距离)测量的是两个离散或连续概率分布的相似性。计算方式和Bhattacharyya系数关系很密切。

Algorithm:

image-20181126231727347

Kernel-based Particle Filter

A PF does not perform well when the dynamic system has a very small system noise or if the observation noise has very small variance. In these cases, the particle set quickly collapses to one single point in the state space.

The standard PF often fails to produce a particle set that captures the “irregular” motion, leading to gradually drifting estimates and ultimate loss of target.

KPF estimates the gradient of the kernel density and moves particles toward the modes of the posterior, leading to a more effective allocation of particles.

The gradient estimation and particle allocation is implemented by the mean shift algorithm.

A Boosted Particle Filter

The problem of tracking a varying number of non- rigid objects has two major difficulties:

  • First, the observation models and target distributions can be highly non-linear and non- Gaussian.
  • Second, the presence of a large, varying number of objects creates complex interactions with overlap and ambiguities.

    Mixture particle filters and Adaboost:

An effective way is to combine mixture particle filters and Adaboost. The crucial issues in mixture particle filters are the choice of the proposal distribution and the treatment of objects leaving and entering the scene.

The mixture particle filter is ideally suited to multi-target tracking as it assigns a mixture component to each player. The proposal distribution can be constructed by using a mixture model that incorporates information from the dynamic models of each player and the detection hypotheses generated by Adaboost.

Methods:

  • Most multi-target tracking assumed a fixed number of objects.
  • BraMBLe has an automatic object detection system that relies on modeling a fixed background.
  • The authors will relax the assumption of a fixed background where the background changes.
  • Particle filters may perform poorly when the posterior is multimodal for multiple targets. Vermaak et al introduce a mixture particle filter (MPF), where each component is modelled with an individual particle filter. BPF is based on MPF.
  • The authors adopt a multi-color observation model based on Hue-Saturation-Value (HSV) color histograms.

The boosted particle filter introduces two important extensions of the MPF:

  • First, it uses Adaboost to construct the proposal distribution. It incorporates the recent observations in proposal distributions (through the Adaboost detections), and outperforms naive transition prior proposals considerably.
  • Second, Adaboost provides a mechanism for obtaining and maintaining the mixture representation. It allows us to detect objects leaving and entering the scene efficiently.

References

  • M. Isard and A. Blake. Condensation–conditional density propagation for visual tracking. Int. J. Computer Vision, 29(1):5– 28, 1998.
  • S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial on particle filters for on-line non-linear/non-Gaussian Bayesian tracking,” IEEE Transactions on Signal Processing, vol. 50, pp. 174–188, Feb. 2002.
  • K. Nummiaroa, E. Koller-Meierb, L. V. Gool, “An adaptive color- based particle filter”, Image and Vision Computing 21 (2003) 99– 110.
  • C.Chang, and R. Ansari, “Kernel Particle Filter for Visual Tracking”, IEEE SIGNAL PROCESSING LETTERS, VOL. 12, NO. 3, pp242-245, 2005.
  • K. Okuma, et al., “A Boosted Particle Filter: Multitarget Detection and Tracking”, ECCV 2004 (2004), pp. 28-39.
  • 基于粒子滤波器的目标跟踪算法及实现

Multiple Cues for Tracking

Ensemble Tracking

Face Detection

References

评论

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×