Download presentation
Presentation is loading. Please wait.
Published byAmberly Grace Horton Modified over 8 years ago
Outline Video Carving 1. Introduction2. Related Work3. Video Carving3.1 Min-Cut/Max-Flow Algorithm 3.1.1 Growth stage 3.1.2 Augmentation stage 3.1.3 Adoption stage 4. Implementation 5. Results 6. Conclusions and Future Work
1. Introduction Motivation Why do we need video condensed or synopsis ? This is a particularly significant problem with video, since raw, uneditted footage consists of lots of time where nothing important happens with only a few short moments of interest in between. ex: extracting the key information from a video is also particularly important problem in security and surveillance applications.
1. Introduction Several techniques have been proposed to condense long video into a shorter and more useful synopsis. downsampling or fast-forwarding the video is cut-down in size by extracting only every nth frame. drawback: fails to capture a rapidly moving object since the temporal samples might miss the actual object (an example of temporal aliasing).
1. Introduction The author in this paper present a novel scheme to take a long video stream with m frames and condense it into a short viewable clip with n frames (where n << m) that preserves the most important information.
1. Introduction Idea Most approaches prune down the video size by eliminating whole frames from the video stream, the author observe that each deleted frame does not have to consist of pixels from a single time step. They think of the frames to be deleted as “sheets” within the space-time volume where each pixel on the sheet has one and only one time step, but different pixels can have different time steps
2. Related Work Several techniques have been proposed to create video summaries. Frame-based approaches simply play the video faster drawback: fast activities may be the lost in the process. =>To avoid this problem, techniques have been developed that identify activities and adaptively adjust the frame rate Object-based approaches To represent activities as 3D objects in the space time domain (e.g. video cube) and seek a tighter packing of these objects in the time axis.
2. Related Work This idea of incrementally removing regions is inspired by Avidan and Shamir's work on seam carving(siggraph2007) To resize an image, they incrementally remove seams, which are 8- connected paths through the image. Complementary to summarizing video Video Retargetting is the task for different output resolutions.
3. Video Carving A long video can be summarized through video carving by incrementally removing 2D sheets from the video cube to reduce its total time. The sheet must fully cut across the xy-plane of the video cube. To compute this sheet, author use a min-cut formulation.
3. Video Carving A min-cut will traverse through regions of low difference (e.g. high similarity).When the low-difference sheet has been found and removed, the resulting video will have few visual artifacts since the removed pixels will be similar to their surroundings both spatially and temporally. By creating an appropriate graph of video pixels and augmenting it with source and sink nodes, they can find the min-cut of this graph and therefore compute the corresponding sheet to remove from the video cube.
3. Video Carving First, they define a node for each pixel of the video cube. Nodes have edges to their top, bottom, left, and right neighbors. They also have edges to nodes in the same pixel location in the next and previous frames. Node(pixel) Edge Edge weights are computed using a measure of spatio-temporal difference.
3. Video Carving A source and sink node are connected to all the nodes in the First and last frame, respectively. First frameLast frame
3.1 Min-Cut/Max-Flow Algorithm To compute the min-cut algorithm on the graph, they use Boykov and Kolmogorov’s min-cut/max-flow algorithms (IEEE Transactions om PAMI 2004). First, we have a directed graph G = Terminology: Active node: active nodes represent the outer border in each tree while the passive nodes are internal. Active nodes allow trees to “grow” by acquiring new children (along non-saturated edges) from a set of free nodes. Passive node: passive nodes can not grow as they are completely blocked by other nodes from the same tree. Free node: the nodes that are not in S or T are called “free”. We have S 、 T : Tree s 、 t : source node and sink node O : orphan’s set A: active node’s set S ⊂ V, s ∈ S, T ⊂ V, t ∈ T, S ∩ T = ∅
3.1 Min-Cut/Max-Flow Algorithm It is convenient to store content of search trees S and T via flags TREE(p) indicating a ffi liation of each node p so that S if p ∈ S TREE(p) = T if p ∈ T ∅ if p is free node If node p belongs to one of the search trees then the information about its parent will be stored as PARENT(p). Roots of the search trees (the source and the sink), orphans, and all free nodes have no parents, t.e. PARENT(p) = ∅. We will also use notation tree_cap(p → q) to describe residual capacity of either edge (p, q) if TREE(p) = S or edge (q, p) if TREE(p) = T.
3.1 Min-Cut/Max-Flow Algorithm The algorithm iteratively repeats the following three stages: “growth” stage: search trees S and T grow until they touch giving an s ->t path “augmentation” stage: the found path is augmented, search tree(s) break into forest(s) “adoption” stage: trees S and T are restored.
3.1 Min-Cut/Max-Flow Algorithm initialize: S = {s}, T = {t}, A = {s, t}, O = Ø; while true grow S or T to find an augmenting path P from s to t if( P = Ø ) terminate augment on P adopt orphans end while
3.1.1 Growth stage Growth stage: At this stage active nodes acquire new children from a set of free nodes, The growth stage terminates if an active node encounters a neighboring node that belongs to the opposite tree. In this case we detect a path from the source to the sink. while A != ∅ pick an active node p ∈ A for every neighbor q such that tree_cap(p → q) > 0 if TREE(q) = ∅ then add q to search tree as an active node: TREE(q) := TREE(p), PARENT(q) := p, A := A ∪ {q} if TREE(q) != ∅ and TREE(q) != TREE(p) return P = PATH s → t end for remove p from A end while return P = ∅
3.1.2 Augmentation stage Augmentation stage: The augmentation phase may split the search trees S and T into forests. The source s and the sink t are still roots of two of the trees while orphans form roots of all other trees. find the bottleneck capacity ∆ on P update the residual graph by pushing flow ∆ through P for each edge (p, q) in P that becomes saturated if TREE(p) = TREE(q) = S then set PARENT(q) := ∅ and O := O ∪ {q} (q is orphan) if TREE(p) = TREE(q) = T then set PARENT(p) := ∅ and O := O ∪ {p} (p is orphan) end for
3.1.3 Adoption stage Adoption stage: During this stage all orphan nodes in O are processed until O becomes empty. Each node p being processed tries to find a new valid parent within the same search tree; in case of success p remains in the tree but with a new parent, otherwise it becomes a free node and all its children are added to O, The goal of the adoption stage is to restore single-tree structure of sets S and T with roots in the source and the sink. while O != ∅ pick an orphan node p ∈ O and remove it from O process p end while
3.1.3 Adoption stage Process p Trying to find a new valid parent for p among its neighbors. If node p finds a new valid parent q then set PARENT(p) = q. (In this case p remains in its search tree and the active (or passive) status of p remains unchanged.) If p does not find a valid parent then scan all neighbors q of p such that TREE(q) = TREE(p): – if tree cap(q → p) > 0 add q to the active set A – if PARENT(q) = p add q to the set of orphans O and set PARENT(q) := ∅ TREE(p) := ∅, A := A − {p} (p becomes a free node) *(A valid parent q should satisfy: TREE(q) = TREE(p),tree cap(q → p) > 0, and the “origin” of q should be either source or sink.)
3.1 Min-Cut/Max-Flow Algorithm Terminal Condition The algorithm terminates when the search trees S and T can not grow (no active nodes) and the trees are separated by saturated edges. This implies that a maximum flow is achieved. The corresponding minimum cut can be determined by S = S and T = T.
3. Video Carving Finally, we find a min-cut on this graph and compute a corresponding sheet that has the property that it has only one temporal value at every projected pixel location. To do this, they first find the set of nodes ” S”, that have edges that cross the min-cut. We then use a “front-surface” strategy to determine which nodes to remove.
3. Video Carving For each pixel location, we project it along the time- axis of the video cube, from the first frame to the last frame. The first node n ∈ S we encounter will be the pixel we remove from the video cube.
3. Video Carving Once a sheet is removed from the video cube, the remaining pixels are packed to cover the empty space. Because every pixel location had one and only one frame removed, the total video cube is shortened by one frame.
4. Implementation Restriction: the memory requirements of storing the entire data structure can be signicant. =>store the video stream as a 3D doubly-linked grid of of pixels with each “pixel” storing the color and gradient information as well as pointers to its neighbors, resulting in a structure “40 bytes” in size per pixel. this limits the maximum number of pixels in our graph to about 50 million. (32-bit Windows gives applications only 2GB of total memory) Ex: For a 720×480 video at 30 frames per second, this only yields about about 150 frames (5 seconds), which is unacceptable.
4. Implementation In order to process videos of larger sizes, they take the input video and break it up into smaller video subsets, each which can fit entirely within memory. Then extract a single frame from each subset with the min-cut algorithm. Therefore, after the first pass through the entire video is finished, they have removed as many frames as there were video subsets. Continue making passes through the video removing frames until the video reaches the desired size.
5. Results
video carving preserves important information that is not in the fast-forwarded version.
5. Results However, our video carving technique has artifacts that show up as “motion tails” following rapidly-moving objects. These are caused by video sheets that traverse the path of the object, placing it with a previous image of itself on the same frame. => These artifacts are the direct cause of having to use a small subset of the video during processing.Since each video subset that was processed was only a few seconds long and required the removal of a video sheet.
6. Conclusions and Future Work First, they might reduce the motion tails in the condensed video by processing larger blocks of video at one time. In addition, it would be of interest to be able to enforce temporal order in the final video. Because we do not use any object information during processing, the carving of video sheets can cause discontinuities to appear as objects move.
6. Conclusions and Future Work By carving out low-gradient video sheets from a long video, they are able to produce a much shorter version that preserves important information, even going as far as compositing objects together that happen different times in the same frame.
Similar presentations
© 2025 Inc.
All rights reserved.