100% tevredenheidsgarantie Direct beschikbaar na je betaling Lees online óf als PDF Geen vaste maandelijkse kosten 4.2 TrustPilot
logo-home
Samenvatting

Summary of paper End-to-end Object Detection with Transformers

Beoordeling
-
Verkocht
-
Pagina's
7
Geüpload op
05-07-2024
Geschreven in
2023/2024

This is a summary of the paper End-to-end Object Detection with Transformers for the course Seminar of Computer Vision by Deep Learning in TU Delft










Oeps! We kunnen je document nu niet laden. Probeer het nog eens of neem contact op met support.

Documentinformatie

Geüpload op
5 juli 2024
Aantal pagina's
7
Geschreven in
2023/2024
Type
Samenvatting

Voorbeeld van de inhoud

End-to-end Object Detection
with Transformers
Abstract
This approach removes the need for many hand-designed components like
non-maximum suppression procedure or anchor generation DETR doesn’t need
that! The main ingredients of the new framework, called DEtection TRansformer
or DETR are a set-based global loss that forces unique predictions via bipartite
matching and a transformer encoder-decoder architecture.
Prior methods: Current object detection pipelines include hand-crafted
components like spatial anchor generation and non-max suppression (NMS).
Each of these components is tuned specifically for a given task. For example,
NMS is threshold-based and requires an IOU (intersection over union) and
confidence threshold tuning to be able to effectively discard the overlapping
bounding boxes.


Introduction
Modern detectors address this set prediction task in an indirect way, by
defining surrogate regression and classification problems on a large set of
proposals, anchors or window centers. Their performances are significantly
influenced by postprocessing steps to collapse near-duplicate predictions.




DETR directly predicts (in parallel) the final set of detections by combining a common CNN
with a transformer architecture. During training, bipartite matching uniquely assigns
predictions with ground truth boxes.


Our DEtection TRansformer predicts all objects at once, and is trained end-to-
end with a set loss function which performs bipartite matching between
predicted and ground truth objects.




End-to-end Object Detection with Transformers 1

, Compared to most previous work on direct set prediction, the main features of
DETR are the conjunction of the bipartite matching loss and transformers with
(non-autoregressive) parallel decoding.


Related Work
Set Prediction
A task where a model predicts multiple elements whose ordering is not relevant
for correctness. (Essentially predicting multiple objects in an image).
The way this is solved now however is by introducing relationship or pre
defined knowledge into the model. For instance, the predicted bounding boxes
should not overlap significantly and should cover all detected objects.
Avoiding Near-Duplicates: In object classification sometimes there are the
same bounding boxes for the same predicition, this is solved by using NMS
however set prediction is set to resolve that.

Transformers and Parallel Decoding
Transformers introduced self-attention layers, which, similarly to Non-Local
Neural Networks, scan through each element of a sequence and update it by
aggregating information from the whole sequence.

Object Detection
Set-based loss: Several object detectors used the bipartite matching loss.
Recurrent detectors: Closest to our approach are end-to-end set predictions
for object detection and instance segmentation. Similarly to us, they use
bipartite-matching losses with encoder-decoder architectures based on CNN
activation to directly produce a set of bounding boxes. These approaches,
however, were only evaluated on small datasets and not against modern
baselines. In particular, they are based on autoregressive models (more
precisely RNNs), so they do not leverage the recent transformers with parallel
decoding.


The DETR model
Object Detection set prediction loss




End-to-end Object Detection with Transformers 2
€7,16
Krijg toegang tot het volledige document:

100% tevredenheidsgarantie
Direct beschikbaar na je betaling
Lees online óf als PDF
Geen vaste maandelijkse kosten

Maak kennis met de verkoper
Seller avatar
guillemribes

Ook beschikbaar in voordeelbundel

Thumbnail
Voordeelbundel
Full Paper Summary for CS by DL
-
9 2024
€ 64,44 Meer info

Maak kennis met de verkoper

Seller avatar
guillemribes Technische Universiteit Delft
Bekijk profiel
Volgen Je moet ingelogd zijn om studenten of vakken te kunnen volgen
Verkocht
0
Lid sinds
1 jaar
Aantal volgers
0
Documenten
11
Laatst verkocht
-

0,0

0 beoordelingen

5
0
4
0
3
0
2
0
1
0

Recent door jou bekeken

Waarom studenten kiezen voor Stuvia

Gemaakt door medestudenten, geverifieerd door reviews

Kwaliteit die je kunt vertrouwen: geschreven door studenten die slaagden en beoordeeld door anderen die dit document gebruikten.

Niet tevreden? Kies een ander document

Geen zorgen! Je kunt voor hetzelfde geld direct een ander document kiezen dat beter past bij wat je zoekt.

Betaal zoals je wilt, start meteen met leren

Geen abonnement, geen verplichtingen. Betaal zoals je gewend bent via iDeal of creditcard en download je PDF-document meteen.

Student with book image

“Gekocht, gedownload en geslaagd. Zo makkelijk kan het dus zijn.”

Alisha Student

Veelgestelde vragen