• SuperGlue: Learning Feature Matching With Graph Neural Networks

    • http://cvpr20.com/event/superglue-learning-feature-matching-with-graph-neural-networks-2/
    • key points matching can use any methods such as SIFT or superpoints
    • Old methods use matching by filtering incorrect using ratio test , mutual check etc. for RANSAC
    • USE GNN with attention ( self and cross attention)
  • Is self supervised always helpful ?
    • http://cvpr20.com/event/how-useful-is-self-supervised-pretraining-for-visual-tasks-2/
    • only when labels are huge
      • then scratch is as good
  • Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective
    • http://cvpr20.com/event/rethinking-class-balanced-methods-for-long-tailed-visual-recognition-from-a-domain-adaptation-perspective-2/
    • long tailed from domain adaption perspective
    • domain adaption: covariate shift of target
    • conditinal distirbution of p(x tail classes) is different
    • class balanced loss + learning to reweight ( meta learning )
    • two states: first train like normal and second stage is a meta learning where balanced set is used to learn after that the final model is obtained
  • CNN fake is easy to spot for now:
    • http://cvpr20.com/event/cnn-generated-images-are-surprisingly-easy-to-spot-for-now2nd-time/
    • new dataset introduced: CNN generate fakes which are generate using GANs , deepfakes etc.
    • Can we make a universal detector ?
    • train on generated images from one method and test on all produced by other methods
    • Average precision metric
    • Results:
      • Training done using with also additional augmentations such as blur + jpeg aug
      • Good on many but low performance on deep fake and super resolution
    • BUT detection not easy for:
      • if architecture changes e.g. non CNN model
      • Deep fakes
  • High freqency componnet help expalin the genrea – Intersting
    • http://cvpr20.com/event/high-frequency-component-helps-explain-the-generalization-of-convolutional-neural-networks/
    • img
    • DNN might be using HFC more than LFC
    • Human mostly focus on LFC
  • ETERNAL sunshine of the spotless net
    • how to foget … fisher noise added so that deletes unwanted knowledge
    • do not need to retrain the network again
    • http://cvpr20.com/event/eternal-sunshine-of-the-spotless-net-selective-forgetting-in-deep-networks/
  • Revisiting knowledge distillation (KD)
    • http://cvpr20.com/event/revisiting-knowledge-distillation-via-label-smoothing-regularization/
    • KD is like a regularization
    • Weak teacher also improves student perf
    • Student can also improve teacher perf
    • alanlogy between Label smoothing and KD
      • KD = learned Label smoothing , LSR = ad-hoc KD with a virtual teacher
  • Computing the Testing Error Without a Testing Set

    • http://cvpr20.com/event/computing-the-testing-error-without-a-testing-set2nd-time/
  • Learning to forget for meta learning
    • http://cvpr20.com/event/learning-to-forget-for-meta-learning/
    • Key: task wise initialization strategy instead of general for MAML
    • Three ways of doing meta learning:
      • Metric based
      • external memory ( MANN) = memory augment NN
      • Optimization ( MAML )
    • gradient conflict cause the optimization landscape to be:
      • Sharper = harder to optimize , no generalization
      • More conflict in last layers = as more task specific
    • To selectively forget attenuate some weights
      • Magnitude of weights == learning
  • Robust Learning Through Cross-Task Consistency
    • http://cvpr20.com/event/robust-learning-through-cross-task-consistency/
    • img
  • Talks:

    • Elizabeth Spelke + Jitendra Malik + Larry Zitnick ( recommend to watch all keynotes)
      • http://mindsvsmachines.com/
    • Jitendra Malik: Turing’s Baby
      • Symbols need to link to entities which is done by perceptual and motor experiences or interactions
      • Turing: instead of tyring to simulate adult mind simulate child brain
      • Features of child six lessons:
        • Mutli-modal
        • Incremental: compostiinality
        • Physical : act in the world
        • Explore : explore vs exploit
        • Social : learn from others
        • Piagget’s constructionism , lifelong ,active , cirriculum learning
      • Imitation : learning by imitation from experts or learning by asking questions
      • Use language : weak supervision
      • Habitat : Embodied AI , AE2thor –> active perception
      • Commonsense is not just facts is a collection of models
        • Currently: Knowledge graphs for facts ( which is now .. VCR etc. )
        • Need to build mental models
      • Model free reinforcement learning doesnot work right now : need to go back
      • Continual Learning :
        • Just focussing on catastrophic forgetting
      • Vision exits to enable action not just work on static datasets
      • Learn from child development and do it in compositional way
      • Why? current methods:
        • Do not generalize well
        • Tasks are indepenedemt
        • New model for every tasks
      • Correct paradigm for vision :
        • Vision = intermediate representation
        • abstraction should be robust and support variety of donwstream tasks
        • Generic and transferable
        • how to build ?
          • previous only in categories level eg. wordnet categories . Now we need tasks
          • how to find related tasks : taskonomy ( amir zamir , cvpr 2018 best paper)
          • Robust learning through cross-task consistency
      • Curently in robotics:
        • brittle
        • sample inneficent
        • retrain for each task
        • Train in Mid level visual features space instead of raw pixels Corl19 : learning to navigate using mid level visual priors
        • If med level feature then you lose “end to end” feature but might gain modularity
        • Small modules trained end-to-end then use that as modules for bigger tasks
        • Maybe fix feature loose performance by asymptoting
          • Solution/trick: do side fine-tuning
      • Humans are 2nd best at lots of tasks we need systems that can do that now
      • Connections are pruned and not added ???
      • Symbolic approaches:
        • Premature symbolization is a failure
        • Maybe not the right way since it gets additional constraint
        • In image people used to operate on edges which did work well up to some extent
      • In language : people went from symbolic to vectors that brought huge improvements
        • Soft symbolism is better than hard coded
          • Since hard fails on long tailed etc.
      • Jitendra’s aha moment:
        • SGD works with over parameterized model with non-convex optimization
        • Language works best when sub symbolic
      • Advice for junior researchers:
        • Keep a portfolio of projects: 20% success, 80% success : diversify as in stocks
          • Stage: vague ( can talk over beer with friends )
          • Stage: more advanced / concrete ideas (to tell to grad students to work on)