Meta learning

In this project I implement few type of metalearning method include prototypical network and Data Hallucination for Few-shot Learning

Prototypical network

Metalearning use small set (N-ways K-shots) of data seprate from full dataset to learn how to learn


  • Feature extractor
    def conv_block(in_channel , out_channel):
      bn = nn.BatchNorm2d(out_channel)
      return nn.Sequential(
          nn.Conv2d(in_channel , out_channel ,3 ,padding=1),
    class convnet(nn.Module):
      def __init__(self , in_channel = 3 , hid_channel = 64 , out_channel = 64):
          self.encoder = nn.Sequential(
              conv_block(in_channel , hid_channel),
              conv_block(hid_channel , hid_channel),
              conv_block(hid_channel , hid_channel),
              conv_block(hid_channel , hid_channel),
      def forward(self,x):
          x = self.encoder(x)
          return x.view(x.size(0),-1)
  • Loss function (Distances calculation)
    • l2 norm
        distances = (
                x.unsqueeze(1).expand(x_shape, y_shape, -1) -
                y.unsqueeze(0).expand(x_shape, y_shape, -1)
        ).pow(2).sum(dim=2)  #calculate each embadding query with 5 ways
    • cosine
        normalised_x = x / (x.pow(2).sum(dim=1, keepdim=True).sqrt() + 1e-8)
        normalised_y = y / (y.pow(2).sum(dim=1, keepdim=True).sqrt() + 1e-8)
        expanded_x = normalised_x.unsqueeze(1).expand(x_shape, y_shape, -1)
        expanded_y = normalised_y.unsqueeze(0).expand(x_shape, y_shape, -1)
        cosine_similarities = (expanded_x * expanded_y).sum(dim=2)
        distance = 1 - cosine_similarities
    • dot
        expanded_x = x.unsqueeze(1).expand(n_x, n_y, -1)
        expanded_y = y.unsqueeze(0).expand(n_x, n_y, -1)
        distance =  -(expanded_x * expanded_y).sum(dim=2)


  • Here I use Mini-ImageNet Dataset for training and testing with different combination of shots Accuracy
    • 5 ways 1 shot : 43.65 ± 2.10%
    • 5 ways 5 shot : 45.55 ± 0.82%
    • 5 ways 10 shot : 44.01 ± 0.92%

Data Hallucination

In order to enlarge the data which use for learning, a concept of generator has been added to the above model


  • Generator
    class generator(nn.Module):
      def __init__(self):
          self.linear1 = nn.Linear(1600*2,2400)
          self.relu1 = nn.ReLU()
          self.drop1 = nn.Dropout()
          self.linear2 = nn.Linear(2400,2400)
          self.relu2 = nn.ReLU()
          self.drop2 = nn.Dropout()
          self.linear3 = nn.Linear(2400,1600)
          self.relu3 = nn.ReLU()
      def forward(self,x_in,M,seed):
          H_stack = torch.tensor([]).to("cuda")
          for _ in range(M):
              if seed :
                  noise = torch.randn(x_in.size()).to("cuda")
              else :
                  noise = torch.randn(x_in.size()).to("cuda")
              x =,noise)) 
              x = self.drop1(self.relu1(self.linear1(x)))
              x = self.drop2(self.relu2(self.linear2(x)))
              x = self.relu3(self.linear3(x))
              H_stack = , x)) 
          return H_stack.view(-1,1600)

    Combine the output of generator to origin data

    output = model(data)    
    support_output = output[:ways * N_shot]
    rand_sample = np.random.randint(0,N_shot,(ways,)) + np.array([i for i in range(0,N_shot*ways , N_shot)])
    new_support = torch.tensor([]).to("cuda")
    for index , r in enumerate(rand_sample):
      hallucinate = generator(support_output[r,...] , M , False)
      hallucinate =[index*N_shot:(index+1)*N_shot,...] , hallucinate))
      new_support =,hallucinate) , dim = 0)


  • tSNE visualize the real and hallucinated data in the latent space

  • Accuracy of different amount (M) of generation data

    • M = 10 : 44.1 ± 1.1%
    • M = 50 : 44.26 ± 0.62%
    • M = 100 :47.59 ± 0.79%
