How to change architecture of conditional GAN to generate 224x224x3 images?

Question

Alok 2022년 8월 4일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1773905-how-to-change-architecture-of-conditional-gan-to-generate-224x224x3-images

답변: Ayush Aniket 2025년 5월 9일

I am following matlab example on conditional GAN at https://www.mathworks.com/help/deeplearning/ug/train-conditional-generative-adversarial-network.html

This example is for image size 64x64x3. I am wondering what changes should be done in layersGenerator and layersDiscriminator to generate 224x224x3 images.

This is my code:

inputSize = [224 224 3] or [256 256 3];

Note if Factor=2 (below) then I get image size 128x128x3. If Factor=4 then generated image size if 256x256x3. However, during the loop, it gives an error that trainedVariance is negative.

inputSize = [64 64 3];
Factor = 4; %if Factor =2 then 128x128x3 image size is generated; 
inputSize = Factor*inputSize(1:2);
numClasses = 2;
augimds = augmentedImageDatastore(inputSize(1:2),XTrain,YTrain);
augimdsValidation = augmentedImageDatastore(inputSize(1:2),XValidation,YValidation);
numLatentInputs = 100;%100
embeddingDimension = 50;
numFilters = Factor*64;%224;
filterSize = 5;
projectionSize = Factor*[4 4 1024];
layersGenerator = [
    featureInputLayer(numLatentInputs)
    fullyConnectedLayer(prod(projectionSize))
    functionLayer(@(X) feature2image(X,projectionSize),Formattable=true)
    concatenationLayer(3,2,Name="cat");
    transposedConv2dLayer(filterSize,4*numFilters,Stride=2,Cropping="same")
    batchNormalizationLayer
    reluLayer
    transposedConv2dLayer(filterSize,2*numFilters,Stride=2,Cropping="same")
    batchNormalizationLayer
    reluLayer
    transposedConv2dLayer(filterSize,numFilters,Stride=2,Cropping="same")
    batchNormalizationLayer
    reluLayer
    transposedConv2dLayer(filterSize,3,Stride=2,Cropping="same")
    tanhLayer];
lgraphGenerator = layerGraph(layersGenerator);
layers = [
    featureInputLayer(1)
    embeddingLayer(embeddingDimension,numClasses)
    fullyConnectedLayer(prod(projectionSize(1:2)))
    functionLayer(@(X) feature2image(X,[projectionSize(1:2) 1]),Formattable=true,Name="emb_reshape")];
lgraphGenerator = addLayers(lgraphGenerator,layers);
lgraphGenerator = connectLayers(lgraphGenerator,"emb_reshape","cat/in2");
netG = dlnetwork(lgraphGenerator);
dropoutProb = 0.75;
%numFilters = 64;
scale = 0.2;
filterSize = 5;
layersDiscriminator = [
    imageInputLayer(inputSize,Normalization="none")
    dropoutLayer(dropoutProb)
    concatenationLayer(3,2,Name="cat")
    convolution2dLayer(filterSize,numFilters,Stride=2,Padding="same")
    leakyReluLayer(scale)
    convolution2dLayer(filterSize,2*numFilters,Stride=2,Padding="same")
    batchNormalizationLayer
    leakyReluLayer(scale)
    convolution2dLayer(filterSize,4*numFilters,Stride=2,Padding="same")
    batchNormalizationLayer
    leakyReluLayer(scale)
    convolution2dLayer(filterSize,8*numFilters,Stride=2,Padding="same")
    batchNormalizationLayer
    leakyReluLayer(scale)
    convolution2dLayer(Factor*4,1)];
lgraphDiscriminator = layerGraph(layersDiscriminator);
layers = [
    featureInputLayer(1)
    embeddingLayer(embeddingDimension,numClasses)
    fullyConnectedLayer(prod(inputSize(1:2)))
    functionLayer(@(X) feature2image(X,[inputSize(1:2) 1]),Formattable=true,Name="emb_reshape")];
lgraphDiscriminator = addLayers(lgraphDiscriminator,layers);
lgraphDiscriminator = connectLayers(lgraphDiscriminator,"emb_reshape","cat/in2");
netD = dlnetwork(lgraphDiscriminator);

However, the above code gives an error at

[~,~,gradientsG,gradientsD,stateG,scoreG,scoreD] = ...
            dlfeval(@modelLoss2,netG,netD,X,T,Z,flipFactor);

The size of generated image at

[XGenerated,stageG] = forward(netG,Z,T);

is 256x256x3. However, an error comes stating that trainedVariance is not positive

Could you assist me which transposedConv2dLayer to change to adjust the size to 224x224x3 or 256x256x3?

Thanks for your help

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Ayush Aniket 2025년 5월 9일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1773905-how-to-change-architecture-of-conditional-gan-to-generate-224x224x3-images#answer_1564956

If you use a projection size that doesn't align with the upsampling path, the generator output won't match the expected image size, which can cause downstream errors (such as negative variance or shape mismatches).

Each transposedConv2dLayer with Stride=2 doubles the spatial resolution. The number of upsampling layers and the initial projection size must align so that after all upsampling, you reach your desired output size.The general rule is that if your initial projection size is [h, w, c] and you have n upsampling layers (each with Stride=2), your output size will be [h*2^n, w*2^n, outputChannels]. Therefore, for

1. 256x256x3 Output -

Start with: [4, 4, ...] projection size
Number of upsampling layers: 4
Calculation: 4 → 8 → 16 → 32 → 64 → 128 → 256 (for 6 layers, but typically 4 layers from 4 to 64, then up to 256)
But: 4 upsampling layers from [4,4] gives [64,64]`\, so you need 6 layers to go from 4 to 256.
However, your code uses 4 upsampling layers, so your projection should be [16,16, ...] for 256x256 output: 16 → 32 → 64 → 128 → 256 (4 layers, 16*2^4 = 256)

2. 224x224x3 Output -

224 is not a power of 2, so you need to start with a projection size that, after upsampling, results in 224.
224 = 14 * 2^4
So, start with [14,14, ...] and 4 upsampling layers.

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

How to change architecture of conditional GAN to generate 224x224x3 images?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

How to change architecture of conditional GAN to generate 224x224x3 images?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기